Es Cue El: 2011

Tuesday, December 13, 2011

T-SQL Tuesday #025 : Tricks of the Trade

This week's SQL Tuesday theme is "Tips and Tricks", which led me to start making an inventory of the tricks and tools I use almost daily as a SQL developer.

As an old(er) dog, many of the tricks I use are things I've learned over the past couple of decades, and as I document them, I'm finding that in many cases, there's new ways of doing them (DMO's, for example). It's no excuse for not learning the new, better methods, but in my defense I'll just say that I have to support clients back to SQL 2000 (and occasionally even 7!) so I find that the old ways still work really well, even if newer versions introduce newer features.

First, some tweaks to the environment:

My first trick is the only one with a price tag. I've been working with Redgate's SQL Developer Bundle all summer, and I can't imagine ever not using it for SQL development. It's pricey, but I'd estimate that I work 50-100% faster with it. If there's money in the budget, ask your boss to get you the best tools available. I'll be happy to talk to your boss if you want a testimonial of how much productivity you can buy with these tools.

Next, color your connections! SSMS will do this if you tell it to, but I prefer the coloring provided as part of the free SSMS Tools Pack. Nothing reminds you that you're on the production server like a giant red bar at the edge of the query editor. SSMS Tools Pack also automatically saves all your tabs periodically with a searchable history, gives you SQL snippets (similar to SQL Prompt's), helps analyze execution plans, and about ten other handy things, all free.

Finally, if you're staring at the query editor all day, why not try a soothing color scheme? I really prefer a black background with light text. Tomas Restrepo wrote a handy little tool for porting color themes between Visual Studio and SSMS. I've found it handy to backup the following registry key after I set the colors I like:

HKEY_CURRENT_USER\Software\Microsoft\Microsoft SQL Server\90\Tools\Shell\FontAndColors

which makes it handy to switch between themes. I use a dark background for development, and a white background for presentations. There's lots of free themes out there at Studio Styl.es for you to try. (Note: if you import a registry key with a new color scheme, you can implement the changes without starting SSMS by going to Tools/Options, touching Fonts and Colors, and hitting OK.)

Everyone's a DBA Sometimes:

Even though I'm usually in development mode, when things get slow, any database person will do.

Activity monitor will help you find blocking processes, but often when there's blocking, there's panic. I keep this code in my snippets (or create a stored procedure and add it to SSMS's keyboard shortcuts):

SELECT * FROM master..sysprocesses WHERE blocked<>0 and blocked<>spid
SELECT * FROM master..sysprocesses WHERE blocked=0 and spid in
(select blocked from master..sysprocesses where blocked<>0 and spid<>blocked)

It immediately tells me if anyone's being blocked, and the head of the blocking chain. It also works in every version of SQL. DBCC INPUTBUFFER(spid) is usually not far behind.

Yes, you can do this from Activity Monitor and most tools have a way to find it relatively quickly, but having a snippet defined lets me run it with two keystrokes, immediately and repeatedly.

The other thing I setup in any environment is a snippet or stored procedure to gather information from a variety of application-specific logs table. Being able to see the last ten errors from a few apps, the ETL processes, maybe even some realtime stats like number of sessions, is invaluable when things are going wrong.

Development Tricks

----------------------------------------------

I've come to rely on Redgate's SQL Seach (it's free!) for finding anything in the database. If I don't have access to it, I usually have this in a snippet for searching the text of stored procedures, triggers, etc:

SELECT schema_name(SO.schema_id)+'.'+SO.NAME,SO.TYPE_DESC,SM.DEFINITION,SO.CREATE_DATE,SO.MODIFY_DATE
FROM SYS.SQL_MODULES SM
INNER JOIN SYS.OBJECTS SO ONSO.OBJECT_ID=SM.OBJECT_ID
WHERE SM.DEFINITION LIKE '%blah%'

----------------------------------------------

Before I started using numbers tables (sometimes called "tally tables"), I thought my life was complete. Now I use them for all kinds of things, particularly populating data. Anytime I need a set of integers, my go-to set is:

SELECT number FROM master..spt_values WHERE type='p'

It contains integers 0-2047, and if you need bigger, you can join it to itself to get 4 million, 8 billion, etc.

You can check out the above links for ideas about how to use numbers tables, but for example if I need a table of days for the next five years or so, you can whip one up with:

SELECT DATEADD(d,Number,'1/1/12') as MyDate, Number%7+1 as DayOfWeek
FROM (SELECT Number FROM master..spt_values WHERE type='p') numbers

----------------------------------------------

Hardly a day goes by when I don't have to convert to or from a comma-delimited list, usually for passing sets around when table-valued parameters aren't an option. FOR XML PATH is a quick way to get a list of values into a single, comma-delimited string:

DECLARE @mylist VARCHAR(MAX)
SET @mylist=(SELECT TOP 10 ','+name FROM sysobjects WHERE type='p' FOR XML PATH(''))
SET @mylist=STUFF(@mylist,1,1,'')
SELECT @mylist

You can use a numbers table to quickly split the delimited string back into its components. (Also see Jeff Moden's definitive piece on tally-table splitters):

SET @mylist=','+@mylist+','
;WITH n AS (SELECT number FROM master..spt_values WHERE type='p' AND number < LEN(@mylist))
SELECT SUBSTRING(@mylist,number+1,CHARINDEX(',',@mylist,number+1)-number-1) FROM n WHERE SUBSTRING(@mylist,number,1)=','

----------------------------------------------

Finally, hardly a day goes by when I don't need to get something random. Although RAND() will get you a random number, that doesn't help if you want to generate a set of random numbers. You can always use a loop or a cursor, but there's a better way.

SELECT ABS(CHECKSUM(NEWID()))%9+1 FROM MASTER.sys.columns

will return a list of random numbers between 1 and 9. This can be particularly handy for generating test data, particularly when combined with TOP() and CROSS APPLY (more on this in another post).

Of course, NEWID() also comes in handy for getting a random selection of existing rows.

SELECT TOP 15 * FROM MyTable ORDER BY NEWID()

will return a random set of 15 records every time it's run.

Social Tricks

If you're reading this blog, you probably know the most important trick: be connected! There's a huge network of wicked-smaht, generous SQL experts out there, and more free blogs and videos out there than you'll ever have time for. Here's some of my favorites:

Brent Ozar PLF's Weekly News letter is the best thing about Monday, and usually points me to a dozen cool things, SQL and not.
There's probably a SQL PASS chapter near you. And if there's not, they have Virtual Chapters. These are great way to learn about SQL topics, meet great SQL people, network, and much much more. Check yours out!
SQL Saturdays are a great, free alternative to the incomparable PASS Summit. Some pretty phenomenal speakers donate their time and wisdom to these, and again they're a great chance to meet fellow SQL geeks. (Hint: many presenters also generously upload their presentation materials to the SQL Saturday sites.)
SQL Bits, SQL Server Cental (particularly their Stairway series), and SQL Server Team Blogs are all great spots to go read and watch tons more great SQL content.
And of course, if you're not on Twitter, get on! Almost all of the SQL MVP's, presenters, and even the SQL development team follow #sqlserver and #sqlhelp. Brent Ozar even wrote a free book specifically for SQL Twitter virgins, so you have no excuse. A typical Twitter experience: I tweeted thanks for some SQL feature I liked, and a few minutes later, the Microsoft engineer who wrote it tweeted back, "You're welcome!" How cool is that?

Tuesday, December 6, 2011

Can Your Network Do This?

A couple of months ago, my wife and I took a road trip from Cleveland to Pittsburgh. Stuck in bumper to bumper traffic, the woman behind me suddenly lept from her car, ran up behind my car, and took a picture of my license plate. She hollered something like "I love SQL!" and disappeared back into traffic.

It took three days for a friend to send this back to me in a tweet:

Aside from the cosmic coolness of having attractive, nerd-appreciating women accosting me in traffic, it occurred to me that this was an indication of a pretty good professional network. Can your network do this? How many degrees of Kevin Bacon are you from a random stranger on the freeway?

One of my personal and professional goals this year was to do more networking. I don't suppose I'm alone in the tech world in putting "networking" right up there with "root canal" in the list of things I'd rather not be doing or having done to me. Images come to mind of schmoozing and making uncomfortable small talk, and telling people we should do lunch, or take lunch, or whatever verb it is people do to lunch while networking.

As it turns out, networking can largely be whatever you want it to be, so long as you're connecting to other people. Yes, I've "done" a couple of really uncomfortable lunches with people trying to sell me something, or see if I could sell something for them. But I've also been writing this blog, tweeting, drinking beer with other SQL geeks, and attending some interesting (for me) conferences. Yes, it's still an effort-- I'm an introvert, and as much fun as I have hanging out with SQL people, I'd often much rather just be at home with my wife and cats. But really, it hasn't been a chore-- just a conscious effort to get more involved in the SQL community.

And you can do it, even if you have major social anxiety.

Here's some things that I've found good luck with:

Find a PASS chapter. There are groups in most major cities, and if there isn't one near you, there are virtual chapters. It's a chance to hang out with other SQL people, learn some things, and just get a feel for what's going on in the SQL world, both at large and in your town.
Get involved with SQL Saturday, SQL Rally, or SQL Summit events. Some of these are run on shoestring budgets and eager for any help you can give. Even if your job is just handing out flyers or stuffing bags, being a volunteer gets you backstage access to presenters, MVPs, and others in the community. If you can't volunteer, at least try to attend.
Blog something. I should spend more time on this, and web presence in general, but having a blog is an opportunity to share your knowledge and get your name out there a little bit.
Read other people's blogs. SQL Bits, SQL Server Cental (particularly their Stairway series), and SQL Server Team Blogs are all great places to find some great SQL blogs to start.
Tweet something. The SQL community is hugely involved in Twitter, and that includes the guys and gals who write SQL Server. I once tweeted "Thanks, Microsoft!" about some SQL feature that I really liked, and a few minutes later, the developer who wrote that feature tweeted back, "Glad you like it!". How cool is that? Follow #sqlserver and #sqlhelp to start, and read Brent Ozar's great introduction to Twitter for SQL folks.
Of course, get yourself LinkedIn. It's not like Facebook, honest-- it's just a good way to keep in contact with people you've worked with.

Notice that some of these don't even require you to leave the living room. Of course, you'll make more meaningful connections if you actually get out and talk to people. You might even-- gasp-- make a friend or two. But at a minimum, the goal is to stay connected to the community.

One big epiphany I had this year was gleaned from the classic What Color Is Your Parachute: a big goal of networking is to increase the number of people you can call (or email) to get introduced to someone who can help you. That's it. These people don't have to be able to vouch for your work personally (although that helps), they just have to be able to introduce you.

So, for example, I volunteered to help organize SQL Saturday with my local PASS chapter, which introduced me to dozen or so SQL people here in Cleveland. Now, aside from my ability to order great sandwiches, they don't know much about my SQL skills. But collectively they probably know someone in every SQL shop in town, or they know someone who does. So now I not only have some cool SQL friends, but I have a little network of people, and when I find myself asking, "I wonder who I can talk to at XYZ Corp," chances are one of these people can hook me up with a name and an introduction.

Further out at the edges of my network are the presenters I've met and watched at SQL Saturday, bloggers I've read, and the aforementioned SQL developers who tweet or lurk on the same hash tags as I do. I don't expect Tim Ford or Jeremiah Peschka or Brent Ozar know who I am... but I know them. (And I know people who know them, bwuahaha.) More importantly, I know they're the experts in their areas, and that they have a wealth of knowledge and materials that they generously share in talks and blogs. They're also active twitterers (tweeters?). If I had some question about DMO's, window functions in Denali, or how SANs work, I know where to go.

As you're making your New Year's resolutions next year, think about what you can do to increase the size of your SQL network. Also, think about becoming a nicer driver-- the person behind you might be closer than you think!

Friday, September 23, 2011

America's Next Top (ER) Model: Plowed!

I've been doing a lot of modeling lately, and it's starting to affect my personal life. In today's article, I'll explore the personal story of how the stress of modeling found me getting plowed every day, and how I managed to stop.

A Day in the Life of a Model
As I mentioned, I'm do a lot of data modelling these days. My team uses Visio for all of our ER diagrams. It works pretty well, and with the help of a free plug-in from Orthogonal, we can generate our initial DDL scripts right from the diagram.

Here's where the stress of modeling starts:

I have a nice ER diagram, all organized in a visually pleasing way:

.. and then I go to move one of the entities a little bit, and that's when it happens:

A whole bunch of other entities scatter out of the way, often landing outside the page borders. Rounding up each of these stray sheep typically results in a few more sheep wandering off in random directions, and it's a long, frustrating day on the ranch.

I'm not even sure how to begin Googling, because there isn't a good word for the behavior. I tried things like "visio object repel" or "visio objects rearrange" or "visio sucks sucks sucks sucks er diagrams".

Getting Plowed

In the end, I found it the old fashioned way: randomly clicking around until I found something that looked dodgy. It turns out, the behavior is called "plowing," and as usual, Microsoft is trying to help you. With plowing enabled, objects behave much like the girls did towards me in junior high school: they jump way out of the way as soon as you get anywhere near them.

What's happening above is something I'd dubbed a "cascading plow" (mmm, cascading plow...). The entity I moved got close to another entity, which jumped ten feet in some random direction, where it in turn ended up perilously close to another entity, which jumped out of the way, and so on, and so on...

From what I can tell, there's two ways to turn this behavior off:

Per Object

Right-click any entity and choose Format, Behavior, and go to the Placement tab. Here you can set one of two properties:

"Move other shapes way on drop", which you want to set to "Plow no shapes"
"Do not allow other shapes to move this shape away on drop.

Which one you set is kind of a world-view thing:

Do you stop Moe from poking everyone else in the eye, or do you teach everyone else to do the Nose Salute every time Moe comes at them? I suppose you could do both, but really I don't think Larry should have to always be on the lookout, so I set the first option to "Plow no shapes." (Nyuk nyuk nyuk)

With this enabled, you can now move your entity anywhere you like, and it won't push other entities around anymore. Of course, you'll run the "risk" of being able to have entities touch or even --gasp!-- overlap, but you're in control of that. Connectors (i.e. relationships) will continue to route themselves automatically, but I find Visio does this pretty well.

Of course, you'll have to set this property on each and every entity you add to your diagram.

Forever and Ever, Amen

I don't really want to have to set the behavior on every entity from now until forever. Having fixed one of my existing ER diagrams, the Plow No Shapes behavior works perfectly, and I don't want to ever go back to bumper car mode.

You can't officially modify the default stencils that ship with Visio (a stencil is the official name for a category of like controls-- when you select File --> Shapes, the little green window-pane items are stencils). So if you like, you can right-click Entity, and choose Add to My Shapes, and add it to your Favorites stencil or a new stencil.

Me, I'm fine with all the other controls in the Entity Relationship stencil, which opens by default with any new "Database Model Diagram". I just want to change the behavior of the standard Entity, forever. If you want to do that, you'll want to open up the standard Entity Relationship stencil, which on my system is located in c:\Program Files\Microsoft Office\Visio11\1033\ENTITY_U.VSS.

With the stencil opened for editing (either your own or the standard ENTITY_U one), you can now right-click the Entity object, and choose Edit Master --> Edit Master Shape. Right-click the shape, Format, Behavior, Placement, Plow No Shapes, and save.

Now when you create new ER diagrams, the Entity object won't act like an ass anymore. You can move your entities around any way you like, and nothing with jump (or be plowed) out of the way.

It's made my modelling life a whole lot easier, although I still have trouble with the four-inch heels.

Sunday, April 24, 2011

Do You Know .NET?

I got an email from a friend recently, saying:

"As I search for my next situation, I stop when I see .NET as a requirement.

So... what does it take to learn .NET?"

This turns out to be the same friend who inspired this blog post. I guess he just has a knack for asking blog-worthy questions.

Like many people I know from my last job, this guy's what I'd call a technology generalist. He's spent most of his career supporting, troubleshooting, tweaking, and business-analyzing software. A lot of shops have people like this around: they know enough about a variety of technologies to move data around from spreadsheet to database, maybe a little scripting skills, enough SQL knowledge to write a query or ten. They're often the people who respond to the constant stream of ad-hoc queries coming from senior leadership or finance.

As a database specialist and sometimes-programmer, I've dealt with this question myself. What does it mean to "know" .NET? I once tried to explain to a sales team what .NET was, and it just about killed us all. "Dot NET" as a marketing term has come to stand for as broad-- and as vague-- a set of technologies as "green" has: it may very well have a very specific meaning to the person using it, or it might just be that person's shorthand for "cutting edge" or "Microsoft".

However, if ".NET" is listed further down in the job description, I understand it to mean three things:

"We work primarily on MS Windows-based software"
"We use C# (probably), VB.NET (maybe), or some other .NET programming language"
"We're not using "old" technology like VB6, C++, Fortran, etc.

How much experience they expect you to have, and how much exposure to .NET languages the job actually has, can vary a lot.

Here's how I approach it, and my advice to my friend: if the primary focus of the job is .NET development, look elsewhere. These jobs are for people who are computer programmers by trade. It's a little like the difference between a General Practitioner and a brain surgeon. A GP could operate on your brain in a pinch, and do a much better job than a non-doctor. But you're still probably better off with a mediocre brain surgeon than an exceptional GP.

The two answers I want to be able to give (honestly) in an interview are the following:

I've worked with/supported/extended/etc programs that were written using .NET technologies-- I think it was mostly ASP.NET for the Web/UI layer, C# for business logic, and SQL Server on the back-end.
I've done a little C# programming recreationally-- not enough to consider myself a programmer, but I wrote a little web app to store recipes in a database, and I'm familiar enough with what a C# program looks like-- if I looked through someone's else's code, I could probably work out what it's doing.

I want to establish right up front that I'm not a programmer, but that I have enough experience and professional curiosity to have dabbled in it a bit.

At this point, the employer should have enough to tick (or not tick) that requirement off the list. If they really need someone who can take over primary coding responsibilities, I'm not the guy, and I don't want to be. If they just need someone who can maybe maintain and tweak a couple little C# utilities that the last guy wrote, as 10% of the job duties, I can probably do that. It might take me ten times longer than a "real" programmer, but let's talk about how my skills fit the other 90% of the job.

So, how does someone like my friend get from "I get the idea that .NET is a framework" to being able to cover these two points? Here's my advice:

For the first answer, ask questions about the software you're working on-- even if you're just doing tech support, or gathering requirements, or whatever. Talk to the programmers and ask them what languages and technologies they're using, and why. Don't be afraid to ask stupid questions: programmers have a reputation for being brusque, but at some point they have to explain what they do to their spouses and moms. Ask them, "What exactly is Ruby on Rails anyway?" and the good ones should be able to give you some analogies at least, or tell you why it differs from other development languages.

Also, as long as you're willing to assume the "dumb guy" role, most people get a big ego boost expounding on their area of expertise to a newbie. It's why guys love explaining the finer points of football to an interested female (and why girls do so well asking, "Tell me again about the guys in the stripey outfits?") Programmers love being the smartest guy in the room, and you can get a lot of knowledge by starting questions with, "I understand you're the X guru here..."

Ask these questions, all the time, about whatever web site or app you're supporting. If you're no longer at the job, contact some former colleagues and ask them. The goal isn't to learn all these technologies-- just to be able to talk about them a little. Good sales guys do this all the time-- they may not know a rack from a pinion, but they want to be able to tell the average Joe why he wants one.

For the second answer, you'll have to work a little harder. You're going to want to teach yourself-- or have someone teach you-- C#. This seems to be the most common language in use out there, so it's probably the place to start, although VB.NET is a bit friendlier and will work just as well for this objective. If you have some familiarity with any kind of computer programming-- maybe you took a Fortran class way back when-- and you have some familiarity with variables and parameters, you've got a head start. If not, you'll be learning new concepts as well as a new language, so it might take longer.

Get yourself a copy of Microsoft Visual Studio Express here (it's free!) and pick up a beginner's C# book-- lots of people seem to like the Head First series, like this one on C#. A lot of these are broken down into one or two hour lessons, and if you dedicate a few nights a week to studying, you can learn the basics in a month or two. Alternatively, check with your local community college or continuing ed program. Most offer introductory classes. As you work through each exercise, make up little variants to test yourself: "What if I want my window to be red instead of blue?" "What if I want to let the user choose the window color?"

Make sure the book you use shows you how to interact with a database, and how to write a web application as well as a Windows/desktop one. This will force you to learn a little SQL, and understand how IIS and web sites work.

From there, it's kind of up to you. Maybe you'll get the bug and want to take some more classes on computer programming, or you'll decide you want to write the next Angry Birds and make your millions $3 at a time.

Most likely, you'll learn a little bit about programming-- much of which you'll forget-- and maybe pick up some skills you can use at work. I've found that the most useful thing to do, once you know a little of a language, is to challenge yourself to some business task: write a little application that looks through all the documents in folder for and reports how many times the word "mortgage" appears. You'll have to search the web to learn how to loop through files in a directory, how to open up various file types, how to search. These are the things a technology generalist often needs to be able to do with .NET: just complicated enough that you don't want to do it by hand, but simple enough to accomplish with 100 lines of code or less.

Remember, the goal here isn't to fake your way through a test, or to just learn enough "buzz words" to impress the interviewer (actually, that is the goal of the first answer). The goal here is to understand enough about C# to be able to use it to do some basic tasks for you. You want to be the GP who can do a sloppy brain surgery if you absolutely have to. When the job calls on someone to write a little routine that sends an email if the stock goes over $40 a share, you want to have just enough skills to Google "C# get stock price" and "C# send email" and be able to cobble the examples together to make something that gets the job done. Then you can tell the next interviewer, "I'm not a programmer, but once I had to write a little app to monitor the stock price and send an email, and it worked."

I can't stress this last point enough. People go to school for years to learn professional programming techniques, and then often spend years honing these skills on the job. And many of these people still might not describe themselves as really "knowing" .NET. You're not trying to become the next rockstar programmer in a month.

But, there's a ton of little tasks on the job that can be solved with a few lines of the right code. The powerful thing about .NET is that there are thousands of libraries out there to open spreadsheets, send emails, move files around, monitor RSS feeds, get the weather, and any other of a million little tasks, and there are millions of web sites and forum posts explaining how to use them. A tech generalist has to be a little like a librarian: you might not know everything, but you know how to find out. A business analyst, or tech support rep, or QA engineer who can cobble together a little .NET code when they have to is often a very valuable person to have on your team.

Now, go out and "learn" .NET!

Tuesday, March 22, 2011

Entity Framework and the Inheritance Problem

[tl;dr Don't use EF's default TPT mapping if you care about... anything]

A friend and I were having lunch, and I mentioned a recent session I'd attended at the last 24 Hours of PASS called "Entity Framework: Not as Evil as You May Think". I talked about how much I hate ORM's and the hideous SQL that they generated.

As it turns out, my friend had recently started using Entity Framework for a project he was working on. As it turns out, his was the kind of project that EF is great for: the database won't ever hold more than a few thousand records in a handful of tables, and the whole thing will run in SQL CE in a single-user database. In short, performance and scalability aren't huge concerns-- he just wants a basic data store.

But he had a problem. During the initial population of the database, they might insert 1000 or more records into a certain table. This process was taking minutes, and he'd tracked it down to a single line of code that checked to see if the record was already in the database. A little snooping revealed that EF was generating a SQL statement that was 37,000 characters long. He suspected that the parse and compile time alone accounted for the 1+ second each lookup took.

My interest piqued, I had him take me back to his place and fire up Visual Studio so we could take a look. I've never actually used EF or SQL CE, and I wanted to watch it all in action. After fumbling around trying to remember how to see the query EF was generating (SQL CE doesn't have a Profiler), we found the ugly offender. It was a mess.

What we found, as described in this article, is that Table-Per-Type inheritance is pretty much a nightmare in EF. (Microsoft claims this is a "high priority work item" for the next version of EF, so we'll see how that goes.)

Let's take a look at how this goes. Suppose I have a base class, "Animal", and four classes which inherit from Animal: Dog, Cat, Tutle, and Fish. The Entity Model is shown here:

(click to enlarge)

When you generate a database from this model, you get a SQL table for each class (hence, Table Per Type, duh), with foreign keys joining Cat, Dog, etc to Animal. So far so good.

Now let's run a simple line of code to check whether an Animal named "Bob" already exists:

Animal a= animalContext.Animals1.FirstOrDefault(Animal => Animal.Name == "Bob");

And, let's see what EF came up with (sorry, you'll need to page down a few times):

SELECT
[Limit1].[C1] AS [C1],
[Limit1].[Id] AS [Id],
[Limit1].[Name] AS [Name],
[Limit1].[Age] AS [Age],
[Limit1].[C2] AS [C2],
[Limit1].[C3] AS [C3],
[Limit1].[C4] AS [C4],
[Limit1].[C5] AS [C5]
FROM ( SELECT TOP (1)
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Age] AS [Age],
CASE WHEN (( NOT (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL))) AND ( NOT (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL))) AND ( NOT (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL))) AND ( NOT (([UnionAll3].[C9] = 1) AND ([UnionAll3].[C9] IS NOT NULL)))) THEN '0X' WHEN (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL)) THEN '0X0X' WHEN (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL)) THEN '0X1X' WHEN (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL)) THEN '0X2X' ELSE '0X3X' END AS [C1],
CASE WHEN (( NOT (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL))) AND ( NOT (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL))) AND ( NOT (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL))) AND ( NOT (([UnionAll3].[C9] = 1) AND ([UnionAll3].[C9] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL)) THEN [UnionAll3].[C2] WHEN (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) END AS [C2],
CASE WHEN (( NOT (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL))) AND ( NOT (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL))) AND ( NOT (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL))) AND ( NOT (([UnionAll3].[C9] = 1) AND ([UnionAll3].[C9] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL)) THEN [UnionAll3].[C3] WHEN (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) END AS [C3],
CASE WHEN (( NOT (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL))) AND ( NOT (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL))) AND ( NOT (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL))) AND ( NOT (([UnionAll3].[C9] = 1) AND ([UnionAll3].[C9] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL)) THEN [UnionAll3].[C4] END AS [C4],
CASE WHEN (( NOT (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL))) AND ( NOT (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL))) AND ( NOT (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL))) AND ( NOT (([UnionAll3].[C9] = 1) AND ([UnionAll3].[C9] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C6] = 1) AND ([UnionAll3].[C6] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C7] = 1) AND ([UnionAll3].[C7] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll3].[C8] = 1) AND ([UnionAll3].[C8] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll3].[C5] END AS [C5]
FROM [dbo].[Animals1] AS [Extent1]
LEFT OUTER JOIN (SELECT
[UnionAll2].[C1] AS [C1],
[UnionAll2].[C2] AS [C2],
[UnionAll2].[C3] AS [C3],
[UnionAll2].[C4] AS [C4],
[UnionAll2].[C5] AS [C5],
[UnionAll2].[C6] AS [C6],
[UnionAll2].[C7] AS [C7],
[UnionAll2].[C8] AS [C8],
[UnionAll2].[C9] AS [C9]
FROM (SELECT
[UnionAll1].[Id] AS [C1],
[UnionAll1].[IsClawed] AS [C2],
[UnionAll1].[C1] AS [C3],
[UnionAll1].[C2] AS [C4],
[UnionAll1].[C3] AS [C5],
[UnionAll1].[C4] AS [C6],
[UnionAll1].[C5] AS [C7],
[UnionAll1].[C6] AS [C8],
[UnionAll1].[C7] AS [C9]
FROM (SELECT
[Extent2].[Id] AS [Id],
[Extent2].[IsClawed] AS [IsClawed],
CAST(NULL AS varchar(1)) AS [C1],
CAST(NULL AS varchar(1)) AS [C2],
CAST(NULL AS varchar(1)) AS [C3],
cast(1 as bit) AS [C4],
cast(0 as bit) AS [C5],
cast(0 as bit) AS [C6],
cast(0 as bit) AS [C7]
FROM [dbo].[Animals1_Cat] AS [Extent2]
UNION ALL
SELECT
[Extent3].[Id] AS [Id],
CAST(NULL AS varchar(1)) AS [C1],
CAST(NULL AS varchar(1)) AS [C2],
CAST(NULL AS varchar(1)) AS [C3],
[Extent3].[FreshWater] AS [FreshWater],
cast(0 as bit) AS [C4],
cast(0 as bit) AS [C5],
cast(0 as bit) AS [C6],
cast(1 as bit) AS [C7]
FROM [dbo].[Animals1_Fish] AS [Extent3]) AS [UnionAll1]
UNION ALL
SELECT
[Extent4].[Id] AS [Id],
CAST(NULL AS varchar(1)) AS [C1],
[Extent4].[IsHouseTrained] AS [IsHouseTrained],
CAST(NULL AS varchar(1)) AS [C2],
CAST(NULL AS varchar(1)) AS [C3],
cast(0 as bit) AS [C4],
cast(1 as bit) AS [C5],
cast(0 as bit) AS [C6],
cast(0 as bit) AS [C7]
FROM [dbo].[Animals1_Dog] AS [Extent4]) AS [UnionAll2]
UNION ALL
SELECT
[Extent5].[Id] AS [Id],
CAST(NULL AS varchar(1)) AS [C1],
CAST(NULL AS varchar(1)) AS [C2],
[Extent5].[ShellColor] AS [ShellColor],
CAST(NULL AS varchar(1)) AS [C3],
cast(0 as bit) AS [C4],
cast(0 as bit) AS [C5],
cast(1 as bit) AS [C6],
cast(0 as bit) AS [C7]
FROM [dbo].[Animals1_Turtle] AS [Extent5]) AS [UnionAll3] ON [Extent1].[Id] = [UnionAll3].[C1]
WHERE N'Bob' = [Extent1].[Name]
) AS [Limit1]

Notice the goofy kind-of-bit-mapped discriminator columns C4-C7. Clever, I'm sure. On my friend's database, there were five derived classes, and lots more columns, resulting in the famed 37kb SQL statement. And, of course, EF is going to use a beast similar to this for any table access to any of the base or derived classes.

This, it turns out, is how Entity Framework handles a simple inheritance with TPT. Neither my friend or I are experts on EF, but there doesn't appear to be a workaround here, without manually hacking the hell out of the mapping files-- which, as an added benefit, seems to make the graphical model designer in VS choke.

Here, I think, is the best way to work around this for now. Luckily, there's no legacy databases to worry about, so redesigning the table structure is an option.

1. Download and install the Entity Designer Database Generation Power Pack (EDDGPP to its friends).
2. Restart Visual Studio
3. Now in the entity model properties, you'll find a whole lot of new xaml files installed under the Database Generation Workflow property, as shown below.

4. Choose "Generate T-SQL Via T4 (TPH).xaml (VS)
5. Re-generate the database from the model.

This creates a single table for Animal (TPH stans for Table-Per-Hierarchy), with all the additional attributes from Dog, Cat, etc added as additional columns in Animal. EF also automatically generates a discriminator column in Animal, which is not mapped to a property in the object model.

There's a couple of caveats here:

Each property in each of the classes needs to be mapped as nullable, or have a default value defined. (Which makes sense: Fish won't have an IsClawed property, so we need a valid value for that column in SQL).
Each property in all the classes-- both base and derived-- need to be unique (which should be self-evident from the table structure).

With that all done, the object model is exactly the same as it was using TPT mapping, but the SQL is a LOT cleaner. Here's our code again:

Animal a= animalContext.Animals1.FirstOrDefault(Animal => Animal.Name == "Bob");

And here's the SQL generated:

SELECT TOP (1)

[Extent1].[__Disc__] AS [__Disc__],

[Extent1].[Id] AS [Id],

[Extent1].[Name] AS [Name],

[Extent1].[Age] AS [Age],

[Extent1].[IsClawed] AS [IsClawed],

[Extent1].[IsHouseTrained] AS [IsHouseTrained],

[Extent1].[ShellColor] AS [ShellColor],

[Extent1].[FreshWater] AS [FreshWater]

FROM [dbo].[Animals1] AS [Extent1]

WHERE ([Extent1].[__Disc__] IN ('Animal','Cat','Dog','Turtle','Fish')) AND (N'Bob' = [Extent1].[Name])

That makes a lot more sense all around, and I'm guessing it will perform a ton better. The drawback (if it's really a drawback) is that now all animals are stored in one table, which may not be desirable in some situations, and of course we had to redesign the database, which could be a bit of an issue if there's existing data in the TPT tables.

So, is Entity Framework evil? Not always: in most cases, the default EF mappings were fine for my friend, and as a developer-not-a-DBA, he's able to load and persist objects with ease. But, God help you when the EF magic decides to do something truly ponderous like this to persist a pretty simple object model.

Thursday, February 24, 2011

A SQL Poem

One of the jobs I recently applied for asked for a SQL poem as part of the application. I love stuff like this so I took a little time and came up with the following. (This actually did happen to me once...)

I didn't originally title it, but I think I'll call it...

"Wait... STAT!" by S

The week was starting bad for Joe

His app was running way too slow.

He summoned Steve, from regions nether

To patch performance back together.

“Well, here’s your problem-- clear as Windex:

You’ve got a missing clustered index.

This calls for not just any geek; we'll

Need one who speaks fluent SQL."

Functions, unctions, query plans:

"We'll need more seeks, and fewer scans.

And then, to get some extra speed, yer

Gonna need a stored procedure."

Steve worked fast, with mumbled zeal,

Declared, "Success! Just watch 'er squeal!"

Said Joe, "While I applaud your fervor,

My problem's on this other server."

Tuesday, February 15, 2011

Google Interview Question: "Explain a database in three sentences to your eight-year-old nephew."

[Reposted from my non-SQL blog. Apologies to the handful of readers who follow both.]

From this list of Google interview questions, I found the question: "Explain a database in three sentences to your eight year old nephew." Here's my attempts.

Attempt 1: Trying to Capture Third Normal Form

On a sheet of notebook paper, make a list of all your video game systems, and number them 1 through whatever.
On another sheet of paper, make a list of all your video games, also numbered 1 through whatever.
For each video game, find the game system that it goes with on the first list, and write the number from the first list next to the game on the second list.

Attempt 2: Eight-year-olds Are Still Kinda Into Santa

Santa has a list of every kid in the world, which includes the kid's name, address, age, and naughty/nice score.
Santa has another list of every toy in the world-- how much it costs, what color it is, how much it weighs (for the sleigh).
Santa has a special "many-to-many" list with just the kid-number and toy-number, to keep track of which kids want which toys, so it's really easy to figure out things like how many of each toy he needs to buy.

Attempt 3: They Learn This Stuff in the Womb, I Swear

You know how you implemented MongoDB for the social networking site you wrote last year, after the metadata structure got sufficiently gnarly for a simple key/value store, and users were demanding the ability to search on any attribute?
Well, when I was a kid, we had these things called Relational Database Management Systems, which were really good at enforcing relational integrity (which was important because the computers still depended on algorithms instead of magic like they do today) but you had to know a thing or two about relational data design, and the RDMS got kind of slow when everyone on the planet started uploading holographic 3D videos of what they had for lunch every day.
Now go get your Uncle S a beer.

Friday, February 11, 2011

SQL Saturday "Action Items"

Kendra Little's "Cerealizable" Isolation Level

SQL Saturday #60 was a huge success! As a volunteer, I'm proud to be part of a team that put on a great event. As an attendee, I'm kind of blown away at the quality of the event, especially knowing that it was all donated time and money. (None of the speakers were paid, and many actually flew to Cleveland in February on their own dimes! You guys rock!)

The best thing about going to events like SQL Saturday and the PASS Summit is that you get to meet and hear all these fantastic speakers. (The worst thing is, there's never time to meet and listen to all of them!) This week's new "discoveries" for me: Jeremiah Peschka, Aaron Bertrand (although I'm a huge fan of Aaron's "Bad Habits" series already), Kendra Little, and Tom LaRock. All of these folks are wicked smaht and fun to listen to, and Kendra even draws groovy little cartoon drawrings like the one I shared here.

As usual, I went home with visions of new SQL tools and techniques dancing in my head. Here's some things on my "to investigate" list:

Erin Stellato introduced me to ClearTrace!!! Die, die, read80trace!!! Okay, that's not fair-- read80trace has helped me through many performance audits, but it's kind of unruly to use. ClearTrace looks like the answer to a SQL Profiler-analyzer's dreams. Must learn it.
About everyone made it clear that Twitter is not just for posting your breakfast intentions anymore. Brent Ozar wrote a little book to help old folks like me catch up. I actually joined twitter last night, and I feel more connected already, at least to the SQL community.
Tom LaRock finally explained waitstats in a way that made sense. I'm vowing to finally start looking at waitstats to diagnose performance issues.
Kendra Little got me excited about snapshot isolation level again. I thought it was about the coolest thing ever when it came out, but then never had the time or load test facilities to try it out. I'm still convinced this will provide huge performance boosts to clients of my former employer, where blocking and deadlocking reign.
Lastly, I'm going to get on the schedule of my local SQL user group to present on some SQL topic. Giving SQL presentations was my favorite part of my last job, and seeing all these great presenters has me itching to get back on the other side of the lectern again.

That should be enough to keep me busy for a while.

Wednesday, February 9, 2011

An Encryption Problem

Here's the scenario: we have a SQL Server database that until recently stored sensitive information in plaintext (yikes!) And we have an Access 2003 database in which we built a custom little front-end, with linked tables to the SQL database. The vendor recently encrypted some of the fields in the database (thanks!) and now we need to update the Access database to comply.

The documentation from the vendor provided a basic example of accessing the encrypted information:

OPEN SYMMETRIC KEY APPKEY DECRYPTION BY CERTIFICATE APPCERT

SELECT CONVERT(VARCHAR,DECRYPTBYKEY(CUSTOMER.SSN)) AS SSN FROM CUSTOMER

That should be a pretty easy change in most code-- just open the key, wrap the encrypted field in a couple of functions, and you're good to go. But linked tables in Access present an interesting problem. I'll spare you the laundry list of things I tried, but suffice it to say that lots of approaches almost worked, but nothing quite let me get the key open before calling DecryptByKey.

What saved me was a little function called DecryptByKeyAutoCert, which extracts the private key from the cert and uses it to decrypt the value, all in one fell swoop. Without the need to open the key separately, I could create a SQL view, and use INSTEAD OF triggers for the updates. With baited breath, I linked my new view in Access... and it worked!

In short, here's the SQL that makes it all possible:

create view my_customer as
select customerid, fname, lname, convert(varchar(12),DecryptByKeyAutoCert(cert_id('APPCERT'),null,customer.ssn)) as ssn
from customer

create trigger trg_my_customer_upd on my_customer
instead of update as
begin
   open symmetric key APPKEY decryption by certificate APPCERT
   update customer set fname=i.fname, lname=i.lname, ssn=EncryptByKey(key_guid('APPKEY'),i.ssn)
   from inserted i where i.customerid=customer.customerid
   close symmetric key APPKEY
end

create trigger trg_my_customer_ins on my_customer
instead of insert as
begin
   open symmetric key APPKEY decryption by certificate APPCERT
   insert into customer (customerid, fname, lname, ssn)
   select customerid, fname, lname, EncryptByKey(key_guid('APPKEY'),ssn)
   from inserted
   close symmetric key APPKEY
end

With this view created, we can now issue simple SELECT, UPDATE, and INSERT queries against my_customer, treating the ssn column as a plaintext column. The view decrypts the data as it comes into the resultset, and the triggers ecrypt the data as it goes back into the database. In short, the my_customer view now acts just like the original customer table.

All we had to do then was to delete the original linked table in Access, link to the new view, and rename the linked table to the original name. No need to rewrite any other logic in our mini-application!

I simplified a few things here, and in our case the cert was created without a password, but it's pretty easy to add that in if you need do.