Bad habits to kick : using alias types
In my last post in this series, I talked about using the visual designers in SSMS. This time, I wanted to treat the use of alias types.
To clarify what I am talking about here, alias types used to be called "user-defined data types." Then, when CLR came around in SQL Server 2005, they gave us the ability to create our own CLR user-defined types (UDTs) – a different beast altogether. I won't get into the differences, but to avoid confusion, they came up with a new name (an alias, if you will; how ironic) for the old-style user-defined data types, which better reflects what they actually do: alias types.
In general, the thinking behind
user-defined data types alias types is that you can use this central definition to declare the properties of a column type across your entire database. For example, if you wanted to always store e-mail addresses in a VARCHAR(320) column, you could do this:
CREATE TYPE [email] FROM VARCHAR(320) NOT NULL;
(In SQL Server 2000, you would use sp_addtype; for this discussion, I'll assume SQL Server 2005 or better.)
Then you could create a table and procedure like this:
CREATE TABLE dbo.Contacts ( ContactID INT PRIMARY KEY, EmailAddress [email] UNIQUE ); CREATE PROCEDURE dbo.Contact_Create @ContactID INT, @EmailAddress [email] AS BEGIN SET NOCOUNT ON; INSERT dbo.Contacts ( ContactID, EmailAddress ) SELECT @ContactID, @EmailAddress; END GO
This would ensure that your developers always use a VARCHAR(320) column for e-mail addresses, because they are instructed to use the [email] data type instead of picking their own at random. This provides consistency throughout your tables, and prevents you from losing data when someone accidentally declares a variable or parameter as VARCHAR(100) or VARCHAR(255). Of course someone can still do that, if they "forget" that they are supposed to use the [email] data type… but SQL Server can't help you solve that problem.
One issue with this approach is that the definition of e-mail address (or at least how you understood it at the time you created the type) can change over time. I know that many people use VARCHAR(128) or VARCHAR(255) for e-mail addresses, even though those are far too short (at least according to the standard). In this example, if you picked a big enough size, it is unlikely to affect you, as the standard hasn't really changed (apart from the inclusion of Far East languages, requiring Unicode). Do you want a real example of a supposedly fixed size data element that has changed? ISBN. I would bet that at one point Barnes & Noble thought it might be a good idea to create an alias type called "ISBN" from CHAR(10) so that people wouldn't be using VARCHAR(10), NVARCHAR(10), VARCHAR(50), VARCHAR(MAX), and other silly ways to define the ISBN column. Then when ISBN-13 codes became standard, they would have regretted doing so. Why? Well, let's try modifying an alias type once it's in use; let's say we need to expand our email type from above to support 350 characters. Our instinct would be to use DDL like this:
ALTER TYPE [email] FROM VARCHAR(320) NOT NULL;
But we can't:
Msg 102, Level 15, State 1, Line 1 Incorrect syntax near 'TYPE'.
In fact, there is no DDL statement that supports ALTER TYPE. (And no, sp_rename or using synonyms won't help you here, nor is there a system procedure called sp_altertype.) So can't we just drop the type and re-create it? You would think you could do the following:
DROP TYPE [email]; GO CREATE TYPE [email] FROM VARCHAR(350) NOT NULL; GO
But you get this error message:
Msg 3732, Level 16, State 1, Line 1 Cannot drop type 'email' because it is being referenced by object 'Contacts'. There may be other objects that reference this type.
You need to follow this process instead:
- find *all* of the objects that use this alias type (this includes parameters definitions and variable declarations);
- modify them all so that they point at the "regular" type (or a different alias type);
- drop the alias type using DROP TYPE;
- create the alias type with the new definition;
- modify all of your objects to point back at the original alias type again.
Now take the case with ISBN, which might also serve as the primary key and have related tables scattered throughout the database. In addition to the above steps, before you start step 2 above, you will also have to temporarily drop all of the constraints, and drop them in the correct order.
Once you have finished all 5 (or 6) of these steps, why not just leave them that way, and point your users at your data dictionary and/or documentation to ensure that they choose the correct data type? At least then when you have data type changes next time, you can phase the new type definition in gradually, instead of an all-or-nothing change.
Further to this, what if you need to use the same custom data type in multiple databases? Now, in addition to the above complications, you must also synchronize the alias types common to all of your databases, or just hope and pray that their definitions will never change.
There are other complications with using alias types that you won't come across when using the built-in types. For example, you cannot use CONVERT() or CAST() to change a column, variable or literal to an alias type; if you do, you get:
Msg 243, Level 16, State 2, Line 1 Type email is not a defined system type.
You also don't get syntax highlighting in Management Studio, so unlike other data type names, your alias type names will appear black instead of blue in query editor windows.
You may also experience issues when you use external or 3rd party providers or drivers to access stored procedures with these alias types, or try to data-bind, or implement Entity Framework or Linq 2 SQL solutions. I'll admit I haven't tried it, but I envision the integration will not be 100% seamless.
Another anomaly with alias types is that you cannot declare #temp tables that use them, unless the alias type also exists in tempdb (but you can use them in @table variables, which also use tempdb; go figure):
Msg 2715, Level 16, State 7, Line 1 Column, parameter, or variable #1: Cannot find data type email.
For further clarification, you can select *into* a new #temp table from a source that includes a column with an alias type, however the #temp table inherits the properties of the base data type (so in the above example, the column would be defined as VARCHAR(320), not email).
I hope I have demonstrated that, while tempting, the benefits of alias types are far outweighed by the potential problems they can cause down the road. In fact, I am hard-pressed to think of a single implementation of an alias type that would provide any tangible benefits at all. CLR UDTs are a different story, as these can be much more complex and can include additional features like properties and methods. And since the CLR UDTs are based in .NET, you can implement validation rules using real regular expressions. Of course, as with any complex feature, they come with a set of restrictions and prerequisites as well, so I will leave it as an exercise to the reader to determine how much of an advantage these provide *for you* over alias types.
I am working on a series of "Bad habits to kick" articles, in an effort to motivate people to drop some of the things that I hate to see when I inherit code. Up next: mis-handling date / range queries.