Bad habits to kick : declaring VARCHAR without (length)
In my last post in this series, I talked about using meaningless table aliases. This time I'm going to talk about a pet peeve of mine: declaring varchar / nvarchar variables or parameters without specifying how many characters they should hold.
Thankfully, I see this issue seldom in the code I've inherited in my own systems. But I do see it quite a bit on blog entries, forum posts and newsgroup questions. Because the length attribute is optional, people seem to make the assumption that defining a VARCHAR in T-SQL is like defining a string in C#: "it will just expand to whatever size you stuff into it." The problem is, if you don't define a length, SQL Server will assign one for you, and it may not be as wide as you expect. In some scenarios, it will be 1 (yes, that is the number one, not a typo), and in others it will be 30. Here is an example:
DECLARE @foo VARCHAR; SET @foo = REPLICATE('a', 255); SELECT LEN(@foo); -- 1 SELECT LEN(CAST(REPLICATE('a', 255) AS VARCHAR)); -- 30
I have asked for the default size of VARCHAR values with unspecified lengths to be consistent; Erland wants the "optional" part deprecated. We've each pleaded our case on Connect:
Anyway, back to the original assumption (that any size will do); this assumption is not devastating when you are creating a table, because you will get an error message if you try to stuff more than 1 character. An example:
CREATE TABLE dbo.foo(bar VARCHAR); GO INSERT dbo.foo SELECT REPLICATE('b', 2); GO DROP TABLE dbo.foo;
The column is actually created as a VARCHAR(1). So the result of the INSERT statement:
.Net SqlClient Data Provider: Msg 8152, Level 16, State 14, Line 1 String or binary data would be truncated. The statement has been terminated.
It will only take one or two tries before you (or some of your users!) discover the problem; surely it will be corrected quickly. However, it can very easily lead to data loss or corruption in other scenarios, and you might not notice for some time. Imagine if you forget to specify the length of a parameter to a stored procedure; since any data value > 30 characters will be truncated silently, portions of your strings will simply disappear! A more elaborate example to illustrate this:
USE [tempdb]; GO CREATE TABLE dbo.foo ( bar VARCHAR(255) ); GO CREATE PROCEDURE dbo.foo_create @bar VARCHAR AS BEGIN SET NOCOUNT ON; INSERT dbo.foo(bar) VALUES(@bar); END GO DECLARE @bar VARCHAR(255); SET @bar = REPLICATE('x', 255); EXEC dbo.foo_create @bar = @bar; SELECT [variable] = LEN(@bar), [stored] = LEN(bar) FROM dbo.foo; GO DROP PROCEDURE dbo.foo_create; DROP TABLE dbo.foo; GO
variable stored -------- -------- 255 1
This is bad news. Unless your testing process includes checking the data in the table(s) directly after the procedure call, you might not figure this problem out for a while.
So, please don't be lazy: declare your string lengths, always!
Similar problems, though probably not as severe, can happen with DECIMAL and NUMERIC types. George Mastros blogged about it this morning.
I am working on a series of "Bad habits to kick" articles, in an effort to motivate people to drop some of the things that I hate to see when I inherit code. Up next: using dashes and spaces in entity names.