Bad Habits to Kick : Inconsistent table aliasing
See the full index.
Today, I wanted to touch on a topic I've touched on before: aliasing.
Over the weekend, there was a twitter conversation between @unclebiguns and @jaybonk (it started with this post), and it reminded me of another bad habit I see quite a bit: inconsistent aliasing.
Now, in this very series, I talked about aliasing back in November. But, in that case, I was talking about choosing poor and meaningless aliases like a, b, and c – and the problems that can cause when someone needs to make sense of the query later.
In this case, I am talking about bad practices involving being inconsistent with the use of table aliases. I'll illustrate with two examples:
Only aliasing some of the tables in the query
This is one form I see a lot, where an alias is used for some tables, but not all. You end up having short forms for table references and then in other spots you have to spell out the entire name of the table every time.
SELECT o.OrderID, OrderDetails.ProductID FROM dbo.Orders AS o INNER JOIN OrderDetails ON o.OrderID = OrderDetails.OrderID WHERE o.OrderID = 1;
If you have to maintain this query and it gets more complex later, you will be cursing yourself because you will have to write "OrderDetails" umpteen more times. Or you will be lazy, and also be guilty of:
Aliasing all tables, but not aliasing some of the columns
Another form of this pet peeve that I see is when the developer leads a wild goose chase, where we get to guess which table a column comes from. Of course the original coder knows what they were thinking, and they can get away with it because that column happens to only exist in one table mentioned in the query (at least for now).
SELECT o.OrderID, ProductID, Quantity FROM dbo.Orders AS o INNER JOIN dbo.OrderDetails AS od ON o.OrderID = od.OrderID WHERE o.OrderID = 1 AND ProductID IN (4,5,6,7,8);
The problem is that if another table is introduced to the query (say, the Products table, to get the name of the products ordered), now you will have to go through all the column references to avoid ambiguous column name errors.
Basically I am against any type of shortcut where changing the query later will be much more cumbersome than it should be. For the sake of future maintenance, write your queries clearly, concisely, and consistently. As I've said before, I am not suggesting that you have to alias like I do, or that you have to alias at all… but if you're going to use table aliases, use them the same way every time.
See the full index.
Well that didn't show up very well once it was published! All AS statements, table names, table aliases, ON statements etc should be all sit one under the other.
mjswart- I completely agree
Coming from a maths background and kind of falling into SQL, I try to keep all my production code consistently formatted and well commented. I've found that most developers don't put as much effort into their code. This makes it really difficult to read and understand the code they produce when it comes round to amending it.
I think that tables should always be aliased and joins stated explicitly. Also, because I write a lot reports, I try to put user friendly column aliases whenever possible as it saves time down the line. Since I will often have to pull off many different existing and calculated columns, I've tried to develop a consistent format that makes it easy to see my progress as I'm developing the code. Aliasing columns is pivotal for this. I've found that explicitly stating AS 'Column' in the same location for each column is the most readable (for me anyway).
My work usually ends up looking like this:
I don't have SSMS on my PC
at home so this han't been validated
T1.Name AS 'Customer Name',
T1.Address1 + T1.PostCode AS 'Customer Address',
T2.NOK_Name AS 'Next of Kin Name',
T2.Add + T2.PostCode AS 'Next of Kin Address',
COUNT(*) AS 'Calls re: Payment,
SUM( CASE WHEN HOUR(TD.Date)>18
END) AS 'Out of hrs calls re: Payment'
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.TableID = T2.Table1_ID
LEFT JOIN Table3Daily TD ON T1.TableID = TD.Table1_ID
WHERE TD.EntryTypeID = 18 –Call re:Payment code
T1.Address1 + T1.PostCode ,
T2.Add + T2.PostCode
It just happened that I was also collecting some bad coding standards recently for my blog, and bad aliasing is amongst them. I find this extremely annoying and waste of money/time to have to dig through all views and tables to find the one a column belongs to. There are no good tools yet that would work as well as intelisense in Visual Studio.
I realised that I was using old version of BOL
Sorry for the inconvenience caused
Ok, so on 1., we're agreed.
I still can't agree with you on 2. Are you really coding SELECT statements with so many contiguous <alias>=<expression> lines that they fill the screen, and do you really assume it's an UPDATE without scrolling up to check? I guess we must code very differently but my SELECTs are not composed completely of expressions… and UPDATEs that use other columns rather than variables are pretty rare, as I'm not a big believer in maintaining multiple aspects of information that could otherwise be derived at query time (which such expressions would).
And on 3., I looked at this URL:
Replacement certainly does include:
alias = expression
So I'm not sure which version of BOL you are looking at, but I can assure you that SQL Server will continue to support this syntax, at least for three or so more versions.
2 Assume you have SELECT statement with lot of expressions and you are using alias names
If it is in a procedure and part of a big code, when you parse thru eyes, they are more like UPDATE statemnet than SELECT
as per BOL
Deprecated feature : 'string_alias' = expression
Replacement : expression AS alias
expression AS [alias]
But I guss it may support
string_alias = expression
and deprecate (usage of single quote)
'string_alias' = expression
But not sure why string_alias = expression is not specified as one of the replacements
Sorry Madhivian, I don't have a 2005 version of Books Online installed. Can you post a real URL please?
Anyway, as for your reasons why I "need" to change it to AS:
1. I have a lot of code that uses various other things that aren't ANSI standard. If I removed all of the code that wasn't ANSI standard I'd have a lot of modules that no longer work, and it would be a wasted exercise anyway. I am not of the Celko mindset, where I'm worried that we will switch to Oracle tomorrow and DB2 next Wednesday.
2. I have never confused a SELECT statement with an UPDATE statement. Can you elaborate how this is a problem?
3. I would be absolutely shocked if this suddenly stopped working. That will break a LOT of code and Microsoft would need to have a very, very, very good reason to implement this kind of restriction that breaks backward compatibility.
I still think it is much more useful to have the column alias at the beginning of the line than the end. But you are more than welcome to have to look for the column aliases at the end of the line. As I said earlier in these comments, this really is quite subjective.
Because comma is such an important character in SQL, i don't think it is missed. Also, because i nearly always put each COLUMN name on it's own line, i don't have that issue. Aliases go on the same line.
I believe format with tabs and spaces to be far superior than the AS.
If you use
You need to change it to
(complex expression) as alias
1 Your preference is not ANSI standard
2 It looks like a update statement where expression is copied to a column
3 Most importantly, SQL Server will not support it in future release
See the point specified at BOL at the link ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/instsql9/html/c10eeaa5-3d3c-49b4-a4bd-5dc4fb190142.htm
Did you read part 1 of my article?
There is a change you may get different result if comma is omitted between the columns
Compare the resultsets
select employeeid,orderdate,shippeddate,shipcity from orders
select employeeid,orderdate,shippeddate shipcity from orders
Oh, i forgot about =. I switch between Oracle and SQL Server, so i do not use that.
Nothing major on the format, just wrap and put the alias after the end:
WHEN 4 THEN 42
WHEN 2 THEN 17
Brian, on your second point, I'm not sure if we agree or not. I prefer this for column aliasing:
Alias = (complex expression)
(complex expression) AS alias
The latter I find tedious, and since you say you hate AS for column aliases, I'm kind of surprised that you have some way to line wrap the latter format using AS so that it is easier to find the alias names. Can you show an example?
>I am a fan of using AS for table aliases
I never use it. But on TABLE s it bother me less.
>this can be tedious when you have complex expressions defining column output.
Not if the lines are wrapped manually.
I am a fan of using AS for table aliases, but certainly not for column aliases. In that case I'd much rather have the column names align on the left, than have to scan each line to find the alias at the end – this can be tedious when you have complex expressions defining column output.
I *hate* AS. It is redundant. It adds clutter. It is reminiscent of some code generator.
IOW "AS" is the opposite of what SQL stands for.
Also alias names should be preceeded by the keyword AS
Not too long ago, I saw an alias for a table named GRADE_OBJECT that was just asking for trouble. (And it did indeed create trouble before too long).
I call it "lazy coding" and it drives me crazy too
Thanks Aaron, couldn't agree more. It's too bad you often see queries like your examples in books, including SQL Server exam prep books. Of course, there seems to be a social aspect to this as well, a belief that elite developers write sloppy code. I've encountered a fair amount eye rolling and such for trying to keep code consistent and readable… I can't help it, I started out in accounting!
>Only aliasing some of the tables in the query
I don;t see an issue with that. Perhaps convenience, but its really a case by case judgment. If the second coder does not want to type it out, he can alias it himself!
>Aliasing all tables, but not aliasing some of the columns
This indeed is a big problem. Being explicit about scope is always a good thing. My rule is, if there is more than one line in the FROM clause, every COLUMN must be referenced with its TABLE.