See the full index.
Today I'd like to touch on using aliases. No, I don't mean fake passports and a life of crime, I mean using an alias as shorthand for referencing a table or view in a query.
Some code I once had to clean up on an inherited system looked a lot like this:
SELECT * FROM Orders a INNER JOIN OrderDetails b ON < ... > INNER JOIN SalesPerson c ON < ... > INNER JOIN Customers d ON < ... > INNER JOIN Addresses e ON < ... > WHERE < ... a bunch of criteria against the a, b, c, d, e aliases ... >;
I'm not talking about one query here. I am talking about dozens and dozens of stored procedures all coded like this. And not even consistently… in the same procedure, OrderDetails could be aliased as "a" in one query, "e" in another, and "c" in two others. I don't like this kind of convention one bit. For one, it makes the code very hard to maintain, because when you are looking at the filter criteria, you constantly have to scan up in the query to see what table or view the "e" alias refers to *this time*…
Some people will argue that applying some kind of order or sequence to the tables in the join might make sense, especially when the order does matter (for example when you mix in OUTER JOIN or CROSS APPLY operations). But those details, if they are not apparent enough from the code itself, should be conveyed via comments. If some of those rules change, nobody wants to go back and change all your b aliases to c and c to b, for example, if those two tables must be joined in the opposite order for some reason.
So, what is a better way to do it? In my opinion, the alias should be between 1-3 letters, and give some sort of indication of which table or view it is referencing. So taking the above example, I would re-write it as follows:
SELECT < ... column list ... > FROM dbo.Orders AS o INNER JOIN dbo.OrderDetails AS od ON < ... > INNER JOIN dbo.SalesPerson AS sp ON < ... > INNER JOIN dbo.Customers AS c ON < ... > INNER JOIN dbo.Addresses AS a ON < ... > WHERE < ... >;
Note that I would never leave SELECT *
alone (see my reasons here). And since I have adopted the schema concept for a lot of my work, I always add the dbo.
prefix where other developers may have forgotten. Also, I have been getting into the habit of using AS
to alias my tables and views, just to be more standards-compliant; as explained in previous posts, this isn't something I feel strongly enough to go revamp my entire codebase, but I certainly strive to add in the AS
keyword whenever I am visiting or maintaining existing code. (I still don't like aliasing *columns* using AS
. But that's a story for another day.)
If you have a very complex query, it can get a little tougher to stick to any sensible rules, because if you have three tables or views that all start with c, you need to get more creative. An option is to not alias the ones with shorter names, and reference them directly. Or at least take their two- or three- or four-part name and alias it as just the table name itself. I see many queries like this (probably the product of one of the crappy crutches visual designers in SSMS):
SELECT DATABASE1.dbo.Customers.key_column, DATABASE1.dbo.Customers.Column2, DATABASE2.dbo.Orders.Column3, DATABASE2.dbo.Orders.Column4 FROM DATABASE1.dbo.Customers INNER JOIN DATABASE2.dbo.Orders ON DATABASE1.dbo.Customers.key_column = DATABASE2.dbo.Orders.key_column;
Blecch! Imagine that the query is more complex, and there are other tables or views that start with C, O, etc. In that case, I would much rather see:
SELECT Customers.key_column, Customers.Column2, Orders.Column3, Orders.Column4 FROM DATABASE1.dbo.Customers AS Customers INNER JOIN DATABASE2.dbo.Orders AS Orders ON Customers.key_column = Orders.key_column;
Or, of course, if these really are the only two tables in the query:
SELECT c.key_column, c.Column2, o.Column3, o.Column4 FROM DATABASE1.dbo.Customers AS c INNER JOIN DATABASE2.dbo.Orders AS o ON c.key_column = o.key_column;
The queries don't make sense logically, but I was just trying to quickly illustrate the point that 1-3 letter aliases are not always appropriate. But please, don't take this as a suggestion to not use aliases at all, regardless of complexity. I love using aliases and I think they make queries much simpler looking and easier to maintain than they could potentially be otherwise.
I will confess that, while I always use sensible alias names for tables and views in all of my JOINs, I am guilty of using single letters like x, y, and z for things like CTEs or subqueries (usually when I am not referencing them in the final result). I am trying to get better at this, and noticing my own habit is actually what spurred this post. So please don't take it like I am wagging my fingers at everyone; I am right there with you on some of my bad habits, shortcuts and other lazy behavior. When I have actually blogged about it, this gives me much less of an excuse to tolerate seeing it in my own code.
See the full index.
I avoid using aliases when possible as I find the code more readable without – this might be because I an dyslexic and thus folk without that disability may more easily be able to read X.This and Y.That, to me the Alias adds clutter.
Our databases have column names unique within the whole database, rather than just the table, so we don't have to Alias to avoid ambiguity such as having an [id] column in both Order and Customer tables. We give each table a mnemonic and use that as as the prefix for each column name, which I suppose doubles as an Alias. The prefix is also used in foreign keys:
SELECT …
FROM Order
JOIN Customer
ON Cust_ID = Ord_Cust_ID
In a derived table I (usually) alias the column in the derived table (i.e. specifically to avoid ambiguity and having to then use an alias prefix on the columns). Its much the same as using an Alias though:
SELECT Ord_ID, O2_Ord_ID, …
FROM Order
JOIN
(
SELECT Ord_ID AS [O2_Ord_ID]
FROM Order
WHERE …
) AS O2
ON O2_Ord_ID = Ord_ID
but I don't have to alias columns SELECTed from both tables to make them unique for the APP – as their names are already unique.
its just that, for me, I find that easier to read than
SELECT O1.Ord_ID, O2.Ord_ID AS O2_Ord_ID, …
FROM Order AS O1
JOIN
(
SELECT Ord_ID
FROM Order
WHERE …
) AS O2
ON O2.Ord_ID = O1.Ord_ID
Pet hate: I wish AS was a compulsory keyword (for backwards compatibility I'd be happy with an EXPLICIT or "FUSSY" directive that enforced compliance so that old code could continue to work),
SELECT
Col1
Col2
aliases Col1 as Col2 and I can spend ages looking for the problem only to find that I missed the comma. That MIGHT??!! make a case for aliasing the tables because
SELECT
A.Col1
A.Col2
is a syntax error. Now I've shot myself in the foot I'll shut up!
The tables I support have capitals for the first letter of each word comprising the table name (in my opinion, also a "best practice") and I can usually alias with 1-3 characters. For example, Inventory table alias is 'i', InventoryDetails is 'id', ReleaseMaster is 'rm', ReleaseDetails is 'rd', ManualAsnMaster is 'mam', and ManualAsnDetails is 'mad'. I can quickly write queries and joins, and if I get the "ambiguous column name" message, I can easily determine which alias to add to the column name without consulting my JOIN statements. In the case of the comment above {(a.tbl_id < b.tbl_id) is quite readable.}, I add a '2' to the "secondary" table name (rd.tbl_id < rd2.tbl_id). I use AS on column alias to make web pages and SSRS reports conform to the "common" language (i.e. people know ReleaseMasterID as "Order Number"), but I've never (before) considering using it on tables. I appreciate the insights expressed on this blog and responses.
I don't know Andriy, I'm not sure that case has so much to do with it as color:
http://i.stack.imgur.com/FD36N.png
> Personally, I like using AS, simply because it makes the aliases easier to spot
This is my preference as well, but my guess is it wouldn't make them easier for one to spot if one didn't also prefer writing SQL keywords in upper case.
@raju There is no such thing as "best practice" here IMHO – this is all about preferences (and reasons behind them). Personally, I like using AS, simply because it makes the aliases easier to spot. Compare these two visually:
… cst.TerritoryID, cst.AccountNumberFROM Sales.SalesOrderHeader hdrINNER JOIN Sales.Customer cstON hdr.CustomerID = cst.CustomerID…
… cst.TerritoryID, cst.AccountNumberFROM Sales.SalesOrderHeader AS hdrINNER JOIN Sales.Customer AS cstON hdr.CustomerID = cst.CustomerID… This can become a lot worse if you are not so liberal with carriage returns and tabs – spotting the alias in a long line like this is even tougher:SELECT …, cst.TerritoryID, cst.AccountNumber FROM Sales.SalesOrderHeader hdr JOIN Sales.Customer cst ON hdr.CustomerID = cst.CustomerID
@David I think you are mistaking aliases for obfuscated aliases. I don't think there is anything wrong with using aliases like changing SalesOrderHeader to header, for example. When you're dealing with a query against order headers and order details, I don't think anyone is going to have to do any mental work parsing the statement and being sure they aren't talking about a header in football (soccer).
@John I'm not suggesting obfuscating everything (or anything). But let's take a look at a simple query against AdventureWorks:
SELECT Sales.SalesOrderHeader.SalesOrderID, Sales.SalesOrderHeader.OrderDate, Sales.SalesOrderHeader.CustomerID, Sales.SalesOrderHeader.TotalDue, Sales.Customer.StoreID, Sales.Customer.TerritoryID, Sales.Customer.AccountNumberFROM Sales.SalesOrderHeaderINNER JOIN Sales.CustomerON Sales.SalesOrderHeader.CustomerID = Sales.Customer.CustomerIDWHERE Sales.SalesOrderHeader.[Status] = 5 AND Sales.SalesOrderHeader.TotalDue > 1000ORDER BY Sales.SalesOrderHeader.TotalDue DESC, Sales.Customer.CustomerID;
Isn't this easier to read, and no more difficult to reverse engineer?
SELECT header.SalesOrderID, header.OrderDate, header.CustomerID, header.TotalDue, cust.StoreID, cust.TerritoryID, cust.AccountNumberFROM Sales.SalesOrderHeader AS headerINNER JOIN Sales.Customer AS custON header.CustomerID = cust.CustomerIDWHERE header.[Status] = 5 AND header.TotalDue > 1000ORDER BY header.TotalDue DESC, cust.CustomerID;
This isn't even all that complex a query – only two tables – but still it seems like a much more sane approach to me than (a) using meaningless aliases like a,b,c and (b) not using aliases at all and repeating the entire table or schema+table reference throughout the query.
I don't see the benefit in having proper table names and then using aliases to make things less obvious. I've seen people start rewriting sql to their own preference so often, just to be able to understand the existing code. Using 1-3 letter aliases don't make things more clear for your colleague that has to maintain your code.
Do what I do: Avoid table aliases altogether when possible. I can type common English words like Account, Customer, Order, and so on pretty quickly. The table aliases just make the humans who have to read the queries later do MORE MENTAL WORK and waste time trying to mentally parse the statement.
Generally it's the thinking time and testing time, not the raw typing time, that is the limiting factor in writing SQL code.
Avoid aliases when possible! Except for derived tables and so on, you will be better served, and your queries will be easier to read. And programmers new to SQL won't mistakenly think that table aliases are a required part of the Join syntax. That misconception frustrates me the most.
For aliasing table names is it a better practice to use the "AS" keyword or alias without the "AS" keyword.
I use the table name as the alias whenever that is unambiguous, in which case a full alias is selected (virtually no abbreviations). It doesn't take much effort to use instead of an abbreviation and results in a very readable query. Time savings during debugging is much more valuable to me than time savings during writing.
What are the oods of Microsoft using the same 2-letter acronym as opposed to an 18-letter name ?
Don't get me wrong, I am in favour of using aliases. I just want to know what I might have to face later.
And the question is still open.
Aaron,
Even worse than the (reasonably) well formatted ugly query you posted is one like the following (which is a version of your posted query). Why, oh WHY, does MS (and most other DBMS houses) feel compelled to reduce my nicely formatted query to a stream-of-consciousness burst of text?!?!?!?!
SELECT DATABASE1.dbo.Customers.key_column, DATABASE1.dbo.Customers.Column2, DATABASE2.dbo.Orders.Column3, DATABASE2.dbo.Orders.Column4 FROM DATABASE1.dbo.Customers INNER JOIN DATABASE2.dbo.Orders ON DATABASE1.dbo.Customers.key_column = DATABASE2.dbo.Orders.key_column;
As for the bad habits, I recently ran into one from a developer I work with. He felt compelled to not only use sarbitrary single character table alias but he also used arbitrary alias for computed columns within the query (of which he had far too many, IMHO!). He apparently thought he had to do a LTRIM(RTRIM( )) on everything and then stick an A1, A2, or A3 as the column alias (giving you B.A1, A.B3, etc.!).
Not to be a smart-ass J, but the same thing that happens if you use a 5- or 10- or 18-letter alias and Microsoft later creates a new reserved word that matches the alias you've chosen.
What happens when you use a two-letter alais and later Microsoft creates a new reserved 2-letter code ?
With SQL 2008 it's easy to maintain long AS names since now we've got intellisense!
I used to use inner and outer as standard aliases for a single TABLEs in a correlated sub-query. I picked up that habit from Understanding SQL by Martin Gruber. http://www.amazon.com/Understanding-SQL-Martin-Gruber/dp/0895886448/
I had to stop that when inner and outer became keywords for the poorly named inner and outer joins. I never did understand why they called it that. It confuses so many people probably because of the name alone.
Grant, when the table is actually called Table1, then the alias t1 actually makes a lot of sense, and I can't think of a better one! I'm guilty of making tables named foo, bar, mort, splunge and other meaningless words. I need to get into the habit of making tables that represent real things, as it may help eventual understanding more if we're not talking about widgets or blingwads.
Excellent! I'm right there with you on this one. Weird little x, y, z alias' make me nuts and make the code harder to read.
I am guilty when writing sample code of using tables named Table1 and then aliasing t1, but… I think that one's ok since we're not talking business oriented procs that need to be maintained, plus they're usually small.
Sheila, if there is only one table in the query that starts with "a", then "add" is no more helpful to me, and is less useful in reducing clutter. If I have to type 3 or more letters, I may as well just type the whole table name.
Of course, like just about all of the posts in this series, it is very subjective, just my take.
I agree that arbitrary single letter aliases are not appropriate for use but I would take this further and say that any non-meaningful alias presents a problem. Debugging old code in a complex stored proc that is suddenly producing strange results is much more difficult if the table aliases are not meaningful. What is wrong with using things like cust for customer instead of c or add for address instead of a. It is not hard to figure out that add is the address table.
I prefer to use the major letters of the table name as the alias. The one case where I find a, b handy and meaningful is when removing duplicates from a table(in the case of using an identity type column to classify which to remove). (a.tbl_id < b.tbl_id) is quite readable.
On the other side of the spectrum –
Working in a place that uses very descriptive table names, I really hate when people qualify the columns with the table name. It makes the query very hard to read.
Rowland, no, I am against using single-letter aliases when the single letter is chosen arbitrarily or alphabetically. It should make some logical sense to be connected with the table or view it is aliasing. Note that in the example where I use the cross-database query and the entire table name, I am still aliasing (I see a lot of people repeat the three- or four-part name throughout their query, and it just makes it look like a big mess). I'm not saying you should or shouldn't alias, that's up to you. Just suggesting to not use meaningless aliases like a,b,c unless your table or view names actually start with those letters.
So you're against using aliases except when you're for them? <g>
This is more a matter of syle IMHO — neither good or bad. I do like to use a single letter from each table and when the letter has already been taken add additional letters until the alias is unique.
Outside of that — who cares?
I couldn't agree more. I've had the bad habit of using t1, t2, t3 without the AS operator, but I've recently been teaching a basic SQL class to some of our technicians and realized that it doesn't make sense. The queries are much more readable when we use logical aliases and the AS operator. Thanks for the post!
Excellent, I wish everyone would read this and take it on board!
I agree. Great article. I'm still laughing at "crappy crutches". 🙂
Agreed 100%! Queries that use meaningless single-letter abbreviations are nearly impossible to read. I worked on a bunch of code last year that used A,B,C… and in complex queries I would have to first rewrite them before I could debug.
That said, I'm also really bad about CTEs and derived tables, and recently I've noticed myself getting even worse, doing things like:
;WITH x(t) AS (SELECT COUNT(*) FROM SomeTable)
SELECT q
FROM
(
SELECT t*2
FROM x
) AS p(q)
I'll join you in trying to get better. Thanks for the nudge.
I can think of several
– not prefixing tables in a query with the schema owner (i.e. dbo.Customers)
– not using explicit column lists when using INSERT…SELECT
– not using SET NOCOUNT ON in procedures and triggers