August 22, 2010 | SQL Server

What's in a name?

In a lot of my projects, the issue of naming/coding conventions comes up.  I am a big fan of PascalCase/plural for tables and views (e.g. dbo.CustomerAddresses, dbo.OrderDetails), PascalCase for column names (e.g. FirstName, HomePhone); and for stored procedures, I am partial to the Entity_Action naming convention (e.g. dbo.Customer_Create, dbo.Customer_Update).  I know some like to separate out their words with underscores, but with the exception of stored procedures, I find this tedious. 

One of the stickier points in several cases has been how to case short suffixes, such as the ID in CustomerID.  Let's go back to what the ID is; an identifier.  I often liken it to an identification card, which usually contains either a visible or a bar-coded identifier of some sort (SSN, driver's license #, etc).  While sometimes this card is called an I.D. and sometimes an ID, I have yet to see it called an Id or an id.

For short suffixes that are actually acronyms, I prefer all upper-case.  Most developers I've come across seem to like to lower-case the remainder of the acronym.  And in some cases they have to; e.g. in JavaScript you refer to element.innerHtml; if you try element.innerHTML you get an error message.  I can still tell what Html and Xml are when they aren't in upper-case, but it can be cumbersome to read that way, particularly with the Id / ID suffix.  Because often "CustomerId" can look like "Customerld" – the difference is only slightly more clear in a fixed width, code-ish font:

CustomerId vs. Customerld

Among other things, using an upper-case ID ensures that everyone knows we aren't talking about the customer's "pleasure principle" (according to Freud).  And isn't this easier to read? 

CustomerID

The "I dee" part rings out much better for me when the acronym is upper case.  And when I find queries written by others where they have used their convention, I always re-write to conform to my convention.  And while it is as weak an argument as writing standards-compliant code for fear that you will ever port to a different RDBMS, I find it hard to justify leaving code as is when it will fail if we ever have to use a case sensitive collation.

Even when I will always name the database entities in this way, if you want to call it CustomerId in the application code, feel free.  Just be consistent.  And don't expect me to leave it that way if you hand control of the code over to me.

What are your thoughts?  Are you a proponent of CustomerId?  In the application code only, or in both the app and the database?  If so, why?

23 comments on this post

    • jamiet - August 22, 2010, 10:52 PM

      CustomerId over CustomerID for me, if only because I feel that that is more conformant to CamelCase. However, unlike yourself if I encountered code that did it the other way it wouldn't bother me enough to go and change it πŸ™‚

    • Aaron Bertrand - August 22, 2010, 11:10 PM

      But Jamie, my point is that ID is not a word.  Do you want to see a column named ContentXML or ContentXml?

    • Erik - August 22, 2010, 11:18 PM

      If the biggest problem in your system is the casing/underscoring of names, I *really* want to work where you work.
      My jobs never have enough resources to go back and clean up the system inconsistencies brought about by the various developers now and in the past.
      I envy you for even having the time to think about this issue.

    • AaronBertrand - August 22, 2010, 11:28 PM

      Well, I'll be perfectly honest… we have plenty of these problems in our legacy application codebase.  The database is quite mature and we have, over time, eliminated most of the inconsistencies and problems in the schema.  I do still deal with issues, but few and far between are they interesting enough to blog about.  As a DBA/Architect I don't have much opportunity to invest in the problems with app code or I'm sure I would be writing a LOT of C#, rant-style blog posts.When I said above that it comes up a lot, I'm talking about when developers are requesting new tables or when we are finalizing the design of a new feature or in a new project altogether.  This problem doesn't come up a lot in general, and is certainly not one of our biggest problems.  I am just a big fan and proponent of consistency.  If you set out thinking that casing and naming consistency is not important, do you think you'll ever go back and fix it?

    • jamiet - August 22, 2010, 11:31 PM

      Yeah fair point, but I view ID as an abbreviation of "Identification" and hence I apply the Camel Casing as I would to the word.
      Potato Potartoe!
      "I envy you for even having the time to think about this issue."
      I second that πŸ™‚

    • Chuck Rummel - August 23, 2010, 12:58 AM

      I think the bigger point is not which standard to use, but that you should have one and follow it, even if it's not perfect.
      The downside is as teams change over time and new members each start using their own "standards" (i.e. whatever they were already used to) without looking at (or choosing to ignore) existing standards. I can tell who wrote what procedures based on what coding conventions they used.

    • AaronBertrand - August 23, 2010, 1:06 AM

      Thanks Chuck, yes, I've made the same point many times:
      /blogs/aaron_bertrand/archive/2009/10/11/bad-habits-to-kick-inconsistent-naming-conventions.aspx
      /blogs/aaron_bertrand/archive/2008/10/30/my-stored-procedure-best-practices-checklist.aspx
      Do I think CustomerID is perfect?  Of course not.  However, it is the convention I use, so if I've started creating a schema and you're adding to it, please follow my rules.  πŸ™‚

    • Rommel - August 23, 2010, 1:09 AM

      ID is not a abbreviation (or contraction) but an acronym. XML and HTML are also acronyms not abbreviations. Acronyms are spelled (usually) in all caps. So imho it should be ContentXML, ContentHTML and CustomerID.
      For abbreviations (Num for Number for example), it should be InvoiceNum, not InvoiceNUM.

    • cinahcaM madA - August 23, 2010, 2:11 AM

      I would prefer CustomerId over CustomerID, only because that's how .NET does it and I'm totally used to it at this point. Prior to working with .NET I would have agreed with you. But I also prefer customer_id over CustomerId, for the same reason: every project I've been on for the last three years has used lowercase with underscores as the naming standard and now that I'm used to it I prefer it over Camel or Pascal case. Which tells me, as a rather choosy person, that none of these things really matter. It's just what we're used to. Consistency is really the only important thing.
      On that note, I don't understand why you'd want a different standard for procedure names — can you explain the rationale?

    • AaronBertrand - August 23, 2010, 2:20 AM

      Adam, sure.  So when you have a table like CustomerAddresses or a column like FirstName, you are usually talking about joining two nouns or a noun with an adjective.  You only have one CustomerAddresses table, so finding it in the list is not cumbersome, but you probably have half a dozen or more procedures that reference it.  In the procedure name, you are talking about one or more nouns on the left, joined with a VERB.  It is the verb that I like to separate, not words in general.  Because there are more procedures than tables in every system I've worked with, and because I am typing table and column names a lot more often than I'm typing procedure names, that extra visual separation is very helpful and worth the more tedious typing.

    • Greg Low - August 23, 2010, 5:39 AM

      Hi Adam/Aaron,
      Actually, the .NET naming conventions make an exception for two letters or less.
      http://msdn.microsoft.com/en-us/library/x2dbyw72(v=VS.71).aspx
      From my reading of it, that would make it ID, not Id but also Num not NUM.
      From your list, I'd also struggle with why underscores for word separation and not PascalCasing for procs. That seems inconsistent with the object names.
      I notice you've also got them all in the dbo schema but that's a discussion for another day πŸ™‚
      Regards,
      Greg

    • Paul White - August 23, 2010, 6:45 AM

      Clearly the most sensible answer is to use a case-sensitive database collation and maintain views and/or synonyms to implement all possibilities to keep everyone happy πŸ™‚
      FWIW, I'm also acclimatised to the lower-case-plus-underscores convention, but am quite happy to use whatever is normal for the system I happen to be working on.
      customer_id or CustomerID, but never cUsTOm3r_1D.
      Paul

    • Luciano Evaristo Guerche (GorΕ‘e) - August 23, 2010, 3:33 PM

      Aaron,
      "In computer programming if the first letter is capitalized, it is called Pascal case; if not, then camel case."
      [Source: http://en.wikipedia.org/wiki/CamelCase]
      Pascal Case, Camel Case and Upper case are also explained at http://msdn.microsoft.com/en-us/library/x2dbyw72%28v=VS.71%29.aspx
      [Thanks Greg for the link]
      Cheers,

    • Aaron Bertrand - August 23, 2010, 4:30 PM

      Greg, I use schemas for separation of certain things but more on a functional level (e.g. reporting vs. metadata) than feeling the need to have Customer objects live in different schemas than Order objects.  If I have a view that shows both customer and order information, what schema should it live in?
      As for the underscores for separation in the procedures, I don't know how else to explain.  I have a lot more procedures than tables, and the words we're talking about specifically are the action verbs, e.g. _Create, _Update, etc.  I can quickly look at the list of procedures and see which ones belong to Customer but not Customer Address.  If I didn't use the underscores, then I would have this:
      CustomerAdd
      CustomerAddressAdd
      CustomerAddressUpdate
      CustomerUpdate
      With the underscores, the compound words are clearly separated from the singulars:
      Customer_Add
      Customer_Update
      CustomerAddress_Add
      CustomerAddress_Update
      I could do the same with the tables, but it's not going to make much of a difference, since there are only a small number:
      Customers
      CustomerAddresses
      vs.
      Customer_Addresses
      Customers

    • Mark (AjarnMark) Caldwell - August 23, 2010, 6:11 PM

      Aaron, I agree with both ID and underscores on procedures.  I definitely like having object separated from action in the procedure name.
      But I disagree on plurals.  I hate plurals in object names (and I know using EF4 on my next project I will probably live with the plurals).  One reason I hate plurals in table names is that when I'm searching through the list, I want to see dbo.Customer BEFORE its compounds like dbo.CustomerAddress.  Especially if I'm using intellisense, and I type dbo.Cust I want Customer to be the first choice, not the last one.

    • Aaron Bertrand - August 23, 2010, 6:24 PM

      Mark, I can't think of a table (which contains a set) as a singular – an Employees table contains employees; an Employee table implies to me that it only contains one row.  One of Celko's few good rants is about how Employee is bad, Employees is better, but Personnel is best.  It's unfortunate that we don't have a common and consistent way in the English language to refer to a set without plopping an s on the end of the word.  Again, these are my reasons.  I am not trying to make sure everyone agrees with me, because that serves no purpose and is impractical anyway.  If you have reasons for your standard, then by all means, employ it.  Again, the more important point to take away is, *always* follow your standard.  If you let exceptions creep in then you may as well not have a standard at all.
      As for IntelliSense, I turn this feature off – and yes SQL Prompt too – because it is more of a hindrance than an aid to the way I work.  I work in systems where there is a lot of cross-database activity, and a lot of this involves synonyms.  When I use a three-part name IntelliSense is way too slow to load the list and the delays are infuriating.  And synonyms of course do not resolve correctly, so this causes issues as well.

    • Wes W. - August 23, 2010, 6:56 PM

      I'm also of the "CustomerId" camp.  The "Id" is not an acronym but an abbreviation for identification. Further, if the field name had and Id then acronym follow it, its easier to read with "Id."  For example, CustomerIdST is easier to read, and more accurate, than CustomerIDST.

    • AaronBertrand - August 23, 2010, 7:03 PM

      Not sure I agree that ID is an abbreviation.  I see I.D. a lot.  Also what does ST stand for?  I'm not sure either of these is any more readable:
      CustomerIDServingTime
      CustomerIdServingTime
      In fact, I still have this urge to pronounce "id" as opposed to "i dee" when I see it expressed as a word this way.  It is an acronym to me because that is how you say it.  You don't say "id" or "ide"…
      I doubt you're likely to have a column with CustomerID<anything>, anyway… you are more likely to associate a value with the entity itself, so it is more likely to be CustomerST – but I would still probably prefer to spell out what ST stands for.  Do you have a real-world example where you have <something>Id<something>?

    • Wes W. - August 23, 2010, 11:54 PM

      ST doesn't matter, just an example of mixing conventions between  abbreviations and acronyms, then throwing exceptions to the rules into the mix.
      Rather than beat the dead horse here, these kind folks already beat us to it:
       http://weblogs.asp.net/andrewseven/archive/2004/08/12/213440.aspx
       http://blogs.msdn.com/b/brada/archive/2003/11/26/50848.aspx

    • AaronBertrand - August 24, 2010, 4:52 AM

      I guess it is only an exception if you firmly believe that ID is an abbreviation and couldn't possibly be an acronym.  I am a firm believer that it is an acronym and not an abbreviation.

    • Wes W. - August 24, 2010, 4:02 PM

      We can likely both agree its ambiguous at best.  What I'd like to see is statement by ISO or NIST, then I'd happily use whatever they suggest, giving preference to an ISO standard in case of conflict.

    • Linchi Shea - August 25, 2010, 11:22 PM

      I agree that the only thing that matters is consistency. The fact of life is that nobody works in a vacuum or raraly in a brand new world. There is always history and almost always when it comes to naming conventions, continuity (aka consistency) trumps–should trump–whatever choosiness. But even that is often a luxury, precisely because people who come before you fancy different conventions, don't respect the conventions already there, and insist on introducing their own and different conventions, destorying the purpose of having a convention.

    • Daniel Smith - August 31, 2010, 12:23 AM

      It could be argued that ID is a real acronym – standing for Identity Document πŸ™‚

Comments are closed.