Why I always start CTEs with a statement terminator
Common table expressions were a fantastic introduction to the T-SQL language; both as a way to isolate derived table logic and facilitate previously complex problems like hierarchy/recursion. I wrote about them a few years ago in #BackToBasics: Common Table Expressions (CTEs).
All the examples in that post start like this:
;WITH cte AS (...
It's odd to see a query that starts with a statement terminator, but let me explain.
Since CTEs use an already overloaded keyword (
WITH), the parser has a very specific requirement:
But many are not in the habit of properly terminating all statements (even though it's been required by the ANSI standard since SQL-92 and not doing so has been deprecated for over a decade).
Inevitably, someone will post a question with code like this:
DECLARE @foo INT Declare @bar VARCHAR(8) set @foo = 5 Set @bar = 'catsup' Select /* <some query where a CTE would make sense> */ ...
And I want to be able to post this, encouraging better habits on multiple fronts:
DECLARE @foo int = 5, @bar varchar(8) = 'catsup'; WITH cte AS (SELECT...
But I can't. Because almost 100% of the time, they will copy just the CTE portion and leave their existing (and admittedly not broken on its own) part of the code alone. Which then produces this verbose error message:
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
And since the line points directly at the code they copied from me, I start fielding complaints that my solution sucks, that it broke their existing procedure, that it killed their overnight scheduled jobs and got them paged while they were dreaming about Kenny Rogers, or what have you.
That got old fast.
I thought to myself, "Self, for my own sanity, how can I prevent these people from shooting themselves in the foot with my cough broken code?" Knowing that I couldn't meddle with their existing codebase, or control what they do before they jam my solution into the middle of it, I remembered that two statements can be separated by one, zero (sometimes), or a hundred statement terminators. This is terrible but valid code:
;;;; ;; ;;;;; ;;;; ;;; ;; ;;; DECLARE @i int; ;; ;;;;; ;;;;;;; ;;;;;;;;; ;;;;; ;;; ;;;;;;;;; SET @i = -4; ;;;; ;;;;; ;;;;; ;;;;;;;;;;;;;; ;;;;;; ;;;;; /* I'm a comment */ ;;;;;;; ;;;;;; ;;;;;;;;;;;;;; ;;;;;; ;;;;SELECT name FROM sys.procedures;;;;; ;;;;; ;; ;;;;;;;;;;; ;;;;;;;;;
So, I thought, why not just add an additional defensive statement terminator before the CTE? Whether the previous statement already has a terminator, it can't possibly hurt anything, except subjective style points and feelings. (An exception is when a CTE is used in a view or inline table-valued function, where there isn't even a preceding statement eligible for a terminator.)
The pattern above became:
DECLARE @foo int = 5, @bar varchar(8) = 'catsup'; ;WITH cte AS (SELECT...
Like magic, the complaints stopped. (Well, to be replaced by different ones. Read on.)
Sure, it raises eyebrows, and there are industry purists that I respect who won't ever pervert their code with this atrocity. From time to time, it causes outbursts from people making fun of this so-called "begininator," or calling me defensive when I explain, or asking me if it wouldn't be better to just assume and expect that everyone terminates all their statements properly. Which makes me laugh, but brings me full circle.
Until statement terminators are mandatory (which would have also been a good April Fools' Joke), or there is a new synonym like
WITH_CTE, this is what you'll continue to see from me when I'm posting CTEs for others to use or learn from.
Sorry (not sorry). I've even started formatting answers like this to help pre-empt some of that noise:
;;; /* for safety */ ;;; WITH cte AS (...
Maybe I'll start adding a link to this post in my leading comment, too.
Note that I'm not suggesting you overhaul your own private codebase, or add leading semi-colons to code review checks in your organization. I'm certainly not adding leading statement terminators to every CTE in the codebase at Stack Overflow, and have no plans to suggest it in our internal documentation as I write up other guidelines. I'm only suggesting that, when posting sample code in blog posts and forum answers, that it's a safe, defensive, non-harmful approach for consumption by less experienced users (or holdouts on statement terminators in general).
And no, I really don't expect Microsoft to ever follow through on their long-standing deprecation notice, or add a synonym to give us a more parser-friendly option.
Nah this is trying to protect people from sloppily copy and pasting code.. People mindlessly doing that without understanding what they're doing shouldn't be catered for, you'll just encourage it. If they copy this into their code without a care for the context the code is in then it isn't unreasonable that there would be a syntax error. Should you also write code to check that the variables you are going to declare aren't already declared by another piece of code earlier up?
I don't really see it that way. We have too many people posting code without statement separators anywhere, and new users are learning from that example. In my case I am protecting them from getting an error (and myself from them attacking me for it) but also raising visibility of the need for statement terminators. Like I said in the article, you code the way you like, but when I am posting answers, my code is going to be defensive. It doesn't hurt readability and I don't agree with you that it is causing harm or encouraging bad things.
I think it might even do the opposite. Someone who has never seen a semi-colon in T-SQL may ask "Why does that statement start with a semi-colon?", and the answer to that question will teach them several valuable things.
;;;;;;;;;I'm with the purists on this one 😀
Erland Sommarskog makes a good case for always putting a semicolon before THROW as well: https://www.sommarskog.se/error_handling/Part1.html#:~:text=if%20you%20leave%20out%20the%20semicolon,uninitiated
Industry purists will already know about the SQL-92 semicolon terminator requirement so should be happy to drop the ";" before "WITH" if they wish. No harm done!
Anybody who used your code without the leading semicolon and got paged in the middle of the night deserved to be paged. Every shop has a QA environment and in some shops it is separate from production.
I took the first half of your comment seriously and then it took me 20 minutes to control my laughter so I could type. Just so I have this right, we should avoid using safe and defensive coding measures and let people shoot themselves in the foot, because every single person ever asking a question on Stack Overflow works for a company with a QA department that will catch all of their mistakes? Maybe every company you work for has a QA department, but I can assure you that not every company does, and not every company that has one would catch this specific issue on review, either.
I think what he is saying is that every company does have a QA department, but some companies are using their PROD environment as QA. Some shops do it correctly by having separate environments. Those who use PROD as QA are reaping what they sow.
But I know first-hand that every company does NOT have a QA department. Some that do have a QA _department_ don't necessarily have (or adequately use) a QA _environment_, and still test in production. Some that don't have a QA department _do_ have a QA environment and test properly first.
There is no "all" or "every" that any of us can authoritatively define.
We don't know the state of all of these environments and why they are the way they are. Budget, negligence, does it matter? Is it ever the fault of the person asking the question, and do we really expect them to always have the power to fix things at a higher level? I prefer not to punish the whole lot by saying "too bad!" YMMV.
I understand why you have decided to add the ;;; /* for safety */ ;;; but it is sad people aren't willing to learn/improve their coding habits. This was one of my first lessons learned as a newbie to T-SQL.
"Starting T-SQL statements with a ; instead of ending them with a ; is a sign of barbarism."
Albert Einstein 1922
"A ; at the beginning instead of the end is the beginning of the end."
Dalai Lama 2022
"Only lazy bums start a CTE with a ;"
“With ';WITH', My Friend, You Are Entering A World Of Pain.”
“This Is Not ‘Nam. This Is Coding. There Are Rules.”
"You think the semicolon is at the start of the CTE, when it's really at the end of the preceding statement."
Ahh, this must be the post you aluded to the other day that I wouldn't "like" Aaron. 🙂
I agree that our opinions differ, but I don't begrudge you for your reasoning; there are unfortunately too many people that simply don't know that a statement terminator is and so beginning statements that require the previous (emphasis on previous) statement to be properly terminated stops the inevitable error.
I will, however, admit that practices like yours have resulted in me seeing people think the keyword is ";WITH"; I have even seen at least someone use FROM dbo.TableName ;WITH (NOLOCK), and we obviously have the problem with inline table functions and views not supporting the syntax. That isn't for me to say you're "wrong", as you're writing defensive code against the main culprit, it just introduces a less encountered problem.
I might update my post on my opposite view with a link to here, so that people can at least see both sides of the fence. 🙂 Sometimes different view points are good, so that you can make your own informed decision.
Good post Aaron. Yes, people should be using semicolon statement terminators (but they never will be required to, so many people won't). Adding that beginning semicolon before a CTE makes perfect sense. Safe code is better than "pretty" code.
I wouldn't worry about the "industry purists" too much. I also wouldn't worry very much about people who just blindly copy part of your code, not understanding how it works, who have a problem when it has a syntax or parsing error.
You can't satisfy everyone, all of the time…