T-SQL Tuesday #24 : Dude, where's the rest of my procedure?
Last week, I talked about the case against INFORMATION_SCHEMA views – I provided several examples where I feel the INFORMATION_SCHEMA views fall short of the catalog views, and expressed my belief that you are better off programming consistently against the catalog views all the time, instead of only when the INFORMATION_SCHEMA views fail. Having known at that time about the T-SQL Tuesday topic, I intentionally left one incriminating piece of evidence out of that discussion, and that is how INFORMATION_SCHEMA.ROUTINES feels about your procedures and functions that are longer than 4,000 characters (8,000 bytes).
In SQL Server 2000, you had three ways to retrieve the definition of a procedure or function: INFORMATION_SCHEMA.ROUTINES, syscomments, and sp_helptext. The problem with sp_helptext is that you can't use it interactively – say you don't know the name of the procedure(s) you're trying to retrieve, but want to base it on a search of the text (e.g. return all the stored procedures that contain 'dbo.foo'). You could get there with either of the views, but in fact all three of these approaches have issues:
- INFORMATION_SCHEMA.ROUTINES only contains the first 4,000 characters of the procedure body. If your procedure is longer, and the text you're looking for falls outside of that (or even if you just want the whole procedure), you're going to be out of luck. This is because the definition of the view purposely truncates the definition:
... convert(nvarchar(4000), object_definition(o.object_id)) AS ROUTINE_DEFINITION, ...
I assume this truncation occurs so that the view still conforms to the standard – so even though your entire procedure body is available, it won't show you the whole thing.
- syscomments is an odd bird, in that it chops your text up into 4,000-character chunks. So while you can manually piece your procedure definition back together (unlike INFORMATION_SCHEMA.ROUTINES, you can't rely on a search of the text for modules that are longer than 4,000 characters, because your search phrase may only occur in a location where it straddles two rows in the result. I'd dig into the mechanics behind this, but you can't get very far… while you can certainly run the following to see what the view does:
EXEC sp_helptext 'sys.syscomments';
You can see that it references a system table, sys.sysschobjs, and internal functions like sysconv() and OPENROWSET(TABLE SQLSRC):
CREATE VIEW sys.syscomments AS SELECT o.id AS id, convert(smallint, case when o.type in ('P', 'RF') then 1 else 0 end) AS number, s.colid, s.status, convert(varbinary(8000), s.text) AS ctext, convert(smallint, 2 + 4 * (s.status & 1)) AS texttype, convert(smallint, 0) AS language, sysconv(bit, s.status & 1) AS encrypted, sysconv(bit, 0) AS compressed, s.text FROM sys.sysschobjs CROSS APPLY OpenRowset(TABLE SQLSRC, o.id, 0) s UNION ALL ...
While sys.sysschobjs is generally off-limits…
SELECT * FROM sys.sysschobjs;
Msg 208, Level 16, State 1, Line 1 Invalid object name 'sys.sysschobjs'.
…if you really want to look at what's in there, you can do so using a DAC connection – and no, this has nothing to do with DACPACs, I am talking about the Dedicated Administrator Connection. Once connected via the DAC, you can take a peek inside of many of these system tables that are usually hidden:
<a title="http://bertrandaaron.files.wordpress.com/2011/11/tsql-24-0a.png" target="_blank" href="https://sqlblog.org/wp-content/uploads/2018/01/39648_tsql-24-0a.png"><img border="0" height="136" width="750" src="https://sqlblog.org/wp-content/uploads/2018/01/39648_tsql-24-0a-1.png"></a> <i> (click to embiggen)</i>
But even under DAC, you can't call the other functions – you could easily change sysconv() to CONVERT(), but outside of system scope, you'll still get a syntax error on the OPENROWSET() call. So without knowing some serious SQL Server internals, you can't really determine why or how the definition is internally stored in chunks, or dig any deeper into the source…
- sp_helptext is another peculiar one. It breaks up your modules by carriage return / line feed pairs, and returns each "line" as a separate row. If you have continuous strings that are longer than 255 characters, it breaks them into 255-character chunks. We can see why by looking at the definition:
EXEC sp_helptext 'sys.sp_helptext';
I'm not going to reproduce the body here, but you will clearly see that there are a lot of loops, cursors and messy string manipulation routines going on in there (the procedure is 208 lines in SQL Server 2012!), making sure that no output line exceeds 255 characters. Why do they do this? Who knows – very skinny CRT screens back when the procedure was first written? The main point is that, even if you dump the results of this procedure into a #temp table, you have even more of a chance to see your search criteria straddle two rows, leading again to missing results.
Let's demonstrate these issues with a quick example. Here is a stored procedure that is > 4,000 characters, and happens to have the string 'supercalifragilisticexpialidocious' straddle the 4,000-character mark. I'm using dynamic SQL here so I don't have to actually publish huge long strings on this blog (or expect you to copy them):
DECLARE @sql NVARCHAR(MAX); SET @sql = N'CREATE PROCEDURE dbo.fooblat AS BEGIN SET NOCOUNT ON; SELECT ''' + REPLICATE('x', 3900) + ''''; SET @sql += ' SELECT ''-supercalifragilisticexpialidocious-xxxxxxxx''; END';
Now, let's see how this procedure looks using the various methods described above.
Here is how the procedure is stored in the INFORMATION_SCHEMA view. I'm going to take the right-most 10 characters to show where the definition of the body gets cut off: <blockquote> SELECT ROUTINE_DEFINITION, [end] = RIGHT(ROUTINE_DEFINITION, 10), l = LEN(ROUTINE_DEFINITION), dl = DATALENGTH(ROUTINE_DEFINITION) FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME = 'fooblat';
So, seeing as the whole word isn't even there, and that the metadata has only revealed the first 4,000 characters of our procedure, it should come as no surprise that the following search will yield 0 results:
SELECT ROUTINE_NAME FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_DEFINITION LIKE '%supercalifragilistic%';
So it looks like I've spoiled the party for the INFORMATION_SCHEMA views once again. Sorry about that.
As described above, we can see with the following query that sys.syscomments returns our procedure split up into multiple rows, and our search term once again straddles rows: <blockquote> SELECT colid, [text], l = LEN([text]), dl = DATALENGTH([text]) FROM sys.syscomments WHERE id = OBJECT_ID('dbo.fooblat');
Which in turn makes a search for this term useless:
SELECT OBJECT_NAME([id]) FROM sys.syscomments WHERE [text] LIKE '%supercalifragilistic%';
So, it doesn't seem like sys.syscomments is going to be a useful way to find and retrieve procedure definitions, either.
I won't get into the whole rigamarole of dumping the output of sp_helptext for all stored procedures into a #temp table and searching there, since I think you are all smarter than to try this in the first place. But I will show the output of sp_helptext against our procedure, so you can once again see how the body is chunked out into multiple rows: <blockquote> EXEC sp_helptext 'dbo.fooblat';
Now it turns out that, in this case, the magic word didn't straddle rows (so a convoluted search process might have still yielded this procedure), but that is just dumb luck. With a forced CR/LF pair every 255 characters, and depending on your coding style, you are certainly bound to see some straddling here that will make searches very unreliable. Again, I don't expect any of you are doing anything like this, but wanted to mention it for completeness.
I wouldn't be writing this post if the intention was just to crap on old-school methods for finding and retrieving procedures and functions. I see lots of folks on StackOverflow and dba.StackExchange.com still responding to queries with suggestions to use syscomments or INFORMATION_SCHEMA, even though they may not realize that better alternatives exist. I hope I've convinced you above that these old approaches aren't the way to go. So what should you use instead?
Since SQL Server 2005, this has been my go-to place for retrieving or searching the definition of procedures and functions. Now, keep in mind this is much easier to demonstrate when we switch to Results to Text mode (Ctrl+T): <blockquote> SELECT OBJECT_DEFINITION(OBJECT_ID('dbo.fooblat'));
So you can see that our big long line actually gets returned the way it was intended. This gives me much more confidence about a search, since there are no forced formatting changes or truncation that can screw things up.
Now, if we want to find all stored procedures that contain the text 'supercalifragilistic'? Let's switch back to Results to Grid mode first (Ctrl + D) and then run this query. Using a dedicated view for only procedures prevents us from needlessly looking at all of the other object types:
SELECT [schema] = OBJECT_SCHEMA_NAME([object_id]), name FROM sys.procedures WHERE OBJECT_DEFINITION([object_id]) LIKE '%supercalifragilistic%';
Unlike the three methods above, we finally have success! If we want to expand our search to procedures and functions, we can use sys.objects and a filter on [type]:
SELECT [schema] = OBJECT_SCHEMA_NAME([object_id]), name FROM sys.objects WHERE OBJECT_DEFINITION([object_id]) LIKE '%supercalifragilistic%' AND [type] IN ('P', 'IF', 'FN', 'TF');
This yields the same results as above.
This is a view that contains the definition and other properties for many types of objects - views, procedures, functions, triggers, etc. Under the covers it uses OBJECT_DEFINITION() to expose the body through the column [definition]. So borrowing from the query above, we could use sys.sql_modules instead of referencing the function directly, in case we want to use more convoluted joins or filters, or wanted other properties that come directly from sys.sql_modules: <blockquote> SELECT [schema] = s.name, o.name FROM sys.objects AS o INNER JOIN sys.schemas AS s ON o.[schema_id] = s.[schema_id] INNER JOIN sys.sql_modules AS m ON o.[object_id] = m.[object_id] WHERE m.[definition] LIKE '%supercalifragilistic%' AND o.[type] IN ('P', 'IF', 'FN', 'TF');
Again, this will yield the same results as above.
Yes, sp_helptext is easy to type, and it can be tough to shake old habits of using INFORMATION_SCHEMA or syscomments. But I think you will be better off overall if you use sys.sql_modules and OBJECT_DEFINITION() for all of your metadata access to procedures and functions. They will always return the body of the module as intended, without chunking or unnecessary carriage returns and line feeds.