In SQL Server 2000, INFORMATION_SCHEMA was the way I derived all of my metadata information – table names, procedure names, column names and data types, relationships… the list goes on and on. I used system tables like sysindexes from time to time, but I tried to stay away from them when I could.
In SQL Server 2005, this all changed with the introduction of catalog views. For one thing, they're a lot easier to type. sys.tables vs. INFORMATION_SCHEMA.TABLES? Come on; no contest there – even on a case insensitive collation where you don't have to make those ugly upper-case words consistent. But further to that, I found them generally easier to use, in addition to including feature and implementation details that are not covered by the INFORMATION_SCHEMA views. A few examples:
Here are the columns available in the INFORMATION_SCHEMA.COLUMNS view:
TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH CHARACTER_OCTET_LENGTH NUMERIC_PRECISION NUMERIC_PRECISION_RADIX NUMERIC_SCALE DATETIME_PRECISION CHARACTER_SET_CATALOG CHARACTER_SET_SCHEMA CHARACTER_SET_NAME COLLATION_CATALOG COLLATION_SCHEMA COLLATION_NAME DOMAIN_CATALOG DOMAIN_SCHEMA DOMAIN_NAME
Nothing in there about IDENTITY properties. Whereas in sys.columns, we have the is_identity column, as well as the catalog view sys.identity_columns, with all kinds of additional information. Now, it should be no surprise that this data is missing from the INFORMATION_SCHEMA views, since IDENTITY columns are proprietary to SQL Server and aren't part of the standard. But if you need to retrieve metadata information, this will often include the need for information about IDENTITY columns, and if you're trying to avoid using the catalog views (or proprietary SQL Server functions like COLUMNPROPERTY), you'll be out of luck.
Indexes, partitions, statistics…
Since these are physical implementation details, there is no coverage anywhere in INFORMATION_SCHEMA. Go ahead, create an index (that is not also participating in a constraint), or partition a table, and try to find any evidence of either of these structures in any INFORMATION_SCHEMA view (including filtered indexes, XML indexes, spatial indexes, fulltext indexes, and include columns). Also try to find out if a table has a clustered index, if a primary key is clustered or non-clustered, or if there are any statistics available. This is all important data, and the person trying to find the information may not have any idea about the line between physical and logical, or why the "recommended" approach to obtaining metadata should treat them differently or ignore physical details altogether.
Non-standard Foreign Key constraints
SQL Server allows something that the standard doesn't allow: foreign key constraints referencing unique indexes. Here is an example:
CREATE TABLE dbo.FKExample ( ColA INT, ColB INT ); CREATE UNIQUE INDEX UQ ON dbo.FKExample(ColA, ColB); CREATE TABLE dbo.FKExampleRef ( ColA INT, ColB INT, CONSTRAINT FK_UQ FOREIGN KEY(ColA, ColB) REFERENCES dbo.FKExample(ColA, ColB) );
To determine these relationships using INFORMATION_SCHEMA views, we'd normally perform a join between INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS, INFORMATION_SCHEMA.TABLE_CONSTRAINTS, and INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE. But since the unique index is not a constraint, there is no data in INFORMATION_SCHEMA.TABLE_CONSTRAINTS, so no way to dig deeper and determine the columns used in the "constraint" without going out to catalog views.
SELECT CONSTRAINT_NAME, UNIQUE_CONSTRAINT_NAME FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS;
CONSTRAINT_NAME UNIQUE_CONSTRAINT_NAME --------------- ---------------------------- FK_UQ UQ
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS WHERE CONSTRAINT_NAME = 'UQ' UNION ALL SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE WHERE CONSTRAINT_NAME = 'UQ';
CONSTRAINT_NAME --------------- 0 row(s) affected.
Because this "constraint" was implemented only as an index, the INFORMATION_SCHEMA views do not contain any further information about it. But we can still get the information using sys.foreign_keys, sys.foreign_key_columns, and sys.columns, just like we would for a foreign key that referenced a unique constraint or a primary key constraint:
SELECT FK = fk.name, FKTable = QUOTENAME(OBJECT_SCHEMA_NAME(fkcol.[object_id])) + '.' + QUOTENAME(OBJECT_NAME(fkcol.[object_id])), FKCol = fkcol.name, ' references => ', PKTable = QUOTENAME(OBJECT_SCHEMA_NAME(pkcol.[object_id])) + '.' + QUOTENAME(OBJECT_NAME(pkcol.[object_id])), PKCol = pkcol.name FROM sys.foreign_keys AS fk INNER JOIN sys.foreign_key_columns AS fkc ON fk.[object_id] = fkc.constraint_object_id INNER JOIN sys.columns AS fkcol ON fkc.parent_object_id = fkcol.[object_id] AND fkc.parent_column_id = fkcol.column_id INNER JOIN sys.columns AS pkcol ON fkc.referenced_object_id = pkcol.[object_id] AND fkc.referenced_column_id = pkcol.column_id ORDER BY fkc.constraint_column_id;
If you were relying on a similar join within the INFORMATION_SCHEMA views, you would not realize that this relationship existed.
While INFORMATION_SCHEMA.COLUMNS stores the *definition* of a default constraint, nowhere in these views will you find the *name* of the constraint. This can be particularly problematic if you use shorthand and allow SQL Server to name your constraints. Instead you will have to look at sys.default_constraints.
Computed columns are another area where the INFORMATION_SCHEMA views turn a blind eye. Consider the following simple example:
CREATE TABLE dbo.ComputedExample ( a INT, b AS (CONVERT(INT, a + 10)) );
Reviewing the list of INFORMATION_SCHEMA.COLUMNS columns listed above, or just running a query, you can see no place for the definition of the second column, or any discernible difference between the two columns – they're both integers and, apart from COLUMN_NAME and ORDINAL_POSITION, all the column values are identical. In sys.columns, we have a flag called is_computed, and we can retrieve the definition of the column from sys.computed_columns using the following query:
SELECT c.name, cc.[definition] FROM sys.columns AS c LEFT OUTER JOIN sys.computed_columns AS cc ON c.[object_id] = cc.[object_id] AND c.is_computed = 1 WHERE c.[object_id] = OBJECT_ID(N'dbo.ComputedExample') ORDER BY c.column_id;
As far as the INFORMATION_SCHEMA views are concerned, b is just an ordinary column.
Much like computed columns, there is no spot in the INFORMATION_SCHEMA.COLUMNS view to indicate whether a column is sparse or a column_set. In sys.columns you will find the columns is_sparse and is_column_set to quickly identify which columns in which tables have been defined as sparse and how the columns are used.
If you are trying to get the definition for a stored procedure, function, or other module from INFORMATION_SCHEMA, and the definition happens to be greater than 4,000 characters, good luck. This column is truncated to NVARCHAR(4000), even though it is defined as NVARCHAR(MAX). This isn't all that much worse than what the backward compatibility view syscomments does (it stores the definition on separate rows in chunks of 4,000 characters). If you might have to sometimes use sys.sql_modules or OBJECT_DEFINITION() to retrieve the entire definition in one piece, why would you ever want to use a less reliable way?
SQL Server 2012 added INFORMATION_SCHEMA.SEQUENCES for the Sequence functionality they added. However, only the information covered by the standard is included in the view. If you want to determine if the Sequence is exhausted, for example, or the current value of the Sequence object, you won't find that from INFORMATION_SCHEMA.SEQUENCES; you'll have to go to sys.sequences anyway. Example:
CREATE SEQUENCE dbo.SequenceExample AS INT MINVALUE 1 MAXVALUE 65535 START WITH 1 CACHE 1000 CYCLE; GO SELECT NEXT VALUE FOR dbo.SequenceExample; GO 3 SELECT current_value, is_exhausted FROM sys.sequences WHERE name = 'SequenceExample';
object_id is missing
As we know, much of SQL Server's internal mechanisms identify metadata using surrogate identifiers like object_id, which don't exist in the INFORMATION_SCHEMA views – they are entirely name-based. This makes joins much more cumbersome – instead of:
SELECT /* cols */ FROM sys.tables AS t INNER JOIN sys.columns AS c ON t.[object_id] = c.[object_id];
You have to say:
SELECT /* cols */ FROM INFORMATION_SCHEMA.TABLES AS t INNER JOIN INFORMATION_SCHEMA.COLUMNS AS c ON t.TABLE_SCHEMA = c.TABLE_SCHEMA AND t.TABLE_NAME = c.TABLE_NAME;
It doesn't look like that much more code, but combined with all the other extra typing you're doing, it will add up if you are writing a lot of metadata queries.
Alias type inconsistency
If you use alias types (in spite of my warnings about them), you'll notice some inconsistency. In INFORMATION_SCHEMA.COLUMNS and INFORMATION_SCHEMA.ROUTINE_COLUMNS, for example, the alias name is in a column called DOMAIN_NAME. In INFORMATION_SCHEMA.PARAMETERS, the column name is USER_DEFINED_TYPE_NAME. And in INFORMATION_SCHEMA.SEQUENCES the column is called DECLARED_DATA_TYPE. (With the catalog views, everything maps back to sys.types columns user_type_id and/or system_type_id.) Now, the view INFORMATION_SCHEMA.COLUMN_DOMAIN_USAGE has a row for every usage within tables, views and table-valued functions – however your joins or unions to pull this information are going to be quite convoluted if you have to look up how the column is referenced differently in every single view.
Misleading information about updatable views
While there are a whole bunch of criteria used to determine whether a view is updatable (including details in the query that is trying to perform the update, such as whether the columns affected reside in only one of the base tables affected by the update), the INFORMATION_SCHEMA.VIEWS view will *always* tell you a view is not updatable.
Further to these things that I have observed, Microsoft seems to be going out of its way to steer you away from the INFORMATION_SCHEMA views as well. For example, the INFORMATION_SCHEMA.SCHEMATA topic says:
** Important ** Do not use INFORMATION_SCHEMA views to determine the schema of an object. INFORMATION_SCHEMA views only represent a subset of the metadata of an object. The only reliable way to find the schema of an object is to query the sys.objects catalog view.
And the INFORMATION_SCHEMA.TABLES topic says:
** Important ** only reliable way to find the schema of an object is to query the sys.objects catalog view. INFORMATION_SCHEMA views could be incomplete since they are not updated for all new features.
The warnings originally stated that the INFORMATION_SCHEMA views were incorrect, but they fixed the wording after I complained here.
Also, the System Information Schema Views topic says:
The information schema views included in SQL Server comply with the ISO standard definition for the INFORMATION_SCHEMA.
Some changes have been made to the information schema views that break backward compatibility. These changes are described in the topics for the specific views.
It is important to note that this statement of compliance was true for ANSI/ISO SQL-92, but not for revisions that have been made to the standard since then. There are changes present in SQL:1999, for example, that have not been reflected in the SQL Server implementation. And as I've highlighted above, there are many features in SQL Server that fall outside of the SQL standard that also do not appear within the INFORMATION_SCHEMA views.
So you don't have to remember when you have to fall back to the catalog views to get a complete picture, do what I do, and avoid the INFORMATION_SCHEMA views altogether. This will make your programming more consistent; after all, why use the catalog views only when you HAVE to?
I know that for some of you this is counter-intuitive because you deal with other RDBMS platforms where you also use INFORMATION_SCHEMA. I don't really have a good answer for folks in that boat, except to be cognizant of the inconsistencies and omissions on each platform, and perhaps to resign yourself to learning the metadata approach relevant to each platform rather than try a one-size-fits-all approach.