September 12, 2019 | SQL Server

Bad habits to kick : avoiding the schema prefix

In my last post in this series, I treated the dreadful SELECT * and other ways we get around typing out a column list. This time I want to discuss the use of the schema prefix. Originally published in 2009, I updated this in 2019 with an example showing an effect on the plan cache.

This has to do with both creating and referencing objects. Do not make any assumptions about which schema an object belongs to. All your objects belong to dbo? Ok, use the dbo prefix anyway. Why? Because you will use additional schemas someday, or some 3rd party will force them on you, or even Microsoft (for example, Change Data Capture). Why leave it up to chance? Typing "dbo." is not that much work… and once you get into the habit, it will be no work at all. Again, this is another case where there is nothing lost by being explicit, but there is plenty to lose otherwise.

Without an explicit schema, SQL Server will first look under the schema associated with your login. This can cause problems, obviously, if you have a table called your_default_schema.foo and there is also a table in the database called dbo.foo. SQL Server will pick the one under your default schema, and *might* be making the wrong choice, if it is not what you intended, but how can it know any better? If you tell it explicitly which schema you are after, there is no chance for confusion. In fact, you might have created one of the objects accidentally, by not using the schema prefix during creation. (As an aside, you always have to qualify scalar user-defined functions with a prefix. So, if you use functions a lot, you're probably already well on your way.)

Here is a quick example:

USE [master];
SET NOCOUNT ON;
GO
 
CREATE DATABASE blat;
GO
 
USE blat;
GO
 
CREATE TABLE bar ( x varchar(32) ); -- this is dbo.bar!
 
INSERT bar( x ) SELECT 'dbo created this.';
GO
 
SELECT x FROM bar;
GO
 
CREATE SCHEMA foo AUTHORIZATION dbo;
GO
 
CREATE USER foo WITHOUT LOGIN;
GO
 
EXEC sys.sp_addrolemember @rolename = N'db_owner', @membername = N'foo';
GO
 
EXECUTE AS USER = N'foo';
GO
 
CREATE TABLE bar ( x VARCHAR(32) ); -- this is foo.bar!
 
INSERT bar( x ) SELECT 'foo created this.';
 
SELECT x FROM bar;
GO
 
REVERT;
GO
 
SELECT [table] = s.name + '.' + name
  FROM sys.tables AS t
  INNER JOIN sys.schemas AS s
  ON t.[schema_id] = s.[schema_id]
  WHERE t.[name] = 'bar';
GO

Results:

x
--------------------------------
dbo created this.
 
x
--------------------------------
foo created this.
 
table
--------------------------------
dbo.bar
foo.bar

Impact on Execution Plans

Another scenario to be aware of is when you have users or applications with different default schemas executing the exact same query – they can't share that query plan because one of the plan attributes forces the plan to be cached differently. Let's create a table, a separate schema, and a user with that schema as their default:

CREATE TABLE dbo.SomeTable(i int);
GO
CREATE SCHEMA SecurityGroup;
GO
CREATE USER Guard WITHOUT LOGIN WITH DEFAULT_SCHEMA = SecurityGroup;
GO
GRANT SELECT ON dbo.SomeTable TO Guard;
GO

Now, we're going to run an ad hoc query against dbo.SomeTable, but without mentioning the schema, first as ourselves (presumably dbo or sa) and then as Guard:

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
GO
SELECT i FROM SomeTable;
GO
 
EXECUTE AS USER = N'Guard';
GO
SELECT i FROM SomeTable;
GO
REVERT;

Both queries get the same results – an empty resultset from dbo.SomeTable. The interesting part is in the plan cache:

SELECT t.[text], p.size_in_bytes, 
  p.usecounts --, [schema_id] = pa.value, [schema] = s.name
FROM sys.dm_exec_cached_plans AS p
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS t
CROSS APPLY sys.dm_exec_plan_attributes(p.plan_handle) AS pa
LEFT OUTER JOIN sys.schemas AS s
  ON s.[schema_id] = CONVERT(INT, pa.[value])
WHERE t.[text] LIKE N'%SELECT%SomeTable%'
AND t.[text] NOT LIKE N'%dm_exec%'
AND pa.attribute = N'user_id';

Results:

text                        size_in_bytes    usecounts
------------------------    -------------    ---------
SELECT i FROM SomeTable;    16384            1
SELECT i FROM SomeTable;    32768            1

If you uncomment the additional columns, you'll see that the plan attribute that varies is the (default) schema name.

Now, let's run the same batch above, but now let's add the schema prefix:

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
GO
SELECT i FROM dbo.SomeTable;
GO
 
EXECUTE AS USER = N'Guard';
GO
SELECT i FROM dbo.SomeTable;
GO
REVERT;

Then we can check the plan cache again, and this time the results are different:

text                        size_in_bytes    usecounts
------------------------    -------------    ---------
SELECT i FROM SomeTable;    16384            2

With the schema prefix, we reuse the same plan and the schema attribute is NULL.

There are exceptions, of course, like when you actually *want* the behavior of a single query to be to adapt to the default schema of the user. But I suspect these cases are few and far between (and might be worthy of a design review).

Summary

Using schemas is a complex topic, and I don't want to get into all of the security ramifications. For those of you that are already using multiple schemas, you have probably already hit most of the big issues. I just wanted to suggest that you get into the habit of using the prefix whenever you create or reference objects in T-SQL code, even if you are currently only using objects in dbo. You might thank me later.

I am working on a series of "Bad habits to kick" articles, in an effort to motivate people to drop some of the things that I hate to see when I inherit code. Up next: inconsistent naming conventions.

12 comments on this post

    • Jim Danby - October 11, 2009, 7:10 PM

      I fear that this is too simple a description of a complex topic.

    • AaronBertrand - October 11, 2009, 7:41 PM

      Jim, the issue I wanted to address was leaving out the dbo. prefix in an all-dbo world.  I could spend a week writing about all the security implications and use cases of a multi-schema world, but that wasn't the goal of my post (and I doubt I could do any more justice to that than Books Online and authors before me have already done).  Obviously, if you have some wisdom to share, please do so…

    • Greg Joiner - October 12, 2009, 4:08 PM

      I think Jim's suggestive/supportive comment meant that a simple example of having some comment code to illustrate this serious issue would have improved your article..
      a dbo.foo and a Aaron.foo and showing a select foo getting Aaron when you wanted to get a dbo. etc.

    • AaronBertrand - October 12, 2009, 4:38 PM

      Point taken; I've added an example.  Hope it is useful.

    • Brian Tkatch - October 12, 2009, 7:33 PM

      I take the exact opposite position. I always leave the schema off. Then again, i come from an Oracle background where different databases are less used.
      I like to copy all my code into a new schema to test something. With one schema called aaron, and another called aaronv2, it is very easy to copy everything over when no schema is mentioned.
      This is very similar to websites and relative URLs, which i think is also a very good idea.
      The only time i mention a schema, is when the project itself demands that a particular schema be used.

    • Luciano Evaristo Guerche (GorÅ¡e) - October 13, 2009, 7:54 PM

      Aaron,
      As far as I remember that scenario you described related to schema(s) and table(s) scoping is named "shadowing".
      Regards,

    • AaronBertrand - October 19, 2009, 12:23 AM

      I noticed Erland's great article on dynamic SQL has some more ammo in favor of being in the habit of always specifying schema:
      http://www.sommarskog.se/dynamic_sql.html#queryplans

    • AaronBertrand - September 2, 2014, 5:28 PM
    • Dacius - June 29, 2015, 9:47 PM

      Well said.

    • Forrest - August 12, 2015, 9:33 PM

      A simple way to introduce this concept to new SQL developers is to convince them that learning to use and specify schemas isn't just about extra typing and security; its also a very good organization practice.
      On one hand that isn't entirely what schemas are about and you CAN overdo it, but it DOES help you keep your fences up between projects, and you can't accidentally wipe out critical code as easily.
      If you've only written a handful of scripts and have no access to the security layer of your environment, then it may be lost on you; but consider taking ALL of your development scripts EVER and simply dumping them in one giant folder on your hard-drive and then trying to find that one query you KNOW you wrote just 6 months ago for a project that would be handy now.

    • stuartd - January 20, 2016, 7:13 PM

      Thank you for spelling "lose" correctly!

    • Jason - March 23, 2016, 8:39 PM

      I always specify the schema in creation scripts for that object. However, inside a proc for example, there are only two scenarios. 1. you want to enable the schema fall through (that is to say, if a jason.Customer table exists, then use that instead of dbo.Customer) which you might do in a shared dev environment. 2. Your objects are named so that they are distinct within that database so that the only difference between two different tables or views is NEVER just the schema. Anything else would be a poor architecture because it would be prone to mistakes.
      I'm not saying you can't or shouldn't specify the schema if want to, but I disagree with the premise that it's a best practice that should be followed by all.

Leave a Reply