SQL Server v.Next (Denali) : Contained Databases
A problem that has plagued SQL Server for a long time is that a database is not very portable. Sure, you can backup or detach a database and restore or attach it elsewhere, but when you do that, you are missing a lot of the pieces that make that database a part of an application, and a lot of those pieces are really considered administration pieces as opposed to application pieces. You don't get any synchronization of "outside of the database" items such as security, roles, linked servers, CLR, database mail, service broker objects, replication, and SQL Server Agent jobs; also, if the target server has a different collation, you are going to have issues when creating #temp tables or joining to different databases. You also may be aware of many issues with moving databases to a new server, and having to deal with logins and other security aspects that tend to be incomplete (or wrong) after the initial move.
Enter "Contained Databases"
In SQL Server Denali, some of these issues are being addressed with a feature called "contained databases." I am not sure if that will be the marketing term you'll hear come release time, but for now I'm going with it. Since it's built into the DDL, the name will be harder to change than, say, when they changed Dynamic Management Framework (DMF) to Policy Based Management (PBM). Essentially, contained databases will help the database to be more of a black box, separating the application and database-specific responsibilities from the administration and server- or instance-specific responsibilities.
In Denali, the first iteration of the contained databases feature will provide the following solutions:
- you can create a database-specific user without a login (and you can create multiple such users with the same name for different databases);
- you can have *some* compatible use of tempdb with databases of different collations, since #temp tables will be created in the collation of the calling database context; and,
- you can use a DMV to show most objects or code you have that will threaten containment.
Objects and other entities can now be classed into two different categories: contained and uncontained. A contained entity is one that has no external dependencies, or at least no dependencies on the instance or server. An example of a contained object is one that lives within the database and has no external references. For a full list of contained entities, see this Books Online topic:
An uncontained object is one that either has an explicit external dependencies (e.g. three- or four-part names) or where it cannot be determined (e.g. dynamic SQL). For a full list of explicit objects that are considered uncontained, see this Books Online topic:
Note that some things were missed; for example, HOST_NAME should not be in this list (and it has been reported on Connect). Finally, if you want to understand what SQL Server features were modified to accommodate containment, see this Books Online topic:
The most interesting thing here is the fact that CREATE / ALTER DATABASE has been changed, and actually works differently when affecting a contained database. A new option called CURRENT has been added, so that if you move the database to a new instance or change the name of the database, the command will still work. The documentation states that in a contained database you MUST use ALTER DATABASE CURRENT, and in an uncontained database, you MUST use ALTER DATABASE <database name>, but – in the current CTP at least – neither limitation is enforced. The topic has some other useful information on a new collation type, limitations of temporary tables, and user options.
Turning containment on
In Denali, only partial containment is supported. This means that you can create uncontained entities, and you can discover these entities (using a new DMV), but the engine will not prevent you from creating them. In a future version of SQL Server, full containment will be supported; this means that containment can actually be enforced.
So, let's create a couple of databases (with different collation), and play with some of these features. Note that you need to set a configuration option at the server level in order to use DDL dealing with CONTAINMENT.
USE [master]; GO CREATE DATABASE [ContainedDB1] COLLATE SQL_Latin1_General_CP1_CI_AS; GO CREATE DATABASE [ContainedDB2] COLLATE Finnish_Swedish_100_CI_AI_SC; GO EXEC sp_configure 'show advanced options', 1; GO RECONFIGURE WITH OVERRIDE; GO EXEC sp_configure 'contained database authentication', 1; EXEC sp_configure 'show advanced options', 0; GO RECONFIGURE WITH OVERRIDE; GO ALTER DATABASE [ContainedDB1] SET CONTAINMENT = PARTIAL; GO ALTER DATABASE [ContainedDB2] SET CONTAINMENT = PARTIAL; GO
Creating a user without a login
You can now create a user that is not tied to a login. The syntax is quite similar to CREATE LOGIN, and you will still have access to most of the other features such as password complexity, default_schema and default_language. Here we will create two users, both named MyUser, with different passwords for each database:
USE [ContainedDB1]; GO CREATE USER [MyUser] WITH PASSWORD = 'DB1'; GO USE [ContainedDB2]; GO CREATE USER [MyUser] WITH PASSWORD = 'DB2'; GO
If I try to do this in an uncontained database, I get the following error:
Msg 33233, Level 16, State 1, Line 1 You can only create a user with a password in a contained database.
You may ask yourself, if there are two users with the same name, how does SQL Server figure out which one you mean? Well, users are now authenticated at the database level first (database is specified in the connection string), and SQL Server will try to find a login with the same name if no user is found at the database level. Mainly for performance reasons, it will not do the opposite: try to find a login, then when it fails, poll every single database for a user with the same name – it will just fail outright if there is no database specified. So, if you specify a valid contained database user name but no database context, the credentials will fail even if the password is correct. This means you will want to stop relying on the default database setting (which in fact does not exist for a contained user), and explicitly specify the "Initial Catalog" property in your connection strings.
You can see the new properties page for contained database users, and they are of the type "SQL user with password":
So what's the big deal? Well, now I can move either or both of these databases to others server, and just re-point connection strings; no login or SID mess. I can also do the same with Windows principals, but this isn't as interesting because they are not tied to logins for uncontained databases either:
However, note that in a contained database, the Windows principal is automatically granted access to the engine and the current database, whereas in an uncontained database, this must be granted through an associated login or via a Windows group. For more information on the updated CREATE USER syntax, see this Books Online topic. There is also a new system stored procedure, sys.sp_migrate_user_to_contained, to help you migrate each SQL auth-based user to a contained database user with password.
Note that when you move a database to a new server with different password complexity rules, these will not be validated unless you re-create a login or alter their password. So it can be possible to end up with weak passwords even on a server where complexity rules are expected to be enforced.
Security is a big issue, and I'm sure I've barely scratched the surface with how users and logins will or will not be able to co-exist in a contained database world. Please let me know about any scenarios that you are interested in investigating. Also be sure to check out this Books Online topic, "Threats Against Contained Databases." One interesting note from that topic, is just another nail in the coffin for AUTO_CLOSE: contained databases with AUTO_CLOSE enabled could be subject to a denial of service attack, because of the extra resources required to authenticate a contained database user.
Resolving tempdb collation issues
It should be no surprise what happens if you run this code in an uncontained or pre-Denali database (on a server with a collation other than Finnish_Swedish_100_CI_AI):
USE [master]; GO CREATE DATABASE [test] COLLATE Finnish_Swedish_100_CI_AI; GO USE [test]; GO CREATE TABLE dbo.foo ( bar NVARCHAR(32) ); CREATE TABLE #foo ( bar NVARCHAR(32) ); GO SELECT * FROM #foo AS a INNER JOIN dbo.foo AS b ON a.bar = b.bar; GO DROP TABLE #foo; GO USE [master]; GO DROP DATABASE [test]; GO
The join condition yields this error message, because the temporary table was created in the default server collation, not the database's collation:
Msg 468, Level 16, State 9, Line 3 Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Finnish_Swedish_100_CI_AI" in the equal to operation.
In a contained database in Denali, temporary objects are created with the database collation. So if you make the database in the above sample contained and re-run the code, it will work just fine.
Of course this doesn't solve all of the issues… you can still have conflicts if you join your #temp tables to objects in databases with different collations. And I do not mean to trivialize collation issues, I just don't have a lot of exposure to the problems, so if you are in a collation-sensitive environment, you will want to spend some more time investigating whether this creates more problems than it solves. But if your T-SQL code is truly contained to the database and collation issues are minor, most of this should make moving the database to a new instance easier.
Discovering threats to database portability
It can be important to understand what parts of your database won't be portable (a.k.a. contained) even if you are not utilizing database containment functionality. So, there is a new DMV called sys.dm_db_uncontained_entities that will highlight these items for you. Not only will it highlight the objects, but for procedure / functions / views / triggers, it will highlight the statement offset, so you can drill right down to the statement and be made immediately aware of multiple violations in the same module. Let's use a contained database we've created above, and add a bunch of objects that we know will threaten containment:
USE [master]; GO -- create a login: CREATE LOGIN MyLogin WITH PASSWORD = 'DB1'; GO USE [ContainedDB1]; GO -- create a user bound to the above login: CREATE USER [MyLogin] FROM LOGIN [MyLogin]; GO -- create a procedure with dynamic SQL + deferred name resolution: CREATE PROCEDURE dbo.foo AS BEGIN EXEC('SELECT * FROM table1'); EXEC('SELECT * FROM table2'); SELECT * FROM dbo.object_does_not_exist; END GO -- create a synonym to an external, uncontained object: CREATE SYNONYM dbo.bar FOR [master].sys.backup_devices; GO -- create a procedure that references the synonym: CREATE PROCEDURE dbo.getbar AS BEGIN SELECT * FROM dbo.bar; END GO -- create a procedure that calls xp_cmdshell: CREATE PROCEDURE dbo.use_xp_cmdshell AS BEGIN EXEC xp_cmdshell 'dir C:\'; END GO -- create a procedure that relies on database mail: CREATE PROCEDURE dbo.use_dbmail AS BEGIN EXEC msdb.dbo.sp_send_dbmail; END GO -- create a silly function that generates a random object_id: CREATE FUNCTION dbo.GenRandNumber() RETURNS BIGINT AS BEGIN RETURN ( SELECT TOP 1 [object_id] FROM msdb.sys.objects; ); END GO -- create a table with a default constraint that references the function: CREATE TABLE dbo.nonsense ( id INT NOT NULL DEFAULT dbo.GenRandNumber() ); GO
I threw this query together to show how the DMV can be used to pinpoint the problems and to dive directly into resolving them:
SELECT e.feature_name, [object] = COALESCE( QUOTENAME(SCHEMA_NAME(o.[schema_id])) + '.' + QUOTENAME(o.[name]), QUOTENAME(SCHEMA_NAME(s.[schema_id])) + '.' + QUOTENAME(s.[name]) ), [line] = COALESCE(e.statement_line_number, 0), [statement / synonym target / route / user/login] = COALESCE( s.[base_object_name], SUBSTRING( m.[definition], e.statement_offset_begin / 2, e.statement_offset_end / 2 - e.statement_offset_begin / 2 ) COLLATE CATALOG_DEFAULT, r.[name], 'User : ' + p.[name] + ' / Login : ' + sp.[name] ) FROM sys.dm_db_uncontained_entities AS e LEFT OUTER JOIN sys.objects AS o ON e.major_id = o.[object_id] AND e.class = 1 LEFT OUTER JOIN sys.sql_modules AS m ON e.major_id = m.[object_id] AND e.class = 1 LEFT OUTER JOIN sys.synonyms AS s ON e.major_id = s.[object_id] AND e.class = 1 LEFT OUTER JOIN sys.routes AS r ON e.major_id = r.[route_id] AND e.class = 19 LEFT OUTER JOIN sys.database_principals AS p ON e.major_id = p.principal_id AND e.class = 4 LEFT OUTER JOIN sys.server_principals AS sp ON p.[sid] = sp.[sid];
You'll see that the stored procedure dbo.foo was picked up by the DMV three times: once for each use of dynamic SQL, and once for reference to an object that doesn't yet exist (this is captured so you're away that this *may* become an uncontained object when it eventually gets created). I thought the line number could be pretty useful too, so that if you loaded the definition of the object, you could go straight to that line instead of performing a search.
Note that the DMV did not pick up the SYNONYM that referenced a three-part name; you'd have to identify any synonyms and then brute force discover them. I filed Connect #622368 in the hopes that they will rectify this issue.
Also note that the table using a function that itself has an uncontained reference is not picked up. I didn't file a Connect item for this one as I think it is a relatively rare issue, however if you have situations like this it is something to keep in mind.
Finally, I am not sure exactly what you can do about the AutoCreatedLocal route; this is the first time I've ever noticed this entity.
There are some other entities covered by the DMV that I didn't treat here, such as Assembly, Type, Full-text Index, Database DDL Trigger, and Audit Specification, but if you are using these items, you should see them show up in the list as well (and you can adjust the query to include any relevant information from those catalog views).
There are obviously plenty of angles to work on, and it will likely take several versions to get to all of them. Having linked servers and agent at the database level, for example, could be very interesting. I can sense the level of investment here and have witnessed the team's excitement first hand, so I am looking forward to the evolution of this feature over the next few major releases.
For more information on contained databases, there is a great Books Online topic in the works: