In my last post in this series, I talked about some problems associated with relying on undocumented behaviors and commands. This time I wanted to touch on SQL Server configuration, and some of the items I see where more thought should have been put into the process when SQL Server was initially installed, rather than just accepting the defaults or picking the options that are "easiest."
Not surprisingly, since SQL Server is so much easier to set up than other RDBMS platforms I won't mention, a lot of people fire up the setup wizard and click Next / Next / Next without having to fill in much information at all, never mind really think about it. By stepping through the more interesting dialogs in a SQL Server 2008 R2 installation wizard, I will try to highlight some of the things that can happen during such an install, that can cause problems later:
Feature Selection screen
No joke, I see folks check *ALL* of the boxes here (and the laziest ones press "Select All"):
I can only assume that they have no idea what the application needs, and don't want to have to come back and install things later. So every install could have Analysis Services, Reporting Services, Full-Text Search, replication, BIDS, Books Online, etc. even if the destination servers are not workstations and the application does not need any of the additional services. My opinion is to take a minimalist approach here – install only what you need on a server. Not only will it reduce the memory / CPU footprint of background services you'll never need, but since most of this stuff ends up on the system drive by default, you'll also prevent filling up that precious (and far too often under-configured) space too quickly.
In the case of a cluster install, then this shouldn't be done in a rush in the first place. The requirements of the application that will utilize the cluster should be dictated well in advance, so there is no question about whether Analysis Services or Full-Text Search will be used. Though the clustered version of setup doesn't allow you to un-check some of those boxes anyway – the reason is that things like Full-Text Search are "really hard" (not my words) to configure for clustering after the engine is already installed.
As an aside, usually "Shared feature directory" is customizable, and this could be a good candidate for moving to a location other than your C:\ drive. If you are adding a new instance or already have client components installed, however, you must continue to use the directory you chose when you first installed SQL Server. This is the case in the screen shot above, because I already have several instances and workstation components installed.
Instance Configuration screen
The default here is to put your instance at C:\Program Files\Microsoft SQL Server\:
As I briefly mentioned in the previous section, your C:\ may not be able to handle an instance of SQL Server at all, never mind one with busy usage or large volumes of data. On the other hand, you may be installing SQL Server in a virtual machine with only one drive letter, or on a system with only one hard drive. In that case, well, I hope you are prepared for the event when tempdb or a database's transaction log has filled the disk.
Server Configuration screen
Ah yes, this is the one where you pick the accounts for the SQL Server services:
The "let's get to a place where we can click Next fastest" methodology takes two approaches here: set all services to run as NT AUTHORITY\SYSTEM or NETWORK SERVICE, or set all services to run using a domain admin. I can't think of many situations where either of these choices is appropriate – usually the best choice is a dedicated domain account (not a domain administrator, and not a local administrator either). Also, don't go overboard here… I've seen cases where people have gone through the Add… button and added everyone in their domain to the SQL Server administrators list. This is overkill, and not something you should need to do during setup anyway – if you need to add logins you can do that after you have the engine up and running. I don't want to get into all of the security nuances or the principle of least privilege (I plan to cite some examples of this in a future post in this series); I just want to point out that some thought should be put into how security for the instance should be set up. Granted, this is one of the settings I'll talk about in this article that is relatively easy to change after installation.
The collation setting, thankfully, is a lot harder to screw up. Wait, what is collation? The short answer: it is used for determining valid character sets, comparing strings, and sorting (for more info, start here in Books Online). You have to intentionally switch tabs on this screen, and intentionally pick something other than the default. I would strongly recommend that, unless you are absolutely positive you need a non-default configuration, that you don't even switch to this tab. While some applications might require a specific non-default collation, I'd be very careful about picking anything other than the default for an instance-wide setting. You're pretty much stuck with this choice for as long as the instance is alive, so if you need to change this later, you will be looking at building up SQL Server from scratch. I have yet to come across an application I was building out that needed a non-default collation, but in the past dealt with an inherited application with multiple database servers where some servers were in CodePage 1, and others were in CodePage 437. It wasn't pretty when we needed to consolidate the application onto a single instance – let's just say I got very familiar with the BCP command.
Database Engine Configuration screen
Here on the first tab ("Account Provisioning"), you pick the authentication mode (and sa password if using mixed mode):
For a lot of shops, mixed mode is used because there are applications that connect to SQL Server from beyond the domain (and it is not always easy to get domain trusts enforced ). While you can connect to SQL Server using Windows Authentication in SSMS to a different domain without a proper relationship in place (see this blog post on runas /netonly by James Kovacs). But if you are in a situation where you can enforce Windows Authentication, this is the more secure option of the two. As with many other choices, make sure you understand how your application will use the database, rather than just breezing through the defaults.
On the second tab, Data Directories, you have much more flexibility than you did in previous versions:
You can dictate the location of the data root (where master, msdb, and model live – you can't customize the location of the resource database, it will always be in <instance root>\MSSQL\BINN\), user database and log directories (where new database files will be created), "temp DB" (let's ignore minor spelling issues for now), and backups. When you can, split these things out as much as possible: the more spindles the better. Having them all on C:\ does not make much sense if you have other drives, or attached storage, or even an underlying SAN that could/should be utilized. Unfortunately, since these are hidden on a tab that is not presented by default, and nothing even suggests that you might want to make changes here at all, I see many cases where these were all left on C:\ – even when there is a perfectly healthy SAN with LUNs carved out for this server. Now, that problem will bite you eventually when C:\ runs out of space, or you may reserve the right to explicitly define the location of the data and log files when you actually create the database. All that is fine, but if you set the default location to a more reasonable place, then you're covered when you forget, or a database is created by someone less familiar with your architecture.
The last tab deals with FILESTREAM. This is another area where a rushed administrator might attempt to set up FILESTREAM access, without really being sure whether your application requires the feature (again to avoid having to come back and configure it later). I have not played with FILESTREAM enough to know how much of a pain it is to later turn it on or off when it has or has not been configured and that was the wrong choice. So as much as possible, try to determine *before* installing SQL Server whether or not you will need this capability.
Error and Usage Reporting screen
Perhaps this is only true in SQL Server 2008 R2 because it is not a final release, but when I get to this screen, both checkboxes are checked:
Which means SQL Server will, by default, send error reports to Microsoft. It's up to you whether or not you want to do this (and you may have a corporate report server set up to receive Windows and SQL Server error reports), but on production instances of SQL Server, I will always strive to ensure that these two checkboxes are unchecked. This is not due to a fascination with Big Brother or George Orwell, but rather to prevent as much network chatter as possible.
And there are other things as well
Once you have SQL Server set up, you may want to look at changing some of the default behaviors. I don't want to say that you should change any or all of these, just want to give an idea of things you should investigate *possibly* changing after SQL Server is running, but before they cause problems later. Just a quick sweep of things off the top of my head:
- set up min/max memory, /3GB, /PAE – make sure SQL Server is seeing the memory it's supposed to, and only the memory it's supposed to.
- if you need to have auditing / Common Criteria compliance set up, may as well get that out of the way now and ensure it is functional.
- for SQL Server 2008 and above, Enterprise Edition at least, you can set WITH COMPRESSION as the default backup option under Server Properties / Database Settings / Backup and restore.
- configure max degree of parallelism – sometimes you don't want your instance to have the ability to create plans with all processors, or even with more than one.
- straighten out the proxy account for SQL Server Agent early (create and test a job that does things you'll need to do, such as access file system, call cmdExec steps, etc).
- if appropriate, change tempdb file allocation (typical wisdom, if you have enough spindles, is to create a data file per physical core).
- change the autogrowth settings on the model database : 1 MB for data and 10% for log? This may have been appropriate last decade but it makes no sense today IMHO.
- check out other settings in model which will be inherited by user databases (initial size, recovery model, snapshot isolation, compatibility level, default cursor scope, broker enabled).
Then, there is the set of things you don't know that you don't know
Now, there are cases where you don't get to set the preferences, such as when some software package needs to install Express locally, and does so behind your back, so to speak. And you might not even know that they did so. Or they assume you have Express installed, then use the ATTACHDBFILENAME argument in their connection string, hard-coded to "C:\Program Files\<their app>\<their DB>.mdf" … then proceed to grow that database on your C: drive. What percentage of users installing an application like this will be able to quickly diagnose why their system drive has run out of space? What percentage of those will be able to shut down the app, modify the connection string, and move the MDF file to a more appropriate location? I think it's a shame that software still gets produced this way… any such package should first ask if you already have an existing instance of SQL Server somewhere that you'd like to use for their application. I have to imagine that this practice is responsible for a large portion of the world's SQL Server sprawl.
Those are some of the basic things that I often see mis-configured in various SQL Server instances that I come across in the wild. I hope you can take at least one thing I touched on in this article and make your next SQL Server installation smoother.