The other day I wrote a blog post about my initial trials with SQL Storage Compress 5.0. For the most part my experience with the product was fantastic, however there were a few areas where I felt bit uneasy – and wanted to address those concerns.
Since I am a troublemaker by nature, one of the first things I tried in my initial round of testing was to access a compressed database while the HyperBac service was stopped. I simply stopped the service through the HyperBac Configuration Manager, and quickly discovered that my compressed databases were no longer accessible. Assuming there is no data loss, this situation is easy enough to recover from, by following these steps:
You can also use a more tedious technique, which decompresses each individual mdfx/ldfx file into its equivalent mdf/ldf file, and then simply attach those files as if it were a normal database. This is similar to the technique you would have to follow if you were using the product and your license expired, or if you decommissioned a SQL Server and only had the .mdfx/.ldfx files available:
hyperutil.exe /S"<path>\<file>.mdfx" /O"<path>\<file>.mdf" /E
hyperutil.exe /S"<path>\<file>.ldfx" /O"<path>\<file>.ldf" /E
CREATE DATABASE <db_name>
ON PRIMARY (FILENAME = '<path>\<file>.mdf')
LOG ON (FILENAME = '<path>\<file>.ldf')
Of course in this case you will no longer be able to automatically use compression against these databases, unless you backup and restore them as mdfx and/or ldfx files again.
In a thread on the Red Gate Forum, Jeffrey Aven agreed that it might be beneficial to warn users of the impact when pressing the Stop button in the HyperBac Configuration Manager; so, they are going to consider addressing this in a future update to the product. While this doesn't help people who might (even inadvertently) stop the service via Task Manager, the Services applet or NET STOP, at least the casual goof might be prevented.
Log Size / Duration
In my first tests, the log file for the HyperBac compressed database was roughly 10X larger than the log file for native or non-compressed data. I checked multiple times that for each test everything was the same – identical workload, same isolation level, same autogrowth settings, same recovery model. While I had no problem creating this scenario multiple times on the first day, I've had trouble re-creating it since the first time I stopped and re-started the HyperBac service. Today I ran some tests with a little more data being pumped through to see if I could re-create it, and thought I would share more pretty graphs. I conducted four different tests instead of three:
Here were the comparisons of time required to perform data load and update workload tasks:
<img src="https://sqlblog.org/wp-content/uploads/2018/01/28388_RG_2_Time.png"> <i>Time, in seconds, required to perform workload tasks</i>
From this we can see that, when the background service has to compress both data file and log file information, the time required to run the workload is dramatically affected.
Next are the comparisons to data, log and overall size using the various compression methods:
<img src="https://sqlblog.org/wp-content/uploads/2018/01/28388_RG_2_Size.png"> <i>Size, in MB, of data and log files after workload</i>
While the log is obviously larger when we don't have HyperBac compress it, it seems that for volatile workloads (at least in typical OLTP applications) it may be worth considering only compressing the data file(s) and leaving the log file in the default mode. The duration for the workload when both log and data were being compressed by the HyperBac service was more than 3X longer than compressing just the data file. And while the uncompressed log file was much larger in that case, the overall size was not much larger than native compression, and the time required was much more in line as well.
(I'll be sharing my workload scripts with Red Gate, and can publish them here on request as well.)
Again I did not perform any tracing, but I did pay a little more attention to spot checking Task Manager to see the impact on CPU. During INSERT/UPDATE operations the CPU was running pretty hot; the biggest consumers were (predictably) sqlservr.exe and HyperBacSrv.exe:
I then discovered that the HyperBacSrv.exe process utilizes a substantial amount of CPU even when there is no activity against SSC compressed databases; it utilized anywhere from 12 to 60% of the CPU while a workload was running against a non-compressed database. I'm not saying this is alarming, just something to watch for when running your own tests. The CPU overhead of native compression and how it affects sqlservr.exe still appears to be in the 5% range for my scenario – but again I will need to run more elaborate trace and perfmon tests to provide more accurate overhead analysis.
SQL Storage Compress 5.0 seems to be a candidate worthy of testing. In environments with adequate CPU resources and where disk space and/or I/O are your primary bottlenecks, or where you don't have the ability to use native data compression, the product can fill a tangible gap. But as always, to determine if a product will be beneficial to your environment, I will always recommend that you perform your own benchmarks and are aware of the risks and overhead.