August 26, 2010 | SQL Server

First Look : Red Gate SQL Storage Compress 5.0

Recently, Red Gate Software released a new version of their product called SQL Storage Compress 5.0.  This is an alternative to using the native Data Compression and Backup Compression features introduced in SQL Server 2008.  In order to utilize Data Compression, you must be running Enterprise Edition or above (and yes, for the nit-pickers, I know this includes Developer and Evaluation, but these aren't real production editions).  For Backup Compression, this was relaxed in SQL Server 2008 R2 to also include Standard Edition, however Data Compression remains a big ticket item.  So certainly on the surface, SQL Storage Compress seems to be a decent solution for people who:

(a) run SQL Server Standard Edition, and can't take advantage of the native data compression features;
(b) have a mix of editions, and want to use the same compression solution everywhere; or,
(c) commonly perform backup/restore operations between different editions.

Setup

Setting it up is quite easy – you run the installer, it installs a service and a configuration tool, and basically you only have to change a few things in your normal routines in order to take advantage of compression from the product.

  • If you want to compress data (resulting in smaller MDF/NDF files), you simply have to create or restore databases with .mdfx / .ldfx / .ndfx extensions instead of .mdf / .ldf / .ndf.  The background service will actively compress these data files in real time.
     
  • If you want to compress backups (resulting in smaller backup files), you simply have to change your native backup statements to save to .hbc or .hbc2 filenames instead of the traditional .BAK extension.  (Alternatively, you can configure the service to also compress .BAK files, however I don't recommend this as it could become confusing in your environment.)  For anyone who has used HyperBac, some of this may sound familiar, and that is because Red Gate acquired HyperBac earlier this year.  In fact much of the branding remains:


     

  • SQL Storage Compress also supports encryption options, but I'll leave those for another day.

At first I was surprised that nothing really happened at the end of setup, except that the activation screen was raised.  To get to the configuration tool, go to Start > Programs > Red Gate > HyperBac > HyperBac Configuration Manager.  This is where you can change settings, most commonly which extensions should actively be compressed by the service:

On the other tabs, you can also start and stop the HyperBac service, inspect its status, and configure locations for log and index files.  Red Gate has published a detailed walk-through which helps you learn how to utilize the product once you have it up and running.
 

The Tests

To keep things simple, I wanted to create new databases from scratch, mark them as compressed, and then populate them with data.  I wanted to measure the following aspects:

  • speed of initial load operations
  • speed of read operations
  • speed of write operations
  • size of data and log files
  • size of backup files

Of course I wanted to compare these metrics using both native and Red Gate compression, and use a baseline where no compression was used at all.

Rather than repeat all of the boring test setup and sample data, I will point you to this post of mine from last year:

/blogs/aaron_bertrand/archive/2009/08/23/sql-server-2008-r2-digging-deeper-into-unicode-compression.aspx

The tests I ran this week were essentially identical to the tests from the above blog post;there were a few exceptions, for example to set up the Red Gate compressed database, I had to restore an existing backup using WITH MOVE to .  I haven't yet tested whether the service works equally on a database where the CREATE DATABASE statement uses the modified extensions; I am not 100% clear on whether the service only listens for RESTORE statements and doesn't pay attention to CREATE DATABASE.  More tests later.

The machine, also, is the same – an 8GB VM with 4 cores.  The hardware isn't an essential detail of the tests, because they're all running on the same box – but keep in mind that these are local virtual disks, so the I/O certainly doesn't represent what you might be dealing with on a real server with a decent disk subsystem.  As always, I recommend you do your own performance tests in your own environment!
 

The Results

For the speed tests, I divided the timing of the results into the three major operational categories: initial load, read, and write, and then stacked these to show overall duration.  From the below graph you can see that, as you might expect, the fastest overall speed for the workload was in the database with no compression.  For the database using SQL Storage Compress, the initial data load was slower than native compression, but subsequent operations were faster.

 
<img src="https://sqlblog.org/wp-content/uploads/2018/01/28289_Screen20shot202010-08-2620at2011.29.2320AM.png" width="480" border="0" height="289">
 <i>Duration, in seconds, of identical workload replay</i>

Next I compared the data and log file sizes.  SQL Storage Compress certainly squeezed the same data into the smallest data file, by almost 50% compared to native data compression.  However, this came at quite a cost in log space: the same workload produced almost 300 MB of log compared to about 40 MB of log in both the native compressed and non-compressed databases:

 
<img src="https://sqlblog.org/wp-content/uploads/2018/01/28289_Screen20shot202010-08-2620at2011.29.3420AM.png">
 <i>Size, in MB, of compressed and non-compressed data and log files</i>

Finally, I ran some backups to test eventual backup size.  I ran four different tests: simple backup with no compression, a simple backup with native compression, and the two HyperBac compression modes (normal and fast).  The timings for these backups were irrelevant, because they all completed within one or two seconds (just not enough data to perform a meaningful comparison).  But the resulting sizes had a few surprises:

 
<img src="https://sqlblog.org/wp-content/uploads/2018/01/28289_Screen20shot202010-08-2620at2011.29.0820AM.png" width="482" border="0" height="319"> 
 <i>Backup size, in MB, of compressed and non-compressed data</i>

Note that even though the original size of the SQL Storage Compressed database was the smallest, taking a normal, non-compressed backup actually yields a much larger file – relatively equivalent to a non-compressed backup of non-compressed data.  As of SQL Server 2008 R2, backup compression is now included in Standard Edition and above, so whether or not you have SQL Storage Compress installed, most of you will be able to compress your backups in one way or another – so this is unlikely to hit anyone in practice.  Without thinking too deeply about what is in this backup file, the result is peculiar to me.  Other than that, the chart speaks for itself – for compressed or non-compressed data, native backup compression and HyperBac compression provide almost identical compression ratios.  Native compression has the edge if you are in a situation where you need to restore databases to servers without the SQL Storage Compress service running.  I hope to do more tests in the future comparing speed and resource usage as well as size on much larger databases.
 

Caveats

As mentioned before, this was a pretty basic test on low-end hardware.  This is in a virtual machine with a single C:\, so the lack of I/O distribution probably impacted both solutions negatively when compared to a real-world scenario.  Could the SQL Storage Compress performance been optimized?  Possibly, especially if I had consulted Red Gate.  In general, though, the test was apples-to-apples… and it was quite true to what a typical user might experience when trying out the product for the first time.  I'll have to put some thought into making some higher-octane hardware available for further and more realistic performance tests.
 

Questions

As I was running the tests, some questions occurred to me.  Three I have answers for; one I do not.

  1. What if I only want to compress some of my data?

    In the native Data Compression solution, you can pick and choose which objects you want to use compression, and you can turn these on and off more or less ad hoc as well.  You can do this with SQL Storage Compress as well, but you need to be much better prepared up front: since the Red Gate solution works on a file basis, you would have to place objects on specific physical files within different filegroups in order to compress some objects but not others.  This is another thing I plan to play with in future tests.
     

  2. What if my trial expires?

    Here they explain how to recover your database if you have compressed it and the trial has expired:

            http://www.red-gate.com/SQLStorageCompress/ExpiredTrial

  3. What if the service stops?

    If the background service stops, your database will become inaccessible.  Personally I think there should be a much bigger warning on the Stop button in the Configuration Manager.  If you stop the service and re-start it, you will get this error message if you try to use the database:

    Msg 945, Level 14, State 2, Line 1
    Database 'RGCompression' cannot be opened due to inaccessible files or insufficient memory or disk space.  See the SQL Server errorlog for details.

Management Studio still shows the database in the tree, but it is offline (not explicitly, but for all intents and purposes):

       
        A compressed database in SSMS if the SQL Storage Compress service has stopped

And as the error message suggests, there are plenty of errors in the SQL Server error log, all of which – at least on first glance – point to "database gone, no paddle" (click to embiggen):

       
        Error log after SQL Storage Compress service stops (click to embiggen)

This aspect looks pretty brittle to me so far – I'll have a follow-up post that explains how to recover your databases should this happen to you.

  • Does the HyperBac service need to be a dependency in a cluster?

    I would expect so, and if it isn't a requirement, it is probably still a good idea, especially given the above.  But I did not have a cluster available to run these initial tests.  If you have any information about this, please let me know.

  • Conclusions

    I've either realized or further cemented a few ideas in the process of conducting these tests.  In no particular order:

    • Whether or not you use native or 3rd party compression for your data files, you should always use backup compression if possible.  You get tremendous space savings, at a very low cost of additional overhead – and in all the tests I've performed with various products, almost always faster.  Again it is not really important whether you use the native backup compression solution or a 3rd party product, as long as you can restore the database on all the required instances.
       
    • SQL Trace can be a real pain sometimes.  I set up several traces so that I could include more pretty graphs, but they all ended up with 2 useless rows.  I obviously boneheaded something, but was too tired to start over.  I should probably have SQL Sentry installed in this VM.  Again, something to work out for more elaborate tests in the future.
       
    • SQL Storage Compress 5.0 can provide some advantages over native compression methods.  So, it could be a viable alternative for those running Standard Edition and lower, and even for those with the ability to use native compression, depending on priorities.  I'll mention again that you shouldn't take my graphs above as gospel; if you want to evaluate how the solution works for you, I strongly recommend you download the 14-day trial edition and conduct your own tests.  For backup compression specifically, you should also consider the following products:
    • The two issues with this product that stand out to me are:
       
      • Why are the log files so big?
      • How do we recover a database if the service has stopped?

      EDIT: I address these questions to some degree in a follow-up post.

     

    6 comments on this post

      • AaronBertrand - August 26, 2010, 9:54 PM

        I've asked a question on Red Gate's support forum, where they're usually pretty good about supplying honest, non-marketing-y answers:
        http://www.red-gate.com/MessageBoard/viewtopic.php?p=42239

      • Jeff Aven - August 27, 2010, 3:01 AM

        Great article Aaron,
        SQL Storage Compress will listen for all activity to the designated file(s) including CREATE DATABASE as well as RESTORE, SELECT, INSERT, UPDATE,…  In the case of CREATE DATABASE, if you create a new 1TB database, the size reported by SQL will be 1TB but the physical data file space will be only about 16MB (until you populate the database – at which time the data will be compressed as it is entered).
        The log file should be compressed as well so this would need to be investigated, you can check this by taking the database offline and getting the meta data and compression properties on the log file by using the HyperBac WinExtractor or hyperutil.exe utilities.
        Also regarding recovery should the service stop, you can simply take the databases offline, decompress them and attach them as native files if necessary.  Furthermore you can perform BACKUP operations of the compressed DB and then RESTORE WITH MOVE back to native, uncompressed data/log files.
        SQL Storage Compress adheres to all of the durability standards of SQL Server so it is designed to provide the same integrity and consistency as you would get natively if you encountered a hardware/disk failure for example.

      • William Ross - August 27, 2010, 3:54 PM

        I thought that compressed data moved faster, not slower than uncompressed data. It seems then that you are trading performance for disk space? That seems to only make sense, then on a backup, yes?

      • AaronBertrand - August 27, 2010, 4:42 PM

        William, on all of the tests I've done with compressed backups, I enjoyed faster *and* smaller backups.  The trade-off cost was CPU, not time – and we typically have more CPU to spare than I/O, disk space, or time.
        That said, in the case above, if you look at the duration chart you'll see that while SQL Storage Compress had slower initial load speeds, the select and update were as fast or faster than both native compression and no compression.  Now, this was not a very thorough test, just a simple workload replay, and I am still working with Jeff on determining whether the high log activity contributed to the slow initial load.  As I mentioned multiple times, do not come to broad conclusions based on my one test.

      • TheSQLGuru - August 29, 2010, 4:58 PM

        Aaron, I have been loving the Hyperbac product(s) for a LONG time now, and with the recently introduces Virtual Restore as well as the new Storage Compress capabilities this product is probably a no-brainer for the vast majority of SQL Server customers out there.  The CPU-processing-power-to-IO-throughput gap just keeps getting wider (at least until everyone moves to SSDs) and Hyperbac can go a long way to mitigating that gap.  

      • Robert - September 28, 2010, 2:39 PM

        Hello Aaron. You asked about dependencies on a cluster. I installed Storage Compress on our test cluster, and no cluster resources where added, only a local service on the individual nodes. I reckon you have to add this service as a generic cluster resource just to be safe. maybe they'll add it in storage compress 6…

    Comments are closed.