Data Compression in SQL SERVER 2008
This article explains types of Data Compressions that are introduced in SQL SERVER 2008 and how they are useful based on our requirements and how to estimate the effect of compression of Table objects or Database.
Data Compression facility is enhanced newly in SQL SERVER2008. To say, it is extended to the existing functionality in SQL SERVER 2005 which is usage of VARDECIMAL with having table-level option can be used with decimals & numeric fields data, so this has been stretched further in 2008 version to the users.
So this compression is a big leap in 2008 and different to normal compression formats, say if you have enabled data compress on a table or index then underlying row/page format will be different in terms of ROW & PAGE value that is stored in the engine. This value of compression can only be useful/available within SQL 2008 storage engine and that will not be available for previous versions.
In addition to offering support for NTFS file compression and vardecimal, SQL Server 2008 offers row-level compression or page-level compression (which includes row-level compression) AND it offers the ability to turn these on at the partition-level or at the table-level for all partitions. While I think the per-partition option is excellent, you might still want to separate your OLTP and read-only data into separate tables for other benefits (like online index operations which I mentioned here) but, the "table-level only" options are certainly starting to decrease! And, more granular options always means better manageability.
Deciding which Data Compression is suited as per requirement:
· Estimate the potential storage savings that you could get by implementing (you could actually end up using MORE storage within certain circumstances)
· Complete a baseline performance analysis of your database server and reproduce this in a development or staging environment. Enable compression and evaluate the performance against that baseline.
· Look at which tables can provide you the biggest benefit. We have found that an audit table will have actually use 75% savings in disk space with no impact to the application on top of the database. Tables which have a lot of repetitive or numerical data or CHAR columns that are not fully populated are usually excellent candidates for compression.
· Check and recheck your analysis or requirement against the baseline, and seek feedback from the users if you do implement it in a production environment or any existing system.
Estimating the savings based on Data Compression on a table object and DataBase:
There is a stored procedure in SQL Server 2008 called sp_estimate_data_compression_savings. This procedure accepts 5 parameters.
The two critical parameters are @object_name and @data_compression.
The object name refers to the table that you wish to evaluate and @data_compression can have one of three values (NONE, ROW, PAGE). Depending upon the value passed this will perform estimations for the two compression types and strangely for no compression.
The additional parameters provide some more advanced estimation options. @schema_name allows you to estimate the savings against tables on a different schema (by default the procedure only looks at tables within your own schema). @index_id will allow you to specify the estimated savings for a single index on a table based upon the index id, left with the default value of NULL it will assess all of the indexes on that table. @partition_number will allow you to define particular partitions on a table to evaluate potential savings. This can be very useful for estimating the savings on older data, rarely changed living on a partition, which you may want to compress as opposed to more volatile data within a different partition which may not be a good candidate for compression. Worth noting, you have to define an index id in order to evaluate a partition.
NOTE: The sp_estimate_data_compression_savings procedure is a bit of a heavy hitter, and obviously with checking all tables in a database (or on an instance) using USP_Compression_Savings_By_DB, your system will be hit all the harder. It is recommend running this against a recent dump of your production databases restored on to another system. At the very least run it during the period of least activity on your system.