A Comparison of Managed Compression Algorithms

A test suite comparing various Managed compression algorithms

I often use compression in various .NET Projects. An example is my FileCache project. Another project that used compression was my Compressed Cookies project.

The concept here is that you need to save a cached item as a file, and if you can find a compression algorithm with the right balance of speed and compression level, it may end up being faster to compress and save a file because of the smaller size, than to save the entire file without compression. Even if the time with compression is about the same, it may be preferential to use it in order to conserve disk space. Of course, we also need to look at decompression speed since if we keep using these compressed files they need to be decompressed first.

The same is true with data sent over the wire - in many cases, even though content is compressed at one end, transmitted, then decompressed at the other end, it can be faster than sending uncompressed data. This is true with TCP, HTTP, and various implementations of WCF.

I analyzed several managed compression algorithms by setting up a test. The test app downloads an uncompressed Word test file that is over 2 MB in size. This is stored in a byte array, and then the same original is used to test both the timing and the compressed size of the different compression algorithms, including the managed Deflate algorithm that is included with the .NET Framework.

At one point I included 7Zip Managed (LZMA), but it was so slow in comparison to the others that I decided to leave it out of the test. 7Zip is capable of superior compression levels, but it can also be very slow.

I also tested SharpZipLib - a managed implementation that has been around for some years, and again, the results were not so great. Additionally, SharpZipLib is a pretty big library with a lot of features and class files - too big for simple in-memory compression and decompression. But, I've included it here for the sake of completeness.

Additionally, I worked with DotNetZip but again, this is a complex library that is really designed to work with Zip files - not necessarily to compress and decompress in-memory data "on the fly". But I did include the "Zlib" only project, and it's in the downloadable source zip file in it's own folder. I also created a ZlibHelper static class for the tests.

The candidates that I included are:

DotNetZip - This is a full featured multipurpose Zip library that supports Zlib in - memory compression / decompression. NOTE: I would be careful with this. Using the static Zlib CompressBuffer and UncompressBuffer methods, I would get a BADMODE Check error even though the only parameters are a byte array in and a byte array out. This indicates to me that the library may not be bug-free.

MiniLZOPort - This is a safe port of MiniLZO by Owen Emlen that is used in the AltSerializer project on CodePlex. While it is probably somewhat slower than the original MiniLZO implementation, it can be used in platforms like Silverlight, which makes it a welcome addition.

QuickLZSharp - Written by Lasse Mikkel Reinhold, this is a managed safe C# port of QuickLZ. Only a subset of the C library has been ported, namely level 1 not in streaming mode.

ManagedQLZ - by Shane Bryldt, is a fully ported C# implementation of QuickLZ but using unsafe code.

Deflate - This is the standard Microsoft implementation from System.IO.Compression, which is essentially the same as Winzip. Friend and fellow MVP Rene Schulte tells me that Microsoft is working on a full "Winzip" compatible offering for the next version of the .NET Framework.

SharpZipLib - the original, well-known compression port by Mike Krueger and friends that was developed as part of their open-source SharpDevelop project.

Results? Of course you want to see the results first:



NOTE: Times may vary from one test run to another, depending on whether the test in run from within Visual Studio or from a release - built executable.

DotNetZip, which I originally did not include in the tests, is shown here. Decompression is very fast, but it certainly wasn't the winner.

Among the remaining single-class implementations, it is easy to see that QuickLZSharp is the fastest, providing the best combination of competitive compression ratios and speed. Only the managed implementation of QuickLZ is even close. The others are so slow as to be virtually unusable by comparison, except perhaps for MiniLZO which is still in the ballpark and would be a good candidate for a Silverlight or WP7 project.

I think these results are revealing, since a lot of developers use some compression class or library without really putting it to the comparison test. And remember, what I'm comparing here is just being able to compress a byte array and get back a compressed byte array, which then can be stored in a class property of type byte[], saved to the file system, sent over the wire, or stored in a database. We're not talking about full WinZip type files with zip headers and so on.

If you are looking for a small, easy - to - use class that provides both reasonably fast compression and excellent decompression speed even on large files, I'd say you would want to go with QuickLZSharp here, although ManagedQLZ, its sister, provides a comparable compression ratio combined with better decompression speed. And, if you really want to test for speed and make a choice that fits with your particular implementation, don't just test with one file. Try to make tests with different sources of a size and composition that are similar to what your application may actually be working with.

NOTE: I originally started out with a PDF file, but an astute tweeter pointed out that PDF's are already compressed. So, I switched to a nice big uncompressed Word document, courtesy of course, of our ever-efficient U.S. Government. I am sure that the Government worker who put this 2.5 MB uncompressed Word sample business plan document up for download probably thought they were doing every taxpayer a big favor (Bandwidth? What's that - we're the Government - whee!).

Here is the test code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using ICSharpCode.SharpZipLib.Zip.Compression;
using Ionic.Zlib;
using QuickLZSharp;
using ManagedQLZ;
using System.IO;
using System.Diagnostics;
using QuickLZ = ManagedQLZ.QuickLZ;
using BrainTechLLC;
using ICSharpCode.SharpZipLib;
using Ionic.Zlib;
using CompressionMode = System.IO.Compression.CompressionMode;
using DeflateStream = System.IO.Compression.DeflateStream;

namespace QuickLzTests
{
class Program
{
// Test file is a 2+MB Word Document
private static readonly string filePath = "http://www.usbr.gov/recreation/publications/02_SampleBusinessPlan.doc";

static void Main(string[] args)

{
Console.WriteLine("Downloading test doc...");
WebClient wc = new WebClient();
byte[] initialData = wc.DownloadData(filePath);
wc.Dispose();

Console.WriteLine("Initial size: " + initialData.Length.ToString());
Console.WriteLine("Times are in ms. Sizes are in bytes.");
Stopwatch sw = new Stopwatch();

sw.Start();
MemoryStream ms1 = new MemoryStream();
Ionic.Zlib.ZlibStream inputStream = new ZlibStream(ms1,Ionic.Zlib.CompressionMode.Compress,CompressionLevel.Level5);

inputStream.Write(initialData, 0, initialData.Length);

byte[] results8 = ms1.ToArray();
sw.Stop();

Console.WriteLine("Compress with DotNetZip: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("Compressed size: " + results8.Length.ToString());

sw.Reset();
sw.Start();
MemoryStream ms2 = new MemoryStream();
Ionic.Zlib.ZlibStream outputStream = new ZlibStream(ms2,Ionic.Zlib.CompressionMode.Decompress);

outputStream.Read(results8, 0, results8.Length);
byte[] results9 = ms2.ToArray();
sw.Stop();

Console.WriteLine("Decompress with DotnetZip: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");



sw.Reset();
sw.Start();

byte[] result= QuickLZ.Compress(initialData, 0, (UInt32)initialData.Length);
sw.Stop();
Console.WriteLine("Compress with ManagedQLZ: " + sw.ElapsedMilliseconds.ToString() );
Console.WriteLine("Compressed size: " +result.Length.ToString() );

sw.Reset();
sw.Start();

byte[] decomp = QuickLZ.Decompress(result, 0);
sw.Stop();
Console.WriteLine("Decompress with ManagedQLZ: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");

sw.Reset();
sw.Start();

byte[] result2 = QuickLZSharp.QuickLZ.compress(initialData);
sw.Stop();
Console.WriteLine("Compress with QuickLZSharp: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("Compressed size: " + result2.Length.ToString());

sw.Reset();
sw.Start();

byte[] decomp2 = QuickLZSharp.QuickLZ.decompress(result2);
sw.Stop();

Console.WriteLine("Decompress with QuickLZSharp: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");
sw.Reset();
sw.Start();
byte[] result3 = initialData.Compress();
sw.Stop();
Console.WriteLine("Compress with MiniLZO: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("Compressed size: " + result3.Length.ToString());
sw.Reset();
sw.Start();
byte[] decomp3 = result3.Decompress() ;
sw.Stop();
Console.WriteLine("Decompress with MiniLZO: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");
sw.Reset();
sw.Start();
MemoryStream ms = new MemoryStream();
System.IO.Compression.DeflateStream deflateStream = new DeflateStream(ms, CompressionMode.Compress ,true);
deflateStream.Write(initialData, 0, initialData.Length);
byte[] result4 = ms.ToArray();
sw.Stop();
Console.WriteLine("Compress with Deflate: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("Compressed size: " + result4.Length.ToString());
sw.Reset();
sw.Start();
ms.Position = 0;
DeflateStream zipStream = new DeflateStream(ms, CompressionMode.Decompress);

byte[] bytes = new byte[initialData.Length];
int numBytesToRead = (int)initialData.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
int n= zipStream.Read(bytes, 0, 1024);
if (n == 0)
{
break;
}
numBytesRead += n;
numBytesToRead -= n;
}

zipStream.Close();
sw.Stop();

Console.WriteLine("Decompress with Deflate: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");

sw.Reset();
sw.Start();
byte[] results6= SharpZipLibCompression.Compress(initialData);
sw.Stop();
Console.WriteLine("Compress with SharpZipLib: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("Compressed size: " + results6.Length.ToString());
sw.Reset();
sw.Start();
byte[] results7= SharpZipLibCompression.DeCompress(results6);
sw.Stop();

Console.WriteLine("Decompress with SharpZipLib: " + sw.ElapsedMilliseconds.ToString());
Console.WriteLine("=========================================================");

Console.WriteLine("Any key to quit.");
Console.ReadLine();
}
}
}

You can download the complete test project (Visual Studio 2010), which includes a self-contained class file for each of the first three algorithms, and a separate copy of just the DotNetZip Zlib library.


By Peter Bromberg   Popularity  (7204 Views)