In-Memory Data Compression in .NET, PART II

By Peter A. Bromberg, Ph.D.

Peter Bromberg  

In my first article of this series, "In-Memory Data Compression in .NET", I illustrated the use of Mike Krüger of icSharpcode.net 's NZipLib 100% C# ZLib port to compress and decompress strings in memory. The next logical step, is "How do I send the stuff over the wire compressed, decompress it , do some work and then recompress it and send back the response?" If you are not familiar with ZLib or data compression, I suggest you review the first article before reading this one.



As I mentioned in my first article, the big advantage of using a library like this to handle your streaming data compression / decompression is not just that the "price is right" (it's open source) - but that it's a direct port of the ZLib RFC's, and therefore you can count on the fact that if there's an RFC compliant ZLib engine on the other end - whether it be COM or Java based or any other platform - it should be able to handle your compressed data perfectly well, as the compression headers and so on should be identical.

So, Let's get into some code. What I've done here is created a "Sender.aspx" page with two text boxes, very much analogous to the WinForm app that we saw in PART I, and a "receiver.aspx" page. I've set up the sender so that you have two buttons. The "Send/Compress" button does the compression and sets a QueryString variable "compressed" to equal "1". The "Send Uncompressed" button sets the "compressed" QueryString variable to "0". So with the same front - end form, we can either post our data the regular way, or we can post it compressed using the default NZipLib deflater. I've also put in a Status label that will show the original size, the compressed size, and the round- trip elapsed time in milliseconds.

Here's the code for the "Sender.aspx" page:


SENDER.ASPX

<% @ Page Language="C#" %>
<% @Import Namespace="System" %>
<% @Import Namespace="System.Text" %>
<% @Import Namespace="System.IO" %>
<% @Import Namespace="System.Net" %>
<% @Import Namespace="System.Text" %>
<% @Import Namespace="NZlib.Streams" %>
<% @Import Namespace="System.Xml" %>

<script language="C#" runat="server">
private void Page_Load(object sender, System.EventArgs e)
{
}

public void Button1_Click(object sender, System.EventArgs e)
{
if(Convert.ToInt32(txtLevel.Text) <1 || Convert.ToInt32(txtLevel.Text) >9)
{
lblStatus.Text = "Compression Level must be integer between 1 and 9";
return ;
}

byte[] crunchedData =null;
lblStatus.Text="Your Data: " +TextBox1.Text.Length.ToString();
System.DateTime sStartTime = System.DateTime.Now;
crunchedData=Compress(TextBox1.Text, Convert.ToInt32(txtLevel.Text));
lblStatus.Text += " Compressed: " +crunchedData.Length.ToString();
System.Net.WebClient wc = new System.Net.WebClient();
byte[] resp= wc.UploadData("http://localhost/compressor/receiver.aspx?compressed=1&level=" + txtLevel.Text, crunchedData);
string finalstuff= DeCompress(resp);
TimeSpan elapsed =System.DateTime.Now - sStartTime;
TextBox2.Text = finalstuff;
lblStatus.Text += " Roundtrip elapsed time: " +elapsed.ToString();
}

public void Button2_Click(object sender, System.EventArgs e)
{

lblStatus.Text="Your Data: " +TextBox1.Text.Length.ToString();
System.DateTime sStartTime = System.DateTime.Now;
System.Net.WebClient wc = new System.Net.WebClient();
byte[] bytData = System.Text.Encoding.ASCII.GetBytes(TextBox1.Text);
byte[] resp= wc.UploadData("http://localhost/compressor/receiver.aspx?compressed=0", bytData);

TimeSpan elapsed =System.DateTime.Now - sStartTime;
TextBox2.Text = System.Text.Encoding.ASCII.GetString(resp);
lblStatus.Text += " Roundtrip elapsed time: " +elapsed.ToString();
}


// Compress Method - uses byte array compressedData which is public defined above
private byte[] Compress(string strInput, int iCompLevel)
{
try
{
byte[] bytData = System.Text.Encoding.UTF8.GetBytes(strInput);
MemoryStream ms = new MemoryStream();
NZlib.Compression.Deflater defl = new NZlib.Compression.Deflater(iCompLevel);
Stream s= new DeflaterOutputStream(ms,defl);
//Stream s = new DeflaterOutputStream(ms);
s.Write(bytData, 0, bytData.Length);
s.Flush();
s.Close();
byte[] compressedData = (byte[])ms.ToArray();
return compressedData;
}
catch(Exception e)
{
lblStatus.Text+=" : " +e.Message;
return null;
}
}
// Decompress Method
private string DeCompress(byte[] bytInput)
{
string strResult="";
int totalLength = 0;
byte[] writeData = new byte[1024];
Stream s2 = new InflaterInputStream(new MemoryStream(bytInput));

try
{
while (true)
{
int size = s2.Read(writeData, 0, writeData.Length);
if (size > 0)
{
totalLength += size;
strResult += System.Text.Encoding.ASCII.GetString(writeData, 0, size);
}
else
{
break;
}
}
s2.Flush();
s2.Close();

return strResult;
}
catch(Exception e)
{
lblStatus.Text+=": " +e.Message;
return null;
}
}
</script>

<HTML>
<body MS_POSITIONING="GridLayout">
<CENTER><h3>Paste xml or text in top TextArea and press Compress/Send Button</h3></CENTER>
<form id="Form1" method="post" runat="server">
<asp:TextBox id=TextBox1 style="Z-INDEX: 101; LEFT: 63px; POSITION: absolute; TOP:86px" runat="server" Width="544px" Height="249px" TextMode="MultiLine"></asp:TextBox>
<asp:TextBox id=TextBox2 style="Z-INDEX: 101; LEFT: 63px; POSITION: absolute; TOP: 420px" runat="server" Width="544px" Height="249px" TextMode="MultiLine"></asp:TextBox>
<asp:Button id=Button1 style="Z-INDEX: 102; LEFT: 69px; POSITION: absolute; TOP: 350px" runat="server" Width="102px" Text="CompressSend" OnClick="Button1_Click" />
<asp:Button id=Button2 style="Z-INDEX: 102; LEFT: 178px; POSITION: absolute; TOP: 350px" runat="server" Width="152px" Text="Send Uncompressed" OnClick="Button2_Click" />
<asp:TextBox id=txtLevel style="Z-INDEX: 102; LEFT: 338px; POSITION: absolute; TOP: 350px" runat="server" Width="15px" Text="1" />
<asp:Label id=txtLevLbl style="Z-INDEX: 102; LEFT: 358px; POSITION: absolute; TOP: 350px" runat="server" Width="152px" Text="Comp Level(1-9)" />
<asp:Label id=lblStatus style="Z-INDEX: 103; LEFT: 73px; POSITION: absolute; TOP: 380px" runat="server" Width="516px" Height="21px"></asp:Label>
</form>
</body>
</HTML>

As you can see, we are using the System.Net.WebClient class, a simplified class whose UploadData method is very easy to use. It sends a byte array - exactly what we want, and expects back a byte array - also exactly what we want. In this manner, we don't need to fool with streams and a bunch of extra code (I'm a big fan of simplicity). And as you can see, .NET makes converting a byte array to a string as easy as falling off a log.

And now here is the Receiver.aspx page code:

RECEIVER.ASPX

<% @ Page Language="C#" %>
<% @Import Namespace="System" %>
<% @Import Namespace="System.Text" %>
<% @Import Namespace="System.IO" %>
<% @Import Namespace="System.Net" %>
<% @Import Namespace="System.Text" %>
<% @Import Namespace="NZlib.Streams" %>
<% @Import Namespace="System.Xml" %>
<script Language="C#" runat="server">

void Page_Load(){
if("0" !=Request.QueryString["compressed"] )
{
int iCompLevel=1;
iCompLevel=Convert.ToInt32(Request.QueryString["level"]);
byte[] bytData = Request.BinaryRead(Request.TotalBytes);
string strDecompressedData=DeCompress(bytData);
// do some work here before recompressing response---
strDecompressedData = "Here is your result: " +strDecompressedData;
byte[] bytResultData=Compress(strDecompressedData, iCompLevel);
Response.BinaryWrite(bytResultData);
}
else
{
Response.Write ("Here is your result: " +System.Text.Encoding.ASCII.GetString(Request.BinaryRead(Request.TotalBytes)));
}
}

// Compress Method - uses byte array compressedData which is public defined above
private byte[] Compress(string strInput, int iCompLevel)
{
try
{
byte[] bytData = System.Text.Encoding.ASCII.GetBytes(strInput);
MemoryStream ms = new MemoryStream();
NZlib.Compression.Deflater defl = new NZlib.Compression.Deflater(iCompLevel);
Stream s = new DeflaterOutputStream(ms,defl);
//Stream s = new DeflaterOutputStream(ms);

s.Write(bytData, 0, bytData.Length);
s.Flush();
s.Close();
byte[] compressedData = (byte[])ms.ToArray();
return compressedData;
}
catch(Exception e)
{
return null;
}
}

// Decompress Method
private string DeCompress(byte[] bytInput)
{
string strResult="";
int totalLength = 0;
byte[] writeData = new byte[4096];
Stream s2 = new InflaterInputStream(new MemoryStream(bytInput));

try
{
while (true)
{
int size = s2.Read(writeData, 0, writeData.Length);
if (size > 0)
{
totalLength += size;
strResult+=System.Text.Encoding.ASCII.GetString(writeData, 0,
size);
}
else
{
break;
}
}
s2.Flush();
s2.Close();
return strResult;
}
catch(Exception e)
{
return null;
}
}
</script>

The code should be pretty much self-explanatory. You will note that I've put in some "conveniences" in the form of QueryString variables "compressed" and "level", and instead of using the default deflater as in the previous article, I'm setting the deflater object separately so that you can play around with different compression levels. In most of the production work I've done, for compression of XML documents you should find that the lower compression levels (1 to 3) provide nearly the same compression ratios but are much faster, so that your overall roundtrip time will be less. You'll need to have the NZipLib.dll assembly in the /bin folder directly under your IIS application folder where you place the aspx files.

One important note: I've comfirmed with Mike that there was a bug in the original Java Zlib port which he ported to C#. That is, using deflater compression levels less than 5 results in an error. Mike has not only fixed this bug, but also has notified the original Java developers of their error. Until Mike has a chance to update his distribution at icsharpcode.net , the NZipLib.dll that is included in this download is the only way for you to get this "Fixed" version!

How much more scalability will Data Compression buy me?

It really depends on what you are doing with the compression. I did some rather unscientific tests by using the Sender.aspx page above sitting on my local machine at my office, and sending/ receiveing the compressed data from a receiver.aspx page sitting on our server at www.eggheadcafe.com. I sent John Bozak's XML rendition of "Hamlet", which is 288,777 bytes long. At the receiver, we just prepend the string "Here is your result:" , recompress and send it back. Since this is all going out and coming back through a coporate firewall, there are of course a lot of variables in the equation. But, the best time I got sending and receiveing Hamlet UNCOMPRESSED was 1 minute 28 seconds. The best time I got with compression on both ends was 10.43 seconds! Your mileage may vary, but I would venture a guess that if your SOAP Webservice or other .NET application needs to send fairly large XML or textual documents back and forth, looking into using data compression may be well worth the extra effort.

Using this code, or your own concoction somewhat similar to it, you should be able to compress, send , decompress on the server, do your stuff, compress the result, send it back, and decompress and use it on the client with ASP.NET. If you want, you can even embed the compressed data in SOAP messages and use SOAP extensions to intercept the message stream and do your stuff. If you get any new ideas or have more suggestions on the use of compression in .NET, please feel free to post them on one of our forums here at www.eggheadcafe.com. And, don't forget my last installment, "In-Memory Data Compression in .NET, Part III".

Download the code that accompanies this article

 

Peter Bromberg is an independent consultant specializing in distributed .NET solutions in Orlando and a co-developer of the NullSkull.com developer website. He can be reached at info@eggheadcafe.com