Lost in Space Forever
[Compressed Binary DataSet Serialization for Compact Framework]
By Peter A. Bromberg, Ph.D.
Printer - Friendly Version

Peter Bromberg

Catchy Title, eh? Dr. Dotnetsky told me to use it, and it's kind of a funny success story (Frankly, I wish the guy would stop mumbling with those Martini olives rolling around inside his mouth, it's hard enough to understand what he says half the time anyway).

First to gain some insight into the process that this project has evolved from, I strongly recommend that you read the original piece I did on this back in January 2004, "True Binary Serialization and Compression of DataSets". In it, I name the various players (both programmatic and persona) and you will be able to get a flavor for how this all came about.

This presentation is all about Binary Serialization and Remoting, why they're good, and why they got left out out of the Compact Framework, which is bad... It's also about how you can add Data Compression, and put all three of these "back into" the Compact Framework, and that's good! On top of that, every line of code is in the public domain or under the very generous LGPL license, which means you are free to put it into commercial software. Ain't technology grand?

Remoting Infrastructure

Remoting is one of the coolest features of the .NET Platform, and unfortunately when the designers of the Compact Framework did their work, they had to make the (most likely) painful decision to leave it out. Why? Space, probably. Look at MarshalByRefObject and the System.Runtime.Remoting classes and you'll see why. Not only that, but you need the BinaryFormatter in System.Runtime.Serialization.Formatters.Binary, which comprises some 66 different classes and enumerations, takes up quite a bit of additional space, and has so many dependencies on other BCL classes that, well - it just wasn't about to be. So, you get a remarkably complete implementation of ADO.NET in the Compact Framework, but no Remoting and no Binary Serialization.

However you can still do "remoting" and you can even have Binary Serialization in the Compact Framework! Binary Serialization can be achieved through Angelo Scotto's marvelous creation, the "Compact Formatter". My chief goal in working with Mr. Scotto to help move his project forward was to get his class to be able to perform binary serialization on DataSets, and he rose to the challenge with great success.

Now developers who've worked with Binary Serialization know that the ADO.NET DataSet, engineering masterpiece that it is (and I say this seriously), describes itself to the Serialization Framework only in XML. DOH! You run a DataSet through the BinaryFormatter and what you get is a very large byte array filled with -- you guessed it -- a whole glop full of textual XML!

Fortunately there are ways to make our friend the DataSet cooperate! One of the most efficient ways is to use what is referred to as a Surrogate class. The best example out there is Ravinder Vuppula's "DataSetSurrogate" class, described in MS KB 82970. It turns out that with only minor modification, any DataSe (except strongly-typed) can be passed into this type of surrogate class and the whole class becomes BinaryFormatter (and CompactFormatter) compatible. The reason for this is because the surrogate class "unwraps" everything in the DataSet into binary serializable types (mostly ArrayLists). There are additional benefits to using a surrogate class - you can add your own properties and methods to it (for an idea of what you could do, take a look at my site partner and fellow MVP Robbe Morris's "Set Class Properties from DataTable with Atttributes" article) and of course, all your custom "stuff" now becomes remotable through the Binary Channel. By the way, Mr. Vuppula of Microsoft informs me that " We'll be doing true binary serialization of DataSet in the next version so that remoting of the DataSet will be performant right out of the box".

Lost in Space!

Now, I digress for just a moment to explain my catchy title "Lost in Space Forever": During the testing of the DataSetSurrogate serialization using the CompactFormatter class for the Compact Framework, I kept getting "Platform not supported" and similar exceptions and the whole process basically went off into LaLa Land (hence, "Lost in Space Forever"). Finally after some serious headache-provoking debugging sessions, I realized it was the Locale property of the DataSet that was causing the CompactFormatter to choke. I'm still not exactly sure, but it seems there are some minor inconsistencies between the regular ADO.NET DataSet and the one in the Compact Framework. Not to be daunted, I remained true to my credo of  "Less is More", and simply commented out the little boogers! The downside? Not much really, for those of us who speak English! The result of all this work is that my final CompressDataSet class is 100% Compact Framework compatible, and adds in SharpZipLib Zip compression to complete the picture.

So now, you really can have a kind of "quasi Remoting" in the Compact Framework: You simply set up a WebService whose purpose it is to receive byte arrays (not DataSets). Web Services are ideal for working with hand held devices such as the PocketPC because the classes to implement Web Service proxy classes through WSDL are present on the Compact Framework, and they use the same standardized XML SOAP messages as are used on the .NET Framework. The Web Service we use is hosted on the ASP.NET runtime under IIS at the server, just like any other Web Service. These compact byte arrays we send and receive are serialized as Base64, which certainly adds a little baggage, but since you are passing DataSets over the wire in a highly compressed format that can represent as little as 2 percent of their original size, its a small price to pay. Your WebService simply uses the CompressDataSet class to accept the byte array, decompress/deserialize it using the CompressDataSet class, and you can hook it up to a DataAdapter and do your updates the same way that you would have, had a regular DataSet come over the wire! I'm still seeing some issues with sending out a CompressedDataSet byte array in response to an incoming SQL query, but if you send the original DataSet out the "old way", your CF application can perform the CompactFormatting and compression locally, and save the resultant compressed DataSet to its local file system, which is the major attraction of this whole exercise.

Local ADO.NET Data Storage with no licensing fees

This presents some interesting possibilities for local storage. While SQL Server CE is available and does a lot, it also takes up space on your device and requires a client license for each device. For simple data storage, you can use my CompressDataSet which has many of the features of a database "built in", and simply load it off the filesystem into memory! You can of course also "Synchronize" with your permanent database as described above through the WebService.-, by simply calling GetChanges on your modified CF app DataSet, and sending the compressed little new guy over the wire to your webservice to handle the decompression and DataAdapter "Update". To complete the picture, the webservice can send back the entire DataSet with all the synchro and changes, to be recompressed and stored back on the Compact PC's filesystem for local storage.

The Compact Framework Solution you can download at the link below has a complete working copy of the CompressDataSet class, along with a webservice "CFService". You'll need to ensure that the CFService subfolder of the solution is set up as an IIS Virtual Directory and marked as an IIS Application. The CompressDataSet project itself can be set up as a "regular" .NET Framework project and will work in either environment without changes. If you make any enhancements or changes to the classes, please let us know so that we can pass them on!

Finally, when testing this, please do your testing in RELEASE (not debug) mode and preferably on a real device (not the "Emulator") if you want to be able to gauge how fast the process is. When run in debug mode, there is a tremendous amount of extra baggage that really slows everything down. The "Get Remote" option in the CF app gets the DataSet from the remote webservice and saves the compressed dataset locally as "ds.dat", The "Get Local" option simply loads this "ds.dat", decompresses and deserializes it, and populates your DataGrid with it as a "Proof of Concept". This uses the Northwind "Customers" table, but you can change the code to use whatever you want. One final issue, if you haven't discovered it already: When setting a webreference from a Compact Framework app, it may not know what "Localhost" is (or the name of the PC) Use instead, the IP address of the machine when you create (or modify the proxy class code in) your WebReference (e.g., "").

N.B. Reader Simon Hansman has discovered that there can be problems when adding new columns or changing the Primary Key prior to passing a DataSet in for compression or Decompression, and has posted an elegant fix here. The CompressDataSet class I have presented here is already in use in at least one commercial application where bandwidth considerations are key, and they report getting average compression ratios of 92 percent.

Download the Source Code that accompanies this article



Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform.