A Little Background First
I "discovered" Office HTML & XML two summers ago when I had to implement "Download as Word Document" functionality for
a report with customized features, through a web page without using any third
party utility. Among other things, I had to allow users to download a report
as a Word document with PRINT LAYOUT.
In Office 2000 (& above) documents can be converted from Word to HTML (File->Save
As) and vice versa! It is interesting to see how an Office Document renders HTML
& CSS on thus saving as Web page. By understanding and emulating this conversion
technique, we can generate Word or Excel documents through web pages. The various
formatting features in the Office document are defined through XML as explained
in the Microsoft Office HTML and XML Reference. By manipulating these properties exposed through XML programmatically, we can incorporate
custom formatting capability in the Office docs that we offer for downloading.
The document properties (defined as XML elements) common to Office applications are
prefixed with "o:" while Word specific properties are prefixed with
"w:" and Excel properties are prefixed with "x:" specific
properties
The View element specifies a document's view setting. It can have the following
values:
- MasterPages
- Normal
- Print
- Web (default)
By setting the value of the View explicitly from the default "Web" layout
view to "Print" - <w:View>Print</w:View> (as shown below),
the downloadable Word document can be made to open with the Print layout.
<xml>
<w:WordDocument>
<w:View>Print</w:View>
<w:Zoom>90</w:Zoom>
</w:WordDocument>
</xml>
To get a low down on the Office XML Elements dig into the downloadable Office HTML and XML Reference mentioned above.
The (inline) VB.NET code below shows how to allow dynamic content to be downloaded
as a Word document with the required Word formatting features. We first build
the content with the Office style settings. Just to demonstrate that we will
be streaming dynamic content, I will be showing the date & time at which
the page is accessed with DateTime.Now
Finally by setting the MIME type to application/msword using Response.AppendHeader
(instead of AddHeader method which is only used for backward compatibility) and
also specifying the filename with "content-disposition" header, the
HTML content can be forced to be downloaded as a Word .doc file with the desired
file name.
<%@ Page Language="VB" %>
<script runat="server">
Public Sub Page_Load(sender as Object, e as EventArgs)
'build the content for the dynamic Word document
'in HTML alongwith some Office specific style properties.
Dim strBody = New StringBuilder("")
strBody = "<html xmlns:o='urn:schemas-microsoft-com:office:office' " & _
"xmlns:w='urn:schemas-microsoft-com:office:word'" & _
"xmlns='http://www.w3.org/TR/REC-html40'>" & _
"<head><title>Time</title>"
strBody = strBody + _
"<!--[if gte mso 9]>" & _
"<xml>" & _
"<w:WordDocument>" & _
"<w:View>Print</w:View>" & _
"<w:Zoom>90</w:Zoom> " & _
"</w:WordDocument>" & _
"</xml>" & _
"<![endif]-->"
strBody = strBody + _
"<style>" & _
"<!-- /* Style Definitions
" */@page Section1{size:8.5in 11.0in;" & _
"margin:1.0in 1.25in 1.0in " & _
"1.25in;mso-header-margin:.5in; " & _
"mso-footer-margin:.5in; mso-paper-source:0;}" " & _
"div.Section1{page:Section1;}-->" & _
"</style></head>"
strBody = strBody + _
"<body lang=EN-US style='tab-interval:.5in'>" & _
"<div class=Section1><h1>Time and tide wait for none</h1>" & _
"<p style='color:red'><I>" & DateTime.Now & "</I></p>"& _
"</div></body></html>"
'Force this content to be downloaded
'as a Word document with the name of your choice
Response.AppendHeader("Content-Type","application/msword")
Response.AppendHeader("Content-disposition", _
"attachment; filename=myword.doc")
Response.Charset=""
Response.Write(strBody)
End Sub
</script>
Implementing the "Download as Excel" feature
I recently stumbled upon a elegant example that utilizes the same technique to implement
"Download as Excel" functionality for a web page with content from
a data source. I got this from among a bunch of re-usable utility methods that Daniel Olson has shared.
The method DataTable2ExcelString() takes a DataTable & converts its contents into a HTML table. This string containing
the HTML table is then streamed as MIME content to Excel on the client and thus
implementing the "Download as Excel" feature becomes a cinch.
Here's a little C# snippet that illustrates how easy it is to use it -
//for demo purpose, lets create a small datatable & populate it with dummy data
System.Data.DataTable workTable = new System.Data.DataTable();
//The tablename specified here will be set as the worksheet name of the generated
Excel file.
workTable.TableName = "Customers";
workTable.Columns.Add("Id");
workTable.Columns.Add("Name");
System.Data.DataRow workRow;
for (int i = 0; i <= 9; i++)
{
workRow = workTable.NewRow();
workRow[0] = i;
workRow[1] = "CustName" + i.ToString();
workTable.Rows.Add(workRow);
}
//...and lets put DataTable2ExcelString to work
string strBody = DataTable2ExcelString(workTable);
Response.AppendHeader("Content-Type", "application/vnd.ms-excel");
Response.AppendHeader("Content-disposition", "attachment; filename=my.xls");
Response.Write(strBody);
The complete source code can be viewed here. For a more detailed description of the DataTable2ExcelString method, check the author’s documentation.
The same idea can be adapted for generating Word documents with dynamic content from
a web page. The code samples and their output can be viewed from this landing page. This technique can be used with any server side language.
The possibilities with Office HTML & XML programmability are immense. This article
described a lightweight method to generate a Excel document dynamically over
the Web with content from any data source, by tapping the rich formatting properties
offered through Office HTML & XML.
Update (Feb 3, 2007)
Adding a Custom Header & Footer to the dynamically generated Excel sheet
Excel uses CSS and @page definitions to store some page setup settings. If you want
a Header & Footer to appear when the downloaded Excel sheet is printed, use
the mso-header-data & mso-footer-data style attributes like this -
<style>
@page
{
mso-header-data : '&R Date: &D Time: &T';
mso-footer-data : '&L Proprietary & Confidential &R Page &P of
&N';
}
</style>
&L is used for left-aligning text and &R for right-aligning text in the header/footer. The other metacharacters and their
meanings (extracted from the "Headers and Footers" section of the Office HTML & XML Reference) are listed below -
&C - Center-aligned data
&D - Date
&T - Time
&P - Page number
&N - Number of pages
&B - Bold
&I - Italic
&\0022fontname\,fontstyle\0022&fontsize - Font name, style, and size
To see this feature in action, check the working sample and source code. Print preview the downloaded Excel file to notice the Header & Footer.