Working with Microsoft Word 2007 Document Fragments in SharePoint

In this article, we explain how to split a Microsoft Word 2007 document based on its sections into different files to work simultaneously in a team. Also we will see that how we can merge those section in to one document as well. Also we will see how to save the section file in document library in SharePoint site.

Windows SharePoint Services provides an enhanced storage system that facilitates collaboration. This system improves upon old collaboration techniques of simply emailing documents back and forth or dropping them in a file share. Relying on email is an awkward system, as team members are never sure they have the most up-to-date version of the document and consolidating the changes becomes a laborious work. File shares are also limited in that files dumped there are often difficult to find, uses have no idea if the file is currently being edited by another user or not. SharePoint’s system of Web-enabled content databases provides a rich experience for the team working on the document. The environment provides check-in/check-out functionality, versioning, metadata, search and an entire web site for storing lists of data related to the creation of the document.


However, even this system becomes strained in cases in which the team’s document is really a collection of separate sections. Frequently, these sections are independent and different team members are responsible for different pieces. Under these circumstances, the team members ideally would begin to work simultaneously. Yet this work often takes place serially because the file in a SharePoint library can be checked out by only one user at a time. In this article, we will see how we can enhance the experience by enabling a Word 2007 document to be split into separate files, one for each of its sections. This will allow different team members to work on their sections simultaneously. When completed, the solution will support merging the individual sections back into a single document.


For example, consulting firms often have a sales resource, a project manager and a development lead working on distinct portions of a proposal. The sales resource is focusing on background information on the company, case studies, and pricing. The project manage is responsible for documenting the project lifecycle, creating a high-level project plan, and detailing the change/review mechanisms. Meanwhile, the developer is taking the customer’s functional requirements and outlining a proposed solution. All of these pieces have to be put together to complete the proposal. This problem is rather generic and can be found across many different customer types. In the construction industry, different team members are often responsible for certain sections of a contract. By allowing the team to divide and conquer the work, the solution in this article provides an efficient process that reduces the amount of time it takes to complete an entire document.


Solution Overview


Our scenario starts with a user who has a document with some boilerplate content. As described earlier, this document is made up of several sections that different uses are responsible for completing. The sections’ begin and end points are not easily deduced. Some sections include multiple headings while others may be just a paragraph. For this reason, we will allow the user to highlight each section, recording the sections within an XML structure. When the user has completed marking up the document and saved to the SharePoint library, we will display a custom action named Split into Sections in that document’s drop-down menu. After choosing this action, the user will be directed to a custom application page that we will construct, where the user will be given the option to create a new document library to hold the section files, or to use an existing one. We will store enough information about the source document in the properties of this document library to support a merge operation later. The team will be able to work with the separate files simultaneously, setting security and maintaining versions. When the sections are complete, we will allow the user to select a merge action from the document library’s toolbar. This will take user to another custom application page where he can choose to write the merge document over the one that was split, or to save it to a new location.


Both the split and merge operations are operating on Microsoft Word 2007 documents. The modification of these files is made possible by the new Open XML file format.  In both the cases, we are able to perform these operations on the server interacting with streams and XML documents, also this don’t need the Microsoft Word to be installed on the server.


For the flexibility to site administrator to simply turn on or off this functionality, we will develop this solution as SharePoint feature. Features are a way of packaging a set of configurations as a single unit that can be activated or deactivated.


Walkthrough


In the walkthrough we will see how to create an XML schema and apply it to a Microsoft Word document. We will also see how to package the solution as a feature that can be enabled by any site administrator who desires the functionality. This solution is created using Visual Studio 2008 and WSPBuilder project template. We will also see how to create a custom document library template that we will use to distinguish the libraries of section files from the other document libraries on the site. We will also see the construction of custom application pages for bot splitting and merging. At the end, we will see the manipulation of the XML in the documents, which facilitates the split and merge operations.


So, create a new WSPBuilder project in Visual Studio 2008.


Creating the XML Schema for Word document


An XML schema is a file that defines a specific type of XML document. It provides information that describes the structures that will be present in any XML document constructed to comply with the schema. Schemas are often used to validate that XML document instances are well-formed.


For our example we will construct an XML schema to describe the structure of the Microsoft Word documents that we want to split. The schema is as simple as it will dictate that the object we are operating on is a document and that the document is made up of a sequence of sections. Though we could get this information from Microsoft Word document itself, but such a solution would require us to infer the begin and end points of a section. This could lead us to use headings to mark the beginning and end of sections. Our goal is to be much more flexible, and allow the user to select which document content belongs to which section. By providing our own schema, we can allow the user / author to select any range of the document and tag it as a section.


To create an XML schema file, in Visual Studio, right click the project name click Add -> Add new item. Select the XML Schema from the templates and give appropriate name like DocumentSections.xsd. Following is the content of the XML schema.

<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="DocumentSections"
    targetNamespace="http://tempuri.org/DocumentSections.xsd"
    elementFormDefault="qualified"
    xmlns="http://tempuri.org/DocumentSections.xsd"
    xmlns:mstns="http://tempuri.org/DocumentSections.xsd"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:element name="Document" type="DocumentType"></xs:element>
<xs:complexType name="DocumentType" mixed="true">
<xs:sequence>
<xs:element name="Section" minOccurs="1" maxOccurs="unbounded"></xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema>

As you can see in the above XML, we have provided targetNamespace element with http://tempuri.gor/DocumentSections.xsd. This is important do distinguse our XML schem versus other used by the application. The first element defined is the Document which represents the entire contents of the Microsoft Word document. It is the outer container of the rest of our markup. Type of the Document element is DocumentType which is custom complex type. The DocumentType definition specifies that a document contains a sequence of sections and at least one section is required.


Applying our Schema to a Document


Since Microsoft Word 2003, we have been able to attach custom schemas to Microsoft Word documents. Our first step is to create a Microsoft Word 2007 document that we want to split. For our example, we are only use a document that has text-only. So, create a word document with couple of paragraphs with separated with headings. The sample document download link is provided at the end of the article.

Use the following steps to attach the schema to the test document.

1. Click the Developer tab in the ribbon. This may be hidden. If it is, then enable it through the Word Options interface. In the Popular section, there is a check box for Show Developer Tab in the Ribbon.
2. Click the Schema button in the XML group of controls.
3. Click the Add Schema button on the XML Schema tab.
4. Browse to where you saved the DocumentSections.xsd schema file. In a production environment, this schema would be in a central location such as a SharePoint library.
5. Give the schema an alias of DocSections and uncheck the option to have these changes affect the current user only.
6. Make sure that DocSections schema is selected when you click Ok to close the Templates and Add-ins dialog.


When you close the Templates and Add-ins dialog, the XML Structure task pane will open on the right side of document. This interface will allow you to select portions of the document and associate them with elements defined in the XML schema. Follow these steps to tag the test document appropriately.
1. Use Ctrl + A to select the entire document.
2. Click on the Document element in the Choose an Element to Apply to Your Current Section area.
3. Click the Apply to Entire Document button in the dialog prompt. Once that’s done, you should see that the outer element node is displayed at the top of the XML Structure task pane and that markup tags have been inserted in the beginning and end of the document.
4. Now highlight from the Default Content Source heading (include heading) through the paragraph below the heading.
5. With the section, click the Section element in the Choose an Element to Apply to Your Current Selection area.
6. Repeat steps 4 and 5 for other heading and paragraphs as well.
7. Save the document.
By following these steps we have added custom markup into the Microsoft Word document. The document will look like the below image.


We will use this document as the source file that our code will operate on. We will upload it to a SharePoint site and choose to split it, which will create new files for each section in a new library. When the split is completed, there will be separate files for each Default Content Source, Creating a new content source and Types of content repositories section. Now we will see how to use this document XML structure to perform the necessary manipulations as well as development of a SharePoint feature that will contain our code. In the actual production environment, if this document were one that might be used repeatedly, and can be used as content type. To hide XML tags for production documents, uncheck the Show XML Tags in Document check box in the XML Structure task pane. But for our example keep it as it is.


Examining the Document’s XML

As the Microsoft Word 2007 take advantage of the new Open XML file format, we can gain insight into the structure of an Open XML-based file by replacing its file extension with .zip since the file is really an ordinary zip archive.
The XML file in the root is named [Content_Types].xml and it contains content-type directives for all the parts that appear in the archive. A content type contains metadata about a particular part or groups of parts and more importantly, contains a directive about how the application should render that part. Following is the snippet of the content types file.

<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" />
  <Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml" />
  <Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml" />

Pay particular attention to the Override element for the part named /word/document.xml. This file contains the document’s contents, and by inspecting it we can see the impact of tagging the document with the custom schema elements. Below image shows the document.xml file in IE.
Building the Document Section Feature Project


Our solution is going to provide the organization with a set of customizations that together enable any site in the site collection to take advantage of this split and merge utility. The important fact here is that we want to provide multiple instances of this capability throughout the site collection. Some sites may want to leverage it while others may not. Due to this, we will package our solution as a SharePoint feature.


A SharePoint feature is a deployable unit that packages site customizations and custom code into a single component that can be activated and deactivated by an administrator. Features can have a defined scope that dictates the breadth of their influence. A feature can be scoped to the entire server farm, a single web application, a site collection or just a site. The scope determines the level of administrator needed to turn it on or off and the level at which its customizations are applied. Features can include customizations such as list templates, list instances or custom menu items and can be used to provision new files such as master-page templates, web-part definitions and page layouts for web-content management. Features can even run custom code in response to their activation or deactivation.


To create the Document Splitter Merger feature, right click the project name in Solution Explorer in Visual Studio, for the project we created earlier, and click Add -> Add new item. In the Add New Item dialog, select WSPBuilder under Visual C# Items. Now select Feature with Receiver item from the templates. Give the appropriate name, for example, DocumentSplitterMerger and click Add. This will add couple of files including the feature.xml and elements.xml and also creates some folders. Now your project structure will look like the below image.


Defining the Feature


The DocumentSplitterMerger folder represents the definition of the feature. The feature.xml is the primary manifest file that contains information about the feature, references to element files that detail the customizations the feature contains, and activation dependencies for other features that this one relies on. Below is the feature information…

<Feature  Id="2b7e9d86-6cb8-4ab5-92b0-d47815901955"
           
Title="Document Splitter Merger"
           
Description="The feature provides the functionality to split a word document into multiple section as well as provide the functionality to merge the section into one word document."
           
Version="12.0.0.0"
           
Hidden="FALSE"
          
Scope="Web"
          
DefaultResourceFile="core"
          
ReceiverAssembly="WordDocumentSplitterMergerCS, Version=1.0.0.0, Culture=neutral, PublicKeyToken=3cbedfab274ba150"
          
ReceiverClass="WordDocumentSplitterMergerCS.DocumentSplitterMerger"           
           
xmlns="http://schemas.microsoft.com/sharepoint/">
  <ElementManifests>
    <ElementManifest Location="elements.xml"/>
  </ElementManifests>
</Feature>

The definition of the feature includes a new GUID as its ID. You can use the Create GUID option from the Tools menu of Visual Studio to create a unique GUID. The ReceiverAssembly attribute is the strong name of the assembly. The public key token used here depends upon your key used to sign the assembly. To retrieve this you must first compile your project to create the DLL and then use Visual Studio’s command-line SN.exe tool.

The next portion of feature.xml defines the element manifests. An element manifest contains the settings of the customizations that the feature will contain. We can have any number of manifests, but it makes sense to group them based on customization type. For this example, we will include one manifest for customizations that involve lists and one manifest for custom actions that our feature places in the site.

The elements.xml file is an element manifest file that contains information about the customizations this feature will make to the site. In this file, we will specify the customizations related to adding links to menus. The feature will create a new link in the menu (Edit Control Block) that displays when hovering over any Microsoft Word 2007 document contained within a document library of the site. The custom action will be a link that directs the user to a custom application page that we will develop.

<CustomAction Id="DocumentSection.SplitAction"
RegistrationType="FileType"
    
RegistrationId="docx"
ImageUrl="/_layouts/images/ICDOC.GIF"
Location="EditControlBlock"
Sequence="225"
Title="Split Document into Sections">
    <UrlAction Url="~site/_layouts/DocumentSectionSplit.aspx?ItemId={ItemId}&ListId={ListId}"/>
  </CustomAction>

The most important attributes of this CustomAction element are RegistrationType, RegistrationId, and Location. The RegistrationType attribute details the type of attachment for the custom action. In this case, the action is associated with a particular type of file. Other possible values include ContentType, List, and ProgId. The RegistrationId attribute further clarifies the attachment by specifying the identifier of the list, item, or content type that the attachment is associated with. In this case, the value docx represents the file extension of Microsoft Word 2007 documents. The Location attribute dictates where this action should be displayed. In this case, we want the action to display when the user hovers over any specific document. This location is the Edit Control Block.

When the user clicks on our action, we want to direct him to the custom application page (DocumentSectionSplit.aspx) that we will develop in the “Building a Custom Application Page for Splitting” section of this article. Notice how the URL specified in the XML includes tokens that represent the current site (~site), the item selected ({ItemId}), and the list it is contained within ({ListId}). Passing these values will allow the custom page to retrieve the selected document and the library that contains it.

In addition to an action to split the document, the feature needs an action to allow the user to merge the section files back together. Since this is an action on a set of files, it is not appropriate to add the action to the Edit Control Block. In this case, the action will be located in the library’s toolbar under the Actions menu. Now before looking at the XML for this action, we need to take a step back and realize that we don’t want to put this merge menu option on every document library. We only really need it in libraries that were constructed to contain the section files. For this reason, our solution is going to include a custom document library template to hold the section files. By having a different library template with a different identifier, we will be able to tell apart the site’s normal document libraries from the ones containing section files. We will build this template a bit later, but for now just understand that these libraries will have a unique ID of 10001. This identifier is used to appropriately locate the merge custom-action XML. The remainder of the elements.xml file contains this custom action.

<CustomAction Id="DocumentSection.Merge"
RegistrationType="List"
    
RegistrationId="10001"
ImageUrl="/_layouts/images/ICDOC.GIF"
Location="Microsoft.SharePoint.StandardMenu"
    
GroupId="ActionsMenu"
Sequence="225"
Title="Merge sections">
    <UrlAction Url="~site/_layouts/DocumentSectionMerge.aspx?ListId={ListId}"/>
  </CustomAction>

In this custom action, the scope is limited to libraries of type 10001, which we will define as a library that stores section files. The Location and GroupId attributes place the Merge setions action in the appropriate place in the library’s toolbar.

Now get the attention to sectionsLibrary.xml element manifest file in the List Template directory of the feature. This file contains a definition of a new list template. This template will be a copy of the document library template with a unique type identifier of 10001. In the XML shown below, the key attributes are the name of the template, its type identifier, the BaseType and DocumentTemplate which specifies that a blank Word document – 101.

<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
<ListTemplate Name="SectionsLibrary" Type="10001"
  
BaseType="1" SecurityBits="11"
  
DisplayName="Document Sections Library"
  
OnQuickLaunch="TRUE" Unique="FALSE"
  
Image="/_layouts/images/itdl.gif"
  
Description="Stores sections of a document as individual files"
  
DocumentTemplate="101"></ListTemplate>
</Elements>

Now for the custom SectionsLibrary template, as it is almost identical to a document library, we can start by pouplating the SectionsLibrary folder with the items in the DocumentLibrary feature’s folder. The default location for these files is C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES\DocumentLibrary\DocLib.
We then want to modify the SectionsLibrary schema.xml file to update the List element shown below

<?xml version="1.0" encoding="utf-8"?>
<List xmlns:ows="Microsoft SharePoint" Title="SectionsLibrary"
  
Direction="$Resources:Direction;" Url="Sections Library" BaseType="1">

Though it may seem like all we are doing is copying the out-of-the-box document library template, we are actually setting ourselves up for having a distinction between libraries that contain section files, and other site document libraries.

Deploying the Feature

At this point, there is enough functionality in this feature that it makes sense to deploy it to SharePoint site to test it successfully activates and deactivates. So, build the project first. After successful build, create .wsp file by right clicking the project name and selecting WSPBuilder->Build WSP. After that again right click the project name and select WSPBuilder->Deploy to deploy the solution. Now navigate to your test site and go to Site Settings. On Site Settings page go to Site Features under Site Administration group. On Site Features page locate our Document Splitter and Merger feature and activate it. If nothing gone wrong, then the feature will be activated successfully.
To test that our custom document library template is deployed successfully or not from Site Actions menu click on Create. Look for Document Sections Library under Libraries group. Try to create that document sections library.

Building a Custom Application Page for Splitting

When a user selects the Split Document into Sections custom action from a Word 2007 document in the site, the solution will redirect the user to an application page. This page will allow the user to select where the individual section files are to be saved. The target location is specified using one of two options. The first is to create a new instance of our SectionsLibrary document template; the other is to select an existing one in the site. Of course, if the user is opting to create a new library, he must have permission to do so in the site. If not, the application page will return an Access Denied message.


After the user selection, the page will create the new library (if necessary); place a file for each section into the library, and save properties about the source document in the library to support the merge operation. There is a good bit of code in the next few sections. We will focus on the important fragments, but please refer to the code download to get every line.


The DocumentSectionSplit page contains our custom code. This page will be the same regardless of the site or document the user came from. For this reason, we will construct it as an application page. This type of page is termed an application page because it is not a page that is part of a site definition, but rather one that lives in SharePoint’s Layouts directory. This means that the ASPX page can be accessed off of the path of any site through the _layouts virtual folder. Application pages are ASP.NET pages that derive from a SharePoint class called LayoutsPageBase and rely on the application.master master page for their layout, look, and feel.


As an application page, the code running here would be in response to a user clicking on the action from any site that has the feature activated. For this reason, the page must establish its context and get references to the site collection and site from which it is being invoked. With these pieces of information, the application page can then use the query-string parameters that the custom action passed to identify the document library the user was in when he selected the action, as well as the specific document that was operated on. The following code snippet show the initial operation of the page.

private SPSite _site;
        private SPWeb _web;
        private Guid _listId;
        private int _itemID = 0;


         protected override void OnLoad(EventArgs e)
        {
            _site = SPContext.Current.Site;
            _web = SPContext.Current.Web;
            _listId = new Guid(Server.UrlDecode(Request.QueryString["ListId"]));
            _itemID = int.Parse(Request.QueryString["ItemId"]);

During the page load event, some other server controls are also populated. The cancel button is set up to return the user to the previous page where the custom action was selected. The drop down list is filled with the set of existing document libraries that are instance of our SectionsLibrary template.

SPListCollection libs = _web.GetListsOfType(SPBaseType.DocumentLibrary);
                foreach (SPList list in libs)
                {
                    XmlDocument doc = new XmlDocument();
                     doc.LoadXml(list.PropertiesXml);
                     if (doc.DocumentElement.GetAttribute("ServerTemplate") == "10001")
                    {
                        string libUrl = doc.DocumentElement.GetAttribute("DefaultViewUrl");
                        string webUrl = doc.DocumentElement.GetAttribute("WebFullUrl");
                         //libUrl = libUrl.Replace(webUrl, string.Empty);
                        libUrl = libUrl.Substring(1, libUrl.IndexOfAny("/".ToCharArray(), 1) - 1);
                        ListItem item = new ListItem(list.Title, libUrl);
                          lstLibs.Items.Add(item);
                    }
                }
                SPListItem sourceItem = _web.Lists[_listId].GetItemById(_itemID);
                 if (sourceItem != null)
                {
                    lblDocumentName.Text = Server.HtmlEncode(sourceItem["Name"].ToString());
                     this.btnCancel.OnClientClick = "javascript:document.location.href='" + sourceItem.ParentList.DefaultViewUrl + "'; return false;";
                 }

When the user clicks the Generate button, the first thing to do is determine the target location for section files that will be created as a result of the split. This could be an existing instance of our SectionsLibrary template or the page may have to create a new one if the user requests that.

SPFolder targetLibrary = null;
            if (this.rdExistingLib.Checked)
            {
                targetLibrary = _web.Folders[this.lstLibs.SelectedItem.Value];
             }
             else {
                SPListTemplate template = _web.ListTemplates["Document Sections Library"];
                Guid newListId = _web.Lists.Add(this.txtLibName.Text.Trim(), "Sections of " + this.lblDocumentName.Text, template);
                _web.Lists[newListId].OnQuickLaunch = true;
                _web.Update();
                SPFolder newLib = _web.Folders.Add(this.txtLibName.Text.Trim());
                targetLibrary = _web.Folders[this.txtLibName.Text.Trim()];
             }

With the target library determined, the DocumentSectionSplit page gets a reference to the source Word 2007 document. It then creates an instance of a Splitter class and passes in enough information for the section files to be created. Finally, the page uses the property bag of the target document library to store the lisId of the source document, URL of sourcefile. This information will be used to merge the sections back to one document.

SPListItem sourceItem = _web.Lists[_listId].GetItemById(_itemID);
            SPFile sourceFile = sourceItem.File;
             if (sourceFile != null)
            {
                Stream sourceStream = sourceFile.OpenBinaryStream();
                Splitter splitter = new Splitter();
                splitter.SplitDocument(sourceStream, targetLibrary, this.lblDocumentName.Text);
                 if (this.rdCreateNewLib.Checked)
                 {
                      targetLibrary.Properties.Add("SourceListId", _listId.ToString());
                     targetLibrary.Properties.Add("SourceFileUrl", sourceFile.Url);
                     targetLibrary.Properties.Add("SourceFileName", this.lblDocumentName.Text.Trim());
                     targetLibrary.Update();
                 }

The Splitter Class


Add a class file named Splitter.cs to the root of the project. Please down load the source code for the complete code of this class. We will use this splitter class to split the source file into individual section files. The splitter class receives a binary stream of the source file, a reference to the library where the section files are to be stored, a string of the source file name. The SplitDocument first determines the number of total sections in the source file. This class obtains this information by querying for the customXml nods we saw in the document.xml part earlier.

XmlNodeList nodes = xdoc.SelectNodes("//w:customXml[@w:uri='http://tempuri.org/DocumentSections.xsd' and @w:element='Section']", nsManager);
                int numSections = nodes.Count;
                 documentPart.Package.Close();
                 this.GenerateDocs(numSections, docStream, library, fileName);

The last line of the above code is a call to the GenerateDocs method, which creates the section files and saves them to the library. The algorithm used by this method creates a section file by first creating a copy of the entire source document—which, of course, includes all the sections.

int i = 0;
            BinaryReader reader = new BinaryReader(docStream);
             for (i = 0; i <= numSections - 1; i++)
             {
                 //make copies
                Stream instanceStream = new MemoryStream();
                docStream.Position = 0;
                BinaryWriter writer = new BinaryWriter(instanceStream);
                 writer.Write(reader.ReadBytes(Convert.ToInt32(docStream.Length)));
                 writer.Flush();

With a copy of the entire source document, the algorithm continues by deleting all of the sections that are not needed for this specific section file. This sequence of removals occurs through a nested loop that skips the deletion only when its index matches that of the outer loop. Once the stream has been reduced to a single section, it is saved and added to the target document library.

XmlNodeList nodes = xdoc.SelectNodes("//w:customXml[@w:uri='http://tempuri.org/DocumentSections.xsd' and @w:element='Section']", nsManager);
                    int j = 0;
                    XmlNode sectionNode = null;
                    foreach (XmlNode sectionNode_loopVariable in nodes)
                    {
                        sectionNode = sectionNode_loopVariable;
                          if ((i != j))
                        {
                            sectionNode.ParentNode.RemoveChild(sectionNode);
                        }
                        j += 1;
                     }
                      //save changes to XML

                     xdoc.Save(documentPart.GetStream(FileMode.Create, FileAccess.Write));
                     //save this as a document

                    instanceStream.Position = 0;
                     library.Files.Add(GenerateNum(i) + fileName, instanceStream, true);

When completed, the result is a document library containing individual files for each of the sections of the source document. This means that each of these sections can now be worked on simultaneously. Also, they can be secured and versioned separately.

Building a Custom Application Page for Merging Sections


The custom application page for merging is very similar to its split counterpart. In this page, the user is specifying preferences for where the merged result should be placed. There are two options: The page can place the resulting file in the same location as its original source or the user can choose any other document library in the site and specify a filename for the resulting document.



Like the split application page, this one includes code to discover what objects it is operating on as well as initializing its controls. The major difference is that this page can use the property bag of the document library that contains the section files to retrieve information that the split operation stored there. This information includes the identifier of the document library that contained the source file as well as its URL and filename.

_targetListId = new Guid(currentLibrary.Properties["SourceListId"].ToString());
            _targetFileUrl = currentLibrary.Properties["SourceFileUrl"].ToString();
            _targetFileName = currentLibrary.Properties["SourceFileName"].ToString();


When the user clicks the Merge button, the event handler gets a reference to the target library where the merged file should be stored. This library is either the original one that contained the source document, or a library in the site that the user selected with the drop-down. From there, the merge operation is performed using a SectionMerge class we will review in the next section. This class includes a method called Merge, which receives a reference to the section library, the destination library, the filename for the merged file, and its URL as parameters.

SPFolder sourceLibrary = _web.Lists[_listId].RootFolder;
            SPFolder targetLibrary = null;
            if (this.rdOriginal.Checked)
                targetLibrary = _web.Lists[_targetListId].RootFolder;
             else
            {
                targetLibrary = _web.Folders[this.lstLibs.SelectedItem.Value];
                _targetFileUrl = lstLibs.SelectedValue + "/" + this.txtFileName.Text.Trim();
            }
            Merger merger = new Merger();
            merger.Merge(sourceLibrary, targetLibrary, _targetFileName, _targetFileUrl);
            lblMessage.Text = "The sections of your document have been merger in the requested library. Use the link to navigate there.";
            lblMessage.Visible = true;
            lnkResult.NavigateUrl = targetLibrary.ServerRelativeUrl;
            lnkResult.Text = targetLibrary.Name;
            lnkResult.Visible = true;

The Merger class


At the root of the project add a class-file named Merger.cs. We will use this Merger class to merge the individual section files into a single document. The Merge method begins by creating a MemoryStream copy of the first section file. It ten calls AddSection for every other section file in sequence. The AddSection method is passed the current section number, a reference to the document element defined in our schema, a reference to the library of sections and the filename.

// Get the document part from the package.
                 // Load the XML in the part into an XmlDocument instance:
                XmlDocument xdoc = new XmlDocument(nt);
                xdoc.Load(documentPart.GetStream());
                XmlNode documentNode = xdoc.SelectSingleNode("//w:customXml[@w:uri='http://tempuri.org/DocumentSections.xsd' and @w:element='Document']", nsManager);

                 //loop through others and append
                int i = 0;
                 for (i = 2; i <= numberSections; i++)
                {
                    AddSection(i, documentNode, sectionLibrary, fileName);
                 }

In the AddSection method, the section node of the appropriate section file is located and copied into the MemoryStream that started out as a copy of the first section file. To perform the copy, we must use the ImportNode method of the XML document since we are copying nodes from one context to another. The value of True in this method call tells the import to perform a deep copy, which is necessary to make sure we get all of the contents of the document’s XML within this node. As we are appending the sections in order, the InsertAfter method places the imported section node last within the document.

XmlNode sectionNode = xdoc.SelectSingleNode("//w:customXml[@w:uri='http://tempuri.org/DocumentSections.xsd' and @w:element='Section']", nsManager);
                 if ((sectionNode != null))
                {
                    XmlNode newNode = documentNode.OwnerDocument.ImportNode(sectionNode, true);
                    documentNode.InsertAfter(newNode, documentNode.LastChild);
                 }

So, this way we can split and merge a Word 2007 document and stroe it in SharePoint document library to work collaboratively on each section simultaneously as well as can have security and versioning of each section files.

Hope, this article helped you a lot. You can download the sample souce code here.
You can also download the test document as well from here



By Jatin Prajapati   Popularity  (3040 Views)
Picture
Biography - Jatin Prajapati
I think, most of the people are interested only in answers so no Biography provided... Want know more just write me at jatin.prajapati.er@gmail.com