MS Word templates for generating documents and reports

If you have ever developed technical MS Word documentation for a domain structure or an interface specification for a medium or large system, you may have wished that you were able to generate the whole document or a large part of it from some structured models, such as a database structure or formal service definition documents. Similarly, if you needed to create a report as a Microsoft Word document from some structured data set, it would require purchasing specialized reporting software, or a manual cut and paste exercise.

Before Microsoft Office 2007 all office documents were stored in a proprietary format, so generating such documents would require using special Office Interop assemblies to produce word documents. Luckily, Microsoft Office moved to the XML-based Open Office format starting from version 2007, which made it much more easy to both read and create office documents using standard XML technologies such as XSLT.

Obviously, generating all the necessary XML for Microsoft Word documents from scratch would be a lot of work and does not lend itself to customizing the way your document is generated. Therefore, it makes a whole lot more sense to create a template document that will serve as a base for the generated document, and have the generation process just replace certain placeholders with the values from your structured data. But how can you easily specify the placeholders in the word document in such a way that will give you maximum flexibility to generate the output you want?

Content placeholders in MS Word

One of the interesting features of the Microsoft Word documents is the ability to define Content Controls. Content controls provide a way to add user interface elements to the documents, restrict edits of some sections of the document, and also to data bind document elements to a specific data source. What matters though is that they provide an easy way for you to select any part of the document, wrap it in a content control, and tag it with some meta-information that would define how the contents should be processed to generate the output document.


This allows you to create and edit a document directly in Microsoft Word that will serve as a template, from which I can generate a full report using your data. In that template you can set up a new styles, headers, footers, cover page and any other static content as well as placeholders for dynamic content using these Content Controls. The content controls allow you to specify two free form properties: Title and Tag. Title will be shown in the document itself, so it is better to use it to specify the selector for the data, so that the designer of the template can easily see the data that will be displayed without pulling up the content control's properties dialog. The Tag field can be used then to specify the type of the placeholder.

To illustrate this let's see how you can define basic constructs, such as loops for outputting repeating contents, conditional contents and printing of simple values from your data source.

Defining repeating contents

This is useful when you have repeating contents and you want to output each item either as a separate section of the document or as a table. In either case, what you need to do is to select the section or the first row in your table that you want repeated, open the Developer tab in Word ribbon and click on the bold Aa button to wrap it in a Rich Text Content Control.

After that you click on the Properties button in the Developer tab and set the Tag value as 'loop' and the Title value to some selector string that identifies the data you want to repeat. If your data is in XML format or can be converted to such format, then you can use XPath expression here as the selector with some predefined parameter as the context item such as $ctx. See the picture below for an example.


Obviously, you will want to use additional content controls inside of your repeater control to output simple, conditional, or repeating contents unless you want to generate some static contents repeating over and over again.

Defining conditional content

You can use this when you want to output included contents only under a certain condition, or if you want to output different contents based in different conditions. Setting it up would be very similar as defining repeating contents described above.

You just need to select the contents that you need to be conditionally included, wrap it into a rich text content control by clicking the bold Aa button on the Developer tab, open up the Content Control Properties dialog and set the Tag and Title fields as follows.

To output the contents only if the specified condition is met enter 'if' into the Tag field, or 'elseif' if also none of the preceding if or elseif conditions should be met. Enter the condition into the Title field, which should evaluate to true for the contents to be included. Again, if your data source is XML then you can use XPath with $ctx as the context item.


Enter just 'else' in the Tag field without any Title to output the contents only if none of the preceding if or elseif conditions are met.

If you have multiple if / else groups on the same level then you can wrap each group into its own content control and set the tag to 'group' without specifying the title to group these conditions.

Outputting simple values

This is what you would use to output actual values from your data source, such as values in each column of a table or names of your sections. In order to output simple values you should also add a content control in your document, but this time around you would click non-bold Aa button in the Developer tab to insert a Plain Text Content Control as it will not require any nested controls.

After you open the content control properties, you should set the Tag to 'val' and Title to a string that evaluates to your value, which would be XPath if your data source is XML-based.


Technically, you can specify anything or nothing inside of your content control, since it will be replaced with the evaluated value, but in practicality you would want to put some sample data inside of that control to help the designer see what it would look like in the actual document.

Template example

To demonstrate this lets create a template for a simple price list with tables of product names and prices grouped into sections by product category. We will create a section with the title and a table with one row under the header, which will be wrapped into a loop content control and each cell will have a val content control for the product name and price respectively. The title will be also a val content control for the product category and we will wrap the entire section into another loop content control. Here is what it will look like in the Design Mode.

Below is a sample XML that conforms to the expected input structure.

<products>
  <pc Category="Mountain Bikes">
    <p Name="Mountain-100 Silver, 38" ListPrice="$3,399.99" />
    <p Name="Mountain-200 Black, 38" ListPrice="$2,294.99" />
    <p Name="Mountain-300 Black, 48" ListPrice="$1,079.99" />
    <p Name="Mountain-500 Black, 52" ListPrice="$539.99" />
  </pc>
  <pc Category="Road Bikes">
    <p Name="Road-150 Red, 62" ListPrice="$3,578.27" />
    <p Name="Road-650 Red, 52" ListPrice="$782.99" />
    <p Name="Road-250 Red, 58" ListPrice="$2,443.35" />
    <p Name="Road-750 Black, 52" ListPrice="$539.99" />
  </pc>
</products>

Generating documents with Xomega

In order to support transformation of such MS Word templates in Open Office format you can write a generic XSLT stylesheet that reads XML from the parts of the origin word template, applies it to your supplied XML data source, and packages the resulting XML into the corresponding part of the target document.

Xomega.Net for Visual Studio automatically comes with such a generic XSLT stylesheet, which it uses to create design documents for your domain and service models. You can, however, use it to generate documents from any custom XML. All you have to do is create your stylesheet that includes doc_common.xsl from Xomega and calls the generate-document with your XML as the parameter. You'll need to make sure that the values for DocumentTemplate and OutputPath parameters are specified either at run time or design time.

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <!-- import Xomega base generator -->
  <xsl:import href="doc_common.xsl"/>

  <xsl:param name="DocumentTemplate">PriceListTemplate.docx</xsl:param>
  <xsl:param name="OutputPath">BikePriceList.docx</xsl:param>

  <xsl:template match="/">

    <!-- call Xomega template to convert XML data to the output document -->
    <xsl:call-template name="generate-document">
      <xsl:with-param name="xmlData" select="."/>
    </xsl:call-template>

  </xsl:template>
</xsl:stylesheet>


If you run such a stylesheet with our sample price list data from above then you will get a document generated like this.

Conclusion

We have learned how to create document templates using Content Controls in MS Word and saw how you can generate documents from those templates and custom XML data source using generic Xomega stylesheet.

We would like to hear your thoughts, so please post your comments or contact us with any questions or feedback.

No comments:

Post a Comment