Archive for January, 2011

The Second Iteration of the Template Document

After great feedback by Svetlin, also after some more contemplation about tables, this post presents the second iteration on a template document to be used for a document generation process.

This post is the third in a series of blog posts. Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series

One additional goal that I have for these document templates is that if necessary, the template designer can specify formatting for a field or for a cell in a table. To facilitate this, I’m going add the capability to specify the style in a separate nested content control.

In the following template, there are five content controls. The first is a value with a style. The second is a value that uses the style of the containing paragraph. The third generates a table from the query. The table is formatted with the table style of the sample table. The fourth shows conditional content. The last specifies that the user should be asked a question, the answer to which must be shorter than 256 characters.

I am certain that the design for this document template will be refined over the next couple of weeks.

Comments (5)

Using a WordprocessingML Document as a Template in the Document Generation Process

In this post, I examine the approaches for building a template document for the document generation process.

This post is the second in a series of blog posts.  Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series

In my approach to document generation, a template document is a DOCX document that contains content controls that will control the document generation process.  The document template designer can format this document as desired, and the document generation process will generate documents that have the format of the template document.

When working with content controls, first of all, remember that you need to turn on the developer tab in the ribbon.  Click File => Options => Customize Ribbon, and then turn on the developer tab:

Turning on the Developer Tab

Turning on the Developer Tab

Another point that will make it easier to work with content controls is to turn on design mode.  If design mode is turned off (which is the default), content controls have a square boxed appearance with a tab at the top that contains the title of the content control:

Content Control - not in Design Mode

Content Control - not in Design Mode

This is not a problem, except that if the focus is not in a content control, there is no visual indication that the content control is there.  Instead, turn on design mode:

Turning on design mode

Turning on design mode

With design mode turned on, content controls have blue tags that indicate the beginning and end of the location of a content control.  With design mode turned on, a template document will look something like the following:

Sample template document with content controls

Sample template document with content controls

In this document, plain text content controls contain a LINQ query that returns a single value.  Formatting is easy – the value returned by the query takes on the formatting of the containing run or paragraph.

In this document, the rich text content control with Table as its title contains a LINQ query that returns a collection of anonymous types.  The results of the query will be inserted into the document as a WordprocesssingML table.  The inserted table will have the formatting of the empty table that is inserted into the rich text content control.

Other uses of the word ‘Template’ in Microsoft Office

One minor issue around the idea of creating a template WordprocessingML document is that the term ‘template’ is overloaded.  Microsoft Word has the notion of ‘Document Templates’, which are saved with the dotx extension.  These are WordprocessingML documents with one special characteristic – when the user opens one of these documents, the user cannot directly save back to the dotx file – the user must instead supply a new filename, and Word will append docx as the extension.

In addition, related to dotx document templates are ‘document template projects’ in Visual Studio 2010 (and 2008).  These are template-based document-level projects (see Architecture of Document-Level Customizations) that consist of managed code that is attached to a document template instead of a document.  The user opens the template, uses the managed customization to do whatever it does, and then saves as a docx document.  The docx document can have a managed customization, or it can be stripped of the customization, leaving a plain old docx.

For this document generation project, we don’t need to use either of these facilities.  Instead, the template document that the designer creates is, as far as Word is concerned, an ordinary word-processing document.

Comments (4)

Using Windows 7 Sticky Notes as a Mini Text Editor

Here is a funny little trick that I didn’t know you could do until about a week ago. Sometimes when refactoring a large chunk of code I need to have multiple snippets of code that I need to paste at various points. Or other times I need to grab some code from some blog post and keep it aside for a bit while I do something else, then go back to it and paste it in the appropriate place. I love using Windows 7 sticky notes for keeping small lists of stuff that have to get done, but by default, sticky notes are in a scripty font. I’m not sure exactly which font it is, but it is certainly not a font that you expect to use to edit code.  There is no obvious way to change the font of a sticky note – if you right click, the only options on the context menu are the background colors of the sticky note.

However, if you format some text using Word, and then copy / paste into a sticky note, the sticky note retains the formatting and font, and you have a mini text editor that you can paste code into.

StickyNotes with fixed font

StickyNotes with fixed font

Comments

Generating Open XML WordprocessingML Documents

Generating word-processing documents is perhaps the single most compelling use of Open XML.  The archetypical case is an insurance company or bank that needs to generate 10’s of thousands of documents per month and archive them and make them available online, send them electronically, or print them and send via post.  But there are about a million variations on this theme.  In this blog series, I am going to examine the various approaches for document generation.  I’m going to present code that demonstrates the various approaches.

This post is the first in a series of blog posts.  Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series

I have some goals for the code that I’ll be publishing:

  • First and foremost, I want the document generation process to be data-driven from content controls that you configure in a template document. 
  • The approach that I want to take is that the template designer creates a document, inserts content controls with specific tags, and then inserts specific instructions into each content control.
  • The data that we will supply to the document generation process will be a data-centric XML document.  I’ll place a few constraints on this document.  Some time ago, I wrote about Document-Centric Transforms using LINQ to XML.  That post discusses data-centric vs. document-centric XML documents.  When generating documents from another data source, such as a SQL database or an internal or secure Web service, the task will be to generate a data centric XML document from that source, and then kick off the document generation process.
  • This code should be short and sweet.  I don’t want to create some monolithic code base that would require a design process, formalized coding and testing procedures, and the like.  The question is: how simple and how powerful can such a system be made?  I’m hoping to stay under a 1000 lines of code.  But we have some powerful tools at our disposal, most importantly using LINQ to XML in a functional style.  Also, I probably will code a few recursive functional transforms.

I am contemplating four approaches for the instructions that the template designer will place in the content controls.  The content controls could contain:

  • Parameterized XPath expressions: This approach might be the easiest for the template designer to configure.
  • XSLT sequence constructors: This approach possible might be the easiest to code.  It might be very, very short if you exclude existing code such as transforming OPC back and forth to Flat OPC, OpenXmlCodeTester, and the axes I detailed in Mastering Text in Open XML WordprocessingML Documents.  I am contemplating using XSLT 2.0.
  • .NET code (either VB or C#): This approach reminds me of code that I presented in OpenXmlCodeTester: Validating Code in Open XML Documents.  It might be cool to put a LINQ expression in a content control that projects a collection of rows and columns that become an table in the word-processing document.  There could be some cool and easy ways to supply formatting.
  • Some XML dialect that I invent as I go along.

I’m not sure which approach I’ll take.  I want to play around with all four approaches, and see which one is easiest to use, and which one is easiest to develop.  As I start playing around with these (and posting the code as I go along), I’ll make some design decisions, and list my reasons for the decisions.

By the way, I really love to have discussions about these things.  If you agree or disagree with any of my design decisions, feel free to chime in.  You can register so we can have more of a discussion, or post anonymously, as you like.

In the next post, I’m going to examine template documents, and define exactly what I mean by a template document.

Comments (16)

Staying Current with Microsoft Developer Technologies

One of my goals when coming to Microsoft was to get, as much as possible, a high-level view of all Microsoft developer technologies that are interesting to me.  My interests are wide – of course, everything Office/SharePoint, but also I am also interested in every data access technology, Silverlight, Windows Phone, Web technologies, graphics design, and on and on.

The other day, a friend of mine at Microsoft asked me what I do to get a broad overview of Microsoft developer technologies, and here is what I told him:

“I listen to every single developer presentation from TechEd, PDC, and MIX.”

Of course, I listen to a number of overview and IT Pro sessions also.  I also make sure I listen to the keynotes – they are interesting not because of technology, but because they give clues into what Microsoft senior management is promoting these days.  Every presentation is available online for download.  This is an incredibly valuable resource.

Note that I said ‘listen’, not watch.  I download the sessions, put them on my Zune, and then start working my way through them.  If I’m doing the dishes, or in my car, or wherever I have dead time, I put on the headphones and continue where I left off.  I listen in the sauna – I can’t believe my old 30GB Zune still works after a hundred trips into the sauna.  Good hardware.  I’ve been doing this for a couple of years, and have listened to many hundreds of sessions.

I don’t necessarily need to examine every line of code in the presentation – I am more interested in listening ‘over the top’ and understanding capabilities, strengths, weaknesses, and overall architecture.  Of course, there are some sessions that are so important that I go back to my PC and watch in high resolution so I can read the code on the screen or see the demos.  And there are others, where the technology is less interesting to me, or the technical content isn’t there, where I quit watching early and move on to the next video.

Which brings me to the point of this post: I’ll be tweeting good sessions as I come across them, with a few words about why I liked them.  Granted, tweeting about sessions presented six months ago is not news, however, twitter is the best forum for these.  They don’t warrant a blog post sometimes, particularly a post that is so very far off topic to what I blog about (Open XML / Office / SharePoint / Functional Programming), but sometimes the information is just too cool to not share.  I’m just letting you know what I found interesting, and why.

Also, I have started a Good Microsoft Conference Sessions page where I’ll list the sessions after I watch them, along with my bit of commentary.  I am also going to add a number from past conferences as time permits.

Comments

Welcome – First Post

As I mentioned on my MSDN blog, January 21, 2011 will be my last day as a Microsoft employee. I am shutting down the MSDN blog, and am commencing to blog here at ericwhite.com/blog. If you liked my previous blog, I expect that you will like this one. Here are my first projects:

Open XML Document Generation

I also want to write a number of samples around parameterized document generation. While it is fairly straightforward to build applications that can generate documents, I think that there is much more that can be done to make document generation much simpler and more powerful. This should be possible generate a wide variety of documents from a wide variety of data sources by changing a source template document that contains content controls. There are existing MSDN articles that show how to generate a lot of documents, but I think I’ll have my own special take on this problem.

Open XML SpreadsheetML Formulas

I’m going to complete the blog post series on writing a recursive descent parser for SpreadsheetML formulas using C# and LINQ.

SharePoint

I want to teach myself the nooks and crannies of the SharePoint server object model. In that process, I’m going to write a fair number of snippets that demonstrate how to work with particular parts of the API. I’ll be blogging those snippets. In addition, I’ve wanted to write about a number of areas of SharePoint.

I’m not sure where else my interests will take me, but one thing is certain – I’m going to continue to focus on Open XML. There is much more that can be done with functional programming and Open XML. And I think it is possible to build some amazing applications using Open XML in a SharePoint application. In blog posts on my MSDN blog, I’ve worked out the mechanics of working with Open XML documents in SharePoint, however, I haven’t really started taking advantage of file formats to ‘light up’ documents.

This is going to be fun!

Eric White, software developer

Comments (13)

About

I am an independant software developer / writer.  I am particularly pasionate about Open XML, SharePoint development, and Office development.  I am a fan of functional programming, and think that using functional programming to transform documents can change the way we think about Open XML documents.

Comments (3)