{"id":69,"date":"2011-01-24T15:39:50","date_gmt":"2011-01-24T15:39:50","guid":{"rendered":"http:\/\/www.ericwhite.com\/home2\/bm8qcmjy\/public_html\/blog\/?p=69"},"modified":"2011-01-26T16:12:03","modified_gmt":"2011-01-26T16:12:03","slug":"generating-open-xml-wordprocessingml-documents","status":"publish","type":"post","link":"https:\/\/www.ericwhite.com\/blog\/2011\/01\/24\/generating-open-xml-wordprocessingml-documents\/","title":{"rendered":"Generating Open XML WordprocessingML Documents"},"content":{"rendered":"<p>Generating word-processing documents is perhaps the single most compelling use of Open XML.\u00a0 The archetypical case is an insurance company or bank that needs to generate 10\u2019s of thousands of documents per month and archive them and make them available online, send them electronically, or print them and send via post.\u00a0 But there are about a million variations on this theme.\u00a0 In this blog series, I am going to examine the various approaches for document generation.\u00a0 I\u2019m going to present code that demonstrates the various approaches.<\/p>\n<p>This post is the first in a series of blog posts.\u00a0 Here is the complete list: <a href=\"https:\/\/www.ericwhite.com\/blog\/map\/generating-open-xml-wordprocessingml-documents-blog-post-series\/\">Generating Open XML WordprocessingML Documents Blog Post Series<\/a><\/p>\n<p>I have some goals for the code that I\u2019ll be publishing:<\/p>\n<ul>\n<li>First and foremost, I want the document generation process to be data-driven from content controls that you configure in a template document.\u00a0<\/li>\n<li>The approach that I want to take is that the template designer creates a document, inserts content controls with specific tags, and then inserts specific instructions into each content control.<\/li>\n<li>The data that we will supply to the document generation process will be a data-centric XML document.\u00a0 I\u2019ll place a few constraints on this document.\u00a0 Some time ago, I wrote about <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2009\/07\/09\/document-centric-transforms-using-linq-to-xml.aspx\" class=\"broken_link\">Document-Centric Transforms using LINQ to XML<\/a>.\u00a0 That post discusses data-centric vs. document-centric XML documents.\u00a0 When generating documents from another data source, such as a SQL database or an internal or secure Web service, the task will be to generate a data centric XML document from that source, and then kick off the document generation process.<\/li>\n<li>This code should be short and sweet.\u00a0 I don\u2019t want to create some monolithic code base that would require a design process, formalized coding and testing procedures, and the like.\u00a0 The question is: how simple and how powerful can such a system be made?\u00a0 I\u2019m hoping to stay under a 1000 lines of code.\u00a0 But we have some powerful tools at our disposal, most importantly using <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2006\/10\/04\/fp-tutorial.aspx\" class=\"broken_link\">LINQ to XML in a functional style<\/a>.\u00a0 Also, I probably will code a few <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2009\/07\/20\/a-tutorial-in-the-recursive-approach-to-pure-functional-transformations-of-xml.aspx\" class=\"broken_link\">recursive functional transforms<\/a>.<\/li>\n<\/ul>\n<p>I am contemplating four approaches for the instructions that the template designer will place in the content controls.\u00a0 The content controls could contain:<\/p>\n<ul>\n<li>Parameterized XPath expressions: This approach might be the easiest for the template designer to configure.<\/li>\n<li>XSLT sequence constructors: This approach possible might be the easiest to code.\u00a0 It might be very, very short if you exclude existing code such as <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2008\/09\/29\/the-flat-opc-format.aspx\" class=\"broken_link\">transforming OPC back and forth to Flat OPC<\/a>, <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2008\/09\/08\/openxmlcodetester-validating-code-in-open-xml-documents.aspx?wa=wsignin1.0\" class=\"broken_link\">OpenXmlCodeTester<\/a>, and the axes I detailed in <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ff686712.aspx\">Mastering Text in Open XML WordprocessingML Documents<\/a>.\u00a0 I am contemplating using XSLT 2.0.<\/li>\n<li>.NET code (either VB or C#): This approach reminds me of code that I presented in <a href=\"http:\/\/blogs.msdn.com\/b\/ericwhite\/archive\/2008\/09\/08\/openxmlcodetester-validating-code-in-open-xml-documents.aspx?wa=wsignin1.0\" class=\"broken_link\">OpenXmlCodeTester: Validating Code in Open XML Documents<\/a>.\u00a0 It might be cool to put a LINQ expression in a content control that projects a collection of rows and columns that become an table in the word-processing document.\u00a0 There could be some cool and easy ways to supply formatting.<\/li>\n<li>Some XML dialect that I invent as I go along.<\/li>\n<\/ul>\n<p>I\u2019m not sure which approach I\u2019ll take.\u00a0 I want to play around with all four approaches, and see which one is easiest to use, and which one is easiest to develop.\u00a0 As I start playing around with these (and posting the code as I go along), I\u2019ll make some design decisions, and list my reasons for the decisions.<\/p>\n<p>By the way, I really love to have discussions about these things.\u00a0 If you agree or disagree with any of my design decisions, feel free to chime in.\u00a0 You can register so we can have more of a discussion, or post anonymously, as you like.<\/p>\n<p>In the next post, I\u2019m going to examine template documents, and define exactly what I mean by a template document.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduces this blog post series on generating WordprocessingML documents, outlines the goals of the series, and desribes various approaches that I may take as I develop some document generation examples.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"_s2mail":"","footnotes":""},"categories":[7,3,5],"tags":[],"class_list":["post-69","post","type-post","status-publish","format-standard","hentry","category-document-generation-series","category-open-xml","category-wordprocessingml"],"_links":{"self":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/posts\/69","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/comments?post=69"}],"version-history":[{"count":9,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/posts\/69\/revisions"}],"predecessor-version":[{"id":128,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/posts\/69\/revisions\/128"}],"wp:attachment":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/media?parent=69"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/categories?post=69"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/tags?post=69"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}