SimplifyMarkup – Saved file corrupted
Home › Forums › Open-Xml-PowerTools › SimplifyMarkup – Saved file corrupted
Tagged: Error, SimplifyMarkup
- This topic has 3 replies, 2 voices, and was last updated 9 years, 3 months ago by
FRCMNS0.
-
AuthorPosts
-
November 28, 2016 at 4:45 pm #3981
FRCMNS0
ParticipantAs an addendum, the extra SimplifyMarkupSettings options (everything besides NormalizeXml) doesn’t cause errors, only when NormalizeXml is set this problem occurs.
November 28, 2016 at 5:37 pm #3982Eric White
KeymasterHi,
I think that there is something else causing this problem, not MarkupSimplifier.
You are getting a failure in parsing the xml in the /word/styles.xml file, not the main document part, which is what NormalizeXml operates on. It looks as though your styles.xml file maybe doesn’t have anything in it, which could be caused by any of a variety of things, but probably not by MarkupSimplifier, not to say that MarkupSimplifier doesn’t modify styles.xml – it might, I can’t recall, but this is not the first place I’d look for this bug. I’d look for what is writing to styles.xml, and see why the XML parser is failing on it.
You can also manually examine the styles.xml file using the Open XML Package Editor Add-In for Visual Studio. That may provide a clue as to why the parser is failing on reading the styles.xml part.
Best, Eric
November 28, 2016 at 6:10 pm #3983FRCMNS0
ParticipantIn this case, I am sure that MarkupSimplifier is modifying the styles.xml file. Here is the entire sample code used:
using DocumentFormat.OpenXml.Packaging; using OpenXmlPowerTools; using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; namespace DocxTest { class Program { static void Main(string[] args) { try { File.Copy("PORTA.docx", "PORTA_copy.docx"); using (var docMaster = WordprocessingDocument.Open("PORTA_copy.docx", true)) { SimplifyMarkupSettings settings = new SimplifyMarkupSettings { NormalizeXml = true, // Merges Run's in a paragraph with similar formatting // Additional settings if required RemoveBookmarks = true, RemoveComments = true, RemoveGoBackBookmark = true, RemoveWebHidden = true, RemoveContentControls = true, RemoveEndAndFootNotes = true, //RemoveFieldCodes = true, RemoveLastRenderedPageBreak = true, RemovePermissions = true, RemoveProof = true, RemoveRsidInfo = true, RemoveSmartTags = true, RemoveSoftHyphens = true, }; MarkupSimplifier.SimplifyMarkup(docMaster, settings); docMaster.Save(); } Console.WriteLine("Done."); } catch(Exception ex) { Console.WriteLine("Error: {0}", ex.ToString()); } Console.ReadLine(); } } }There’s nothing else being done to the document.
Most of the differences are a extra space before closing a tag or reordered attributes.The major change is right on the beginning of the file, mostly additional namespace declarations.
Here is a WinMerge report with the differences highlighted:
https://drive.google.com/file/d/0B0ZNalzpb4uFRjdndWFidTduME0/view?usp=sharing -
AuthorPosts
- You must be logged in to reply to this topic.