koolprasad2003

Forum Replies Created

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • in reply to: why openxml block document if it has hyperlink #8830

    koolprasad2003
    Participant

    Resolved !!!
    i got following code from
    Following is the complete listing of the class UriFixer, as well as the code to use it. The approach that you take when using this class is to first attempt to open the document as usual, catching OpenXmlPackageException. If that exception is thrown, and if the text of that exception contains “Invalid Hyperlink”, then the code calls UriFixer.FixInvalidUri. After calling FixInvalidUri, the code then opens the fixed document (or spreadsheet / presentation) as usual.

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.IO.Compression;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using System.Xml;
    using System.Xml.Linq;
    using DocumentFormat.OpenXml.Packaging;
    
    class Program
    {
        static void Main(string[] args)
        {
            var fileName = @"..\..\..\Test.docx";
            var newFileName = @"..\..\..\Fixed.docx";
            var newFileInfo = new FileInfo(newFileName);
    
            if (newFileInfo.Exists)
                newFileInfo.Delete();
    
            File.Copy(fileName, newFileName);
    
            WordprocessingDocument wDoc;
            try
            {
                using (wDoc = WordprocessingDocument.Open(newFileName, true))
                {
                    ProcessDocument(wDoc);
                }
            }
            catch (OpenXmlPackageException e)
            {
                if (e.ToString().Contains("Invalid Hyperlink"))
                {
                    using (FileStream fs = new FileStream(newFileName, FileMode.OpenOrCreate, FileAccess.ReadWrite))
                    {
                        UriFixer.FixInvalidUri(fs, brokenUri => FixUri(brokenUri));
                    }
                    using (wDoc = WordprocessingDocument.Open(newFileName, true))
                    {
                        ProcessDocument(wDoc);
                    }
                }
            }
        }
    
        private static Uri FixUri(string brokenUri)
        {
            return new Uri("http://broken-link/");
        }
    
        private static void ProcessDocument(WordprocessingDocument wDoc)
        {
            var elementCount = wDoc.MainDocumentPart.Document.Descendants().Count();
            Console.WriteLine(elementCount);
        }
    }
    
    public static class UriFixer
    {
        public static void FixInvalidUri(Stream fs, Func<string, Uri> invalidUriHandler)
        {
            XNamespace relNs = "http://schemas.openxmlformats.org/package/2006/relationships";
            using (ZipArchive za = new ZipArchive(fs, ZipArchiveMode.Update))
            {
                foreach (var entry in za.Entries.ToList())
                {
                    if (!entry.Name.EndsWith(".rels"))
                        continue;
                    bool replaceEntry = false;
                    XDocument entryXDoc = null;
                    using (var entryStream = entry.Open())
                    {
                        try
                        {
                            entryXDoc = XDocument.Load(entryStream);
                            if (entryXDoc.Root != null && entryXDoc.Root.Name.Namespace == relNs)
                            {
                                var urisToCheck = entryXDoc
                                    .Descendants(relNs + "Relationship")
                                    .Where(r => r.Attribute("TargetMode") != null && (string)r.Attribute("TargetMode") == "External");
                                foreach (var rel in urisToCheck)
                                {
                                    var target = (string)rel.Attribute("Target");
                                    if (target != null)
                                    {
                                        try
                                        {
                                            Uri uri = new Uri(target);
                                        }
                                        catch (UriFormatException)
                                        {
                                            Uri newUri = invalidUriHandler(target);
                                            rel.Attribute("Target").Value = newUri.ToString();
                                            replaceEntry = true;
                                        }
                                    }
                                }
                            }
                        }
                        catch (XmlException)
                        {
                            continue;
                        }
                    }
                    if (replaceEntry)
                    {
                        var fullName = entry.FullName;
                        entry.Delete();
                        var newEntry = za.CreateEntry(fullName);
                        using (StreamWriter writer = new StreamWriter(newEntry.Open()))
                        using (XmlWriter xmlWriter = XmlWriter.Create(writer))
                        {
                            entryXDoc.WriteTo(xmlWriter);
                        }
                    }
                }
            }
        }
    }

    We are considering including this method in the Open XML SDK itself. We would make a few overloads of the WordprocessingDocument.Open method, the SpreadsheetDocument.Open method, and the PresentationDocument.Open method. These overloads would take the callback as an argument, just as in the above example. These new methods would first attempt to open the document in the normal way. If the attempt to open is successful, then these methods would return the newly opened document. However, if System.IO.Packaging throws the OpenXmlPackageException, and if the document were opened for writing, then the method would open, modify, and save a fixed document. It would then attempt to open again, and return the newly opened document.

    With this approach, the idiom to open the document would be almost identical to the current approach to opening a document. The only difference would be the inclusion of the callback method as an argument.

    If the document was opened for read-only access, then the various methods would create a copy of the document in memory, fix the broken Uri objects, and then open and return the fixed document (for read-only access).

    Please feel free to comment about how this approach would work for you. If we have agreement on this approach, then in a month or two, we will make the change to the open source version of the Open XML SDK.


    koolprasad2003
    Participant

    Hello Eric
    Thanks for the reply.
    Sure. The easiest way to find out is to compare before and after version of document using Productivity tool.
    Actually following code will work to fetch values of ActiveX control but for word 2007 only

    using System;
    using System.Collections.Generic;
    using System.Xml.Linq;
    using System.Xml;
    using System.IO;
    using System.Text;
    using DocumentFormat.OpenXml;
    using DocumentFormat.OpenXml.Wordprocessing;
    using DocumentFormat.OpenXml.Packaging;
    
    namespace OpenXMLTest
    {
        class Program
        {
            const string textBoxId = "{8BD21D10-EC42-11CE-9E0D-00AA006002F3}";<
    span class="pln" style="margin: 0px; padding: 0px; border: 0px; font-size: 13px; color: rgb(48, 51, 54);">
            const string radioButtonId = "{8BD21D50-EC42-11CE-9E0D-00AA006002F3}";
            const string checkBoxId = "{8BD21D40-EC42-11CE-9E0D-00AA006002F3}";
    
            static void Main(string[] args)
            {
                string fileName = @"C:\Users\Andy\Desktop\test_l1demo.docx";
                using (WordprocessingDocument doc = WordprocessingDocument.Open(fileName, false))
                {
                    foreach (Control control in doc.MainDocumentPart.Document.Body.Descendants())
                    {
                        Console.WriteLine();
                        Console.WriteLine("Control {0}:", control.Name);
                        Console.WriteLine("Id: {0}", control.Id);
    
                        displayControlDetails(doc, control.Id);
                    }
                }
    
                Console.Read();
            }
    
            private static void displayControlDetails(WordprocessingDocument<
    span class="pln" style="margin: 0px; padding: 0px; border: 0px; font-size: 13px; color: rgb(48, 51, 54);"> doc, StringValue controlId)
            {
                string classId, type, value;
    
                OpenXmlPart part = doc.MainDocumentPart.GetPartById(controlId);
                OpenXmlReader reader = OpenXmlReader.Create(part.GetStream());
                reader.Read();
                OpenXmlElement controlDetails = reader.LoadCurrentElement();
    
                classId = controlDetails.GetAttribute("classid", controlDetails
    .NamespaceUri).Value;
    
                switch (classId)
                {
                    case textBoxId:
                        type = "TextBox";
                        break;
                    case radioButtonId:
                        type = "Radio Button";
                        break;
                    case checkBoxId:
                        type = "CheckBox";
                        break;
                    default:
                        type = "Not known";
                        break;
                }
    
                value = "No value attribute"; //displays this if there is no "value" attribute found
                foreach (OpenXmlElement child in controlDetails.Elements())
                {
                    if (child.GetAttribute("name", controlDetails.NamespaceUri).Value == "Value")
                    {
                        //we've found the value typed by the user in this control
                        value = child.GetAttribute("value", controlDetails.NamespaceUri).Value;
                    }
                }
    
                reader.Close();
    
                Console.WriteLine("Class id: {0}", classId);
                Console.WriteLine("Control type: {0}", type);
                Console.WriteLine("Control value: {0}", value);
    
            }
        }
    }
    
    

    I observed, the way word 2007 store ActiveX values are different than word 1010, I think tags are changed.
    and the code is somehow not working for word 2010 and above

    Thanks
    Prasad

Viewing 2 posts - 1 through 2 (of 2 total)