{"id":1978,"date":"2016-02-03T15:52:59","date_gmt":"2016-02-03T15:52:59","guid":{"rendered":"http:\/\/www.ericwhite.com\/home2\/bm8qcmjy\/public_html\/blog\/?page_id=1978"},"modified":"2016-03-22T23:09:51","modified_gmt":"2016-03-22T23:09:51","slug":"search-and-replace-content-in-docx-pptx-using-regular-expressions","status":"publish","type":"page","link":"https:\/\/www.ericwhite.com\/blog\/search-and-replace-content-in-docx-pptx-using-regular-expressions\/","title":{"rendered":"Search and Replace Content in DOCX, PPTX using Regular Expressions"},"content":{"rendered":"<p><span class=\"Back\"><a class=\"Back\" href=\"https:\/\/www.ericwhite.com\/blog\/openxmlregex-developer-center\/\">Return to the<br \/>OpenXmlRegex<br \/>Developer Center<\/a><\/span><span><b>OpenXmlRegex<\/b> is a class in <b>PowerTools for Open XML<\/b> that enables you to search and optionally replace content in DOCX and PPTX using regular expressions. &nbsp;The following screen-cast demonstrates the <b>OpenXmlRegex<\/b> class, and explains some of the more interesting semantics of it.<\/span><\/p>\n<p><iframe loading=\"lazy\" title=\"OpenXmlRegex\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/rDGL-i5zRdk?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p><span>To use the <b>OpenXmlRegex<\/b> class, you first query for a collection of paragraphs, and then pass that collection into the various methods. &nbsp;This enables you to select a subset of a document and perform search or replace operations just in that subset. &nbsp;Further, it enables you to perform search and replace operations on parts other than the main document part; you can query for paragraphs in a header or footer and perform search or replace operations in those parts.<\/span><\/p>\n<p><span>There are not separate methods for operation on paragraphs in <b>PresentationML<\/b>. &nbsp;You can pass in either a collection of <b>WordprocessingML<\/b> paragraphs, or a collection of <b>PresentationML<\/b> paragraphs. &nbsp;Behavior is identical for both XML vocabularies, with the exception that you cannot introduce revision tracking markup into <b>PresentationML<\/b>, as it does not support revision tracking. &nbsp;The various methods in <b>OpenXmlRegex<\/b> detect whether you have passed in <b>WordprocessingML<\/b> markup or <b>PresentationML<\/b> markup, and then take action accordingly.<\/span><\/p>\n<p>To get <b>OpenXmlRegex<\/b>, go to the <a href=\"http:\/\/powertools.codeplex.com\/releases\/view\/74771\">Downloads Tab<\/a> at <a href=\"http:\/\/powertools.codeplex.com\">powertools.codeplex.com<\/a>, and download version 2.7.04 or later.<\/p>\n<p><span>There are four external methods in the <b>OpenXmlRegex<\/b> class:<\/span><\/p>\n<h1 class=\"pt-Heading1\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000000\">OpenXmlRegex.Match Method (IEnumerable&lt;XElement&gt;, Regex)<\/span><\/h1>\n<p><span>Counts the number of times that the regular expression matches text in the specified content.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Syntax<\/span><\/h2>\n<p><span>public static int Match(<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;IEnumerable &lt;XElement&gt; content,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Regex regex<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>)<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Parameters<\/span><\/h2>\n<p><span>content<\/span><\/p>\n<p><span>Type: IEnumerable&lt;XElement&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The content to search.<\/span><\/p>\n<p><span>regex<\/span><\/p>\n<p><span>Type: System.Text.RegularExpressions.Regex<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The regular expression to match.<\/span><\/p>\n<p><span>Return Value<\/span><\/p>\n<p><span>Type: int<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The number of matches found in the content.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Remarks<\/span><\/h2>\n<p><span>If this method returns 0, then no matches were found. &nbsp;If this method returns a value greater than zero, then matches were found.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Example<\/span><\/h2>\n<pre class=\"prettyprint\">content = xDoc.Descendants(W.p).Take(1);\r\nregex = new Regex(&quot;Video&quot;);\r\ncount = OpenXmlRegex.Match(content, regex);\r\nConsole.WriteLine(&quot;Example #1 Count: {0}&quot;, count);<\/pre>\n<h1 class=\"pt-Heading1\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000000\">OpenXmlRegex.Match Method (IEnumerable&lt;XElement&gt;, Regex, Action&lt;XElement, Match&gt;)<\/span><\/h1>\n<p><span>Counts the number of times that the regular expression matches text in the specified content, calling the specified callback for each instance of matched text.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Syntax<\/span><\/h2>\n<p><span>public static int Match(<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;IEnumerable &lt;XElement&gt; content,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Regex regex,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Action&lt;XElement, Match&gt; found<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>)<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Parameters<\/span><\/h2>\n<p><span>content<\/span><\/p>\n<p><span>Type: IEnumerable&lt;XElement&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The content to search.<\/span><\/p>\n<p><span>regex<\/span><\/p>\n<p><span>Type: System.Text.RegularExpressions.Regex<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The regular expression to match.<\/span><\/p>\n<p><span>found<\/span><\/p>\n<p><span>Type: Action&lt;XElement, Match&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The callback to call with each match.<\/span><\/p>\n<p><span>Return Value<\/span><\/p>\n<p><span>Type: int<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The number of matches found in the content.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Remarks<\/span><\/h2>\n<p><span>If this method returns 0, then no matches were found. &nbsp;If this method returns a value greater than zero, then matches were found.<\/span><\/p>\n<p><span>Typically, you write the found callback using a lambda expression. &nbsp;In the lambda expression, you can write code to inspect each match.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Example<\/span><\/h2>\n<pre class=\"prettyprint\">content = xDoc.Descendants(W.p).Take(1);\r\nregex = new Regex(&quot;video&quot;, RegexOptions.IgnoreCase);\r\ncount = OpenXmlRegex.Match(content, regex, (element, match) =&gt;\r\n    Console.WriteLine(&quot;Example #3 Found value: &gt;{0}&lt;&quot;, match.Value));<\/pre>\n<h1 class=\"pt-Heading1\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000000\">OpenXmlRegex.Replace Method (IEnumerable&lt;XElement&gt;, Regex, string, Func&lt;XElement, Match, bool&gt;)<\/span><\/h1>\n<p><span>Replaces matched text in the specified content, calling the specified callback for each instance of matched text. &nbsp;If the callback returns true for matched text, then the method replaces the matched text. &nbsp;If the callback returns false, then the method does not replace the matched text.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Syntax<\/span><\/h2>\n<p><span>public static int Replace(<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;IEnumerable &lt;XElement&gt; content,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Regex regex,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;string replacement,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Func&lt;XElement, Match, bool&gt; doReplacement<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>)<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Parameters<\/span><\/h2>\n<p><span>content<\/span><\/p>\n<p><span>Type: IEnumerable&lt;XElement&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The content to search.<\/span><\/p>\n<p><span>regex<\/span><\/p>\n<p><span>Type: System.Text.RegularExpressions.Regex<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The regular expression to match.<\/span><\/p>\n<p><span>replacement<\/span><\/p>\n<p><span>Type: string<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The text that will replace the matched content.<\/span><\/p>\n<p><span>found<\/span><\/p>\n<p><span>Type: Action&lt;XElement, Match&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The callback to call with each match. &nbsp;If you pass null for this argument then the method replaces all occurrences of matched text.&nbsp;<\/span><\/p>\n<p><span>Return Value<\/span><\/p>\n<p><span>Type: int<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The number of replacements in the content.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Remarks<\/span><\/h2>\n<p><span>Typically, you write the found callback using a lambda expression. &nbsp;In the lambda expression, you can write code to inspect each match.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Example<\/span><\/h2>\n<pre class=\"prettyprint\">content = xDoc.Descendants(W.p).Skip(1).Take(1);\r\nregex = new Regex(&quot;^Video provides&quot;);\r\ncount = OpenXmlRegex.Replace(content, regex, &quot;Audio gives&quot;, null);\r\nConsole.WriteLine(&quot;Example #4 Replaced: {0}&quot;, count);<\/pre>\n<p><span style=\"color:#2e74b5;font-family:&#39;Calibri Light&#39;, sans-serif;font-size:16pt;line-height:22.82666778564453px;\">OpenXmlRegex.Replace Method (IEnumerable&lt;XElement&gt;, Regex, string, Func&lt;XElement, Match, bool&gt;, bool, string)<\/span><\/p>\n<p><span>Replaces matched text in the specified content, calling the specified callback for each instance of matched text. &nbsp;If the callback returns true for matched text, then the method replaces the matched text. &nbsp;If the callback returns false, then the method does not replace the matched text.<\/span><\/p>\n<p><span>If you pass true for the trackRevisions argument, then this method introduces tracked revisions for all replacements. &nbsp;In WordprocessingML, each tracked revision contains the name of the author who made the change. &nbsp;For tracked revisions that are created by this method, the author of the tracked revisions is set to the value of the&nbsp;<\/span><span class=\"pt-DefaultParagraphFont-000014\"><b>author<\/b><\/span><span>&nbsp;argument.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Syntax<\/span><\/h2>\n<p><span>public static int Replace(<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;IEnumerable &lt;XElement&gt; content,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Regex regex,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;string replacement,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;Func&lt;XElement, Match, bool&gt; doReplacement,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;bool trackRevisions,<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>&nbsp;&nbsp; &nbsp;string author<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>)<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Parameters<\/span><\/h2>\n<p><span>content<\/span><\/p>\n<p><span>Type: IEnumerable&lt;XElement&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The content to search.<\/span><\/p>\n<p><span>regex<\/span><\/p>\n<p><span>Type: System.Text.RegularExpressions.Regex<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The regular expression to match.<\/span><\/p>\n<p><span>replacement<\/span><\/p>\n<p><span>Type: string<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The text that will replace the matched content.<\/span><\/p>\n<p><span>found<\/span><\/p>\n<p><span>Type: Action&lt;XElement, Match&gt;<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The callback to call with each match. &nbsp;If you pass null for this argument then the method replaces all occurrences of matched text.&nbsp;<\/span><\/p>\n<p><span>trackRevisions<\/span><\/p>\n<p><span>Type: bool<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>If true, then this method introduces tracked revisions when replacing content.<\/span><\/p>\n<p><span>author<\/span><\/p>\n<p><span>Type: string<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The author of the tracked revisions.<\/span><\/p>\n<p><span>Return Value<\/span><\/p>\n<p><span>Type: int<\/span><span class=\"pt-DefaultParagraphFont-000002\"><br \/>&lrm;<\/span><span>The number of replacements in the content.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Remarks<\/span><\/h2>\n<p><span>Typically, you write the found callback using a lambda expression. &nbsp;In the lambda expression, you can write code to inspect each match.<\/span><\/p>\n<h2 class=\"pt-Heading2\" dir=\"ltr\"><span class=\"pt-DefaultParagraphFont-000001\">Example<\/span><\/h2>\n<pre class=\"prettyprint\">content = xDoc.Descendants(W.p).Skip(13).Take(1);\r\nregex = new Regex(&quot;Video provides &quot;);\r\ncount = OpenXmlRegex.Replace(content, regex, &quot;Audio gives &quot;, null, true, &quot;John Doe&quot;);\r\nConsole.WriteLine(&quot;Example #16 Replaced: {0}&quot;, count);<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Return to theOpenXmlRegexDeveloper CenterOpenXmlRegex is a class in PowerTools for Open XML that enables you to search and optionally replace content in DOCX and PPTX using regular expressions. &nbsp;The following screen-cast demonstrates the OpenXmlRegex class, and explains some of the more interesting semantics of it. To use the OpenXmlRegex class, you first query for a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"_s2mail":"","footnotes":""},"class_list":["post-1978","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/pages\/1978","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/comments?post=1978"}],"version-history":[{"count":3,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/pages\/1978\/revisions"}],"predecessor-version":[{"id":3246,"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/pages\/1978\/revisions\/3246"}],"wp:attachment":[{"href":"https:\/\/www.ericwhite.com\/blog\/wp-json\/wp\/v2\/media?parent=1978"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}