OpenXML Filter: segmentation quality reduced for some PPTX documents
Please consider the following extraction:
<source xml:lang="en">The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown
fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. <x id="1" c
type="x-x" equiv-text="<tags1/>"/>The quick brown fox jumps over the lazy dog. The
quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy d
og. <x id="2" ctype="x-x" equiv-text="<tags2/>"/>The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over
the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps
over the lazy dog. <x id="3" ctype="x-x" equiv-text="<tags3/>"/></source>
There is extra <x id="3"> code in the end.
The expected output mustn't contain it:
<source xml:lang="en">The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown
fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. <x id="1" c
type="x-x" equiv-text="<tags1/>"/>The quick brown fox jumps over the lazy dog. The
quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy d
og. <x id="2" ctype="x-x" equiv-text="<tags2/>"/>The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over
the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps
over the lazy dog. </source>
For more details please refer to the attached document.