OpenXML Filter: allow formatting ignorance for whitespaces

Sometimes an extraction is represented with many codes, wrapping up whitespace characters. For instance:

<trans-unit id="P244D057-tu1" xml:space="preserve">
<source xml:lang="ar"> <g id="1">إنشاء</g><g id="2"> </g><g id="3">وإنتاج</g><g id="4"> </g><g id="5">المحتوى</g><g id
="6"> </g></source>
<target xml:lang="ar-SA"> <g id="1">إنشاء</g><g id="2"> </g><g id="3">وإنتاج</g><g id="4"> </g><g id="5">المحتوى</g><g
 id="6"> </g></target>
</trans-unit>

This is how the original document looks like in a viewer/editor:

a-run-with-ignored-whitespace-styles

The whitespaces are slightly different from consequential characters in terms of font styling. So, it would be acceptable to ignore such whitespace styling under the bPreferenceAggressiveCleanup - ignoreWhitespaceStyles options.

The mentioned document can be found attached: a-run-with-ignored-whitespace-styles.docx

Edited Nov 29, 2024 by Denis Konovalyenko
Assignee Loading
Time tracking Loading