OpenXML Filter: improve complex fields extraction when there are more than one text intruction present

There are cases when complex fields extraction is expected by not performed due to the presence of “empty” text instructions after a meaningful one:

      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:instrText xml:space="preserve"> HYPERLINK  \l "_top" </w:instrText>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:instrText xml:space="preserve"> </w:instrText>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:fldChar w:fldCharType="separate"/>
      </w:r>

Also, the extraction decision has to be based on the principle of the first meaningful text instruction presence, if there are 2 meaningful ones. So, if there are DATE and HYPERLINK, then DATE has to be considered for extraction and vice versa.

      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:instrText xml:space="preserve"> DATE </w:instrText>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:instrText xml:space="preserve"> HYPERLINK  \l "_top" </w:instrText>
      </w:r>
      <w:r w:rsidR="00B4756F">
        <w:rPr>
          <w:lang w:val="en-US"/>
        </w:rPr>
        <w:fldChar w:fldCharType="separate"/>
      </w:r>

For more details, please refer to the attached documents.

Assignee Loading
Time tracking Loading