OpenXML Filter: XLSX: the dynamic exposition of limiting options (minwidth, maxwidth, minheight, maxheight) on extraction

To follow up on Viktor's idea of having dynamic limiting options (minwidth, maxwidth, minheight, maxheight), I am proposing to handle them as metadata in the following way.

The document:

image

with the worksheet configuration:

worksheetConfigurations.0.namePattern=Sheet1
worksheetConfigurations.0.metadataColumns=C,D,E
worksheetConfigurations.number.i=1

and meaning of the metadata format:

COLUMN:limitingoption:VALUE:size-unit-value

  • the limitingoption might be shortened to something like: mnw, mxw, mnh, mxh
  • the last specified size-unit-value would take priority, and might be omitted - "char" by default

could be extracted as:

<group id="P76C545-sg1" resname="Sheet1">
<group id="P132303AB-sg1" resname="1">
<context-group name="row-metadata"><context context-type="x-E">general metadata</context></context-group>
<trans-unit id="P147242AB-tu1" resname="Sheet1!A1" xml:space="preserve" minwidth="1" maxwidth="2" size-unit="char">
<source xml:lang="en">A1</source>
<target xml:lang="fr"></target>
</trans-unit>
</group>
<group id="P132303AB-sg2" resname="2">
<context-group name="row-metadata"></context-group>
<trans-unit id="P147242AB-tu2" resname="Sheet1!A2" xml:space="preserve">
<source xml:lang="en">A2</source>
<target xml:lang="fr"></target>
</trans-unit>
<trans-unit id="P147242AB-tu3" resname="Sheet1!B2" xml:space="preserve" minheight="3" size-unit="em">
<source xml:lang="en">B2</source>
<target xml:lang="fr"></target>
</trans-unit>
</group>
</group>

An example document can be found attached.