Skip to content

Abbreviation Expansion Improvements

R requested to merge named_structure_positions into containerize

Made changes to make it easier to see expanded labels from generated CDXMLs in ChemDraw.

  • Made all abbreviation graphs visible on expansion, avoid expanding to a small 'error point', which is confusing.
    • Assigned positions (ordered left-right or right-left depending upon label location around molecule)
    • ChemDraw corrects positions automatically in many cases
    • In other cases, ChemDraw operations such as 'scale' and 'clean structure' for molecules and changing the font size for the molecule make expansion readable
  • Updatedd CDXML template to default 6pt font size, and Times New Roman font, so that expansions look 'correct', don't change original font on expansion
    • 6pt works with example document, unlike to greatly alter the structure of larger molecules
    • document font size defaults can be easily configured within ChemDraw using menus if needed

Testing

  • Pull this branch (`git checkout named_structure_positions`)
  • Run make chem-v2-all-test 2> STDERR_temp (pipes standard error messages from molconvert to a file to make terminal easier to read)
  • Confirm that program runs main steps successfully; this should be visible after SMILES evaluation:
Exact matches: 141 243 	Percent: 0.5802469135802469
Eval Dir: outputs/All/eval_tsv/or100.09.tables_eval.tsv
 [ WRITTEN EVAL METRICS TO: outputs/All/eval_tsv... ]
  • Also check file outputs/All/generated_cdxmls/or100.09.tables_full_cdxml/or100.09.tables.cdxml -- try expanding structures (for definitions that are defined) on left and right side of molecules. Confirm expansions occur and font preserved.
    • Also try using the 'Scale' and 'Molecule -> Clean up Structure' operations to confirm that 'messy' expansion results can be reformatted.

Merge request reports