Skip to content

Separate filters for CCs and Symbols

Matt Langsenkamp requested to merge sigir-prep-update into main

TESTING NOTES: PART 1: git clone https://gitlab.com/dprl/graphics-extraction.git cd graphics extraction git checkout sigir-prep-update make ./bin/acl/run-year 1952

if there are errors scroll up. errors often propagate through the system from previous scripts.

if there are no errors then the test passes

PART 2 (specifically for Bryan): examine the file modules/pipeline/msp_settings.py particularity the following lines:

# Recognition control flow
# PIPE_SKIP_PREPARSE =   False #True
PIPE_SKIP_PREPARSE =   False
PIPE_COMBINE_IF_PREPARSE = True
PIPE_SKIP_PARSE = False
PIPE_COMBINE_ONLY = False
PIPE_PARSE_ONLY = False

relate these to the where they are evoked in modules/pipeline/msp_settings.py to get an understanding of how they should be used.

review the script ./bin/acl/run-year and the scripts it reverences. If a part is confusing ask matt.

Look through the outputs folder. Look at the different files in each folder to get an idea of what each part of the pipeline produces

Merge request reports