Separate filters for CCs and Symbols
TESTING NOTES:
PART 1:
git clone https://gitlab.com/dprl/graphics-extraction.git
cd graphics extraction
git checkout sigir-prep-update
make
./bin/acl/run-year 1952
if there are errors scroll up. errors often propagate through the system from previous scripts.
if there are no errors then the test passes
PART 2 (specifically for Bryan):
examine the file modules/pipeline/msp_settings.py
particularity the following lines:
# Recognition control flow
# PIPE_SKIP_PREPARSE = False #True
PIPE_SKIP_PREPARSE = False
PIPE_COMBINE_IF_PREPARSE = True
PIPE_SKIP_PARSE = False
PIPE_COMBINE_ONLY = False
PIPE_PARSE_ONLY = False
relate these to the where they are evoked in modules/pipeline/msp_settings.py
to get an understanding of how they should be used.
review the script ./bin/acl/run-year
and the scripts it reverences. If a part is confusing ask matt.
Look through the outputs
folder. Look at the different files in each folder to get an idea of what each part of the pipeline produces