Write yet another ATF parser
Our ATF parser is not up to par. We need to write a new one, again. It needs to be written in Python (3) and made as a standalone which we will deploy in the backend of cdli to pipe it atf data and get results back
- should be able to take one or more atf text as input
- should provide back:
- errors and warnings
- exact line in the file for each error or warning
- text number and text line if available
- provide a clear explanation of the problem ( as possible )
- keep running to parse the whole input even when there are errors and warnings
- rules and warning / notice messages should be easy to maintain
- bonus:
- a fix function to automatically fix obvious problems
Existing parsers:
- https://github.com/cdli-gh/JTF
- https://github.com/cdli-gh/jtf-lib
- https://github.com/cdli-gh/ATF-Checker
- https://github.com/cdli-gh/pyoracc
- https://github.com/oracc/owi/blob/master/www/rpc.plx.in ( and other files )
Documentation on the file format is here :
http://oracc.museum.upenn.edu/doc/help/editinginatf/cdliatf/index.html
Try out this parser to see how errors should look like : http://oracc.museum.upenn.edu/util/atfproc.html ( we cant reuse that code, it's lisp and perl and has to be run on the oracc server stack only)
Edited by Émilie Pagé-Perron