... | @@ -36,7 +36,7 @@ PARSEME provides scripts to convert between CUPT, CoNLL-U and FoLiA (and also th |
... | @@ -36,7 +36,7 @@ PARSEME provides scripts to convert between CUPT, CoNLL-U and FoLiA (and also th |
|
|
|
|
|
## File format validation
|
|
## File format validation
|
|
|
|
|
|
PARSEME provides a validation script `parseme_validate.py`, designed to check the content within [CUPT](https://gitlab.com/parseme/corpora/-/wikis/PARSEME-tools#file-format-validation) files. Located within the `st-organizers/release-preparation` directory of the PARSEME [utilities](https://gitlab.com/parseme/utilities) repository. This is the official PARSEME validator, described in more detail below.
|
|
PARSEME provides a validation script `parseme_validate.py`, designed to check the content within [CUPT](https://multiword.sourceforge.net/cupt-format) files. Located within the `st-organizers/release-preparation` directory of the PARSEME [utilities](https://gitlab.com/parseme/utilities) repository. This is the official PARSEME validator, described in more detail below.
|
|
|
|
|
|
The validation procedure is structured around various levels. The validation script can be optionally commanded to test the validity of your data up to a specific level.
|
|
The validation procedure is structured around various levels. The validation script can be optionally commanded to test the validity of your data up to a specific level.
|
|
* **Level 1** (CUPT backbone): At this level, the validator exclusively tests the order of lines, newline encoding, and conducts core tests to ensure the file's integrity. It invokes the [UD validator](https://universaldependencies.org/validation-rules.html) at level 1 and supplements it with new tests designed for the CUPT format. For instance, one such test ensures that the first line appropriately specifies **global.columns**, and that the `ID` and `PARSEME:MWE` columns are present.
|
|
* **Level 1** (CUPT backbone): At this level, the validator exclusively tests the order of lines, newline encoding, and conducts core tests to ensure the file's integrity. It invokes the [UD validator](https://universaldependencies.org/validation-rules.html) at level 1 and supplements it with new tests designed for the CUPT format. For instance, one such test ensures that the first line appropriately specifies **global.columns**, and that the `ID` and `PARSEME:MWE` columns are present.
|
... | | ... | |