|
|
Quick links:
|
|
|
* [PARSEME corpora - Home](home)
|
|
|
* [PARSEME Language Leader Guide](parseme-language-leader-guide)
|
|
|
* [Annotating new corpora](annotating-new-corpora)
|
|
|
* [Enhancing existing corpora](enhancing-existing-corpora)
|
|
|
* **Preparing raw corpora**
|
|
|
* [PARSEME tools](parseme-tools)
|
|
|
|
|
|
-----------
|
|
|
|
|
|
## Annotation: FLAT
|
|
|
|
|
|
* [FLAT annotation platform](http://mwe.phil.hhu.de/): the PARSEME instance of [FLAT](https://github.com/proycon/flat), developed by Maarten van Gompel and hosted at the University of Düsseldorf.
|
|
|
* [FLAT user guide](https://docs.google.com/document/d/1zd_VhXQTel_IRVQ_u6s2wvJttwBHdDIk5YtWDMa3QW4/edit#) for PARSEME annotation
|
|
|
|
|
|
## File formats and conversions: utilities
|
|
|
|
|
|
* [PARSEME utilities](https://gitlab.com/parseme/utilities/): a repository containing useful scripts for corpus management, including parsemetsv<->cupt conversion, adjudication, consistency checks, and corpus statistics. LLs may need to run some of these scripts with the help of core organizers
|
|
|
* [PARSEME guidelines](https://gitlab.com/parseme/sharedtask-guidelines): a repository hosting the HTML guidelines and issues page (LLs generally do not need to edit the guidelines directly but they do participate in raising and solving issues)
|
|
|
* [CUPT format](http://multiword.sourceforge.net/cupt-format/): Description of the PARSEME version of extended [CoNLL-U format](https://universaldependencies.org/format.html), defined jointly with [Universal Dependencies](http://universaldependencies.org/). The generic meta-format extending CoNLL-U is called [CoNLL-U Plus](https://universaldependencies.org/ext-format.html).
|
|
|
|
|
|
## Consistency checks scripts
|
|
|
|
|
|
## Error mining: Grew-match
|
|
|
|
|
|
* [Grew-match](http://match.grew.fr/?corpus=PARSEME-EN): an online query tool on annotated data.
|
|
|
|
|
|
## Gitlab data repositories
|
|
|
|
|
|
* [Development Gitlab space](https://gitlab.com/parseme/sharedtask-data-dev) (for authorised users): contains development versions of the corpora, double-aligned corpora for IAA calculation, system results from previous editions, various scripts for ST organizers (automating system evaluation, publishing the results, running IAA). In 2020, we would like to experiment moving the development version of language corpora to dedicated gitlab repositories.
|
|
|
* [Description of PARSEME repositories](https://docs.google.com/document/d/1Wkx7bWTR04TXFVypPKy-qYi4ugc_034BtfskDeLDoGU/). This document may require updates, please send us a message if you find any inconsistency. |
|
|
\ No newline at end of file |
|
|
* [Description of PARSEME repositories](https://docs.google.com/document/d/1Wkx7bWTR04TXFVypPKy-qYi4ugc_034BtfskDeLDoGU/). This document may require updates, please send us a message if you find any inconsistency.
|
|
|
|
|
|
## Guidelines editions and example editing
|
|
|
|
|
|
* [PARSEME guidelines](https://gitlab.com/parseme/sharedtask-guidelines): a repository hosting the HTML guidelines and issues page (LLs generally do not need to edit the guidelines directly but they do participate in raising and solving issues) |