Commit f11e7928 authored by Arne Köhn's avatar Arne Köhn

A New And Shiny README File!

parent 42e37049
For the evaluation of tagging results you need SVMTool. You can get it
from [1].
Setup and general Information
If you want to use the train-evaluate UI, you have to copy
train-evaluate.conf.sample to train-evaluate.conf and adjust
everything to your needs.
The subdirectory "taggers" contains taggers that use different taggers
such as TnT. Every tagger contains the configuration of the interfaced
tagger. This makes the interface quite simple and you don't have to
fiddle around with configuration files each time you want to test
something. If you want to test a tagger with a different
configuration, just create a new tagger. Have a look at the existing
interfaces and taggers/README for more information.
You should add every tagger you want to evaluate to the list of
taggers in the configuration file.
All information will be stored in a sqlite3 database: Which tagger has
been trained on which testset, the results of the tests, infomation
about the testsets.
To start the program, type `perl`
Creating new Testsets
Just use "tsets->create tset" in the menu to create a tset.
A testset consists of three files: train, test and gold.
- train contains the sentences used to train a tagger.
- test is the file used to test the tagger. It contains all the
sentences that are not in train.
- gold contains the same sentences as test and the correct tag for
each word.
You will be asked how many sentences you want for training. A new
testset will be created, using randomly chosen sentences for
training. All sentences that are not in the training set go to the
Training and testing taggers
mark every tagger you want to train and every tset you want those
taggers to be trained on and press "train". train-evaluate will
automatically sort out those combinations that have already be trained
and distribute the remaining jobs to the hosts specified in the
configuration file.
You can then use the same process to test the taggers on tsets.
IMPORTANT: svmt-standard has to be trained on each tset you want to
use for testing. The training data created by svmt-standard s used
for the evaluation.
Getting nice graphs
Since nobody really wants to poke around in the sqlite database to get
the results, train-evaluate can show you some nice graphs. Just mark
every tagger you want to be graphed and go to "taggers->make graph".
You will be asked for an output directory. This is the directory in
which you can find all the data to plot the resulting graph by
hand. If you have an X server running, train-evaluate will start
gnuplot for you.
Questions, Comments etc.
If you have questions, send me an e-mail: arne at
This software is licensed under the GPLv3 or later.
You should be able to get it from
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment