YAML parser is not thread-safe
Created by: A
I have encountered an issue when parsing multiple yaml files in parallel. The error results in a failure to strip double quotes from some values. For example, the file:
foo:
"key1": "value1"
"key2": "value2"
"key3": "value3"
Might be parsed as if the input file was:
foo:
"key1": "value1"
"key2": "\"value2\""
"key3": "value3"
I have traced the error down to the following method in the class net.sf.okapi.filters.yaml.parser.Line:
public static String decode(String encoded) {
String decoded = encoded;
try {
decoded = (String) yaml.load(encoded);
} catch(Exception e) {
// case where snakeyaml blows up on surrogates and other "non-printables"
// just pass through the uncide value and hope for the best
// FXIME: Dump snakeyaml and use our own decoder
}
return decoded;
}
As the 'yaml' object in the code above is a static instance, it is shared across multiple threads. I have created a pull request which surrounds the yaml.load line with a synchronized block here.