YAML parser is not thread-safe

Created by: A

I have encountered an issue when parsing multiple yaml files in parallel. The error results in a failure to strip double quotes from some values. For example, the file:

foo:
    "key1": "value1"
    "key2": "value2"
    "key3": "value3"

Might be parsed as if the input file was:

foo:
    "key1": "value1"
    "key2": "\"value2\""
    "key3": "value3"

I have traced the error down to the following method in the class net.sf.okapi.filters.yaml.parser.Line:

	public static String decode(String encoded) {
		String decoded = encoded;
		try {
			decoded = (String) yaml.load(encoded);
		} catch(Exception e) {
			// case where snakeyaml blows up on surrogates and other "non-printables"
			// just pass through the uncide value and hope for the best
			// FXIME: Dump snakeyaml and use our own decoder
		}
		return decoded;
	}

As the 'yaml' object in the code above is a static instance, it is shared across multiple threads. I have created a pull request which surrounds the yaml.load line with a synchronized block here.

Assignee Loading
Time tracking Loading