Skip to content

Implement Markup/Formatting for source texts

Currently we allow users to upload plaintext and Wikipedia pages via their Wiki sourcecode as documents to translate. We store the plaintext and then attempt a segmentation into sentences.

The current approach works fine in 99% of all cases, but has some disadvantages:

  • Translations sometimes contain Wiki sourcecode markup
  • Editor doesn't give any hints about the original formatting, so you have to guess what's a headline or paragraph
  • You cannot easily change the segmentation or content of a translation.

To address this, we should work on three things:

  1. Decide on a markup language that allows simple formatting while making it easy to perform segmentation
  2. Implement a new translation view, that shows the source text in original formatting and opens a layer with the editor when clicking on a segment
  3. Allow editing of source texts and their segmentation

We still need to figure out if we want to keep the Wiki uploader as it is or if we could try to convert the Wiki markup to our custom markup. Also we need to ensure that all translations can be downloaded, special care needs to be taken to compile Wiki pages back into Wiki sourcecode.

Edited by Sebastian Utzerath
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information