• Author Owner

    workflow concept:

    1. launch game and tool
    2. play until dialogue
    3. push "capture" button/hotkey
    4. advance dialogue
    5. repeat 2 and 3
    6. push "tag" button
    7. captured moment is shown (screenshot or savestate), draw boxes over glyphs
    8. push "next" button/hotkey
    9. repeat 6 and 7
    10. push "map" button
    11. glyph is shown, enter Unicode equivalent (native text)
    12. push "next" button/hotkey
    13. repeat 10 and 11
    14. push "translate" button
    15. play until dialogue
    16. push "capture" button/hotkey
    17. copy transcribed text into Google Translate and hope

    edit 2023-08-10: I've mostly implemented the first part, glyph capture. The UX is very different from what I initially envisioned. You can see the bright-yellow OSD in one of the Yoshi's Island screenshots below. It should be extensible to variable-width text, though when I come back to this project in a few days I'd prefer to just find another game and get right into the next part, mapping. Seeing as I already spent a whole day making an on-screen keyboard for it...

    edit 2024-07-16: "when I come back to this project in a few days" ...is that so? Anyway, ScHlAuChi expressed interest in the project, which lead me to finally come back and start on the latter workflow, mapping glyphs to codepoints. After much ado I have something that half works, with the caveat that the grid is hardcoded for this one textbox since I never finished that: screencap

    Edited by YoshiRulz
  • Author Owner

    older games have nice grids: update: AAAAAAAAAAAAA

    Yoshi's Island screencap

    Yoshi's Island screencap Yoshi's Island screencap in GIMP

    but newer games, while still using sprites, have variable-width glyphs:

    Mario and Luigi screencap Mario and Luigi screencap in GIMP

    (at first glance it looks like it just needs aligning, but while it seems the kana are all 8 px wide and the kanji are all 12 px wide, suggesting a 4 px alignment, the small イ is a pixel narrower and throws it off)

    Edited by YoshiRulz
  • Author Owner

    NEVERMIND, GOT IT! https://www.nuget.org/packages/TesserNet

    ...but now I'm seeing that Tesseract doesn't agree with the Yoshi's Island font in the slightest


    https://www.nuget.org/packages/TesseractOCR API isn't psychotic like the other Tesseract wrapper, but needs a tiny patch before I can even try hacking on a POC.

    https://www.nuget.org/packages/Sdcb.PaddleOCR promising, but doesn't ship binaries for Linux

    https://github.com/cyanfish/naps2/tree/master/NAPS2.Sdk hey look, not only is this PDF scanner FOSS, this one is actually modular! unfortunately, while the modules look nice, they're not published to NuGet yet. also it's just an abstraction over Tesseract.

    https://scribeocr.com https://github.com/scribeocr/scribe.js pros: FOSS, not Tesseract; cons: JS, only English and other Latin-script languages atm

    Edited by YoshiRulz
  • Author Owner

    need to handle scrolling textboxes

  • Author Owner

    Great results! Sadly not FOSS... screencap

  • Author Owner

    Just remembered the electronic jisho on the DS has OCR... The input region is fairly low-resolution, I wonder if its algorithms would work on NN-upscaled sprites?

  • Author Owner

    found this, mainly intended for Chinese<->English from photographs of printed or handwritten text https://paddlepaddle.github.io/PaddleOCR/en/ppocr/overview.html

    some ML solutions:

    Edited by YoshiRulz
  • Author Owner

    prior art (heavily reliant on Google) https://gitlab.com/spherebeaker/vgtranslate https://ztranslate.net/docs

    and another existing tool, powered by manual (cutscene detection and?) localisation https://github.com/eadmaster/RetroSubs

    Edited by YoshiRulz
  • Author Owner

    FCEUX already has this??? https://fceux.com/web/help/TextHooker.html

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment