-
🦅 @YoshiRulzAuthor Ownerworkflow concept:
- launch game and tool
- play until dialogue
- push "capture" button/hotkey
- advance dialogue
- repeat 2 and 3
- push "tag" button
- captured moment is shown (screenshot or savestate), draw boxes over glyphs
- push "next" button/hotkey
- repeat 6 and 7
- push "map" button
- glyph is shown, enter Unicode equivalent (native text)
- push "next" button/hotkey
- repeat 10 and 11
- push "translate" button
- play until dialogue
- push "capture" button/hotkey
- copy transcribed text into Google Translate and hope
edit 2023-08-10: I've mostly implemented the first part, glyph capture. The UX is very different from what I initially envisioned. You can see the bright-yellow OSD in one of the Yoshi's Island screenshots below. It should be extensible to variable-width text, though when I come back to this project in a few days I'd prefer to just find another game and get right into the next part, mapping. Seeing as I already spent a whole day making an on-screen keyboard for it...
edit 2024-07-16: "when I come back to this project in a few days" ...is that so? Anyway, ScHlAuChi expressed interest in the project, which lead me to finally come back and start on the latter workflow, mapping glyphs to codepoints. After much ado I have something that half works, with the caveat that the grid is hardcoded for this one textbox since I never finished that:
Edited by YoshiRulz -
🦅 @YoshiRulzAuthor Ownerolder games have nice grids:update: AAAAAAAAAAAAAbut newer games, while still using sprites, have variable-width glyphs:
(at first glance it looks like it just needs aligning, but while it seems the kana are all 8 px wide and the kanji are all 12 px wide, suggesting a 4 px alignment, the small イ is a pixel narrower and throws it off)
Edited by YoshiRulz -
🦅 @YoshiRulzAuthor OwnerNEVERMIND, GOT IT! https://www.nuget.org/packages/TesserNet
...but now I'm seeing that Tesseract doesn't agree with the Yoshi's Island font in the slightest
https://www.nuget.org/packages/TesseractOCR API isn't psychotic like the other Tesseract wrapper, but needs a tiny patch before I can even try hacking on a POC.
https://www.nuget.org/packages/Sdcb.PaddleOCR promising, but doesn't ship binaries for Linux
https://github.com/cyanfish/naps2/tree/master/NAPS2.Sdk hey look, not only is this PDF scanner FOSS, this one is actually modular! unfortunately, while the modules look nice, they're not published to NuGet yet. also it's just an abstraction over Tesseract.
https://scribeocr.com https://github.com/scribeocr/scribe.js pros: FOSS, not Tesseract; cons: JS, only English and other Latin-script languages atm
Edited by YoshiRulz -
🦅 @YoshiRulzAuthor Ownerneed to handle scrolling textboxes
-
🦅 @YoshiRulzAuthor Owner -
🦅 @YoshiRulzAuthor OwnerJust remembered the electronic jisho on the DS has OCR... The input region is fairly low-resolution, I wonder if its algorithms would work on NN-upscaled sprites?
-
🦅 @YoshiRulzAuthor Ownerfound this, mainly intended for Chinese<->English from photographs of printed or handwritten text https://paddlepaddle.github.io/PaddleOCR/en/ppocr/overview.html
some ML solutions:
Edited by YoshiRulz -
🦅 @YoshiRulzAuthor Ownerprior art (heavily reliant on Google) https://gitlab.com/spherebeaker/vgtranslate https://ztranslate.net/docs
and another existing tool, powered by manual (cutscene detection and?) localisation https://github.com/eadmaster/RetroSubs
Edited by YoshiRulz -
🦅 @YoshiRulzAuthor OwnerFCEUX already has this??? https://fceux.com/web/help/TextHooker.html
Please register or sign in to comment






