Fixed up the PHP script to be compatible with latest(0.0.97 as of writing this) WiktionaryParser.
New WiktionaryParser fixes up quite a bunch of issues, but also introduces a small one, which is why the .txt processed files are not going to be up to date yet, am waiting for it to be fixed before commiting an update for them.
iflogic, allowing me to remove workarounds that were made because of it.
Rewrote inflection detection. This cleaned a lot of garbage from definitions and in turn added a tiny bit of garbage into inflections, but it fixed way more than it broke, including multi-line inflections.
Made the Amazon FW workaround better, fixing looking up inflections of words that already had a word definition and accents were involved.
0.15 Release 0.15Release 0.15
0.01-0.14 For changelog historyRelease 0.01-0.14
- 0.01 - Initial release
- Contains nouns, pronouns, verbs and lots of bugs.
- 0.02 - Arrival of the adjectives
- Who knew those are needed!
- 0.03 - Added adverbs and squashed some bugs
- Turns out adverbs are important too
- Fix the dictionary being marked as en-us>en-us instead of nb-no>en-us
- Squash an ugly bug which prevented a lot of words(1000+) from being added to the final dictionary
- Squash a bug which caused definitions being overwritten if there were more than 2 sections of them on Wiktionary, now both are saved correctly
- Remove broken entries from the .txt and .inf files, so final size should be smaller
- 0.04 - Added prepositions, updated all entries
- This seems to be a recurring theme! Added prepositions
- Re-scrapped Wiktionary. Made the process fully automated so this will be easier to do in the future.
- 0.05 - Bugsquash season
- Wiktionary can have nested definitions, account for that. 1300~ words now have better/more definitions, and hundreds of words now have additional inflections
- Hundreds of inflections had garbage input, fixed that
- 0.04 introduced a bug that killed a handful of words in the wordlist at random, developed a workaround and tests to detect this in the future
- 0.10 - Universe expansion
- Added determiners
- Added support for Nynorsk to all scripts and released it here too
- Parser issues were fixed upstream. 500~ new/better definitions and 400~ new/better inflections.
- Properly catch garbage inflection input (try #2). What's caught should be later added to the definition as extra info. #TODO
- After fixing the issue above, the inflections now have entries like "adekvat, adekvate, mer adekvat, mest adekvat" - should I be filtering out mer/mest inflections? No clue.
- Reverted fix for correctly marking dictionaries (so from nb-NO>en-us to en-us>en-us) because Kindle dictionary search is retarded, it is impossible to search the dictionary otherwise. See here for reasoning and a more detailed description
- Fixed issue with phrase definition inflections. My hacky workaround was actually affecting way more stuff than I thought. 400~ inflections(1000~ for Nynorsk) now have new or better definitions.
- 0.11 - 'Dictionary features are not a priority' edition
- Added conjunctions
- Re-scrapped Wiktionary to include my typo fixes
- Word parser script now properly uses Wiktionary API - it's much faster. Also fixed a bug that was occurring with phrases, adding one new phrase.
- If Kindle finds a direct definition of a word, it does not bother to search for definitions from inflections. Used a disgusting hack - cloning all the relevant entries as additional direct definitions. This seems to fix about 1K~ words in each dictionary.
- 0.12 - Formatting fix
- 0.13 - Bugfixes
- Fixed handful remaining inflections that were still broken after attempted fix in 0.10. There should be no broken inflections now.
- Turns out Kindle also does not bother to search for an inflection definition if it already found ANY definition, direct or inflection. Improve upon fix from 0.11 to really add all affected words.
- 0.14 - Cleanup
- Added numerals
- Cleaned up definition entries like "plural of X" or "alternative form of Y" and turned them into inflections instead. This added a lot of garbage inflections which I need to take care of and trimmed a handful of definitions of proper entries. To be fixed.
- 0.01 - Initial release