wrong encoding for ofx data in boobank
When I run history
in boobank, to export an OFX file for Gnucash, I sometimes have French labels that are outside of US-ASCII. For example:
[yves@localhost ofx]$ LANG=C grep "[^-[:alnum:]<>/ .:']" Oney_visa.ofx
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Prélèvement mensualité
<NAME>Avoir crédit effectué par PAYPAL
<NAME>Prélèvement mensualité
<NAME>Avoir crédit effectué par PAYPAL
But the header says ENCODING:USASCII
. Thus:
- There are encoding errors during the download (much like this one).
- When imported into Gnucash, the software trusts the encoding declaration, and imports words such as
Prélèvement
.
It probably wouldn’t hurt to change the default encoding to UTF-8, since:
- UTF-8 is compatible with US-ASCII for the whole US-ASCII range.
- Weboob does not seem to stop on encoding errors, in case a bank returns something else (eg. ISO-8859-*).
Of course, the best way would be to detect the encoding… But this means buffering, and other things to think about :-(