Added eszett and euro sign; revised alif to MARC-8 to UTF8 conversion
Created by: gugek
Pull request for: Issue: https://github.com/edsu/pymarc/issues/84
Added mappings for eszett and the euro sign. Updated the alif character to a new one. Added relevant tests also.
I made a decision not to deal with the ligatures and double tildes. These are double combining characters that are represented with two code points in MARC-8 and one in Unicode. The alternative is to map to the left right combining characters, and that is what exists right now. Otherwise I'd have to get into the guts of marc8.py and get something in that would look to handle all the error conditions: see MARBI Proposal 2004-08 and discussion paper
From: https://memory.loc.gov/diglib/codetables/45.html
Note 1: The Ligature that spans two characters is constructed of two halves in MARC-8: EB (Ligature, first half) and EC (Ligature, second half). The preferred Unicode/UTF-8 mapping is to the single character Ligature that spans two characters, U+0361. The single character Ligature is encoded between the two characters to be spanned. The two half Ligatures in Unicode, to which the Ligature has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Ligature mark will be more easily obtained than for the two halves.
Note 2: The Double Tilde that spans two characters is constructed of two halves in MARC-8: FA (Double Tilde, first half) and FB (Double Tilde, second half). The preferred Unicode/UTF-8 mapping is to the single character Double Tilde that spans two characters, U+0360. The single character Double Tilde is encoded between the two characters to be spanned. The two half Double Tildes in Unicode, to which the MARC8 Double Tilde has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Double Tilde mark will be more easily obtained than for the two halves.