Initial import. authored by Kevin J. McCarthy's avatar Kevin J. McCarthy
Only a few of the files have been cleaned up so far.  I'll continue
cleaning, but want to push this up to the server.
### Umlauts, accents, and other non-ASCII characters are displayed as '?' or '\\123' -- locales
**Short answer:** set \`LC\_CTYPE=en\_US.ISO-8859-1\`.
**Long answer:** You have to configure your *locale* settings. This is
done by setting environment variables.
If your system is already configured, you only have to set \`$LANG\`
and/or some of the \`$LC\_\*\` variables. \`$LANG\` is the default for
the \`$LC\_\*\` categories that are unset, and \`$LC\_ALL\` overrides
all other variables, so make sure the latter one is unset. Mutt cares
mostly about these
categories:
``* `LC_CTYPE` is the character set used by your terminal``
``* `LC_MESSAGES` is the language used by the Mutt menus and messages printed``
``* `LC_TIME` is used by ``*`strftime(3)`*
We will use \`$LANG\` here. Examples for settings
are:
``* `export LANG=de_DE.UTF-8` (sh/bash syntax, put that into your .bashrc/.bash_profile)``
``* `setenv LANG en_US.ISO-8859-1` (csh/tcsh syntax for .cshrc/.login)``
The triple stands for *language*\_*country*.*charset*. There are also
variants like *de\_AT@euro*, and aliases like *deutsch*. Check the
output of "\`locale -a\`" to see what locales values are supported by
your system. Type "\`locale\`" to check the actual settings of all
categories.
$ locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
...
Finally, verify Mutt correctly detects the charset of the locale.
Restart Mutt and type:
:set &charset ?charset
charset="utf-8"
Don't forget to empty the Mutt [header
cache](http://www.mutt.org/doc/devel/manual.html#header-cache) when you
change the charset if you're not at least running mutt 1.5.18.
Also, if you built mutt yourself it is critical that you use a unicode
aware ncurses. Sometimes the package for that is ncursesw\*. If there is
no such package chances are your system includes unicode for all of
ncurses.
**Problem:** If "\`locale --all-locales\`" list is empty, or lacks a
suitable value, you have to generate the locale files first. Check the
*localedef(1)* manpage on how to do this. Debian users simply call
"\`dpkg-reconfigure locales\`" (make sure the *locales* package is
installed).
**Further
problems:**
``* Some systems (MacOS X before 10.4, NetBSD...) have no `locale` command installed.``
`` Use something as "`ls /usr/share/locale/`" or "`ls /usr/lib/locale/`" to list available values.``
`* Some systems (libc5...) have no way to tell Mutt the locale's charset.`
` You have to set $charset variable in muttrc yourself.`
`* Some systems (HP-UX, AIX, OSF1, Irix...) have not totally standard names for some charsets.`
` Use iconv-hooks to alias them to standard names. Example files come with Mutt tarball.`
`* Some systems (Cygwin...) have no working locales.`
`` Use the `--enable-locales-fix` configure option, and set $charset yourself,``
` but be prepared to have some limitations in functionality.`
### Umlauts, accents, and other non-ASCII characters are displayed fine in some mails, but hidden in others
Make sure the mails have a proper charset declaration in the header. For
example:
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: 8bit
In case the charset label lacks, lies, or these headers lack entirely,
you can still try to make Mutt workaround the problem on-the-fly.
Example for westerners receiving broken mails really mostly in Latin-1
or CP-1252 charset: Declaring CP-1252 as default assumed charset for
broken mails.
charset-hook ^us-ascii$ cp1252
charset-hook ^iso-8859-1$ cp1252
unset strict_mime
set assumed_charset="cp1252"
or
charset-hook US-ASCII ISO-8859-1
charset-hook x-unknown ISO-8859-1
charset-hook windows-1250 CP1250
charset-hook windows-1251 CP1251
charset-hook windows-1252 CP1252
charset-hook windows-1253 CP1253
charset-hook windows-1254 CP1254
charset-hook windows-1255 CP1255
charset-hook windows-1256 CP1256
charset-hook windows-1257 CP1257
charset-hook windows-1258 CP1258
Another example, for Chinese receiving broken mails really mostly in
GB2312 charset:
charset-hook ^us-ascii$ gb2312
unset strict_mime
set assumed_charset="gb2312"
In more specific cases you can use <edit-type> function to manually
override a wrong label. By default it's \!^E key. From index or pager it
acts on the body of the mail, while from attachments menu it acts for
the individual part selected.
See also: [PatchList](PatchList):
assumed\_charset
### Umlauts, accents, and other non-ASCII characters are only displayed wrong when using auto\_view
First, imagine a situation when you have to use [MIME
Autoview](http://www.mutt.org/doc/devel/manual.html#auto-view) i.e. to
display \`text/html\` content in the mutt-pager.
You get a mail with the following header:
Content-Type: text/html; charset="iso-8859-1"
your locales are:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
(...)
your mailcap looks something like:
text/html w3m -dump %s; copiousoutput
and there is one \`auto\_view\` in your muttrc:
auto_view text/html
When you open this mail in the mutt-pager, mutt spawns *w3m* (or any
other text-browser defined in mailcap), *w3m* dumps text generated from
the input html-file (\`%s\`) back to mutt and the mutt-pager displays it
-- unfortunately wrong.
The problem is, that *w3m* does not know anything about the character
encoding of the input-file. *w3m* can only figure out a (possible)
charset from your locales but in our example the sets don't match
(\`iso-8859-1 \!= UTF-8\`).
One can get around this, with the \`%{charset}\` variable in
\`mailcap\`:
w3m -I %{charset} -T text/html -dump; copiousoutput
(Don't be confused by the missing \`%s\` -- *w3m* can read data from
stdin so \`%s\` is basically not needed. See [Advanced mailcap
Usage](http://www.mutt.org/doc/devel/manual.html#id929354) for details.)
the *w3m* documentation says:
$ w3m -h
(...)
-I charset document charset
-T type specify content-type
(...)
With this entry, your mutt-pager will print something like:
[-- Autoview using w3m -I 'iso-8859-1' -T text/html -dump --]
(...)
Here are some Umlauts: äöü ÄÖÜ
(...)
As you can see, mutt resolved \`%{charset}\` correctly into
\`iso-8859-1\`. Of course the input-charset options above depend on your
preferred
text-browser.
### Characters are replaced by ? when charsets and fonts are correctly set up
The problem here is that characters in the document's charset are simply
not available in mutt's current charset. This is particularly prevelant
in documents created by Microsoft agents. Mutt can be instructed to make
a best effort attempt to replace the missing characters with something
similar by appending //TRANSLIT to the set charset declaration (e.g. set
charset=iso-8859-1//TRANSLIT).
**Note:** Whatever nice this "approximation" trick can be, it's only a
workaround. The best solution for the problem is upgrading to a more
capable terminal, with a charset able to display directly all wanted
characters. But it's not always possible or easy.
### How can I check if locales work before I blame Mutt for it?
perl is sensetive to proper locale settings. On certain distros (e.g:
Debian) it will complain when the charset settings are incorrect. Try:
perl -e ""
should do nothing and print nothing. If it gives a loud ugly warning
about LANG, LC\_CTYPE and LC\_ALL, something's wrong. But if it does not
shout it may only be because it is configured not to (how?). To test for
that, run:
env LC_ALL=nocharset perl -e ""
and verify that you <em>do</em> get and ugly warning with it.
GNU *ls* also uses $LC\_CTYPE. Simply "\`touch äöü\`" a file with
non-ASCII characters and look whether "\`ls\`" lists the proper name, or
just "???". To test $LC\_MESSAGES, call GNU *grep*:
Aufruf: grep [OPTION]... MUSTER [DATEI]...
grep --help gibt Ihnen mehr Informationen.
(Obviously, this method does not work for English locales.)
### UTF-8 chars are displayed fine, but the screen is garbled
Mutt has to be linked against a term library with wide char support. For
ncurses, this is the libncurses**w** library.
$ mutt -v | grep using
System: Linux 2.4.25-planck (i686) [using '''ncurses''' 5.4]
$ ldd `which mutt` | grep curses
libncursesw.so.5 => /usr/lib/libncurses'''w'''.so.5 (0x40023000)
To get libncursesw, compile curses with --enable-widec. Debian users
install the libncursesw5 package. (On Debian/Woody (stable), install
mutt-utf8. Starting with Debian/Sarge, Mutt is already linked against
libncursesw; try apt-get build-dep mutt if you compile your own mutt.)
Default Slang seems not to work with UTF-8, relink Mutt against
libncursesw. (Hello Gentoo users :-)
S-Lang needs the UTF-8 patch to work with UTF-8. Here it is:
<http://www.emaillab.org/mutt/tools/slang-1.4.8-utf8.diff.gz> (This
displays CJK chars more correctly than ncursesw.)
### I tuned all the variables correctly, but my messages are garbled
Miscoded characters can perturbate the charset transcoding, or their
auto-sensing by your $editor. Make sure that your signature, aliases,
muttrc, /etc/Muttrc, and any files sourced are written with the right
charset. Make sure that the charset of **$locale** (used to localize
date and time) matches your **$charset**. Make sure that the mail you
quote was cleanly displayed before.
-
**Tip:** Autoconvert on-the-fly the config files from their fixed
charset to the current $charset:
Convert once for all your files to one given charset, your preferred
one. Example here UTF-8. From now on edit them only in this charset.
Then add at the **beginning** of your muttrc:
set config_charset=utf-8
set signature="iconv -f utf-8 ~/.signature |"
set locale=`echo "${LC_ALL:-${LC_TIME:-${LANG</code>"`
-
**Note:** $config\_charset feature is included since Mutt 1.5.7.
The **$editor** used by Mutt to compose messages must be configured to
read and write files in current locale's charset, without smart
autosensing of file's charset. When used for the \<**edit**\> function
(edit the raw message), autosensing can help. When used to edit muttrc,
signature, or aliases, hardcode the charset previously choosen as
**$config\_charset**.
Regarding your editor of choice: Some distros change the defaults of the
editor you use or the defaults are not good enough. For example some
distributions set the **fileencoding of Vim to UTF-8** no matter what
locale the user chooses to use. Say the user chooses LANG="de\_DE@euro".
Then displaying received messages containing umlauts or other special
characters is most likely no problem at all. But writing messages
results in a total mess. For instance sending a string containing
"öäüß@€" results in "öÀÌÃ\\237@â\\202¬". You can fix this by
setting up your own ~/.vimrc holding the following:
set encoding& " terminal charset: follows current locale
set termencoding=
set fileencodings= " charset auto-sensing: disabled
set fileencoding& " auto-sensed charset of current buffer
Those settings are in fact reset to Vim's sensible defaults. Only the
**fileencodings** is different: Its default value is very nice, but can
sometimes hurt Mutt. At best, it should be unset **only** when called
from Mutt to compose a message, not in general (how?).
### Attached text files get sent misencoded with wrong charset
By default Mutt assumes the text files you attach are originally in the
same charset as your terminal. Upon sending, Mutt will convert those
files from **$charset** to one of **$send\_charset**. This fails badly
for any file that was **not** originally in **$charset**.
There are 2 solutions:
- Interactively change the attachment's charset to the file's real
charset in compose
menu\\
` before sending, using <`**`edit-type`**`> function (bound to ^T key by default) and replying `**`no`**` to the "Convert?" question.\`
` This unfortunately bypasses automatic selection of the better suited sending charset.`
- Activate original charset auto-sensing
with:
<!-- end list -->
set file_charset="utf-8:iso-8859-1"
` Mutt then checks each `**`$file_charset`**` in turn.\`
` The first charset in which the text file is entirely valid is assumed to be the file's charset.\`
` Upon sending, Mutt will convert this file from auto-sensed charset to one of `**`$send_charset`**`.`
**Note**: This "auto-sensing" is really educated guessing, and can fail.
Keep an eye on compose menu, which displays for each attachment the
charset choosen for sending (after **$charset** or **$file\_charset** to
**$send\_charset** conversion). Particularly it is not able to
distinguish similar 8 bits charsets like Latin-1 from Latin-2, or from
CP-850, and such. UTF-8 and **one** 8 bits charset is OK. No more.
Japanese may use "iso-2022-jp:euc-jp:shift\_jis:utf-8" which works well
because those charsets are coded very differently and thus are easely
distinguishable.
**Note**: **$file\_charset** is one of the numerous features provided by
Takashi Takizawa in his Japanese patch. It is also part of the compat
patch, and of the tt.assumed\_charset patch. See more infos on
[PatchList](PatchList). The feature is integrated in Debian and Gentoo Mutt packages.
\=== MIME attachment filenames are displayed as =?iso-8859-1?Q === The
filename is encoded in the deprecated RFC 2047 (which has been
superseded by RFC 2231); this is commonly produced by Microsoft Outlook,
and some other MUAs.
Decode these filenames by setting this parameter:
set rfc2047_parameters