Extranous extra empty line before first part in multipart message
Today, I ran into an issue because mutt inserts an extra empty line before the first part in a multipart message. Even though this is, AFAICS, not really wrong according to the MIME spec, it caused a problem for me when combined with an MTA (ran by someone else) that strips that extra newline, which breaks the DKIM signature and caused my e-mails to be bounced.
Let me emphasize that the real fault for my DKIM issue is, as far as I am concerend, that MTA - it should not be stripping newlines from messages. However, maybe it would be good to fix this in mutt as well, to reduce the surface for such a broken MTA to actually break things.
As an example, here's a mail (a draft from my postponed maildir) that shows the extra newline:
Date: Tue, 20 Feb 2024 20:41:43 +0100
From: Matthijs Kooijman <matthijs@stdin.nl>
To: test@example.net
Subject: Test
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="THNTSMH6U7w0KHk7"
Content-Disposition: inline
--THNTSMH6U7w0KHk7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
BODY
--THNTSMH6U7w0KHk7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="attach.txt"
--THNTSMH6U7w0KHk7--
Note the two empty lines after the Content-Disposition: inline
. The first added by e.g. this code serves to separate the headers from the body, and is required.
The second one, however, serves no purpose. It is added because a newline is added before every boundary by this code. This is needed for subsequent boundaries (to allow parts that do not end with a newline, such as the empty "attach.txt" attachment in my example), but AFAICS not the first (see spec details below).
As for the specs, I think this is all defined in section 5.1.1 of RFC2046.
About the newline before each boundary, [the spec says]:
NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary so that it is possible to have a part that does not end with a CRLF (line break). Body parts that must be considered to end with line breaks, therefore, must have two CRLFs preceding the boundary delimiter line, the first of which is part of the preceding body part, and the second of which is part of the encapsulation boundary.
To me, it is a bit ambiguous whether this also applies to the first boundary. Technically, you could read here that the first boundary also has a newline attached, and assume that that newline must be a different one from the one after the empty line that separates headers from body (which is what mutt is now doing). However, section 5.1.4 shows an example that has only a single empty line, which tells me that an extra newline is not what the spec intends to require (and also that omitting the extra line would be compliant).
Furthermore, also in 5.1.1, the spec says:
There appears to be room for additional information prior to the first boundary delimiter line and following the final boundary delimiter line. These areas should generally be left blank, and implementations must ignore anything that appears before the first boundary delimiter line or after the last one.
NOTE: These "preamble" and "epilogue" areas are generally not used because of the lack of proper typing of these parts and the lack of clear semantics for handling these areas at gateways, particularly X.400 gateways. However, rather than leaving the preamble area blank, many MIME implementations have found this to be a convenient place to insert an explanatory note for recipients who read the message with pre-MIME software, since such notes will be ignored by MIME-compliant software.
So the spec confirms that there is some undefined space at the beginning. It also confirms that mutt is not wrong (it is just outputting a single empty line preamble), but also recommends to leave the area entirely blank (though one can argue whether an extra empty line counts as blank...).
In terms of implementation, if this is to be fixed, it looks most obvious to fix it in the mutt_write_mime_body()
function, that writes the boundaries, by omitting the leading newline for the first part.