Virus Detection Vulnerability Report: Bypassing Virus Detection by Exploiting Email Content Parsing Ambiguity

In recent tests, we have discovered a security vulnerability in amavisd-new which allows an attacker to bypass the scanning of the antivirus engine and send an arbitrary malicious file as an attachment to the target victim; the victim can then download the virus attachment through specific clients without receiving any security threat warnings. Details are described below. In order to carry out the tests, we first built a simple email service system on Ubuntu 22.04. Here are the open source tools we used and corresponding versions:

Postfix 3.6.4

amavisd-new-2.12.2 (20211013)

ClamAV 0.103.8

We chose the ransomware virus Ransom.WannaCryptor as the malicious test payload for our tests, sending test email samples from our own host and delivering them directly to the aforementioned mail server. The virus sample needs to be encoded (Base64 or Quoted-Printable) as an entity in the email and declared as an attachment with header "Content-Disposition: attachment".

Under normal circumstances, amavisd parses the content of each entity of the email and completes the decoding of the data content, and gives the parsing results to the detection engine (ClamAV) for scanning. At this point, it will be possible to detect the presence of wannacryptor samples in the email and block the email to ensure that users do not receive content that poses a security threat. The detection log is shown below.

One way to effectively bypass the virus detection described above is to set the value of the email's Content-Type header to multipart/mixed and declare several different boundary parameters. This parameter is used to distinguish between different entities within the multipart message and there should be one and only one declaration for each multipart entity (RFC 2046 5.1.1). We construct the sample as follows:

From: Attacker
To: Victim
Date: xxxx
MIME-Version: 1.0
Subject: multiple_boundary
Content-Type: multipart/mixed; boundary=bar; boundary=foo

--foo
Content-type: text/plain

Email with an attachment.
--bar
Content-type: application/octet-stream; name=b64_sample
Content-Disposition: attachment; filename=b64_sample
Content-Transfer-Encoding: base64

<This is b64_wannacry.>
--bar--
--foo--

Going to amavisd's working directory /var/lib/amavisd/tmp, we can see that amavisd only parsed one mail entity (p001), and it was parsed according to the boundary string "foo", which is shown below:

As a result, clamav does not scan the content of the virus but a part of the email so that no virus samples are found, amavisd does not execute any blocking of the email, and the content of the email can be saved in the victim's inbox.

Users can then choose from a variety of client programs to connect to the mail server to receive mail. After testing, we have found that through Thunderbird, Windows Mail, eM Client and Netease Client, users can get the correctly parsed virus attachments and save the contents of the attachments to their own devices, thus exposed to security threats.

Such results indicate that Amavisd and the client take different approaches when faced with multiple different boundary string declarations. Amavisd adopts the latter boundary parameter (foo) to differentiate between the various email entities, and thus considers that the content of the message body is not malicious since the "text/plain" Content-Type header is used to parse the content of the message body. All content between "Email with an attachment" and the end of the boundary line "--foo--" is the body of a text/plain entity. In contrast, several of the clients mentioned above choose the previous boundary parameter (bar) for parsing messages, thus considering a boundary line "--bar" to be preceded and terminated by the boundary line "--bar--". The content before a dividing line "--bar" and after the terminating boundary line "--bar--" is considered invalid data, and only the content in between is parsed and decoded, thus extracting the correct virus sample.

There are two possible ways to deal with this kind of e-mail content that violates the RFC 2046 format:

Rejecting emails with illegal formatting. Although many products may be concerned that such a strict action may affect their interactivity with other products, the fact is that well-written mail components do not generate such obviously problematic messages, and strict checking measures do not necessarily affect normal mail communication.
Normalization of emails. Amavisd itself already has a normalization operation for emails and temporarily saves the result as email.txt. According to the boundary parameters adopted by Amavisd for content parsing, it can delete the redundant and repetitive parameters, and pass the normalized content to the subsequent processing streams to ensure the consistency of the parsing results of the subsequent components.

The reason for the above vulnerabilities can be summarized as inconsistency in the way Amavisd and the user interface parse the content of email data. We have found more examples of security issues similar to the above in our testing, and therefore we are writing this report to inform you of our findings on this issue, and look forward to your company's further exchanges with us on the internal processing logic of the issue, the determination of security threats, and countermeasure ideas, and other specifics.

Edited Aug 30, 2023 by echo zhang

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information