Implement allocation limits in decoder (!3437) · Merge requests · NebulousLabs / Sia

Luke Champine requested to merge encoding-fix into unstable Feb 08, 2019

Removing the encoding limits caused out-of-memory errors, because the size of a slice was not checked against any limit before it was allocated. Previously I had assumed that constructs like io.LimitReader were sufficient to prevent this, but I was mistaken: limiting the reader prevents an attacker from sending you unbounded data, but does not prevent them from tricking you into allocating very large objects.

To address this, this MR adds a maxAllocs limit to encoding.NewDecoder. The limit is decremented each time the decoder has to allocate a slice or a pointed-to object. Exceeding the limit results in an error.

This appears to be effective, but the API is not super friendly. For example, initially I had thought that Unmarshal could use the size of the byte slice as the allocation limit. However, this results in a limit that is often too high, and, surprisingly, sometimes too small! Consider an encoded slice with zero elements. The encoding for this object is simply a length prefix of 0, i.e. 8 bytes. But the size of this slice in memory is 24 bytes, since it includes the slice header! So we cannot simply use 8 as our limit, since it would be too small.

Another surprise occurs when decoding into an array, such as a types.UnlockHash or crypto.Signature. I would naively pass 32 and 64 as the decoding limits for these objects, but actually the correct limit is 0! Since the arrays are already in memory, the decoder doesn't need to allocate anything at all; it just copies directly into the array. In short, it can be difficult to figure out what the proper limit is for an object, because "size of encoded object" does not always match "allocations required to decode object."

To alleviate this somewhat, I added a DefaultAllocLimit constant to the encoding package, which is currently 1MB. This limit can be used for most objects, although I recommend that we return to these objects later and derive more precise limits for them. Also, Unmarshal will use len(b)*2 + 4096 as the allocation limit; this was chosen as a reasonable compromise. Even if an attacker can cause you to unmarshal a malicious object, you will allocate at most ~2x the size of the object, which shouldn't bring you anywhere near OOM territory.

Implement allocation limits in decoder

Merge request reports