Strange PES packet handling
Hello,
during testing for #1700 (closed) I found something odd. Namely that the data from completely fine PES packets can be mixed with data from broken PES packets (which to my knowledge should currently be disarded). Namely a gap of 32ms (equal to 1 frame) in an ac3 track from a transport stream. This transport stream puts four ac3 frames into one PES packet and given that PES packets with invalid size are supposed to be discarded before they even reach the ac3 packetizer I assumed that the length of each gap must be an integral multiple of 432ms. But this is not the case. Here is the relevant part of the output of ffmpeg's framehash muxer with adler32 checksums for the file produced by the newest pre-build:
0, 2720, 2720, 32, 1792, 26466e3a
0, 2752, 2752, 32, 1792, a2dc62a7
0, 2784, 2784, 32, 1792, 54ad6e41
0, 2816, 2816, 32, 1792, ad2d4fc2
0, 2848, 2848, 32, 1792, f18d81da
0, 2880, 2880, 32, 1792, 4aed884b
0, 2912, 2912, 32, 1792, d77b7f70
0, 3840, 3840, 32, 1792, 7fcb7724
0, 3872, 3872, 32, 1792, 0db37bda
0, 3904, 3904, 32, 1792, 3b11631f
0, 3968, 3968, 32, 1792, a2057d53
0, 4000, 4000, 32, 1792, ff7a809c
0, 4032, 4032, 32, 1792, cc196cd6
0, 4064, 4064, 32, 1792, 73d16183
The gap I talked about is from 3904-3968. In version 9.8 (before #1864 (closed)) the output was a little bit different:
0, 2752, 2752, 32, 1792, 26466e3a
0, 2784, 2784, 32, 1792, a2dc62a7
0, 2816, 2816, 32, 1792, 54ad6e41
0, 2848, 2848, 32, 1792, ad2d4fc2
0, 2880, 2880, 32, 1792, f18d81da
0, 3840, 3840, 32, 1792, 4aed884b
0, 3872, 3872, 32, 1792, d77b7f70
0, 3904, 3904, 32, 1792, 7fcb7724
0, 3936, 3936, 32, 1792, 0db37bda
0, 3968, 3968, 32, 1792, 3b11631f
0, 4000, 4000, 32, 1792, a2057d53
0, 4032, 4032, 32, 1792, ff7a809c
0, 4064, 4064, 32, 1792, cc196cd6
0, 4096, 4096, 32, 1792, 73d16183
Strangely the gap here is 3840ms-2912ms = 928ms = 29*32ms long; 29 is no multiple of four. And here is ffmpeg:
0, 2720, 2720, 32, 1792, 26466e3a
0, 2752, 2752, 32, 1792, a2dc62a7
0, 2784, 2784, 32, 1792, 54ad6e41
0, 2816, 2816, 32, 1792, ad2d4fc2
0, 2848, 2848, 32, 1792, f18d81da
0, 2880, 2880, 32, 1792, 4aed884b
0, 2912, 2912, 32, 1104, d91b19cf
0, 2944, 2944, 32, 1792, 370b76d0
0, 3840, 3840, 32, 1792, 36786a45
0, 3872, 3872, 32, 1792, 7fcb7724
0, 3904, 3904, 32, 1792, 0db37bda
0, 3936, 3936, 32, 1792, 3b11631f
0, 3968, 3968, 32, 1792, a2057d53
0, 4000, 4000, 32, 1792, ff7a809c
0, 4032, 4032, 32, 1792, cc196cd6
0, 4064, 4064, 32, 1792, 73d16183
This time we even have different adler (and a different number of frames). After analyzing the files a bit (with projectX, mkvinfo and a hex editor) I found out that there is the beginning of a PES packet at 4983240. Then 28 ts packets with the relevant PID, then there is a discontinuity in the continuity counter (last packet before the discontinuity at 5060320, first packet after the discontinuity at 5064644), then 16 more packets with the same PID, then the beginning of a new PES packet at 5124804. Only the last packet before the new PES packet contains padding data, the rest are pure payload. The PES packets have a size of 7176 B (excluding the PES header) and therefore need 40 ts packets (the last packet only contains 6 Bytes of payload). None of the first 40 ts packets which make up the PES packet at 4983240 use byte stuffing, so we are already past the size of the PES packet which therefore should have been rejected. But it isn't.
The material immediately after the beginning of the PES packet ends up in the frame with adler ad2d4fc2; the frame with adler f18d81da begins in the ts packet beginning at 5029300; the next frame (adler 4aed884b) begins at 5035128. This frame is also the first defective as it contains the part with the discontinuity. Afterwards 1104 byte are discarded (by mkvmerge, ffmpeg puts them in a packet) until the next syncword has been found. Then there are exactly 1792 bytes payload left until the beginning of the next PES packet which is exactly one ac3-frame. The resulting frame would have an adler32 of 370b76d0 (and correct crc). This seems to be what ffmpeg did (although the timestamp of this frame should probably be 3808ms and not 2944ms, but ffmpeg has no way to know this for sure).
But mkvmerge does things differently. mkvmerge's next frame (adler d77b7f70) had the same first 688 byte as the frame with adler 370b76d0; these numbers give a clue to what might happen under the hood: 1104 bytes are different. This is exactly the number of bytes that were discarded previously until a syncword has been found. With these 688 bytes, the transport stream reader has exactly as many payload bytes as the PES packet header claimed. It therefore seems to consider the PES as valid and finished. The problem is: This not the end of the PES. There are exactly 1104 bytes payload left. (In fact, only six bytes of the last transport stream packet that contains any of these 688 bytes were used and said packet contains 178 more. And there are even more ts packets...) They are ignored and the last 1104 bytes of the frame with adler d77b7f70 are taken from the beginning of the next PES packet.
And the story isn't finished here: Because the ac-3 packetizer has already used the first 1104 bytes of the first ac-3 frame of the new PES, there are 688 bytes of data (treated as garbage) before the next syncword. This next frame (adler 7fcb7724) which is actually the second frame of the new PES is treated as first frame of the PES packet and therefore muxed with the wrong timestamp. Likewise for the remaining two frames in the PES. The next PES is then parsed normally; given that mkvmerge missed one frame from the second PES we now have a gap equal to one frame in front of the third PES. So the reason for this gap (and crc errors when testing the track) lies in the code for PES packets, not in the ac-3 packetizer. That is at least my guess based upon the above.
I have uploaded a small file (defective.ts) with the error described here and a bigger file with many errors to test any fix.
Gruß Andi
*: I used ffmpeg instead of mkvinfo because it automatically normalizes files (the ac3 tracks produced by mkvmerge start at 19ms, the ones produced by ffmpeg at zero; by using ffmpeg I don't have to care about this). PS: Ich will dir schon einmal fröhliche Weihnachten wünschen.