Various issues with Opus
Hello,
as you might have noticed from the doom9 thread or the ffmpeg ticket you subscribed to, I have found a way to reproduce these errors that were previously untraceable. And this enabled me to find some bugs in the code regarding opus:
-
Your code sums the lengths of the packets encountered so far and compares it to the time where one should be based upon the granule position. But this simply presumes that the initial granule position is zero, an assumption that needn't be true according to the part 4.5 of the specs. They only need to be non-negative. According to the specs, one shouldn't compare the sum of the duration of packets encountered so far and the time derived from the current granule position, but rather the sum of the durations of the packets that end in a certain page (if any) and the difference of the times derived from the granule positions of the current page and the granule position of the last page in which a packet ended. For all but the first (for which there isn't a previous page) and the last page this sum should be equal to the aforementioned difference if the stream is spec-compliant, i.e. a spec compliant stream doesn't have holes. If the initial granule position is strictly positive, then the check for whether there needs to be a DiscardPadding element might not be true even if there needs to be a DiscardPadding element; and even if the condition still evaluates to true, the DiscardPadding length will be too small. Here is a probable scenario for this to happen: If you split the beginning of a file away with mkvmerge, it is common for the audio of the remaining part to start a few milliseconds after the video. For ffmpeg, the first audio packet and also the first audio sample derived from it has a strictly positive timestamp and this timestamp is preserved after encoding with libopus (said libopus needs to be integrated into ffmpeg; given that the wave format (or raw PCM) can't signal an initial offset (or gaps), the initial offset won't survive piping to another tool like a standalone version of the reference encoder). For such files it is common to have a DiscardPadding element that is too low/nonexistent when muxed with mkvmerge. (Given what I have said about lacing on the ffmpeg bug tracker it is best not to do this with a laced input file.)
-
The consequences of comparing the absolute sums with the granule positions are much worse when the file is a bit out of spec as described in my post on the ffmpeg bug tracker. Consider the file "Initial.Granulepos.negative.opus" which I have uploaded to your ftp-server. It is actually the file that fadedfedor has uploaded on the doom9 thread (so such files exist in the wild). See part d) of my ffmpeg post on why such files exist. For us only one thing is important: This were a complete valid file if all the granule position were shifted by a certain amount to make the first (implicit) granule position nonnegative. The current approach by mkvmerge of comparing the absolute positions leads to the situation that the check in line 1207 is true for every page so that there is a DiscardPadding element inserted for every page. As has already been said: The specs don't allow gaps in opus files; only a page with the 'end of stream' flag set can have end trimming at all! mkvmerge currently completely ignores this. The reference decoder is a bit buggy in this regard, too: It also acts as if the first page could have a valid end trimming, but because it isn't based upon absolute positions, all the other pages are decoded just fine. (By the way, actually files with negative initial granule position are completely invalid according to the specs, but they seem to work. See 4. for a proposal on how they can be fixed.)
-
There is also the possibility that lacing (or other timestamp issues) of the input file can lead to a malformed output file (see again the ffmpeg post). Here again mkvmerge creates DiscardPadding for pages that don't have the 'end of stream' flag set. In my tests the actual files were fine except for the granule positions. The only thing that was truly lost was the end trimming at the last page.
-
So the files should be repairable. Here is a suggestion how to do it: Given that logical ogg bitstreams can't have holes (except in case of packet loss), one actually doesn't need the granule position at all (save for two things). One just needs the duration of the packets. The two exceptions are packet loss and end trimming. The first thing can be dealt with by making use of the fact that the ogg page header contains a page_sequence_number that can be used to detect packet loss. So if the page_sequence_number shows that there is packet loss, one uses the difference of the granule positions to derive the timestamps of the packets in the second page from the timestamp of the packets in the first page; if not, it is better to directly use the durations. And the end trimming of course has to be based upon granule positions. That way, these malformed files with "Sample count behind granule" would be automatically fixed. Last but not least, currently mkvmerge ignores whether there is an initial granule position offset for determining timestamps; they always start at zero. I think it would be good to keep it that way. There is just a problem: If opus is muxed in ogg with other streams, then they all need to be shifted by the same amount. This might change the current behaviour for the other tracks. Is this a problem?
-
mkvextract currently tries to preserve DiscardPadding elements, including those in the middle of files; but as has already been said: In ogg, end trimming is only allowed in a packet that ends a logical bitstream. So mkvextract would either have to discard the DiscardPadding elements in the file (except the one at the very end; this would of course be the right approach for the kind of partially broken samples that ffmpeg ticket #4178 or mkvmerge ticket #2099 (closed) was about) or it would have to start new logical bitstreams in the output file (this would probably be the right approach for opus tracks that have been created by joining together several independent opus streams).
-
And given that I have just mentioned joining opus files I should also mention that this currently works suboptimally: Before muxing the first pre-skip/CodecDelay samples of each input track are invalid, but currently nothing ensures the invalidity of these samples for every appended file. The right approach would probably to use DiscardPadding elements that signify to discard the beginning and offset the timestamps of the appended files accordingly. (And this of course presumes that the CodecDelay of all input files is the same; it would be way more complicated if it weren't, but such Opus streams have different CodecPrivate anyway so that appending them is not supported.)
Gruß Andi