An mkvextract bug with consecutive IDR frames
Created by: mkver
Hello,
as the title indicates I found a bug in mkvextract that can occur under certain circumstances when there are consecutive idr frames in an h264 stream. First I created a matroska-file named 1.mkv with 10 idr frames (using x264 and --qpfile). If I extract the video stream with mkvextract and analyze the resulting file 1.264 with h264_parse I unsurprisingly get this (and a little bit more):
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 0
pic_order_cnt_lsb: 0
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 1
pic_order_cnt_lsb: 0
Nal is new picture
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 0
pic_order_cnt_lsb: 0
Nal is new picture
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 1
pic_order_cnt_lsb: 0
Nal is new picture
As one can see, idr_pic_id changes from zero to one and back so that consecutive IDR frames differ in this value so that any program reading 1.264 can find out that the new NAL belongs to a different picture; without this information one would have to infer from section 7.4.1.2.4 of the H264 specification that these NALs were merely different slices of the same frame. Notice that this value is only important in the annex B bitstream because in a container like matroska all the data from one frame is to be put in one block.
If I cut frame 2 out of 1.mkv using the commandline
"D:/Portable Programme/mkvtoolnix\mkvmerge.exe" --ui-language de --output B:/2.mkv --language 0:eng --default-track 0:yes ^"^(^" B:/1.mkv ^"^)^" --split parts-frames:1-2,+3-
I get a file 2.mkv with one frame less. Extracting the videostream and running h264_parse yields the following information:
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 0
pic_order_cnt_lsb: 0
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 0
pic_order_cnt_lsb: 0
Nal is part of last picture
Nal length 246 start code 4 bytes
ref 3 type 5 Coded slice of an IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 0 (4 bits)
idr_pic_id: 1
pic_order_cnt_lsb: 0
Nal is new picture
Now there are two consecutive idr NAL-units with idr_pic_id 0 and because of section 7.4.1.2.4 of the H264 specification they are considered part of the same frame. It's not only h264_parse, but also mkvmerge treating it that way (see file 3.mkv muxed from 2.). Given that (as already mentioned) idr_pic_id is actually meaningless in a container like matroska I think that bug is not in mkvmerge (mkvmerge is not obliged to change the idr_pic_id-values in the bitstream and the file 2.mkv does to the best of my knowledge not violate any of the specifications), but in mkvextract for ignoring that these two NAL units will be read as parts of one frame. IMO the best solution for this is adding access unit delimiters at the appropriate places in the bitstream when extracting with mkvextract. I have no idea whether the same problem occurs with H265.
Grüße Andi
PS: If you ask why I report such things which probably won't happen in a real world scenario: It's because you said that you might replace your module for framed h264-input with the unframed one. If you do this the wrong way, the above could become a lot more likely.