mkvmerge: infloop in/around generic_packetizer_c::force_duration_on_last_packet with v2 external timestamp file
So I'm doing another of my massive video transcodes (denoising a dreadfully noisy BluRay remastering of the original Stargate movie, this time), and when the time comes to mkvmerge the denoised video (recompressed a bit with ffmpeg 4.1.3) back into the original of all the other streams, we get only five minutes in and then infloop in force_duration_on_last_packet()
:
apply_factory_once(): source 0 t 283283000000 tbf 283283000000 at 283276166667
apply_factory_once(): source 0 t 283325000000 tbf 283325000000 at 283317875000
apply_factory_once(): source 0 t 283366000000 tbf 283366000000 at 283359583333
apply_factory_once(): source 0 t 283408000000 tbf 283408000000 at 283401291667
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
'/usr/local/tmp/encoding/job29255/2-unmuxed.mkv' track 0: force_duration_on_last_packet: forcing at 00:04:53.210000000 with 41708.333ms
I observe this with every mkvmerge I have tested: at least 30.1.0 to 33.1.0 are affected.
I'm using an external timestamp file with this mux, generated by vapoursynth, so that when I have to do things that change the framerate everything still syncs up right. In this case, the framerate should be unchanged, and the external timestamp file looks perfectly all right, but if I disable the external timestamp file, everything works and there is no hang. (This means I have a workaround, at least for this case: good thing I'm not changing the framerate!)
The timestamp file does not contain an entry for 41708.333ms: the relevant bit looks like this:
41618.083333
41659.791667
41701.500000
41743.208333
41784.916667
41826.625000
i.e. exactly what you'd expect for a 23.976fps file. Maybe one of the inputs does contain something at a pts not in the timestamp file, and this is throwing something off? (But... handling this sort of thing is what the timestamp file is for, so I'm quite surprised.)
I see this problem if I feed just the unmuxed file through mkvmerge, with no other files at all in the 'merge', only the mkvmerge and the timestamp. The file itself is the result of an "ultrafast" ffmpeg encoding, like the below (this is not much use because of the amount of external state you don't have, but it does at least show that I'm not using too many weird tools: in particular vspipe is just emitting uncompressed video frames, as usual). I'm stuffing things through ffmpeg because this roughly halves the size of this intermediate file at a cost of about 2% of the CPU time spent vapoursynthing, and for large transcodes this can cut hours off the time needed. It is quite possible that ffmpeg is emitting corrupted output, but even then the question of why using --timestamps hangs mkvmerge remains.
The source file is generated from an MKVified input like this (hand-translated from shell script, so may be a bit faulty):
ffmpeg -loglevel fatal -i input.mkv -f ffmetadata foo.ffmetadata
mkvextract input.mkv tracks 0:foo.vid
vspipe denoise.vpy --arg source=foo.vid --timecodes foo-timecodes.txt --y4m - | ffmpeg -i - -i foo.ffmetadata -map 0 -map_metadata 1 -c:v libx264 -preset ultrafast -qp 0 -filter setsar=sar=1/1 -force_key_frames chapters foo-unmuxed.mkv
This (stripped-down to minimal uselessness) do-next-to-nothing command infloops (but so do more elaborate non-do-nothing commands that actually do useful muxing): mkvmerge -o /dev/null --timestamps 0:foo-timecodes.txt foo-unmuxed.mkv
The problem is that I have no obvious way to give you a version of the unmuxed file that's a reasonable length without triggering the same damn bug: the shortest file I have that exhibits this is 4GiB long :( replacing all the image data with nice easy-to-compress blankness using vapoursynth's std.Limiter changes things enough that the failure is no longer observed.
I'm happy to do whatever diagnosis you want on this one.