What are those tags generated by mkvmerge 7.0.0 and newer? How do I get rid of them?
There are many tags like NUMBER_OF_FRAMES and BPS in each output file I create with mkvmerge 7.0.0 or newer. Where do they come from? What are they for? How do I get rid of them?
The Matroska track headers do not offer fields for statistical data (see also e.g. the FAQ entry about meta data being lost during muxing). As adding new Matroska elements has proven problematic for existing players in the past, it was decided to store such information in tags instead.
Starting with v7.0.0, mkvmerge will automatically calculate a number of statistical properties for each track and write tags for them. Those statistics all apply to exactly one track in the file they're present in (meaning that if you're using the splitting feature, those statistics will be re-calculated for each output file created).
Here's a list of all the tags that are generated, their format and their meaning:
NUMBER_OF_BYTES — the total number of encoded bytes this track consists of before any of Matroska's content encoding schemes (e.g. lossless track compression) is applied.
NUMBER_OF_FRAMES — the number of Matroska blocks present in the track. Note that if you have an interlaced video track, then this number usually contains the number of interlaced fields, not the number of full frames.
DURATION — the total duration for this track. This is calculated as the difference between the highest sum of a track's timecode + its duration and the lowest timecode. Mathematically speaking it's MAX(timecode[m] + duration[m]) - MIN(timecode[n]) for all indexes m, n. The format used is HH:MM:SS.nnnnnnnnn (HH # hours, MM minutes, SS # seconds, nnnnnnnnn nanoseconds).
BPS — the track's bit rate in bits per second. This is simply the NUMBER_OF_BYTES multiplied by 8 divided by DURATION in seconds.
_STATISTICS_WRITING_APP contains mkvmerge's version information. It can be used to identify which application has created those statistics. If this differs from the segment header field WRITING_APP, the tags may be out of date (meaning another app A was used to remux a file that contains valid statistics but A hasn't bothered to update the statistics tags itself). See the section Determining if the tags are up to date below for details.
_STATISTICS_WRITING_DATE_UTC contains a timestamp when the statistics tags where created in the format YYYY-MM-DD HH:MM:SS. Comparing this to the date value in the segment info header allows you to determine whether or not the tags are up to date. See the section Determining if the tags are up to date below for details.
_STATISTICS_TAGS is a space-separated list of the statistics tags that have been written by the app referenced by _STATISTICS_WRITING_APP. For mkvmerge this tag contains BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES.
Note that if you re-mux a file that already contains these statistical tags, mkvmerge will replace them with updated statistics.
Disabling the generation of those tags
You can disable the generation by passing mkvmerge the option --disable-track-statistics-tags. In MKVToolNix GUI you can add that option on the Output tab under Additional options (for the current multiplex job) or in the preferences at Multiplexer → Default values → Default additional command line options (for all newly created multiplex jobs).
Note that if you re-mux a file containing these tags, you have to disable reading tags from the source file in addition to disabling the generation of new statistics tags with --disable-track-statistics-tags.
Determining if the tags are up to date
One problem that may arise is if an application re-muxes these statistics tags from an existing file to a new file without being aware of them. In that case the statistics tags apply to the old file but not necessarily to the new file anymore.
A reader (either a human being or an application) can make use of two pieces of information in order to determine whether or not the tags are up to date (and who to blame if they're wrong): a comparison of _STATISTICS_WRITING_APP to the segment info value writing application and a comparison of _STATISTICS_WRITING_DATE_UTC to the segment info value date.
The more important one is the date & time field and its comparison. If an application not aware of those tags simply re-muxes the tags to a new file, then the tag's timestamp will be older than the segment info's date field. This is a clear indicator that the tags are out of date. The segment info writing application then tells you which application it is that doesn't support those tags.
If the statistics are wrong, the comparison of the writing application may yield a hint where to file the bug report. If the tag's timestamp is newer than the segment info's date value, then the _STATISTICS_WRITING_APP is to blame for a miscalculation; otherwise it's likely the case mentioned above: the segment info's writing application doesn't support those tags.
Matching tags to tracks
In order to determine which tag applies to which track, you have to look at that tag's Targets element. It will contain the following sub-elements:
TARGET_TYPE_VALUE of 50 and TARGET_TYPE of MOVIE — this means that the tag applies to the whole file and not just to a part of it (e.g. to a single chapter)
TrackUID — this will contain the UID of the track that this tag applies to. It will be the UID of one of the tracks found in the track headers.
Easy access to those tags for other programs
mkvmerge itself has been adjusted allowing easier access to this statistical information by extending its verbose identification mode. In this mode track-specific tags will now be output as well if you also use the option --engage keep_track_statistics_tags. Their format in the output is tag_<tagname>:value with <tagname> being the tag's name in all lower case. For example, the output might include tag_number_of_frames:12345.
Note that mkvmerge's identification output is escaped according to mkvmerge's escaping rules in order to allow reliable and unambiguous parsing.
Example output for one track (broken up into multiple lines for easier reading; in reality this output would contain only two long lines, one for the container itself, one for the single track this file contains):