[#382] Fix incorrect internal representation of 1970-01-01 (unix epoch)

TLDR

  • Dates less than 1 second away from 1970-01-01 did not have a uniform representation in various formats
  • New test TDTT110 does joins between all date types for dates at and around epoch
  • It showed that we have 3 different issues causing conversion problems. These are now fixed
  • ZHorolog conversion of microseconds less than 100000 was also wrong, and is now fixed, and tested by TDTT111 and TDTT112

Background

  • The TDTT087 subtest failed in one nightly in-house test run with the following diff.

    29c29
    < 1
    ---
    > 0
  • This fancy subtest generates queries with random date values and converts them to various formats (zhorolog, zut etc.) and then stores those transformed values as values in different globals each of which is then mapped to an Octo READONLY table that defines a non-key column of that type (zhorolog, zut etc.).

    create table date_zhorolog_tbl (order_id integer primary key, order_date date(zhorolog)) GLOBAL "^datezhorologtbl" READONLY;
    create table date_zut_tbl (order_id integer primary key, order_date date(zut)) GLOBAL "^datezuttbl" READONLY;
  • The subtest then does an inner join from 2 such tables where the ON clause is specified as t1.order_date = t2.order_date where t1 and t2 correspond to tables with the non-key columns in different date format (in this case, zhorolog and zut).

    select count(*) >= 10 from date_zhorolog_tbl t1 inner join date_zut_tbl t2 on t1.order_date = t2.order_date;
  • It then expects as many number of rows in the output of the inner join query as there are rows in each of those tables (both tables are guaranteed to have the same number of rows by the test).

  • In this case, there were 10 rows in each of those tables and the subtest expected 10 rows of output from the inner join query. But it instead saw 9 rows and that is what caused the failure above.

  • After some analysis, came up with the below simple test case that demonstrates the same issue as seen above.

    yottadb -run %XCMD 'kill ^datezhorologtbl set ^datezhorologtbl(1)="47117,,,"'
    yottadb -run %XCMD 'kill ^datezuttbl set ^datezuttbl(2)=0'
    echo 'create table date_zhorolog_tbl (order_id integer primary key, order_date date(zhorolog)) GLOBAL "^datezhorologtbl" READONLY;' | src/octo
    echo 'create table date_zut_tbl (order_id integer primary key, order_date date(zut)) GLOBAL "^datezuttbl" READONLY;' | src/octo
    echo 'select * from date_zhorolog_tbl;' | src/octo
    echo 'select * from date_zut_tbl;' | src/octo
    echo 'select * from date_zhorolog_tbl t1 inner join date_zut_tbl t2 on t1.order_date = t2.order_date;' | src/octo

    To give some background, the special values in the gvns used in the set commands above were obtained by running the below queries (the zhorolog and zut representation of the unix epoch date 1970-01-01.

    $ echo "select date_to_zhorolog(date '1970-01-01');" | src/octo
    date_to_zhorolog
    47117,,,
    (1 row)
    
    $ echo "select date_to_zut(date '1970-01-01');" | src/octo
    date_to_zut
    0
    (1 row)
  • The output of the inner join query above is the following which is incorrect as we expect 1 row of output (not 0 rows of output).

    order_id|order_date|order_id|order_date
    (0 rows)
  • New subtest TDTT110 revealed that this problem is not just limited to zhorolog dates. Joins involving zut dates and dates with timezones showed similar problems.

  • All date types that can be joined are tested with Unix epoch, epoch - 1 microsecond, epoch + 1 microsecond, + 2 distant "regular" dates in updated TDTT110.

  • Epoch - 1 microsecond always passed. Upon investigation, doesn't fail as that number gets a nine's complement on it and appended to -1 (i.e. -1 second from epoch + a number of milliseconds, which brings it back closer to the epoch), and ends up fully numeric.

Issue

  • There were 3 separate issues:

  • Most datetime conversions go through ydboctoText2InternalFormatC(); ZUT conversions go through ydboctoZutC. For dates less than +1 second from the Epoch, ydboctoText2InternalFormatC() ended up with a string representation of microseconds that can be appended to seconds (e.g. 0000000 (7 digits of 0 each) or 0000001). ydboctoZutC returned a numeric result always (e.g. 0 or 1). Joins between ZUT types and other types therefore always failed.

  • The issue with ZHorolog per se has to do with the fact that the conversion of microseconds less than 100000 was incorrect. This does not affect the epoch date (0 mcs), but it affects the 1 mcs after the epoch.

  • Use of timezone types forced conversion of the string 0000000 to a numeric 0. In some cases, this causes the join to succeed (e.g. a join with a ZUT type); in other cases it would cause the join to fail (e.g. to a timestamp-only type).

Fix

  • The internal representation for doing joins between types relies on the Unix time tm from mktime() which does not include microseconds, and to which we add microseconds after. For example, 2020-02-14 05:14:32.009922 becomes 1581657272 concatenated with 009922, which is 1581657272009922. With no seconds in Unix time, we previously ended up with 0000000

  • The ideal fix is to make sure that all micrseconds less than 1 second of the epoch are just expressed numerically in ydboctoText2InternalFormat just like ydboctoZutC

  • The fix needed to be performed for all times less than 1 second after the epoch.

  • Now we use we use atoi() and sprintf() with a %i in ydboctoText2InternalFormat to strip the extra leading zeros and so 0000000 ends up being 0. The code in ydboctoZutC uses a different algorithm, and always returns a numeric value.

  • Separate from this fix, eposodic failures in TDTT046 showed that the change done to correct ZHOROLOG input processing in ZHorolog2Text^%ydboctoplanhelpers to fix the ZHOROLOG problem also needed to be done for output in PrintDateTimeResultColumnValue^%ydboctoplanhelpers. For example,

    $ echo "datetimeoutputformat = \"zhorolog\"" > octo.conf
    $ $ydb_dist/plugin/bin/octo
    OCTO> select timestamp'2969-3-18 16:31:45.365000';
    ???
    412071,59505,365,
    (1 row)

    365 microseconds is wrong. It should be 365000.

Test

  • A new TDTT110 subtest verifies this fix. It takes about a minute to run, and thus was split between two different files.

  • TDTT110 test showed 53 failures on master:

    • 20 failures due to ZHorolog microsecond conversion.
    • 28 failures due to ZUT using function ydboctoZutC rather than ydboctoText2InternalFormat. The latter always returns a numeric result 0 instead of 0000000 at the epoch, and 1 instead of 0000001 at one second after the epoch.
    • 6 failures due to use of Timezones, which forces any string microseconds to become a number (0000001 -> 1).
  • Added test TDTT111 for PrintDateTimeResultColumnValue^%ydboctoplanhelpers change.

  • Added test TDTT112 for ZHorolog2Text^%ydboctoplanhelpers change.

  • TDTT049 was moved to make test timing uniform

  • ZHorolog fixes changed the ref files for tests TDTT004 and TDTT104_non_primary_key, TDTT104_primary_key. Updated.

Edited by Narayanan Iyer

Merge request reports

Loading