tpch revealed bugs the Tcl tests do not catch
I used the recently added tpch command to produce test DBs (with default scale factor 1). I noticed the DB sizes differ between go-sqlite3 and sqlite:
jnml@e5-1650:~/tpch/testdata/sqlite3/sf1$ ll
total 1510088
-rw-r--r-- 1 jnml jnml 1546330112 Jan 15 15:40 sqlite3.db
jnml@e5-1650:~/tpch/testdata/sqlite3/sf1$
vs
jnml@e5-1650:~/tpch/testdata/sqlite/sf1$ ll
total 4202860
-rw-r--r-- 1 jnml jnml 4303728640 Jan 15 17:03 sqlite.db
jnml@e5-1650:~/tpch/testdata/sqlite/sf1$
While looking around what might be the cause, I did:
jnml@e5-1650:~/tpch/testdata/sqlite3/sf1$ sqlite-shell sqlite3.db 'select * from part limit 10'
135263|deep drab floral hot blue|Manufacturer#2|Brand#21|PROMO POLISHED TIN|6|MED PACK|129826|uffy, fluffy pla
14658|dark lemon steel cornsilk ghost|Manufacturer#3|Brand#35|MEDIUM BURNISHED STELL|5|WRAP CASE|157265|h; foxes
72265|sienna maroon medium navy rose|Manufacturer#5|Brand#53|PROMO PLATED BRASS|24|LG PACK|123726|odolites try to hi
186005|deep misty orange indian snow|Manufacturer#5|Brand#52|STANDARD ANODIZED TIN|35|WRAP CASE|109100|y silent dugouts s
166014|lawn burlywood sky linen wheat|Manufacturer#4|Brand#43|SMALL BURNISHED BRASS|14|JUMBO CASE|108001|ve, regular dolph
44087|lawn lemon hot linen coral|Manufacturer#2|Brand#25|STANDARD BRUSHED STELL|18|JUMBO CAN|103108|y past
54705|tomato orange ivory lime snow|Manufacturer#2|Brand#25|PROMO BURNISHED COPPER|38|SM JAR|165970|luffily past the perma
145828|aquamarine lavender lawn mint blanched|Manufacturer#4|Brand#44|STANDARD BURNISHED BRASS|18|SM JAR|187382|l beans shall ha
104088|dim aquamarine drab sky papaya|Manufacturer#3|Brand#35|STANDARD BRUSHED COPPER|20|WRAP CASE|109208|ets should have to
123634|navajo beige medium misty blush|Manufacturer#2|Brand#21|PROMO BRUSHED TIN|5|JUMBO CAN|165763|s doubt carefully f
jnml@e5-1650:~/tpch/testdata/sqlite3/sf1$
vs
jnml@e5-1650:~/tpch/testdata/sqlite/sf1$ sqlite-shell sqlite.db 'select * from part limit 10'
uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla|uffy, fluffy pla
h; foxes|h; foxes|h; foxes|h; foxes|h; foxes|h; foxes|h; foxes|h; foxes|h; foxes
odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi|odolites try to hi
y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s|y silent dugouts s
ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph|ve, regular dolph
y past |y past |y past |y past |y past |y past |y past |y past |y past
luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma|luffily past the perma
l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha|l beans shall ha
ets should have to|ets should have to|ets should have to|ets should have to|ets should have to|ets should have to|ets should have to|ets should have to|ets should have to
s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f|s doubt carefully f
jnml@e5-1650:~/tpch/testdata/sqlite/sf1$
The 9 colums all contain the same value, which is the correct value of the last column only.
I don't know what, but something is very wrong. I hope to look into it this weekend, but additional bug hunting volunteers more than welcome.
sqlite-shell is compiled as gcc -o sqlite-shell shell.c sqlite3.c -lpthread -ldl in sqlite-amalgamation-3330000. The point is that this program does not use any Go code, so it proves the sqlite DB is corrupt on disk. The error is not in reading the data.
My hypothesis is that we are doing something stupid when going between C strings and Go strings with respect to either the ownership of the result or the lifetime of the object, resulting in some data being incorrectly reused and/or overwritten.