v1.51.0 to v1.52.0 · cznic / sqlite

Commits on Source 14

add Backup.Remaining and Backup.PageCount progress wrappers · 5e633702

Ian Chechin authored May 28, 2026 and

cznic committed May 28, 2026

Two thin wrappers around the existing sqlite3_backup_remaining and
sqlite3_backup_pagecount C symbols. They expose the underlying backup
progress counters that the database/sql layer already keeps but that
Go callers cannot currently read without dropping to lib/* directly.

The motivation is the standard progress-UI use case for online backups:

    for {
        more, err := bck.Step(pagesPerTick)
        if err != nil {
            return err
        }
        ui.Update(bck.PageCount()-bck.Remaining(), bck.PageCount())
        if !more {
            break
        }
    }

Without these wrappers a caller has to either skip the progress display
or fall back to unsafe per-call SQL queries against pragma_page_count.

API shape mirrors !115 (FileControlDataVersion): named after the SQLite
C function with the s/sqlite3_// prefix stripped and CamelCase applied,
documented inline with a link to the official C API page, and added on
the existing public *Backup receiver so no new interface or escape
hatch is required.

The C functions are zero-arg lookups against the sqlite3_backup
object and cannot fail, so the Go wrappers return int with no error.
Per the SQLite docs, both return 0 before the first Step and Remaining
returns 0 after SQLITE_DONE; the new TestBackupProgress test exercises
all three phases (before any Step, after a partial Step, after DONE)
and asserts the documented relationships hold (Remaining = PageCount -
copied, PageCount stable across the final Step).

Test suite (go test -count=1 ./...) stays green.

5e633702

Merge branch 'feat/backup-progress-wrappers' into 'master' · 2cba7d51
cznic authored May 28, 2026
```
add Backup.Remaining and Backup.PageCount progress wrappers

See merge request !122
```
2cba7d51

CHANGELOG.md: document #122 · 0c32f40a

cznic authored May 28, 2026



Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

0c32f40a

conn: skip the second string copy in columnText · 20ab6ab7

Ian Chechin authored May 29, 2026

(*conn).columnText currently allocates twice per TEXT column per row:
once for the make([]byte, len) buffer that receives the SQLite-owned UTF-8
bytes, and once again inside the string(b) conversion that
runtime.slicebytetostring performs because the compiler must assume the
caller could mutate b.

Here b is local to columnText and is never touched again after the copy
from the C buffer, so the second copy is redundant. Replacing string(b)
with unsafe.String(unsafe.SliceData(b), len) builds the returned string
directly on top of b. The string is immutable from Go's perspective, the
GC keeps b alive for as long as the string is reachable, and no aliasing
is possible because b becomes unreachable as []byte the moment the
function returns. The same pattern is already used in sqlite.go (!120)
for the volatile-args path.

Benchmark on darwin/arm64 (Apple M3), 1000-row SELECT of a single TEXT
column, -benchtime=2s, before -> after:

  Short  (16-byte TEXT):
    4009 -> 4009 allocs/op  (Go runtime already short-circuits
                             string(b) for slices below the inline
                             threshold; no regression either)
       52348 ->    52348 B/op
      157342 ->   155746 ns/op

  Medium (256-byte TEXT):
    5009 -> 4009 allocs/op  (-1000 allocs/op = -1 per row)
      548351 ->   292350 B/op  (-256 KB/op = the second 256-byte copy)
      226863 ->   204730 ns/op (-10%)

  Long  (4096-byte TEXT):
    5009 -> 4009 allocs/op  (-1000 allocs/op = -1 per row)
     8228510 -> 4132413 B/op  (-4 MB/op = the second 4 KB copy)
     1605640 -> 1135113 ns/op (-29%)

The saving scales linearly with TEXT column length, since the eliminated
work is exactly one memcpy of the column bytes. No change to (*conn).
columnBlob, which already returns its make([]byte, len) buffer directly
and pays only one alloc + memcpy per row.

TestColumnTextScan exercises the path under -race over the three branches
of columnText: empty (short-circuit), short (Go-fast-path) and long
(allocating) TEXT, including a multi-byte / emoji payload to confirm
UTF-8 is preserved bit-for-bit. Full go test -count=1 ./... stays green.

20ab6ab7

Merge branch 'perf/column-text-zero-copy' into 'master' · c80a08fb
cznic authored May 29, 2026
```
conn: skip the second string copy in columnText

See merge request !123
```
c80a08fb

CHANGELOG.md: document #123 · b17c0c7f

cznic authored May 29, 2026



Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

b17c0c7f

rows: cache the column decltype lookup once per result set · f8fb6dd1

Ian Chechin authored May 30, 2026

The Next() hot path calls (*rows).ColumnTypeDatabaseTypeName(i) for
every TEXT column on every row when _texttotime=1, and for every
INTEGER column on every row when intToTime is set. Each call ran:

  return strings.ToUpper(r.c.columnDeclType(r.pstmt, index))

which is one libc.GoString to materialise the C decltype string into Go
memory, plus a (cheap, allocation-free for already-uppercase inputs)
strings.ToUpper. The declared type of a result column is fixed for the
lifetime of a prepared statement, so the libc.GoString cost is paid
N_text_cols * N_rows times for nothing.

Move the lookup to newRows() and cache the uppercased decltype into a
new rows.decltypes []string. ColumnTypeDatabaseTypeName, the Next()
DATETIME branch (which goes through it), and ColumnTypeScanType now
read from the cache instead of redoing the C round-trip per row. The
case-sensitive switch in ColumnTypeScanType is rewritten against the
cached uppercase values to drop a per-column strings.ToLower as well.

Benchmark (darwin/arm64 Apple M3, _texttotime=1, 1000-row SELECT of all
DATETIME columns, -benchtime=2s, before -> after):

  1 column:
    11010 -> 10012 allocs/op  (-1000 = -1 per row, the libc.GoString)
       400354 ->   392393 B/op  (-8 KB = -8 bytes per row, "DATETIME"
                                  string body)
       646068 ->   601121 ns/op (-7%)

  5 columns:
    55014 -> 50020 allocs/op  (-5000 = -5 per row, -1 per col per row)
      2000499 -> 1960654 B/op  (-40 KB, scales linearly with columns)
      2992839 -> 2908393 ns/op (-3%)

The saving scales 1:1 with N_text_cols * N_rows for queries that hit
the time-conversion path. Workloads using _texttotime, _time_format,
or _intToTime DSN flags benefit; queries without those flags do not
touch ColumnTypeDatabaseTypeName per row and see no behavior change.

TestColumnTypeDatabaseTypeNameCache covers a mixed-case CREATE TABLE
across all SQLite storage classes (INTEGER / TEXT / BLOB / DATETIME /
DATE / BOOLEAN), reads the cache once at result-set start and again
inside the Next loop for every row, and asserts the values never drift.
The full go test -count=1 ./... suite stays green.

f8fb6dd1

rows: lock down ColumnTypeScanType under the decltype cache · 8a6f33ce

Ian Chechin authored May 31, 2026

Per @cznic on !124: the decltype cache rewrites the lowercase decltype
switch in ColumnTypeScanType to a cached-uppercase switch, but the
existing TestColumnTypeDatabaseTypeNameCache only exercises the
DatabaseTypeName side. Add a table-driven TestColumnTypeScanTypeDecltypeCache
that covers every arm of the cached switch:

  - INTEGER + BOOLEAN (any case)              -> bool
  - INTEGER + DATE/DATETIME/TIME/TIMESTAMP    -> time.Time
  - INTEGER + plain / unrecognised decltype   -> int64
  - TEXT (default)                            -> string
  - TEXT + DATETIME-shaped decltype (no flag) -> string
  - TEXT + DATE/DATETIME/TIME/TIMESTAMP under _texttotime=1 -> time.Time
  - TEXT + unrecognised decltype under _texttotime=1        -> string

Each case uses a mixed-case declared type to keep the case-folding path
covered, and inserts one row before SELECT so sqlite3_column_type sees
the actual storage class instead of SQLITE_NULL (which would short-
circuit ColumnTypeScanType to reflect.TypeOf(nil)).

All 15 sub-cases pass under -race.

8a6f33ce

Merge branch 'perf/cache-column-decltype' into 'master' · 51e67147
cznic authored Jun 01, 2026
```
rows: cache the column decltype lookup once per result set

See merge request !124
```
51e67147

CHANGELOG.md: document #124 · 7da793ef

cznic authored Jun 01, 2026



Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

7da793ef

rows: cache the parseTime format index per result column · 3638d17b

Ian Chechin authored Jun 02, 2026

(*conn).parseTime ran on every TEXT-stored DATETIME / DATE / TIMESTAMP
column read in Next(). The function tried (*conn).parseTimeString first
and then walked parseTimeFormats[0..6] sequentially until time.Parse
matched the row's value. For the canonical SQLite TEXT datetime format
("2006-01-02 15:04:05.999999999", index 2) every row paid two failed
time.Parse attempts in the warmup, plus the one successful match. Each
failed Parse allocates a ParseError, so the per-row cost on a steady
1000-row scan was ~5 allocs per row from the format-search alone.

Add a sticky per-column hint cache:

  - rows.parseFmtIdx []int8, sized once at newRows() to the column count,
    initialised to -1 (no match recorded).
  - (*conn).parseTime now takes hintIdx int and returns the index that
    actually matched (or -1 when parseTimeString matched / all formats
    failed). It tries hintIdx first if in range, then walks the list
    skipping the index it just tried.
  - rows.Next() records the first successful index per column and reuses
    it on subsequent rows. The cache is sticky: it is set once and not
    overwritten, so a mixed-format column still pays the original
    fallthrough cost on non-matching rows but a steady column wins on
    every row after the first.

Benchmark (darwin/arm64 Apple M3, 1000-row SELECT of a DATETIME TEXT
column in the canonical SQLite format, -benchtime=2s, before -> after):

  10013 -> 5019 allocs/op   (-50%, ~5 fewer per row)
   392417 -> 168672 B/op    (-57%, mostly ParseError structs)
   633531 -> 397843 ns/op   (-37%)

TestParseTimeFormatCache covers correctness across the cache transitions:
three steady-format rows followed by one ISO-T format row (different
index) and one date-only row (yet another index), all returning the
expected time.Time. The full go test -count=1 ./... suite stays green.

No API change. The fall-through chain is preserved bit-for-bit so any
row the old code would have parsed still parses to the same value.

3638d17b

Merge branch 'perf/cache-parse-time-format' into 'master' · 44857934
cznic authored Jun 02, 2026
```
rows: cache the parseTime format index per result column

See merge request !125
```
44857934

rows: clarify parseFmtIdx mixed-column cost; CHANGELOG.md: document #125 · e3f64ec2

cznic authored Jun 02, 2026



Tighten the parseFmtIdx doc comment: a mixed-format column pays at most one extra format probe (on rows whose matching format precedes the cached index), not just the original fall-through cost. Add the !125 CHANGELOG entry. No code/behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

e3f64ec2

release v1.52.0, upgrade to SQLite 3.53.2 · 66b4d20f
cznic authored Jun 06, 2026

66b4d20f