Add FunctionImpl.VolatileArgs opt-in for zero-copy TEXT/BLOB args (follow-up to !114)

Follow-up to !114 (merged) (#226).

Background

!114 (merged) eliminated the per-call []driver.Value slice-header allocation for UDF callbacks but explicitly left the per-row TEXT and BLOB body copies out of scope. As @cznic framed it in the !114 (merged) review:

While the driver's contract technically states that argument values aren't valid past the return of the function, Go's garbage collector trains developers to assume that if they hold a reference to a slice, the underlying array is safe.

A default-on zero-copy path would silently corrupt user code that retains an argument slice across rows, undetectable by -race (UDF execution is sequential on one goroutine).

What this MR does

Adds VolatileArgs bool to FunctionImpl as a strict opt-in. When true:

  • TEXT arguments arrive as unsafe.String views into the SQLite-owned sqlite3_value_text buffer
  • BLOB arguments arrive as unsafe.Slice views into the sqlite3_value_blob buffer

When false (the default for all existing call sites), behavior is byte-for-byte identical to current master.

The flag is captured at registration and threaded to the trampolines via small wrapper structs keyed in xFuncs.m / xAggregateFactories.m / xAggregateContext.m, so the hot path is one extra field read rather than a second map lookup.

Vtab Filter and Update are deliberately out of scope here — those trampolines pass false explicitly. A future MR can extend the opt-in to vtab if there's demand.

Safety contract

The full docstring on FunctionImpl.VolatileArgs covers:

  • The retention rule (no storing the slice/string anywhere past the call, neither directly nor via something that captures it)
  • The failure mode: deterministic data corruption invisible to -race, where every retained value appears to hold the contents of the most recent row
  • Safe-copy idioms for callbacks that must keep values across rows:
    saved := append([]byte(nil), args[0].([]byte)...)
    saved := strings.Clone(args[0].(string))
  • "When in doubt, leave it off" guidance noting that the non-volatile path is already cheap after !114 (merged) (one make([]byte) per BLOB column plus one libc.GoString per TEXT column, not a fresh slice header)
  • The fact that the flag is a no-op for INTEGER / FLOAT / NULL arguments

Matching cross-references are added to AggregateFunction.Step and WindowInverse docstrings.

Benchmark

Same 3-arg noop UDF as !114 (merged)'s BenchmarkUDFArgsAllocation (1000 rows × INTEGER + 5-char TEXT + 3-byte BLOB), darwin/arm64 (Apple M3), -count=3:

BenchmarkUDFArgsAllocation-8           208360 ns/op   70368 B/op   5754 allocs/op
BenchmarkUDFArgsAllocationVolatile-8   182444 ns/op   62371 B/op   3754 allocs/op

The 2000 fewer allocs/op match exactly the 1000 BLOB + 1000 TEXT copies that unsafe.Slice / unsafe.String skip. The remaining 3754 allocs are upstream of the trampoline (statement preparation, row driver, result handling) and out of scope here.

Tests

  • TestVolatileArgsScalar — registers a scalar UDF with VolatileArgs: true, runs it over a 3-row table covering TEXT, BLOB, empty-string and NULL-coerced-to-empty-BLOB cases. The callback uses strings.Clone and append([]byte(nil), ...) to demonstrate the required safe-copy pattern.
  • TestVolatileArgsAggregate — same shape for the Step trampoline path; asserts the assembled per-row sequence matches the input.
  • All pre-existing UDF + aggregate + vtab tests stay green under -race. All non-volatile call sites continue to take the copy path.

Credit

API shape and the "explicit opt-in, footgun behind a boolean" framing suggested by @cznic in the !114 (merged) review thread. The decision to keep TEXT/BLOB copies on by default and require explicit opt-in is exactly the design proposed there.

Merge request reports

Loading