Skip to content

feat(ci): add cross-compilation and precompiled binaries for Go bindings

What does this MR do and why?

This MR introduces cross-compilation for multiple platforms (Linux and Darwin, both amd64 and arm64) and includes precompiled binaries in the release process. It updates the Go bindings to use these precompiled libraries, enabling go get to work without requiring Rust compiler installation.

Problem

Go projects need to use the chunker library via go get without requiring Rust compiler installation. Currently, the Go bindings in MR !77 (merged) work but require users to have Rust installed and compile the library locally.

Solution

Build cross-platform Rust binaries during release and commit them to the repository, enabling go get to work without Rust dependencies.

Implementation

1. Platform-Specific Go Files

Created architecture-specific Go files with appropriate CGO directives:

  • chunker_darwin_arm64.go → Links to target/release/libparser_c_bindings.a (dev) → lib/darwin_arm64/libparser_c_bindings.a (prod)
  • chunker_darwin_amd64.go → Links to target/release/libparser_c_bindings.a (dev) → lib/darwin_amd64/libparser_c_bindings.a (prod)
  • chunker_linux_amd64.go → Links to target/release/libparser_c_bindings.a (dev) → lib/linux_amd64/libparser_c_bindings.a (prod)
  • chunker_linux_arm64.go → Links to target/release/libparser_c_bindings.a (dev) → lib/linux_arm64/libparser_c_bindings.a (prod)

Note: Currently configured for development/CI testing using target/release/ paths. Will be updated to production lib/ paths after first release.

2. Cross-Compilation CI Jobs

Added new CI jobs to .gitlab-ci.yml:

  • cross-build-linux-amd64 - Native Linux x86_64 build using Linux runners
  • cross-build-linux-arm64 - Native Linux ARM64 build using ARM64 runners
  • cross-build-darwin-amd64 - Cross-compile from macOS ARM64 to x86_64 using macOS runners
  • cross-build-darwin-arm64 - Native macOS ARM64 build using macOS runners

Technical approach: Uses native runners where possible instead of cross-compilation to avoid toolchain complexity, especially for tree-sitter C dependencies.

3. Enhanced Release Pipeline

Updated publish-release::manual job to:

  • Collect artifacts from all cross-compilation jobs
  • Create lib/ and include/ directory structure
  • Commit pre-built binaries to repository with [skip ci] tag

4. Semantic Release Configuration

Updated .releaserc.json to:

  • Include lib/**/* and include/**/* in git assets
  • Create .semantic-release-version file for CI coordination

Key Technical Decisions

Static Linking Strategy

  • Rust builds: Creates both .a (static) and .dylib/.so (dynamic) files
  • Go CGO: Uses -Wl,-Bstatic on Linux to force static linking, standard linking on macOS
  • Platform differences: Linux linker prefers dynamic libraries, macOS linker prefers static
  • Result: Consistent static linking across all platforms without runtime dependencies

Architecture-Specific Build Tags

  • Uses Go build tags (//go:build darwin && arm64) for precise platform targeting
  • Ensures correct binary is linked for each platform/architecture combination
  • Cleaner than runtime platform detection

CI Runner Strategy

  • Native builds: Uses platform-native runners where possible (Linux ARM64, macOS ARM64)
  • Cross-compilation: Only for macOS Intel (ARM64 → x86_64) due to runner availability
  • Reasoning: Avoids complex cross-compilation toolchain setup for tree-sitter C dependencies

Target Repository Structure

gitlab-code-parser/
├── lib/                           # Pre-built binaries (committed)
│   ├── linux_amd64/
│   │   └── libparser_c_bindings.a
│   ├── linux_arm64/
│   │   └── libparser_c_bindings.a
│   ├── darwin_amd64/
│   │   └── libparser_c_bindings.a
│   └── darwin_arm64/
│       └── libparser_c_bindings.a
├── include/                       # Header file (committed)
│   └── parser-c-bindings.h
└── bindings/go/chunker/
    ├── chunker.go                 # Platform-agnostic code
    ├── chunker_darwin_amd64.go    # macOS x86_64 CGO directives
    ├── chunker_darwin_arm64.go    # macOS ARM64 CGO directives
    ├── chunker_linux_amd64.go     # Linux x86_64 CGO directives
    └── chunker_linux_arm64.go     # Linux ARM64 CGO directives

Testing Strategy

What Can Be Tested in MR

  1. Existing CI Pipeline - Verify no regressions in current jobs
  2. Local Go Bindings - Architecture-specific files compile correctly
  3. CI Configuration - YAML syntax validation and job dependencies

What Requires Post-Merge Testing

  1. Cross-Compilation Jobs - New CI jobs only run on main branch
  2. Release Pipeline - Binary commit process needs manual release trigger
  3. End-to-End Workflow - go get testing requires committed binaries

Related Issues

Closes gitlab-org/gitlab#536081 (closed)

Testing

Completed Testing (Pre-Merge)

  1. Cross-Platform CI Jobs

    • All 4 cross-compilation jobs (Linux x64/ARM64, macOS x64/ARM64) passing
    • Native runner builds working correctly
    • Artifact generation and collection verified
  2. Go Unit Tests

    • Fixed static linking issues (-Wl,-Bstatic for Linux)
    • Platform-specific CGO directives working correctly
    • Tests passing with development paths (target/release/)
  3. Rust Compilation Fixes

    • Fixed ARM64 Linux compilation (c_char vs i8/u8 type compatibility)
    • All cross-compilation targets building successfully
  4. CI Pipeline Organization

    • All cross-build jobs correctly placed in build stage
    • Job dependencies and artifacts properly configured
    • Release job ready to collect all platform binaries

Local Testing Available (Pre-Merge)

Complete end-to-end testing can be performed locally before merging:

  1. Set up test environment (from project root):

    mkdir -p local-test/lib local-test/include
  2. Download CI artifacts from your platform's cross-build job:

    • macOS ARM64: cross-build-darwin-arm64 job
    • macOS Intel: cross-build-darwin-amd64 job
    • Linux x86_64: cross-build-linux-amd64 job
    • Linux ARM64: cross-build-linux-arm64 job
  3. Copy artifacts (adjust for your platform):

    # macOS ARM64 example
    mkdir -p local-test/lib/darwin_arm64
    cp ~/Downloads/artifacts/darwin_arm64/libparser_c_bindings.a local-test/lib/darwin_arm64/
    cp ~/Downloads/artifacts/parser-c-bindings.h local-test/include/
    
    # Linux x86_64 example
    mkdir -p local-test/lib/linux_amd64
    cp ~/Downloads/artifacts/linux_amd64/libparser_c_bindings.a local-test/lib/linux_amd64/
    cp ~/Downloads/artifacts/parser-c-bindings.h local-test/include/
    
    # Linux ARM64 example  
    mkdir -p local-test/lib/linux_arm64
    cp ~/Downloads/artifacts/linux_arm64/libparser_c_bindings.a local-test/lib/linux_arm64/
    cp ~/Downloads/artifacts/parser-c-bindings.h local-test/include/
    
    # macOS Intel example
    mkdir -p local-test/lib/darwin_amd64
    cp ~/Downloads/artifacts/darwin_amd64/libparser_c_bindings.a local-test/lib/darwin_amd64/
    cp ~/Downloads/artifacts/parser-c-bindings.h local-test/include/
  4. Create production-style Go module:

    cd local-test
    mkdir chunker
    cp ../bindings/go/chunker/chunker.go chunker/
    cp ../bindings/go/chunker/chunker_test.go chunker/
  5. Create platform-specific CGO file (example for macOS ARM64):

    // chunker/chunker_darwin_arm64.go
    //go:build darwin && arm64
    package chunker
    /*
    #cgo CFLAGS: -I../include
    #cgo LDFLAGS: -L../lib/darwin_arm64 -lparser_c_bindings
    */
    import "C"
  6. Test static library linking:

    go mod init gitlab.com/gitlab-org/rust/gitlab-code-parser/bindings/go
    go mod tidy
    go test -v ./chunker
  7. Test external Go get simulation:

    mkdir ../external-test && cd ../external-test
    go mod init test-chunker
    echo 'replace gitlab.com/gitlab-org/rust/gitlab-code-parser/bindings/go => ../local-test' >> go.mod

    Create a main.go file with the following content:

    package main
    
    import (
    	"fmt"
    	"log"
    	
    	"gitlab.com/gitlab-org/rust/gitlab-code-parser/bindings/go/chunker"
    )
    
    func main() {
    	fmt.Println("Testing external Go get simulation...")
    	
    	// Create a size-based chunker
    	c, err := chunker.NewChunkerSize(1024, 0)
    	if err != nil {
    		log.Fatal("Failed to create chunker:", err)
    	}
    	defer c.Close()
    	
    	// Test chunking some simple code
    	testCode := `package main
    
    import "fmt"
    
    func main() {
        fmt.Println("Hello, World!")
    }`
    	
    	c.AddFile("main.go", testCode)
    	
    	err = c.ChunkFiles()
    	if err != nil {
    		log.Fatal("Failed to chunk files:", err)
    	}
    	
    	chunkCount := 0
    	for chunk := range c.Chunks() {
    		chunkCount++
    		fmt.Printf("  Chunk %d: %d-%d bytes (%s)\n", chunkCount, chunk.StartByte, chunk.EndByte, chunk.FilePath)
    	}
    	
    	fmt.Printf("✅ Successfully chunked code into %d chunks\n", chunkCount)
    	fmt.Println("🎉 External Go get simulation successful!")
    }

    Then run the test:

    go mod tidy
    go run main.go

Expected Result: Complete chunking functionality works with production binary structure

Post-Merge Testing Required

Phase 1: Validate CI Pipeline

  • Merge MR to main branch
  • Trigger manual release: publish-release::manual
  • Verify all cross-compilation jobs succeed
  • Check that binaries are committed to lib/ and include/

Phase 2: Test Real Go Get Workflow

  • Test from external project:
    go mod init test-chunker
    go get -u gitlab.com/gitlab-org/rust/gitlab-code-parser/bindings/go
  • Verify works without local replace directive

Phase 3: Update Production Paths

  • Update Go platform files to use lib/ instead of target/release/ paths
  • Verify CI tests still pass with production paths

Success Criteria

  • go get gitlab.com/gitlab-org/rust/gitlab-code-parser/bindings/go works without Rust installation
  • Cross-compilation produces working binaries for all 4 platforms
  • Release process automatically commits binaries
  • No impact on existing Rust development workflow

Performance Analysis

This MR primarily affects build-time and distribution, not runtime performance:

Build-Time Impact

  • Cross-compilation jobs: Adds ~10-15 minutes to CI pipeline for 4 additional platform builds
  • Release process: Minimal overhead for binary copying and git operations
  • Binary size: Each .a file is ~60MB, total repository size increase ~240MB

Runtime Impact

  • No performance regression: Uses identical static libraries as before
  • Memory usage: Same zero-copy design with proper memory pinning
  • Static linking: Eliminates dynamic library loading overhead

Performance Checklist

  • Memory allocations: No changes to existing zero-copy Go ↔️ Rust interface
  • Profiling: Existing cargo bench results remain valid (no runtime changes)
  • Zero-copy operations: Maintained &str usage and slice references
  • Data structures: No changes to core data structures or algorithms
  • Static linking: Improved performance by eliminating dynamic library overhead
  • Build optimizations: Release builds use existing optimization flags
  • Benchmarking: Post-merge validation with existing benchmark suite

Related

Closes #57 (closed)

Edited by Vitali Tatarintev

Merge request reports

Loading