Skip to content

Conversation

@kyleconroy
Copy link
Collaborator

@kyleconroy kyleconroy commented Dec 18, 2025

Summary

This PR introduces a database-only analyzer mode that uses the database connection for query analysis instead of sqlc's internal catalog. This provides more accurate type inference by leveraging the actual database schema.
Configuration

The feature is enabled by setting analyzer.database: only and requires the analyzerv2 experiment:

version: "2"
sql:

  • engine: postgresql # or sqlite
    schema: "schema.sql"
    queries: "query.sql"
    database:
    managed: true # or uri: "postgres://..."
    analyzer:
    database: "only"
    gen:
    go:
    package: "db"
    out: "db"

Run with SQLCEXPERIMENT=analyzerv2 to enable:

SQLCEXPERIMENT=analyzerv2 sqlc generate

Key Features

Database-backed type inference: Uses PREPARE statements to get accurate column types
Star expansion: Expands SELECT * and RETURNING * using the database
Syntax validation: Schema files are parsed to validate SQL syntax and catch errors early
PostgreSQL and SQLite support: Works with both database engines
Experiment-gated: Requires SQLCEXPERIMENT=analyzerv2 to activate

How It Works

Schema migrations are parsed for syntax validation (errors are reported)
Schema contents are passed to the database connection
The internal catalog remains empty - database is the source of truth
Queries are analyzed by preparing them against the database
Star expressions are expanded using column names from prepared statements

Changes

Added analyzer.database config option with true, false, and "only" values
Added analyzerv2 experiment flag
Extended Analyzer interface with EnsureConn and GetColumnNames methods
Added SQLite analyzer support for database-only mode
Refactored compiler to use unified analyzer interface

Test plan

PostgreSQL tests pass (accurate_cte, accurate_enum, accurate_star_expansion)
SQLite tests pass (accurate_sqlite)
Schema syntax errors are properly reported
Feature only activates with experiment flag

kyleconroy and others added 7 commits November 30, 2025 20:38
Add an optional `analyzer.accurate: true` mode for PostgreSQL that bypasses
the internal catalog and uses only database-backed analysis.

Key features:
- Uses database PREPARE for all type resolution (columns, parameters)
- Uses expander package for SELECT * and RETURNING * expansion
- Queries pg_catalog to build catalog structures for code generation
- Skips internal catalog building from schema files

Configuration:
```yaml
sql:
  - engine: postgresql
    database:
      uri: "postgres://..."  # or managed: true
    analyzer:
      accurate: true
```

This mode requires a database connection and the schema must exist in the
database. It provides more accurate type information for complex queries.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add three end-to-end test cases for the accurate analyzer mode:

1. accurate_star_expansion - Tests SELECT *, INSERT RETURNING *, UPDATE RETURNING *, DELETE RETURNING *
2. accurate_enum - Tests enum type introspection from pg_catalog
3. accurate_cte - Tests CTE (Common Table Expression) with star expansion

All tests use the managed-db context which requires Docker to run
PostgreSQL containers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Update expected output files to match actual sqlc generate output:
- Fix parameter naming (Column1, Column2, dollar_1)
- Fix nullability types (sql.NullString, sql.NullInt32)
- Fix CTE formatting (single line)
- Fix query semicolons

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Tests CTE using VALUES clause with column aliasing to verify
accurate analyzer handles inline table expressions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
The VALUES clause was incorrectly formatting multiple rows as a single
row with multiple columns. For example:
  VALUES ('A'), ('B'), ('C')
was being formatted as:
  VALUES ('A', 'B', 'C')

This caused the star expander to think the VALUES table had 3 columns
instead of 1, resulting in incorrect SELECT * expansion.

The fix properly iterates over each row in ValuesLists and wraps each
in parentheses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
… experiment

This change refactors the "accurate analyzer mode" feature:

1. Rename config option from `analyzer.accurate: true` to
   `analyzer.database: only` - a third option in addition to true/false

2. Gate the feature behind the `analyzerv2` experiment flag. The feature
   is only enabled when:
   - `analyzer.database: only` is set in the config
   - `SQLCEXPERIMENT=analyzerv2` environment variable is set

3. Update JSON schemas to support boolean or "only" for analyzer.database

4. Add experiment tests for analyzerv2 flag

5. Update end-to-end test configs and expected outputs

The database-only mode skips building the internal catalog from schema
files and instead relies entirely on the database for type resolution
and star expansion.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 🔧 golang labels Dec 18, 2025
…nly)

This extends the database-only analyzer mode to support SQLite in addition
to PostgreSQL:

1. Add EnsureConn, GetColumnNames, and IntrospectSchema methods to the
   SQLite analyzer for database-only mode functionality

2. Update compiler to handle SQLite database-only mode:
   - Add sqliteAnalyzer field to Compiler struct
   - Initialize SQLite analyzer when database-only mode is enabled
   - Build catalog from SQLite database via PRAGMA table_info

3. Add SQLite end-to-end test case for database-only mode

The SQLite database-only mode uses PRAGMA table_info to introspect
tables and columns, and prepares queries to get column names for
star expansion.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add EnsureConn and GetColumnNames methods to Analyzer interface
- Remove engine-specific pgAnalyzer and sqliteAnalyzer fields from compiler
- Use unified analyzer interface for database connection initialization
- Keep parsing schema files to build catalog, only use database for star expansion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
In database-only mode, parse the schema migrations to validate syntax
and collect them for the database connection, but skip updating the
catalog. The database will be the source of truth for schema information.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@kyleconroy kyleconroy changed the title feat(postgresql): add accurate analyzer mode for database-only analysis feat(postgresql): add analyzerv2 experiment for database-only analysis Dec 22, 2025
@kyleconroy kyleconroy merged commit 68b2089 into main Dec 22, 2025
13 checks passed
@kyleconroy kyleconroy deleted the claude/accurate-analyzer-mode-UeNm6 branch December 22, 2025 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files. 🔧 golang

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants