Skip to content
All articles
Articleaicost-controlpostgresoperations

Capping the Cost of AI-Generated Database Code

AI-paired teams can rack up surprising database bills: bad queries, runaway loops, expensive review pipelines. Here's how to put guardrails in place without slowing the team down.

10 min read

AI-paired teams in 2026 are surprised by two recurring bills: the database bill (because the agent shipped a query that scans 100M rows every minute) and the AI bill (because review pipelines ran on every line change). Both are preventable with simple guardrails.

Where the costs actually come from

  • Sequential scans the agent didn't mean to write. A missed index turns a 5ms query into a 5s query, which turns into a 50% CPU spike under load.
  • N+1 patterns inside server actions. The agent generates a route that loads 50 related records in a loop. Each one is a separate query.
  • AI review on every PR, including formatting-only PRs. 5,000-token reviews on a Friday afternoon at the end of the month adds up.
  • Runaway agent loops. An agent that gets confused and keeps retrying produces a huge token bill before someone notices.

The bad-query tax

The single biggest database cost spike from AI work is missing indexes on new columns. Mitigation: run pg_stat_statements regularly, sort by total time, look for queries that show up surprisingly.

Add a CI step that runs EXPLAIN on the queries introduced in a PR (via your test suite, or via a static analysis tool that finds db.select calls). Flag anything that comes back with Seq Scan on a large table.

AI review token bills

AI code review is excellent value, but the cost adds up when:

  • Every PR runs review (including dependency bumps).
  • The review prompt is huge (entire codebase as context).
  • The model is large (Claude Opus / GPT-4 Turbo) when a small model would do.

Practical caps:

  • Run review only on PRs that touch src/db/, drizzle/, or src/lib/auth/.
  • Hard token cap per PR (~50k input + 10k output is plenty for one PR).
  • Use a smaller model (Claude Sonnet, GPT-4o-mini) for first-pass review; escalate to a big model only if the small one flags something.

Runaway review loops

A specific failure mode: an agent that doesn't know what to do and keeps retrying. We've seen this produce $40 token bills on a single PR. Mitigations:

  • Cap the number of tool calls per agent run. 6-8 is plenty for database operations.
  • Cap the wall-clock time per run. 60 seconds is a reasonable upper bound.
  • Alert on token-bill anomalies. If a single run exceeds your typical p99, page someone.

The five guardrails

  1. Statement timeout on the application role. 5 seconds.
  2. pg_stat_statements review on a schedule. Weekly, someone looks at the top 20 queries by total time and asks "is this what we expected?"
  3. AI review gated by path filter. Only run on PRs that touch the database / auth / migrations directories.
  4. Token cap per AI run. Both for the review pipeline and any in-app agent (chat assistant, copilot).
  5. Audit log + Sentry tracing on agent-initiated writes. So when something looks weird, you can prove it was the agent and roll it back.

None of these slow the team down meaningfully. They cap the worst cases. The day the agent goes feral and someone notices in the morning, the bill is hundreds, not thousands.

Suparbase is an admin workspace for Supabase. Encrypted credentials, server-side proxy, RLS debugger, SQL playground, AI assistant with diff-confirmed writes. Free tier for solo projects.

Related articles