All stacks

stack

AI-Native SaaS Stack

A multi-tenant SaaS skeleton with LLM gateway and pgvector wired in by default

The AI-Native SaaS Stack is the scaffold under every AppLiaison product where the user-facing surface is a chat, a copilot, or a generation flow. It treats LLMs as a first-class dependency, not a bolt-on — every request is logged, cost-tracked, and replayable, and vector retrieval is in the same Postgres database as the rest of the app so there is no two-write-path consistency drift.

Architecture

Architecture variant: ai-native
Frontend
Next.js 14 (App Router, RSC)tRPC clientTailwind + headless primitives
Backend
Next.js Route HandlerstRPC serverPostgres 16 with pgvector
Data + infra
Vercel (web)Neon or Supabase (Postgres + pgvector)Cloudflare R2 (uploads)
Integrations
Stripe BillingClerk (auth + multi-tenant orgs)Resend (transactional email)PostHog (product analytics)

When to choose this stack

  • The product is a chat, copilot, or generation surface end-to-end
  • You need retrieval-augmented generation over customer-private documents
  • You want every model call cost-attributed and replayable
  • LLM vendor lock-in is something you want to avoid from day one
  • Output volume justifies caching and not just on-demand calls

What's NOT included

  • On-device / air-gapped inference
  • Fine-tuning pipelines on customer data
  • Custom model training infrastructure
  • Real-time speech (TTS / STT) — handled by the Realtime Mobile Stack instead

How the pieces fit

A request enters at the Next.js edge, runs through tRPC for type-safe routing, and hits the LLM gateway with a provider-agnostic prompt object. Retrieval happens in pgvector inside the same Postgres database that holds tenant data — no separate vector service to keep in sync.

Every model call is wrapped in a span carrying tenant ID, prompt hash, model, token counts, and cost, so per-tenant invoicing, model A/B tests, and cache invalidation all work off the same underlying log.

Why these choices

Postgres + pgvector instead of a dedicated vector DB: tenant-scoped row-level security and a single backup story matter more than a fractional improvement on recall metrics for the document sizes most customer apps see.

LLM gateway in front of provider SDKs: every app on this stack can switch from Claude to OpenAI to a self-hosted model with a config change, and an outage in one provider does not take the whole product down.

tRPC over REST or GraphQL: end-to-end TypeScript inference for a small team is more valuable than the ecosystem benefits of either alternative.

Apps built on this stack