stacks · developers building products where the llm is the product

the ai-native dev stack

for people building agents and llm products, not just bolting a chatbot onto a crud app.

$50–200/mo
·7 tools

plus llm api spend

most "ai stack" lists are just a model provider and a vague gesture at langchain. that's not a stack, that's a single api call with extra steps. this list is for the point where the llm calls itself, calls tools, and needs you to actually see what it did — because at that point a chat completion call isn't enough and you know it.

the load-bearing decision here is mastra over rolling your own orchestration. people resist agent frameworks because the early ones were overengineered, but skipping memory and tool-calling infrastructure just means you rebuild a worse version of it three weeks in, under deadline pressure.

observability is the piece teams skip until something goes wrong in production and nobody can explain why the agent did what it did. langfuse before launch, not after the first weird support ticket.

the stack — 7 tools

editor

·

Cursor

the deepest ai integration of any editor, full control over the model.

agent framework

·

Mastra

memory, tools, and orchestration built in — not duct-taped together.

llm observability

·

Langfuse

open source and self-hostable — no lock-in to one llm's tracing tool.

runtime

·

Bun

faster cold starts for agent loops, less tooling to configure.

backend & vector storage

·

Supabase

pgvector means embeddings live next to your actual data.

hosting

·

Vercel

edge functions and streaming are table stakes for ai products.

error tracking

·

Sentry

catches your llm app's weird errors before users have to report them.

skip this stack if

  • ×your product is a thin wrapper around one api call — you don't need an agent framework or dedicated observability for that, just call the api.
  • ×you're cost-sensitive on inference — this stack doesn't pick a model provider, and that's usually where the real bill shows up.
  • ×you need full control over model weights or self-hosted inference — that's a different, heavier stack entirely.

one of 5 opinionated stacks.

see all stacks →