is langfuse really open source?

yes — langfuse is MIT licensed. the full server, UI, and SDK are public on github. you can run the entire platform on your own infrastructure. langsmith is closed source and cloud-only.

can i self-host langfuse on my own infra?

yes, and it's well-documented. langfuse ships as a docker container and has official kubernetes helm charts. for teams with GDPR concerns, data residency requirements, or who want trace data on-premises, self-hosting is a first-class path.

does langfuse work with langchain?

yes. langfuse integrates with langchain via a callback handler — two lines of code. it also integrates with llamaindex, the vercel ai sdk, mastra, openai directly, and any custom LLM setup via the tracing SDK.

which has better evals?

both have evaluation frameworks. langfuse's evals are model-based (use an LLM to score traces) and human-in-the-loop annotation. langsmith's evals are tightly integrated with langchain's LCEL chains. for teams not on langchain, langfuse's eval tooling is more accessible.

what about helicone or arize phoenix?

helicone is simpler and faster to set up — proxy-based, works with any openai-compatible SDK instantly. arize phoenix is python-first and research-oriented, better for ML teams doing offline evaluation. langfuse is the most complete open source option for production use.

is langsmith only for langchain users?

officially no, but practically yes. langsmith technically supports other frameworks but the integrations are afterthoughts compared to the native langchain experience. if you're not using langchain or langgraph, langsmith's value proposition is weaker.

aidev toolsobservability

langfusevslangsmith

try langfuse →try langsmith →

winnerlangfuse

for: teams who want open source observability they can self-host, with framework-agnostic tracing and a generous free tier

skip if: teams already deep in the LangChain ecosystem who want the official tracing tool with the least integration overhead

AI observability is the new logging. vendor lock-in on your trace data is a real architectural risk. langfuse being open source and self-hostable isn't just philosophical — it's the right call before you have millions of traces somewhere you can't move.

AI observability is the new logging — and the teams that don't set it up before they need it will regret it the same way teams that skipped logging regretted it. the question isn't whether to instrument your LLM calls; it's which tool to trust with that data long-term.

langfuse's answer is open source and self-hostable. you can audit the code, run it on your own infra, and never worry that a pricing change will put your trace data behind a paywall. langsmith's answer is "we built langchain and we built the best tracing tool for it." that's a good answer if you're on langchain.

for teams using a mix of frameworks — vercel ai sdk, mastra, raw openai, llamaindex — langfuse is the clear choice. it works with all of them, costs less at scale, and gives you full ownership of your observability data.

verdict as of mid-2026. these tools move fast — we'll update when things change.

what each one actually is

Langfuse is an open source LLM observability and analytics platform. it provides tracing (see exactly what happened in each LLM call, chain, or agent run), prompt management (version your prompts, A/B test them), evaluations (score outputs with models or humans), and a dashboard for analyzing cost, latency, and quality. it integrates with virtually every LLM framework and runs both as a managed cloud service and on self-hosted infrastructure.

LangSmith is LangChain's official observability and evaluation platform. it's built by the same team as langchain and langgraph, which means the integration is deep and first-class. if you're using langchain expressions (LCEL) or langgraph agents, langsmith picks up traces with almost zero configuration. it has evals, prompt management, dataset management, and annotation queues. it's a managed cloud product — no self-hosting option.

pricing, honestly

langfuse cloud free tier: 50,000 observations/month, 2 users, 30-day data retention. enough to evaluate seriously and run a small production workload. the team plan starts at $59/month for 1M observations and longer retention. for teams with high tracing volume, the self-hosted option eliminates the per-observation cost entirely.

langsmith's pricing shifted upward in 2025. the developer tier is free for limited usage; the plus tier starts at $39/month per seat. for teams with high trace volume and multiple users, langsmith can cost significantly more than langfuse's equivalent tier. langsmith doesn't have a self-host option to escape the cost curve.

what it's actually like to use them

getting started with langfuse is a good benchmark for how LLM tooling should feel: install the SDK, set two environment variables, wrap your LLM call, and traces appear in the dashboard within seconds. the UI shows inputs, outputs, tokens, latency, and cost per trace with clean hierarchical visualization for chain and agent runs. prompt management lets you pull prompts at runtime with version pinning. the dataset and eval tooling is genuinely useful for teams doing systematic quality testing.

langsmith's integration story is remarkable if you're on langchain. a single environment variable (LANGCHAIN_TRACING_V2=true) and langsmith captures every chain, every tool call, every LLM invocation automatically. the trace visualization is purpose-built for langchain's structure — you see exactly which nodes fired in your chain and what data passed through each. if you're not on langchain, you're using the generic SDK, which is fine but not the experience langsmith is designed around.

who langfuse is for

teams using multiple LLM frameworks (vercel ai sdk, mastra, llamaindex, raw openai, anthropic SDK)
companies with data residency or compliance requirements who need self-hosted observability
teams that want to audit their observability tool or contribute to it
anyone price-sensitive at production trace volumes who can't absorb langsmith's per-seat costs
teams building evals outside the langchain ecosystem

who langsmith is for

teams using langchain (LCEL, LangGraph) who want zero-configuration tracing
organizations building with langchain that want first-party support and tight integration
teams that don't need self-hosting and want the tightest possible integration with their agent framework

when to avoid each

don't use langfuse if you want zero-config langchain integration and are never going to use another framework. langfuse's langchain callback works but requires one more line of setup than langsmith, which is trivial but worth naming.

don't lock into langsmith if you're mixing frameworks or might want to run your own infrastructure. langsmith's pricing at scale and its closed-source nature make it a risky foundation for teams that expect significant growth in trace volume.

stuff their landing pages won't tell you

langfuse's self-hosted version ships with the same features as the cloud version — no artificial feature gating
langsmith's "free tier" is limited in ways that become apparent at real workloads — check the specific limits for your use case
both tools calculate cost automatically from token counts and model pricing — useful but requires keeping model pricing tables updated
langfuse has an active community in discord and github; langchain/langsmith support is official but slower for non-enterprise users
both platforms support human annotation queues for labeling trace outputs — useful for building ground-truth eval datasets
langfuse's prompt management works at runtime: you update a prompt in the UI and your production app picks it up without a redeploy. langsmith has similar capability.

the call

langfuse. the open source + self-hosting story is the most important consideration for teams that expect significant scale. locking your trace data into a closed-source cloud product before you know your volume is the same mistake teams made with logging before they discovered they couldn't afford the managed provider.

langsmith is a perfectly valid choice if and only if you're committed to langchain as your primary LLM framework. the zero-config integration is a real advantage and the tooling quality is high. but the moment your stack diversifies, langfuse is the more durable foundation.

for a new AI product in 2026: start with langfuse, self-host if your data is sensitive, and revisit langsmith only if you go all-in on langgraph.

try langfuse →try langsmith →

frequently asked

what the community thinks

don't just take our word for it.

redditwhat reddit thinksunfiltered chaos hacker newswhat hn thinkspedantic but honest product huntlaunch reviewsnice ship btw youtubevideo reviews10 min you won't get back alternativetoalternatives & votesthe og comparison site twitter / xlive opinionshot takes only

spotted a great take?send it in→

newsletter

one verdict a week.

new comparisons, stack updates, and the occasional rant. free forever.

subscribe on substack→

some links on this page are affiliate links. we earn a small commission if you sign up, at no extra cost to you. we don't change verdicts for affiliate money — see how this site makes money.

last updated: june 14, 2026

Mastra vs Vercel AI SDK

mastra if you're building production agent workflows with memory and orchestration. vercel ai sdk if you're building a chat interface or need simple tool calling. mastra is the agent framework; vercel ai sdk is the streaming layer.

PostHog vs Mixpanel

posthog for most product teams in 2026 — it does more, costs less at typical startup scale, and the open source option is real. mixpanel if you have a dedicated analytics team and the budget.