How much does it cost to build a SaaS MVP?

We sell three productized MVP tiers (Lite, Standard, Plus) plus an AI automation sprint and ongoing retainers. Public pricing for every tier is on the pricing page. Most B2B SaaS founders fit the Standard tier for a six-week build.

How fast can you start?

Usually within a week. We hold capacity for 2 active MVP builds at a time; you'll know on the scoping call whether we have a slot in the next 7 days or whether we'd be queuing you for the following month.

What's included in a fixed-price MVP?

Auth, Stripe billing, a custom design system, the core user flows we agreed in scoping, an admin panel, analytics, and a production deploy. No drift, no surprises. The full inclusion list is on each service page.

How does the team work?

We're a small, async-first team built around a shared playbook refined across 50+ shipped products. Every engagement runs on the same operational rhythm. Daily commits, end-of-day async videos, weekly Friday demos, Slack-first communication.

Yes. Mutual NDA before any scoping work, signed in the first email exchange. Standard 2-page format, lawyer-reviewed.

What stack do you use?

Default: Next.js 15, TypeScript, Tailwind, shadcn/ui, Postgres (Supabase or Neon), Stripe, deploys on Vercel. We swap pieces when the brief calls for it. Astro for marketing-heavy, Bun for performance-critical APIs.

What if I need changes after launch?

Every MVP includes 30 days of post-launch fixes for any bug we ship. Beyond that, a custom retainer covers ongoing engineering capacity. Retainer pricing is public on the pricing page.

How much does it cost to run a typical B2B AI automation?

For a single moderate-volume B2B automation (~10k LLM calls per day), all-in production cost typically lands at $200–$800/month. The LLM API spend is usually 40–60% of that; infrastructure, monitoring, and vector storage make up the rest.

What's the most expensive line item?

LLM API calls, almost always. Within that, the most expensive sub-line is output tokens. Both vendors charge 3–5x more for output tokens than input tokens, and most cost overruns come from prompts that produce longer outputs than necessary.

How do I cut LLM costs without hurting quality?

Three levers, in order of impact: (1) route to a cheap-fast model (Haiku, gpt-4o-mini) for non-flagship work; (2) cache prompt prefixes; (3) constrain output length. Most automations can cut 50–70% of cost on these three levers without quality degradation.

What's the cost ramp from prototype to production?

Prototype costs are usually $5–50/month (single-user volume, no monitoring, no fallbacks). Production-grade costs for the same automation are 5–20x. The increase is mostly volume, monitoring, redundancy, and ops. Plan for the 10x ramp before shipping.

Can I run AI automations on a flat monthly budget?

Yes, with discipline. Set a daily LLM spend cap with a hard kill-switch. Use deterministic models (temperature=0 where appropriate) for repeatable workloads. Pre-approve any prompt change that increases per-execution cost. Most teams that miss budget didn't have these in place.

— AI AutomationApr 29, 2026

What it actually costs to run an AI automation in production

A real cost breakdown for running an AI automation in production in 2026. LLM API costs, infrastructure, monitoring, vector storage, and the hidden ops costs nobody puts in their pitch deck. With realistic ranges by automation shape.

— TL;DR

A moderate-volume B2B AI automation in 2026 costs $200 to $800 per month all-in. LLM API spend is 40 to 60% of that; infrastructure, monitoring, and vector storage make up the rest. Output tokens are the biggest line. Routing to cheap models, caching prefixes, and constraining length cuts 50 to 70% without hurting quality.

A realistic monthly cost for a moderate-volume B2B SaaS AI automation in production is $200–$800 in 2026. LLM API spend, infrastructure, monitoring, vector storage. That's the all-in number including the boring ops costs that don't make it into pitch decks. The high end of that range is the case where you've shipped without cost discipline; the low end is where the team has built in routing, caching, and monitoring from week 1.

This piece breaks down the line items by category, gives realistic ranges by automation shape, and walks through the levers that actually move the cost.

#The shape of the bill

A typical production B2B AI automation cost stack:

Line item	% of bill	Range (moderate volume)
LLM API calls	40–60%	$80–$480/mo
Compute (containers / serverless)	10–20%	$20–$160/mo
Vector storage (if RAG)	5–15%	$10–$120/mo
Monitoring + observability	10–15%	$20–$120/mo
Audit log storage	5–10%	$10–$80/mo
Workflow platform fees (if n8n cloud / Zapier / Make)	0–20%	$0–$160/mo
Email / notification services	1–5%	$2–$40/mo
Total all-in	(varies)	$200–$800/mo

These are real numbers for production B2B automations we've shipped. Moderate volume meaning ~10,000 LLM calls per day, with monitoring, fallbacks, and audit logging in place. Below that volume the cost falls; above that volume the cost rises but typically sub-linearly because you start hitting volume discounts and amortizing fixed costs.

#LLM API costs

The biggest line item. Within it, two sub-lines that account for most of the variance:

Input vs output tokens. Both OpenAI and Anthropic charge 3–5x more for output tokens than input tokens. A prompt that has 1,000 input tokens and produces 2,000 output tokens costs more than a prompt with 5,000 input tokens that produces 500 output tokens.

The single biggest cost-cutting move is constraining output length. Use stop sequences. Set max_tokens. Ask for terse responses in the system prompt. Use structured outputs (JSON schema constraints) which often produce shorter responses than free-form text.

Model tier. Flagship models (GPT-5, Claude Opus 4.7) cost 10–30x what cheap-fast models (gpt-5-nano, Claude Haiku 4.5) cost per token. Most production workloads have a heavy long tail of routine calls that don't need flagship reasoning. Those should route to the cheap tier.

The pattern: ~80% of production LLM calls go to a cheap-fast model; ~20% go to a flagship model when the cheap one fails an evaluation gate or the workload needs flagship-grade reasoning. Teams that route 100% of calls to flagship models are spending 5–10x what they need to.

Prompt caching. Both vendors offer prompt prefix caching in 2026. If your prompts have a stable system prompt or shared context (RAG, few-shot examples), caching cuts input token costs by 50–90% for the cached portion. Free for Anthropic; small per-cached-token fee for OpenAI; both worth wiring up.

For more detail on the model-by-model cost trade-offs, see OpenAI vs Anthropic for B2B SaaS automation.

#Compute costs

Where the automation runs. Three common shapes:

Serverless functions (Vercel, AWS Lambda, Cloudflare Workers): pay-per-execution, no idle cost. Best for low-to-moderate volume automations with bursty traffic. Typical monthly cost: $5–$80 for moderate-volume automations.
Long-running containers (Fly.io, Railway, AWS Fargate, plain VPS): flat monthly cost regardless of execution count. Best for sustained-traffic automations or workloads that don't fit serverless cold-start tolerance. Typical monthly cost: $20–$200 for a small container.
Workflow platforms (n8n cloud, Zapier, Make): bundled with execution fees. n8n cloud starts at ~~$24/month; self-hosted n8n costs only the hosting (~~$10–$30/month for a small server). Zapier / Make per-execution fees can dominate at volume.

For most B2B automations under 100k executions/day, serverless is the right shape. For workflows that need to maintain long-lived connections (websockets, polling-heavy integrations), a container is cleaner.

#Vector storage (for RAG workloads)

If your automation does retrieval-augmented generation, you need a vector store. Options in 2026:

pgvector in your existing Postgres (Supabase, Neon): $0–$25/month additional cost. Fine for under ~1M vectors. Becomes slow above that without careful indexing.
Pinecone: $0 free tier, $70/month starter, more at scale. Fully managed, fast, opinionated about index types.
Qdrant: $0 free tier on Qdrant Cloud, $25–$50/month starter. Open source if you self-host.
Weaviate: $25–$100/month for managed; self-host on a $5–$20/month VPS.
Chroma: usually self-hosted, ~$5–$20/month for the VPS.

For most B2B SaaS automations: pgvector if you're already on Postgres and your vector count is under 1M. Pinecone or Qdrant if you cross that threshold or have specific perf requirements.

#Monitoring + observability

The line item teams skip in prototype and pay for in production. Realistic stack:

Sentry for error monitoring: $0 free tier (5k events/month), $26/month for Team tier
PostHog or Datadog for usage analytics + dashboards: $0–$45/month at moderate volume
LangSmith or Langfuse for LLM-specific observability (prompt/response logging, eval traces): $0 free tier, $39–$199/month for paid tiers
Better Uptime or equivalent for cron / endpoint monitoring: $0–$29/month

For a production-grade automation, monitoring lands at $20–$120/month all-in. That's the cost of being able to diagnose what went wrong when the automation produces weird output at 3 AM. Skipping it is the canonical false economy.

#Audit log storage

Required for any automation that touches customer data, financial records, or regulated workloads. Even when not legally required, it's required operationally. When a customer complains that the automation did the wrong thing, you need to be able to prove what it did and why.

Realistic costs:

Postgres audit log table (in your existing DB): $0 incremental cost; storage cost is negligible at moderate volume
S3 / R2 for long-term retention of full prompt + response payloads: $0.02/GB/month; usually $5–$20/month for moderate-volume automations
Datadog logs or equivalent if you want unified search across application + audit logs: $30–$100/month at moderate volume

For B2B SaaS the right pattern is usually: structured audit log row in Postgres (timestamp, user, action, summary) + full payloads in S3 / R2 for retrieval when needed. Cheap and adequate for most regulatory contexts.

#Hidden costs

The ones founders don't budget for:

Prompt iteration cost. When a prompt isn't working, you iterate. Each iteration involves running the prompt against an evaluation set (often 50–500 test cases) and comparing outputs. That's not free; running 500 evaluation cases on a flagship model can be $5–$20 per iteration. Across 20–50 iterations during an automation's lifetime, that's $100–$1,000 in eval-only LLM spend.

Budget: ~10% of expected production LLM spend, allocated to evaluation runs.

Fallback model spend. When the primary vendor hits a rate limit or returns 5xx, your automation falls over to the secondary vendor. Most of the time this is fine and rare. Occasionally the primary has a multi-hour outage and 100% of traffic flips to the secondary, which charges different prices. Budget for occasional spikes.

Compliance / audit prep. If your customer is enterprise and you'll be audited (SOC 2, HIPAA, etc.), the audit logging and reporting infrastructure adds engineering time and tooling cost. Not a monthly LLM spend line item, but a one-time $5–25k workstream for the first audit, smaller for subsequent ones.

Drift management. LLM models change. The prompt that worked perfectly on Claude Sonnet 4.6 in April 2026 may behave differently when Anthropic releases Sonnet 4.8 in August 2026. Production automations need an ongoing eval discipline to catch drift; that's engineering time, not a vendor cost line item.

Plan for ~10–20% of LLM API spend going to these hidden costs over the lifetime of an automation.

#Realistic ranges by automation shape

Shape	Monthly LLM spend	All-in monthly cost
Daily summary email (1–10 calls/day)	$1–$10	$20–$50 (mostly monitoring)
Customer support triage (1k tickets/day)	$30–$150	$80–$300
Content moderation (100k items/day)	$200–$800	$300–$1,000
Sales enrichment (10k leads/day)	$50–$300	$150–$500
RAG over docs (5k queries/day)	$80–$400	$150–$600
Multi-step agent workflow (1k runs/day, 10 steps each)	$200–$1,500	$400–$2,000
High-volume real-time agent (50k+/day)	$1,000–$8,000	$1,500–$12,000

These ranges are with reasonable cost discipline. Model routing, prompt caching, output constraint, monitoring in place. Without those, double the LLM spend. Without monitoring, expect production incidents that cost more than the monitoring would have.

#What we ship for cost discipline

For our AI Automation Sprint engagements, the default cost-control posture in 2026:

Model routing layer that defaults to the cheap-fast model and escalates to flagship only when needed
Prompt caching wired up from day 1 (free win)
Output constraints in every prompt (max_tokens, structured outputs, stop sequences)
Daily LLM spend dashboard with alerts at 50%, 80%, 100% of expected daily spend
Hard kill-switch that can pause the automation if cost runs away (typically: a feature flag we can toggle in 30 seconds)
Cost-per-execution metric tracked in monitoring so we can see if a prompt change spiked the per-call cost

That setup costs ~2–3 days of week-1 build and saves 30–60% of LLM spend over the automation's lifetime. The math is decisive. This is the work that separates "we shipped an automation" from "we shipped an automation that's economically sustainable."

#Bottom line

Production AI automations in 2026 cost real money, but the costs are predictable when you build with discipline. The cost goes up when teams skip the boring infrastructure work (routing, caching, monitoring) in favor of speed. The cost stays bounded when the boring infrastructure work gets shipped in week 1.

Plan for $200–$800/month for a moderate-volume B2B automation. Plan for the ramp from prototype ($5–50/month) to production (10–20x prototype) before shipping. Build the kill-switch before you need it.

— Want this for your SaaS?

AI Automation Sprints, shipped fortnightly ↗

Two-week cycles to ship internal-tool automations that actually save hours. n8n, LangChain, custom code. Opinionated stack, full handoff, paid for by the time it gives back.

See the service

— Keep reading