Post-Credit Burn Plan

AWS AI/ML Token Burn Plan

3-month runway after $100K AWS Activate credit expires. Budget: $45,000 for AI/ML services alone (months 13-15).

3-Month Post-Credit AI/ML Spend

Month 13 (transition)
$11,200
Month 14 (optimise)
$14,500
Month 15 (scale)
$19,300
Quarter Total
$45,000
Avg Monthly
$15,000
Year-2 Steady State
$10-14K/mo
Why this is aggressive: AWS gave us $100K credits to prove production load. The minute credits end, they expect to see $10-20K/mo in real PAYG spend. Dropping to $2K/mo after credits look like we were just freeloading. The plan below assumes we keep building and accept the burn as customer acquisition cost.

3-Month Burn Trajectory

MONTH 13 (Credit +0)
$11,200
Switch to PAYG on day 1. Keep Bedrock Provisioned Throughput (commit month-to-month). Activate 1-yr Savings Plans to lock rates. Aggressive cache warmup.
MONTH 14 (Crawl)
$14,500
Customers signing up. Token volume spikes 40%. Add 2nd Bedrock model unit. SageMaker endpoints scaled up. No Prometheus data yet to optimise - just absorb.
MONTH 15 (Walk)
$19,300
First renewal cohort + new customers. Production traffic at 80% of credit-era peak. Add Kinesis + SageMaker Batch for scale. Lock in 1-yr RIs.

Bedrock Token Budget Breakdown

Month 15 Token Allocation: ~3.8B input + 1.2B output tokens

Claude Sonnet 35%
Claude Haiku 30%
Nova/Llama 15%
Embeddings 10%
Batch API 10%

Mix designed to push hot-path queries through cheap models first, escalate to Sonnet only when confidence is low.

Service-by-Service Plan

1. Amazon Bedrock (Foundation Models)

Service / Model Configuration Monthly Cost Purpose
Bedrock
Claude Sonnet 4.5 (on-demand)
800M input + 200M output tokens/mo ~$3,000 Production risk analysis, report generation, complex reasoning
Bedrock
Claude Sonnet 4.5 (1-yr Provisioned)
1 model unit, no upfront, 1-yr commit ~$1,950 Baseline customer-facing workload, predictable latency
Bedrock
Claude Haiku 4.5
1.2B input + 400M output tokens/mo ~$2,100 High-volume: alert categorisation, claim summaries, notifications
Bedrock
Amazon Nova Pro / Lite
600M tokens/mo (open-weight alternative) ~$450 Bulk classification, embedding auxiliary, multi-language
Bedrock
Llama 3.1 70B / Mistral Large
300M tokens/mo (cost-sensitive workloads) ~$300 Open-weight tasks, batch processing, fine-tuned variants
Bedrock
Titan Text Embeddings v2
200M tokens/mo (RAG + semantic search) ~$50 Vector embeddings for evidence retrieval (replaces sentence-transformers)
Bedrock
Bedrock Batch API
300M tokens/mo (50% off) ~$200 Nightly portfolio risk recompute, bulk evidence processing
Bedrock
Custom Model Import (2 models)
Custom inference, 50M tokens/mo each ~$400 Industry-specific fine-tuned: commercial property, fleet risk
Bedrock
Guardrails for Bedrock
20M policy units/mo (PII, hallucination, topic filters) ~$200 Compliance + insurance regulatory content filtering
Bedrock Subtotal: ~$8,650/mo — this is the single largest AI line item. Intelligent prompt routing (Haiku -> Sonnet escalation) saves 30-40% versus using Sonnet for everything.

2. Amazon SageMaker (Custom ML)

Service Configuration Monthly Cost Purpose
SageMaker
Real-Time Endpoints
4x ml.m5.xlarge with auto-scaling (1-8 instances) ~$1,600 Custom claims classifier, anomaly detector, risk scoring ensemble
SageMaker
Serverless Inference
10M invocations/mo, 4GB mem, 5s max latency ~$400 Spikey workloads: occasional inference, no idle cost
SageMaker
Batch Transform
ml.m5.4xlarge, 300 hours/mo ~$900 Nightly portfolio risk recompute, model scoring at scale, bulk reports
SageMaker
Training Jobs (GPU)
ml.p3.2xlarge (V100), 150 hours/mo ~$1,950 Weekly model retraining, drift adaptation, A/B challenger training
SageMaker
Processing Jobs
ml.m5.2xlarge, 200 hours/mo ~$300 Feature engineering, data prep, label generation
SageMaker
SageMaker Studio
5 users, ml.m5.4xlarge notebooks ~$200 Data scientist workspace, experiment tracking
SageMaker
Feature Store
10M records, 100K reads/mo ~$150 Online + offline feature store for risk models
SageMaker
Model Registry + Pipelines
50 model versions, 100 pipeline runs/mo ~$100 MLOps: model versioning, A/B testing, automated retraining

3. Other AI/ML Services

Service Configuration Monthly Cost Purpose
Comprehend
Comprehend (NER + Sentiment)
10M characters/mo (claims + incident text) ~$30 Extract entities, sentiment from claim descriptions
Comprehend
Comprehend Medical
8M characters/mo (medical claims) ~$400 PHI extraction, medical entity recognition for injury claims
Rekognition
Image Analysis
3M images/mo (property damage, incident photos) ~$300 Visual damage triage, label detection, PPE compliance
Rekognition
Custom Labels
5M inference units/mo (custom damage classes) ~$200 Industry-specific visual damage classification
Forecast
Time-Series Forecasting
20M predictions/mo (telemetry forecasting) ~$180 Sensor drift prediction, anomaly forecasting, capacity planning
Textract
Document Extraction
50K pages/mo (insurance docs, certificates) ~$75 OCR + structured extraction from policy documents, certificates of insurance
Translate
Real-Time Translation
5M characters/mo (multi-region support) ~$75 Multi-language evidence, customer comms in EU/Asia markets
Polly
Text-to-Speech
20M characters/mo (alert voice calls) ~$60 Voice alert synthesis for emergency callouts
Lex
Conversational AI Bot
5K text requests + 2K speech requests/mo ~$50 Customer support chatbot for risk queries
Kendra
Intelligent Search
Developer Edition, 5M queries/mo ~$810 Enterprise search across evidence, reports, policies (replaces OpenSearch for RAG)

4. AI-Support Storage & Data

Service Configuration Monthly Cost Purpose
S3
S3 Standard (AI training data)
3TB active training + model artefacts ~$70 Training datasets, model checkpoints, evaluation results
S3
S3 Intelligent-Tiering
10TB (prompt logs, completions, eval sets) ~$260 Long-term LLM conversation logs for fine-tuning + audit
S3
Vector Store (OpenSearch Serverless)
100GB vectors, 10M queries/mo ~$700 Managed vector DB for RAG (replaces self-hosted pgvector)
Kinesis
Kinesis Data Streams (token events)
20 shards, prompt + completion event stream ~$440 Real-time token usage metering + anomaly detection

3-Month Totals

AI/ML Burn by Month

Bedrock (models)
$8,650
SageMaker (custom ML)
$5,600
Comprehend / Rekognition / Forecast
$1,370
Kendra / Lex / Polly / Translate / Textract
$1,070
AI Storage (S3 + Vector + Kinesis)
$1,470
Monthly Subtotal (steady)
$18,160
Month 13 (transition dip)
$11,200
Quarter Total
$45,000

Token Optimisation Playbook (Post-Credit)

Tenant Token Quota Strategy

Per the platform's rate limiting implementation, each tenant gets a tiered token budget. The default tiers (overridable per-tenant):

Tier RPM Tokens/min Target Customer
Free1050KTrial signups, sandbox tenants
Starter60500KSMB insurers, < 100 assets
Pro3005MMid-market, 100-1K assets
EnterpriseUnlimitedNegotiatedLarge insurers, > 1K assets, custom SLAs

This is enforced via Redis-backed token-bucket (key: `tquota:{tenant_id}:{model}:{window}`) and surfaced as 429 responses with `Retry-After` headers when exceeded.

Why Burn $15K/mo After Credits?

Three reasons:
  1. Stickiness = future EDP negotiation power. $180K/yr post-credit burn puts us in the top decile of InsurTech AWS customers. That earns a custom Enterprise Discount Program contract at 15-25% off list — saving $30-45K/yr going forward.
  2. Customers expect production-grade AI. If we cut Sonnet access to save $2K/mo, our risk analysis drops in quality, customers churn, and the whole burn plan collapses. $15K/mo is the cost of being taken seriously.
  3. ML training compounds value. Weekly retraining on SageMaker is what makes the platform smarter. Cut that and we lose the "predictive" differentiator that justifies the price tag.

Funding Source for Post-Credit Burn

Bottom line: We need to show AWS that we can sustain $10-20K/mo in real spend. The $100K credit was the down payment on proving production load. The next $45K is us putting our own capital where our roadmap is. If we can't afford that, we shouldn't be building this.