Architecture, 12-month investment, and post-credit AI/ML burn planning for the SafeGuard Predictive Risk Intelligence Platform.
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Compute EC2 m6i.xlarge |
4 vCPU, 16GB RAM - FastAPI backend (Multi-AZ x2) | ~$280 | Production API tier with N+1 redundancy |
| Compute EC2 m6i.large |
2 vCPU, 8GB RAM - Temporal + Camunda workers x3 | ~$240 | Workflow engine fleet |
| Compute EC2 c6i.2xlarge |
8 vCPU, 16GB RAM - ML inference fleet | ~$280 | Bedrock batching, custom model serving |
| Compute ECS Fargate |
50 vCPU, 100GB - autoscaling async workers | ~$450 | Webhook delivery, telemetry polling, batch jobs |
| Compute EKS |
1 control plane + 3 m5.large nodes | ~$320 | Container orchestration for microservices growth path |
| Compute Lambda |
50M requests/mo, 400K GB-seconds | ~$200 | Event glue, API Gateway backend, Step Functions triggers |
| Compute 1-Year Savings Plans |
$1,200/mo commitment, ~40% off on-demand | -$480 effective | Lock in baseline compute from month 6 |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Database Aurora PostgreSQL Serverless v2 |
2-16 ACU, Multi-AZ, 200GB | ~$750 | Primary OLTP - assets, alerts, users, audit log (replaces self-hosted TimescaleDB/pgvector) |
| Database RDS Read Replica (Cross-AZ) |
db.r6g.large for analytics queries | ~$210 | Offload reporting queries from primary |
| Database ElastiCache Redis (cluster mode) |
3x cache.r6g.large, Multi-AZ | ~$480 | Session, pub/sub, rate limiting, feature store cache |
| Database DynamoDB |
On-demand, 500GB, 100K WCU/RCU peak | ~$400 | Real-time state, hot IoT metadata, websocket session map |
| Database Neptune (Graph DB) |
db.r6g.large, Multi-AZ, 100GB | ~$380 | Replaces Apache AGE for risk correlation graphs |
| Database Timestream (time-series) |
1B writes, 10B queries/mo, 500GB memory store | ~$350 | Raw IoT telemetry (replaces TimescaleDB hypertable) |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Storage S3 Standard |
2TB active, 50M requests | ~$80 | Reports, evidence, alert media, ML training data (replaces MinIO) |
| Storage S3 Intelligent-Tiering |
5TB with auto-classification | ~$130 | Long-term logs, model artefacts, backups |
| Storage S3 Glacier Instant Retrieval |
10TB compliance archive | ~$50 | 7+ year insurance evidence retention (regulatory) |
| Storage S3 Glacier Deep Archive |
20TB cold storage | ~$12 | 10+ year audit trail, regulatory archive |
| Storage EBS gp3 (EC2 volumes) |
2TB aggregated across fleet | ~$160 | Boot volumes, hot scratch space, log buffers |
| Storage EBS io2 Block Express |
500GB for Aurora temporary tables | ~$130 | High-IOPS workloads, batch analytics |
| Storage FSx for Lustre |
1.2TB, 100 MB/s/TiB throughput | ~$180 | High-speed scratch for ML training jobs |
| Storage AWS Backup |
10TB warm backup across RDS, EBS, S3 | ~$200 | Cross-region DR, 35-day retention, point-in-time recovery |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Network Route 53 |
5 hosted zones, 100M queries | ~$50 | DNS, health checks, failover routing (replaces self-hosted) |
| Network CloudFront |
10TB egress, 500M HTTP/HTTPS requests | ~$950 | Global CDN for frontend, API edge caching, static assets |
| Network Application Load Balancer |
2 ALBs (web + API) + LCU | ~$80 | SSL termination, path routing, WAF integration (replaces Traefik) |
| Network Network Load Balancer |
2 NLBs for WebSocket + IoT MQTT | ~$70 | TCP/L4 load balancing for long-lived connections |
| Network API Gateway |
200M API calls, 500M connection-minutes | ~$250 | Public REST API entrypoint, throttling, API keys |
| Network NAT Gateway |
3 NAT GWs across 3 AZs + processing | ~$140 | Egress for private subnets (compute, workers) |
| Network VPC Endpoints (Gateway + Interface) |
S3, DynamoDB, Secrets Manager, ECR, Bedrock | ~$110 | Private connectivity, reduce NAT costs |
| Network Transit Gateway |
Multi-region peering for DR | ~$80 | Hub for VPC-to-VPC and on-prem connectivity |
| Network Direct Connect (hosted) |
1Gbps port, 50% utilisation | ~$220 | Dedicated link for insurance partner integrations |
| Network Data Transfer (Internet Out) |
5TB/mo to customers/partners | ~$450 | API responses, report downloads, evidence exports |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| ML Bedrock - Claude Sonnet 4.5 |
5M input + 2M output tokens/mo (production reasoning) | ~$200 | Risk analysis, evidence synthesis, report generation |
| ML Bedrock - Claude Haiku 4.5 |
20M input + 10M output tokens/mo (high-volume) | ~$120 | Alert categorisation, claim summarisation, notifications |
| ML Bedrock - Nova Pro / Llama 3 70B |
10M tokens/mo (open-weight workloads) | ~$80 | Bulk scoring, embeddings, classification |
| ML Bedrock Provisioned Throughput |
1 model unit (Claude Sonnet 4.5), 1-month commitment | ~$1,950 | Guaranteed latency for customer-facing risk analysis (months 6-12) |
| ML Bedrock Custom Model Import |
2 fine-tuned models, inference only | ~$400 | Industry-specific risk models (commercial property, fleet) |
| ML Titan Embeddings v2 |
50M tokens/mo (RAG + semantic search) | ~$50 | Vector embeddings (replaces self-hosted sentence-transformers) |
| ML SageMaker Real-Time Endpoints |
3x ml.m5.xlarge with auto-scaling | ~$1,200 | Custom claims classifier, anomaly detector, risk scoring ensemble |
| ML SageMaker Batch Transform |
ml.m5.4xlarge, 200 hours/mo | ~$600 | Nightly portfolio risk recompute, model scoring at scale |
| ML SageMaker Training |
ml.p3.2xlarge GPU, 100 hours/mo | ~$1,300 | Weekly model retraining with new data |
| ML SageMaker Pipelines + Experiments |
MLOps orchestration | ~$150 | Model versioning, A/B testing, drift monitoring |
| ML Comprehend Medical |
5M characters/mo (claims & incident text) | ~$250 | Extract medical/injury entities from claim descriptions |
| ML Rekognition (Image) |
2M images/mo (property damage assessment) | ~$200 | Visual damage triage from incident photos |
| ML Forecast (time-series) |
10M predictions/mo (telemetry forecasting) | ~$90 | Sensor drift & anomaly prediction |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| IoT AWS IoT Core |
1B messages, 500K devices registered | ~$800 | MQTT broker, device shadows, registry (replaces EMQX) |
| IoT IoT Device Defender |
Continuous audit on 500K devices | ~$150 | Security monitoring, anomaly detection, compliance |
| IoT IoT Analytics |
Pipeline running 24/7, 100GB processed/day | ~$300 | Time-series SQL queries on raw sensor data |
| IoT IoT SiteWise |
100K assets monitored, 10M datapoints/day | ~$500 | Industrial asset modelling, asset hierarchy, KPI computation |
| IoT IoT Events |
50M detector events/mo | ~$100 | Stateful event detection, complex pattern matching |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Security Secrets Manager |
100 secrets, 5M API calls | ~$60 | DB creds, API keys, JWT signing, encryption keys |
| Security WAF (Web Application Firewall) |
1 Web ACL, 20M requests, 30 managed rules | ~$130 | OWASP top-10, bot mitigation, rate limiting |
| Security GuardDuty |
Full account + S3 + EKS protection | ~$450 | Continuous threat detection across all services |
| Security Security Hub |
Standard tier, all findings aggregated | ~$50 | Centralised security posture, compliance dashboards |
| Security Inspector (Vulnerability Scanning) |
Continuous ECR + EC2 scanning | ~$80 | CVE detection in container images & VMs |
| Security Macie (Data Privacy) |
1TB S3 scanned/mo for PII | ~$100 | GDPR/SOX compliance for customer data in S3 |
| Security Certificate Manager (ACM) |
Public + private certs | $0 | TLS for ALB, CloudFront, API Gateway (replaces Let's Encrypt) |
| Security KMS (Key Management) |
100 CMKs, 50K API calls | ~$30 | Encryption-at-rest for RDS, S3, EBS, Secrets Manager |
| Security Cognito |
100K MAU, advanced security features | ~$275 | End-user identity (replaces self-hosted Keycloak for B2C portal) |
| Security IAM Identity Center (SSO) |
50 engineering users | ~$25 | AWS console SSO, cross-account access for prod/staging |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Monitor CloudWatch (Logs + Metrics) |
500GB logs, 100K custom metrics, 50 dashboards | ~$800 | Logs, metrics, alarms, dashboards (replaces Prometheus + Grafana) |
| Monitor CloudWatch Logs Insights Queries |
20TB scanned/mo for ad-hoc analysis | ~$200 | Application debugging, security investigations |
| Monitor CloudWatch Anomaly Detection |
200 metrics with ML-based alerting | ~$100 | Auto-baselined SLO monitoring, capacity planning |
| Monitor AWS X-Ray |
50M traces, full app instrumentation | ~$250 | Distributed tracing, latency analysis, dependency maps |
| Monitor Managed Grafana |
5 editors, 50 viewers, 50 dashboards | ~$180 | Customer-facing observability, exec dashboards |
| Monitor Managed Prometheus |
10M samples/sec ingestion, 100K active series | ~$650 | EKS/microservice metrics, PromQL compatibility |
| Monitor CloudWatch RUM + Synthetics |
10M events, 50 canary runs/mo | ~$150 | Frontend performance, synthetic uptime checks |
| Monitor SNS (Alerting) |
10M publishes, SMS + email + PagerDuty | ~$80 | On-call paging, SLO breach notifications |
| Monitor DevOps Guru |
Full application + infrastructure analysis | ~$300 | ML-powered anomaly detection, root cause suggestions |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Network Kinesis Data Streams |
50 shards, 1TB/hr ingestion | ~$1,100 | Real-time alert + telemetry pipeline (replaces Redpanda) |
| Network Kinesis Data Firehose |
10TB delivered to S3 + Splunk | ~$290 | Durable streaming to data lake + analytics |
| Network MSK (Kafka) |
3x kafka.m5.large brokers, Multi-AZ | ~$950 | Kafka-compatible stream for partner integrations |
| Network SQS |
500M requests/mo | ~$200 | Webhook delivery, background jobs, fan-out |
| Network SNS |
100M publishes/mo | ~$50 | Pub/sub for fan-out, SMS/email alerts to end users |
| Network EventBridge |
500M events/mo, 50 event buses | ~$225 | Service decoupling, scheduled jobs, SaaS integrations |
| Network Step Functions |
10M state transitions/mo (replaces Camunda) | ~$250 | Visual workflow orchestration for claim processes |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Analytics Athena |
50TB scanned/mo, S3 data lake | ~$250 | Ad-hoc SQL on telemetry, audit, evidence data |
| Analytics Redshift Serverless |
32 RPU baseline, 5TB compressed | ~$960 | Executive analytics, portfolio risk dashboards |
| Analytics QuickSight (Enterprise) |
100 authors, 500 readers, dashboards | ~$750 | Customer-facing embedded BI, exec reporting |
| Analytics OpenSearch (Elasticsearch) |
3x t3.small.search, 100GB | ~$180 | Full-text search across evidence & reports |
| Analytics Glue (ETL) |
10 DPUs, 100 hours/mo | ~$440 | Data transformation, schema discovery, catalog |
| Analytics Lake Formation |
1TB data lake with fine-grained access | ~$110 | Governance, row/column-level security on data lake |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Storage ECR (Container Registry) |
500GB images, 10M pulls | ~$50 | Docker images for backend/frontend/workers |
| Compute CodeBuild |
2,000 build-minutes, 10 concurrent builds | ~$80 | CI for tests, lint, typecheck, image build |
| Compute CodePipeline |
10 active pipelines | ~$10 | Multi-stage deploy: dev -> staging -> prod |
| Compute CodeDeploy |
20 EC2 + 20 Lambda deployments/mo | ~$5 | Blue/green + canary deployments, rollback |
| ML CodeCatalyst (dev environments) |
20 dev environments (4 vCPU, 16GB) | ~$800 | Cloud dev environments, no local Docker pain |
| Storage CodeArtifact |
100GB Python + npm packages | ~$5 | Private package registry, supply chain security |
| Security Inspector + ECR scanning |
Continuous on 500 images | ~$30 | CVE detection in build pipeline |
| Service | Configuration | Monthly Cost | Purpose |
|---|---|---|---|
| Support AWS Business Support |
1-hour response for production-down, 24/7 chat | ~$1,000 or 10% of monthly spend | Required by most enterprise customers for insurance-grade SLA |
| Support AWS IQ / Professional Services |
40 hours/mo of architect time | ~$5,000 | Migration assistance, Well-Architected reviews, training |
| Support Training & Certification |
5 engineers, 3 courses each | ~$3,000 | SA-A, SAP, Security Specialty certs |
| Support Marketplace Subscriptions |
3 third-party SaaS listings | ~$500 | Datadog Pro, Snowflake data sharing, PagerDuty |
Realistically, the first 6 months won't burn at the full target rate. Adoption curves look like this:
Y1 cumulative: ~$102K - perfect credit absorption.
After Y1 we lock in 1-year and 3-year Reserved Instances / Savings Plans for known baseline, and right-size the rest. Target:
When customers reach 100K+ devices, the IoT + ML line items scale linearly. Plan for:
InsurTech infrastructure benchmarks (CoverWallet, Lemonade, Hippo, Root):
$5-7K/mo steady-state with $100K Y1 burn is the sweet spot: expensive enough to be taken seriously, lean enough to show path to profitability.
Apply at aws.amazon.com/activate:
For $100K credits, partner with the AWS Startup Loft team or your VC's AWS account team directly. The credits cover compute, ML, data transfer, and Business Support.