# Technology Selection Framework Structured decision framework for backend and full-stack technology choices. Prevents analysis paralysis while ensuring rigorous evaluation. **Iron Law: NO TECHNOLOGY CHOICE WITHOUT EXPLICIT TRADE-OFF ANALYSIS.** "I like it" and "it's trending" are not engineering arguments. --- ## Phase 1: Requirements Before Technology ### Non-Functional Requirements (Quantify!) | Dimension | Question | Bad Answer | Good Answer | |-----------|----------|-----------|-------------| | Scale | How many concurrent users? | "Lots" | "1K concurrent, 500 RPS peak" | | Latency | Acceptable p99 response time? | "Fast" | "< 200ms API, < 2s reports" | | Availability | Required uptime? | "Always up" | "99.9% (8.7h downtime/year)" | | Data volume | Expected storage growth? | "A lot" | "100GB/year, 10M rows" | | Consistency | Strong vs eventual? | "Consistent" | "Strong for payments, eventual for feeds" | | Compliance | Regulatory? | "Some" | "GDPR data residency EU, SOC 2 Type II" | ### Team Constraints - Team size and seniority level - What the team already knows well - Can you hire for this stack? (check job market) - Timeline pressure (days vs months to production) - Budget for licenses, infrastructure, training --- ## Phase 2: Evaluation Matrix Score each option 1-5 on weighted criteria: | Criterion | Weight | Option A | Option B | Option C | |-----------|--------|----------|----------|----------| | Meets functional requirements | 5× | _ | _ | _ | | Meets non-functional requirements | 5× | _ | _ | _ | | Team expertise / learning curve | 4× | _ | _ | _ | | Ecosystem maturity (libs, tools) | 3× | _ | _ | _ | | Community & long-term viability | 3× | _ | _ | _ | | Operational complexity | 3× | _ | _ | _ | | Hiring pool availability | 2× | _ | _ | _ | | Cost (license + infra + training) | 2× | _ | _ | _ | | **Weighted Total** | | _ | _ | _ | **Rules:** - Any option scoring **1 on a 5× criterion** → automatically disqualified - Options within **10%** of each other → choose what team knows best - Options within **15%** → run a **time-boxed PoC** (2-5 days max) --- ## Phase 3: Decision Trees ### Backend Language / Framework ``` What type of project? │ ├─ REST/GraphQL API, rapid development │ ├─ Team knows TypeScript → Node.js │ │ ├─ Full-featured, enterprise patterns → NestJS │ │ ├─ Lightweight, flexible → Fastify / Hono / Express │ │ └─ Full-stack with React → Next.js API routes │ ├─ Team knows Python │ │ ├─ High-perf async API → FastAPI │ │ ├─ Full-stack, admin-heavy → Django │ │ └─ Lightweight → Flask / Litestar │ └─ Team knows Java/Kotlin │ ├─ Enterprise, large team → Spring Boot │ └─ Lightweight, fast startup → Quarkus / Ktor │ ├─ High concurrency, systems-level │ ├─ Microservices, network → Go │ ├─ Extreme perf, safety → Rust (Axum / Actix) │ └─ Fault tolerance → Elixir (Phoenix) │ ├─ Real-time (WebSocket, streaming) │ ├─ Node.js ecosystem → Socket.io / ws │ ├─ Scalable pub/sub → Elixir Phoenix │ └─ Low-latency → Go / Rust │ └─ ML / data-intensive └─ Python (FastAPI + ML libs) ``` ### Database ``` What data model? │ ├─ Structured, relational, ACID │ ├─ General purpose → PostgreSQL ← DEFAULT CHOICE │ ├─ Read-heavy, MySQL ecosystem → MySQL / MariaDB │ └─ Embedded / serverless edge → SQLite / Turso / D1 │ ├─ Semi-structured, flexible schema │ ├─ Document-oriented → MongoDB │ ├─ Serverless document → DynamoDB / Firestore │ └─ Search-heavy → Elasticsearch / OpenSearch │ ├─ Key-value / cache │ ├─ In-memory + data structures → Redis / Valkey │ └─ Planet-scale KV → DynamoDB / Cassandra │ ├─ Time-series → TimescaleDB / ClickHouse / InfluxDB ├─ Graph → Neo4j / Apache AGE (Postgres extension) └─ Vector (AI embeddings) → pgvector / Pinecone / Qdrant ``` **Default:** Start with PostgreSQL. It handles 80% of use cases. ### Caching Strategy | Pattern | Technology | When | |---------|-----------|------| | Application cache | Redis / Valkey | Sessions, frequent reads, rate limiting | | HTTP cache | CDN (Cloudflare/Vercel) | Static assets, public API responses | | Query cache | Materialized views | Complex aggregations, dashboards | | In-process cache | LRU (in-memory) | Config, small lookup tables | | Edge cache | Cloudflare KV / Vercel KV | Global low-latency reads | ### Message Queue / Event Streaming | Pattern | Technology | When | |---------|-----------|------| | Task queue (background jobs) | BullMQ / Celery / SQS | Email, exports, payments | | Event streaming (replay, audit) | Kafka / Redpanda | Event sourcing, real-time pipelines | | Lightweight pub/sub | Redis Streams / NATS | Simple notifications, broadcasting | | Request-reply (sync over async) | NATS / RabbitMQ RPC | Internal service calls | ### Hosting / Deployment | Model | Technology | When | |-------|-----------|------| | Serverless (auto-scale) | Vercel / Cloudflare Workers / Lambda | Variable traffic, pay-per-use | | Container (predictable) | Cloud Run / Render / Railway / Fly.io | Steady traffic, simple ops | | Kubernetes (large scale) | EKS / GKE / AKS | 10+ services, team has K8s expertise | | VPS (full control) | DigitalOcean / Hetzner / EC2 | Predictable workload, cost-sensitive | --- ## Phase 4: Decision Documentation ### ADR (Architecture Decision Record) Template ```markdown # ADR-{NNN}: {Title} ## Status: Proposed | Accepted | Deprecated | Superseded by ADR-{NNN} ## Context What problem are we solving? What forces are at play? ## Decision What did we choose and why? ## Evaluation | Criterion | Weight | Chosen | Runner-up | |-----------|--------|--------|-----------| ## Consequences - Positive: ... - Negative: ... - Risks: ... ## Alternatives Rejected - Option B: rejected because... - Option C: rejected because... ``` --- ## Common Stack Templates ### A: Startup / MVP (Speed) | Layer | Choice | Why | |-------|--------|-----| | Language | TypeScript | One language front + back | | Framework | Next.js (full-stack) or NestJS (API) | Fast iteration | | Database | PostgreSQL (Supabase / Neon) | Managed, generous free tier | | Auth | Better Auth / Clerk | No auth code to maintain | | Cache | Redis (Upstash) | Serverless-friendly | | Hosting | Vercel / Railway | Zero-config deploys | ### B: SaaS / Business App (Balance) | Layer | Choice | Why | |-------|--------|-----| | Language | TypeScript or Python | Team preference | | Framework | NestJS or FastAPI | Structured, testable | | Database | PostgreSQL | Reliable, feature-rich | | Queue | BullMQ (Redis) | Simple background jobs | | Auth | OAuth 2.0 + JWT | Standard, flexible | | Hosting | AWS ECS / Cloud Run | Scalable containers | | Monitoring | Datadog / Grafana + Prometheus | Full observability | ### C: High-Performance (Scale) | Layer | Choice | Why | |-------|--------|-----| | Language | Go or Rust | Max throughput, low latency | | Database | PostgreSQL + Redis + ClickHouse | OLTP + cache + analytics | | Queue | Kafka / Redpanda | High-throughput streaming | | Hosting | Kubernetes (EKS/GKE) | Fine-grained scaling | | Monitoring | Prometheus + Grafana + Jaeger | Metrics + tracing | ### D: AI / ML Application | Layer | Choice | Why | |-------|--------|-----| | Language | Python (API) + TypeScript (frontend) | ML libs + modern UI | | Framework | FastAPI + Next.js | Async + SSR | | Database | PostgreSQL + pgvector | Relational + embeddings | | Queue | Celery + Redis | ML job processing | | Hosting | Modal / AWS GPU / Replicate | GPU access | --- ## Anti-Patterns | # | ❌ Don't | ✅ Do Instead | |---|---------|--------------| | 1 | "X is trending on HN" | Evaluate against YOUR requirements | | 2 | Resume-Driven Development | Choose what team can maintain | | 3 | "Must scale to 1M users" (day 1) | Build for 10× current need, not 1000× | | 4 | Evaluate for weeks | Time-box to 3-5 days, then decide | | 5 | No decision documentation | Write ADR for every major choice | | 6 | Ignore operational cost | Include deploy, monitor, debug cost | | 7 | "We'll rewrite later" | Assume you won't. Choose carefully. | | 8 | Microservices by default | Start monolith, extract when needed | | 9 | Different DB per service (day 1) | One database, split when justified | | 10 | "It worked at Google" | You're not Google. Scale to YOUR context. | --- ## Common Issues ### Issue 1: "Team can't agree on a framework" **Fix:** Time-box to 3 days. Fill the evaluation matrix. If scores within 10%, pick what the majority knows. Document in ADR. Move on. ### Issue 2: "We picked X but it doesn't fit" **Fix:** Sunk cost fallacy check. If < 2 weeks invested, switch now. If > 2 weeks, document pain points and plan phased migration. ### Issue 3: "Do we need microservices?" **Fix:** Almost certainly no. Start with a well-structured monolith. Extract to services only when: (a) different scaling needs, (b) different team ownership, (c) different deployment cadence.