Files
skills/fullstack-dev/references/technology-selection.md
shihao 6487becf60 Initial commit: add all skills files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:52:49 +08:00

255 lines
9.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Technology Selection Framework
Structured decision framework for backend and full-stack technology choices. Prevents analysis paralysis while ensuring rigorous evaluation.
**Iron Law: NO TECHNOLOGY CHOICE WITHOUT EXPLICIT TRADE-OFF ANALYSIS.**
"I like it" and "it's trending" are not engineering arguments.
---
## Phase 1: Requirements Before Technology
### Non-Functional Requirements (Quantify!)
| Dimension | Question | Bad Answer | Good Answer |
|-----------|----------|-----------|-------------|
| Scale | How many concurrent users? | "Lots" | "1K concurrent, 500 RPS peak" |
| Latency | Acceptable p99 response time? | "Fast" | "< 200ms API, < 2s reports" |
| Availability | Required uptime? | "Always up" | "99.9% (8.7h downtime/year)" |
| Data volume | Expected storage growth? | "A lot" | "100GB/year, 10M rows" |
| Consistency | Strong vs eventual? | "Consistent" | "Strong for payments, eventual for feeds" |
| Compliance | Regulatory? | "Some" | "GDPR data residency EU, SOC 2 Type II" |
### Team Constraints
- Team size and seniority level
- What the team already knows well
- Can you hire for this stack? (check job market)
- Timeline pressure (days vs months to production)
- Budget for licenses, infrastructure, training
---
## Phase 2: Evaluation Matrix
Score each option 1-5 on weighted criteria:
| Criterion | Weight | Option A | Option B | Option C |
|-----------|--------|----------|----------|----------|
| Meets functional requirements | 5× | _ | _ | _ |
| Meets non-functional requirements | 5× | _ | _ | _ |
| Team expertise / learning curve | 4× | _ | _ | _ |
| Ecosystem maturity (libs, tools) | 3× | _ | _ | _ |
| Community & long-term viability | 3× | _ | _ | _ |
| Operational complexity | 3× | _ | _ | _ |
| Hiring pool availability | 2× | _ | _ | _ |
| Cost (license + infra + training) | 2× | _ | _ | _ |
| **Weighted Total** | | _ | _ | _ |
**Rules:**
- Any option scoring **1 on a 5× criterion** → automatically disqualified
- Options within **10%** of each other → choose what team knows best
- Options within **15%** → run a **time-boxed PoC** (2-5 days max)
---
## Phase 3: Decision Trees
### Backend Language / Framework
```
What type of project?
├─ REST/GraphQL API, rapid development
│ ├─ Team knows TypeScript → Node.js
│ │ ├─ Full-featured, enterprise patterns → NestJS
│ │ ├─ Lightweight, flexible → Fastify / Hono / Express
│ │ └─ Full-stack with React → Next.js API routes
│ ├─ Team knows Python
│ │ ├─ High-perf async API → FastAPI
│ │ ├─ Full-stack, admin-heavy → Django
│ │ └─ Lightweight → Flask / Litestar
│ └─ Team knows Java/Kotlin
│ ├─ Enterprise, large team → Spring Boot
│ └─ Lightweight, fast startup → Quarkus / Ktor
├─ High concurrency, systems-level
│ ├─ Microservices, network → Go
│ ├─ Extreme perf, safety → Rust (Axum / Actix)
│ └─ Fault tolerance → Elixir (Phoenix)
├─ Real-time (WebSocket, streaming)
│ ├─ Node.js ecosystem → Socket.io / ws
│ ├─ Scalable pub/sub → Elixir Phoenix
│ └─ Low-latency → Go / Rust
└─ ML / data-intensive
└─ Python (FastAPI + ML libs)
```
### Database
```
What data model?
├─ Structured, relational, ACID
│ ├─ General purpose → PostgreSQL ← DEFAULT CHOICE
│ ├─ Read-heavy, MySQL ecosystem → MySQL / MariaDB
│ └─ Embedded / serverless edge → SQLite / Turso / D1
├─ Semi-structured, flexible schema
│ ├─ Document-oriented → MongoDB
│ ├─ Serverless document → DynamoDB / Firestore
│ └─ Search-heavy → Elasticsearch / OpenSearch
├─ Key-value / cache
│ ├─ In-memory + data structures → Redis / Valkey
│ └─ Planet-scale KV → DynamoDB / Cassandra
├─ Time-series → TimescaleDB / ClickHouse / InfluxDB
├─ Graph → Neo4j / Apache AGE (Postgres extension)
└─ Vector (AI embeddings) → pgvector / Pinecone / Qdrant
```
**Default:** Start with PostgreSQL. It handles 80% of use cases.
### Caching Strategy
| Pattern | Technology | When |
|---------|-----------|------|
| Application cache | Redis / Valkey | Sessions, frequent reads, rate limiting |
| HTTP cache | CDN (Cloudflare/Vercel) | Static assets, public API responses |
| Query cache | Materialized views | Complex aggregations, dashboards |
| In-process cache | LRU (in-memory) | Config, small lookup tables |
| Edge cache | Cloudflare KV / Vercel KV | Global low-latency reads |
### Message Queue / Event Streaming
| Pattern | Technology | When |
|---------|-----------|------|
| Task queue (background jobs) | BullMQ / Celery / SQS | Email, exports, payments |
| Event streaming (replay, audit) | Kafka / Redpanda | Event sourcing, real-time pipelines |
| Lightweight pub/sub | Redis Streams / NATS | Simple notifications, broadcasting |
| Request-reply (sync over async) | NATS / RabbitMQ RPC | Internal service calls |
### Hosting / Deployment
| Model | Technology | When |
|-------|-----------|------|
| Serverless (auto-scale) | Vercel / Cloudflare Workers / Lambda | Variable traffic, pay-per-use |
| Container (predictable) | Cloud Run / Render / Railway / Fly.io | Steady traffic, simple ops |
| Kubernetes (large scale) | EKS / GKE / AKS | 10+ services, team has K8s expertise |
| VPS (full control) | DigitalOcean / Hetzner / EC2 | Predictable workload, cost-sensitive |
---
## Phase 4: Decision Documentation
### ADR (Architecture Decision Record) Template
```markdown
# ADR-{NNN}: {Title}
## Status: Proposed | Accepted | Deprecated | Superseded by ADR-{NNN}
## Context
What problem are we solving? What forces are at play?
## Decision
What did we choose and why?
## Evaluation
| Criterion | Weight | Chosen | Runner-up |
|-----------|--------|--------|-----------|
## Consequences
- Positive: ...
- Negative: ...
- Risks: ...
## Alternatives Rejected
- Option B: rejected because...
- Option C: rejected because...
```
---
## Common Stack Templates
### A: Startup / MVP (Speed)
| Layer | Choice | Why |
|-------|--------|-----|
| Language | TypeScript | One language front + back |
| Framework | Next.js (full-stack) or NestJS (API) | Fast iteration |
| Database | PostgreSQL (Supabase / Neon) | Managed, generous free tier |
| Auth | Better Auth / Clerk | No auth code to maintain |
| Cache | Redis (Upstash) | Serverless-friendly |
| Hosting | Vercel / Railway | Zero-config deploys |
### B: SaaS / Business App (Balance)
| Layer | Choice | Why |
|-------|--------|-----|
| Language | TypeScript or Python | Team preference |
| Framework | NestJS or FastAPI | Structured, testable |
| Database | PostgreSQL | Reliable, feature-rich |
| Queue | BullMQ (Redis) | Simple background jobs |
| Auth | OAuth 2.0 + JWT | Standard, flexible |
| Hosting | AWS ECS / Cloud Run | Scalable containers |
| Monitoring | Datadog / Grafana + Prometheus | Full observability |
### C: High-Performance (Scale)
| Layer | Choice | Why |
|-------|--------|-----|
| Language | Go or Rust | Max throughput, low latency |
| Database | PostgreSQL + Redis + ClickHouse | OLTP + cache + analytics |
| Queue | Kafka / Redpanda | High-throughput streaming |
| Hosting | Kubernetes (EKS/GKE) | Fine-grained scaling |
| Monitoring | Prometheus + Grafana + Jaeger | Metrics + tracing |
### D: AI / ML Application
| Layer | Choice | Why |
|-------|--------|-----|
| Language | Python (API) + TypeScript (frontend) | ML libs + modern UI |
| Framework | FastAPI + Next.js | Async + SSR |
| Database | PostgreSQL + pgvector | Relational + embeddings |
| Queue | Celery + Redis | ML job processing |
| Hosting | Modal / AWS GPU / Replicate | GPU access |
---
## Anti-Patterns
| # | ❌ Don't | ✅ Do Instead |
|---|---------|--------------|
| 1 | "X is trending on HN" | Evaluate against YOUR requirements |
| 2 | Resume-Driven Development | Choose what team can maintain |
| 3 | "Must scale to 1M users" (day 1) | Build for 10× current need, not 1000× |
| 4 | Evaluate for weeks | Time-box to 3-5 days, then decide |
| 5 | No decision documentation | Write ADR for every major choice |
| 6 | Ignore operational cost | Include deploy, monitor, debug cost |
| 7 | "We'll rewrite later" | Assume you won't. Choose carefully. |
| 8 | Microservices by default | Start monolith, extract when needed |
| 9 | Different DB per service (day 1) | One database, split when justified |
| 10 | "It worked at Google" | You're not Google. Scale to YOUR context. |
---
## Common Issues
### Issue 1: "Team can't agree on a framework"
**Fix:** Time-box to 3 days. Fill the evaluation matrix. If scores within 10%, pick what the majority knows. Document in ADR. Move on.
### Issue 2: "We picked X but it doesn't fit"
**Fix:** Sunk cost fallacy check. If < 2 weeks invested, switch now. If > 2 weeks, document pain points and plan phased migration.
### Issue 3: "Do we need microservices?"
**Fix:** Almost certainly no. Start with a well-structured monolith. Extract to services only when: (a) different scaling needs, (b) different team ownership, (c) different deployment cadence.