Initial commit: add all skills files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:52:49 +08:00
commit 6487becf60
396 changed files with 108871 additions and 0 deletions
--- a/fullstack-dev/references/api-design.md
+++ b/fullstack-dev/references/api-design.md
@@ -0,0 +1,444 @@
+---
+name: fullstack-dev-api-design
+description: "API design patterns and best practices. Use when creating endpoints, choosing methods/status codes, implementing pagination, or writing OpenAPI specs. Prevents common REST/GraphQL/gRPC mistakes."
+license: MIT
+metadata:
+  version: "2.0.0"
+  sources:
+    - Microsoft REST API Guidelines
+    - Google API Design Guide
+    - Zalando RESTful API Guidelines
+    - JSON:API Specification
+    - RFC 9457 (Problem Details for HTTP APIs)
+    - RFC 9110 (HTTP Semantics)
+---
+
+# API Design Guidelines
+
+Framework-agnostic API design guide for backend and full-stack engineers. 50+ rules across 10 categories, prioritized by impact. Covers REST, GraphQL, and gRPC.
+
+## Scope
+
+**USE this skill when:**
+- Designing a new API or adding endpoints
+- Reviewing API pull requests
+- Choosing between REST / GraphQL / gRPC
+- Writing OpenAPI specifications
+- Migrating or versioning an existing API
+
+**NOT for:**
+- Framework-specific implementation details (use your framework's own skill/docs)
+- Frontend data fetching patterns (use React Query / SWR docs)
+- Authentication implementation details (use your auth library's docs)
+- Database schema design (→ `database-schema-design`)
+
+## Context Required
+
+Before applying this skill, gather:
+
+| Required | Optional |
+|----------|----------|
+| Target consumers (browser, mobile, service) | Existing API conventions in the project |
+| Expected request volume (RPS estimate) | Current OpenAPI / Swagger spec |
+| Authentication method (JWT, API key, OAuth) | Rate limiting requirements |
+| Data model / domain entities | Caching strategy |
+
+---
+
+## Quick Start Checklist
+
+New API endpoint? Run through this before writing code:
+
+- [ ] Resource named as **plural noun** (`/orders`, not `/getOrders`)
+- [ ] URL in **kebab-case**, body fields in **camelCase**
+- [ ] Correct **HTTP method** (GET=read, POST=create, PUT=replace, PATCH=partial, DELETE=remove)
+- [ ] Correct **status code** (201 Created, 422 Validation, 404 Not Found…)
+- [ ] Error response follows **RFC 9457** envelope
+- [ ] **Pagination** on all list endpoints (default 20, max 100)
+- [ ] **Authentication** required (Bearer token, not query param)
+- [ ] **Request ID** in response header (`X-Request-Id`)
+- [ ] **Rate limit** headers included
+- [ ] Endpoint documented in **OpenAPI spec**
+
+---
+
+## Quick Navigation
+
+| Need to… | Jump to |
+|----------|---------|
+| Name a resource URL | [1. Resource Modeling](#1-resource-modeling-critical) |
+| Pick HTTP method + status code | [3. HTTP Methods & Status Codes](#3-http-methods--status-codes-critical) |
+| Format error responses | [4. Error Handling](#4-error-handling-high) |
+| Add pagination or filtering | [6. Pagination & Filtering](#6-pagination--filtering-high) |
+| Choose API style (REST vs GraphQL vs gRPC) | [10. API Style Decision](#10-api-style-decision-tree) |
+| Version an existing API | [7. Versioning](#7-versioning-medium-high) |
+| Avoid common mistakes | [Anti-Patterns](#anti-patterns-checklist) |
+
+---
+
+## 1. Resource Modeling (CRITICAL)
+
+### Core Rules
+
+```
+✅ /users                         — plural noun
+✅ /users/{id}/orders              — 1 level nesting
+✅ /reviews?orderId={oid}          — flatten deep nesting with query params
+
+❌ /getUsers                       — verb in URL
+❌ /user                           — singular
+❌ /users/{uid}/orders/{oid}/items/{iid}/reviews  — 3+ levels deep
+```
+
+**Max nesting: 2 levels.** Beyond that, promote to top-level resource with filters.
+
+### Domain Alignment
+
+Resources map to **domain concepts**, not database tables:
+
+```
+✅ /checkout-sessions       (domain aggregate)
+✅ /shipping-labels          (domain concept)
+
+❌ /tbl_order_header          (database table leak)
+❌ /join_user_role            (internal schema leak)
+```
+
+---
+
+## 2. URL & Naming (CRITICAL)
+
+| Context | Convention | Example |
+|---------|-----------|---------|
+| URL path | kebab-case | `/order-items` |
+| JSON body fields | camelCase | `{ "firstName": "Jane" }` |
+| Query params | camelCase or snake_case (be consistent) | `?sortBy=createdAt` |
+| Headers | Train-Case | `X-Request-Id` |
+
+**Python exception:** If your entire stack is Python/snake_case, you MAY use `snake_case` in JSON — but be **consistent across all endpoints**.
+
+```
+✅ GET /users          ❌ GET /users/
+✅ GET /reports/annual  ❌ GET /reports/annual.json
+✅ POST /users          ❌ POST /users/create
+```
+
+---
+
+## 3. HTTP Methods & Status Codes (CRITICAL)
+
+### Method Semantics
+
+| Method | Semantics | Idempotent | Safe | Request Body |
+|--------|-----------|-----------|------|-------------|
+| GET | Read | ✅ | ✅ | ❌ Never |
+| POST | Create / Action | ❌ | ❌ | ✅ Always |
+| PUT | Full replace | ✅ | ❌ | ✅ Always |
+| PATCH | Partial update | ❌* | ❌ | ✅ Always |
+| DELETE | Remove | ✅ | ❌ | ❌ Rarely |
+
+### Status Code Quick Reference
+
+**Success:**
+
+| Code | When | Response Body |
+|------|------|--------------|
+| 200 OK | GET, PUT, PATCH success | Resource / result |
+| 201 Created | POST created resource | Created resource + `Location` header |
+| 202 Accepted | Async operation started | Job ID / status URL |
+| 204 No Content | DELETE success, PUT with no body | None |
+
+**Client Errors:**
+
+| Code | When | Key Distinction |
+|------|------|-----------------|
+| 400 Bad Request | Malformed syntax | Can't even parse |
+| 401 Unauthorized | Missing / invalid auth | "Who are you?" |
+| 403 Forbidden | Authenticated, no permission | "I know you, but no" |
+| 404 Not Found | Resource doesn't exist | Also use to hide 403 |
+| 409 Conflict | Duplicate, version mismatch | State conflict |
+| 422 Unprocessable | Valid syntax, failed validation | Semantic errors |
+| 429 Too Many Requests | Rate limit hit | Include `Retry-After` |
+
+**Server Errors:** 500 (unexpected), 502 (upstream fail), 503 (overloaded), 504 (upstream timeout)
+
+---
+
+## 4. Error Handling (HIGH)
+
+### Standard Error Envelope (RFC 9457)
+
+Every error response uses this format:
+
+```json
+{
+  "type": "https://api.example.com/errors/insufficient-funds",
+  "title": "Insufficient Funds",
+  "status": 422,
+  "detail": "Account balance $10.00 is less than withdrawal $50.00.",
+  "instance": "/transactions/txn_abc123",
+  "request_id": "req_7f3a8b2c",
+  "errors": [
+    { "field": "amount", "message": "Exceeds balance", "code": "INSUFFICIENT_BALANCE" }
+  ]
+}
+```
+
+### Multi-Language Implementation
+
+**TypeScript (Express):**
+```typescript
+class AppError extends Error {
+  constructor(
+    public readonly title: string,
+    public readonly status: number,
+    public readonly detail: string,
+    public readonly code: string,
+  ) { super(detail); }
+}
+
+// Middleware
+app.use((err, req, res, next) => {
+  if (err instanceof AppError) {
+    return res.status(err.status).json({
+      type: `https://api.example.com/errors/${err.code}`,
+      title: err.title, status: err.status,
+      detail: err.detail, request_id: req.id,
+    });
+  }
+  res.status(500).json({ title: 'Internal Error', status: 500, request_id: req.id });
+});
+```
+
+**Python (FastAPI):**
+```python
+from fastapi import Request
+from fastapi.responses import JSONResponse
+
+class AppError(Exception):
+    def __init__(self, title: str, status: int, detail: str, code: str):
+        self.title, self.status, self.detail, self.code = title, status, detail, code
+
+@app.exception_handler(AppError)
+async def app_error_handler(request: Request, exc: AppError):
+    return JSONResponse(status_code=exc.status, content={
+        "type": f"https://api.example.com/errors/{exc.code}",
+        "title": exc.title, "status": exc.status,
+        "detail": exc.detail, "request_id": request.state.request_id,
+    })
+```
+
+### Iron Rules
+
+```
+✅ Return RFC 9457 error envelope for ALL errors
+✅ Include request_id in every error response
+✅ Return per-field validation errors in `errors` array
+
+❌ Never expose stack traces in production
+❌ Never return 200 for errors
+❌ Never swallow errors silently
+```
+
+---
+
+## 5. Authentication & Authorization (HIGH)
+
+```
+✅ Authorization: Bearer eyJhbGci...      (header)
+❌ GET /users?token=eyJhbGci...            (URL — appears in logs)
+
+✅ 401 → "Who are you?"  (missing/invalid credentials)
+✅ 403 → "You can't do this"  (authenticated, no permission)
+✅ 404 → Hide resource existence  (use instead of 403 when needed)
+```
+
+**Rate Limit Headers (always include):**
+```
+X-RateLimit-Limit: 100
+X-RateLimit-Remaining: 42
+X-RateLimit-Reset: 1625097600
+Retry-After: 30
+```
+
+---
+
+## 6. Pagination & Filtering (HIGH)
+
+### Cursor vs Offset
+
+| Strategy | When | Pros | Cons |
+|----------|------|------|------|
+| **Cursor** (preferred) | Large/dynamic datasets | Consistent, no skips | Can't jump to page N |
+| **Offset** | Small/stable datasets, admin UIs | Simple, page jumps | Drift on insert/delete |
+
+**Cursor pagination response:**
+```json
+{
+  "data": [...],
+  "pagination": { "next_cursor": "eyJpZCI6MTIwfQ", "has_more": true }
+}
+```
+
+**Offset pagination response:**
+```json
+{
+  "data": [...],
+  "pagination": { "page": 3, "per_page": 20, "total": 256, "total_pages": 13 }
+}
+```
+
+**Always enforce:** Default 20 items, max 100 items.
+
+### Standard Filter Patterns
+
+```
+GET /orders?status=shipped&created_after=2025-01-01&sort=-created_at&fields=id,status
+```
+
+| Pattern | Convention |
+|---------|-----------|
+| Exact match | `?status=shipped` |
+| Range | `?price_gte=10&price_lte=100` |
+| Date range | `?created_after=2025-01-01&created_before=2025-12-31` |
+| Sort | `?sort=field` (asc), `?sort=-field` (desc) |
+| Sparse fields | `?fields=id,name,email` |
+| Search | `?q=search+term` |
+
+---
+
+## 7. Versioning (MEDIUM-HIGH)
+
+| Strategy | Format | Best For |
+|----------|--------|----------|
+| **URL path** (recommended) | `/v1/users` | Public APIs |
+| **Header** | `Api-Version: 2` | Internal APIs |
+| **Query param** | `?version=2` | Legacy (avoid) |
+
+**Non-breaking changes (no version bump):** New optional response fields, new endpoints, new optional params.
+
+**Breaking changes (new version required):** Removing/renaming fields, changing types, stricter validation, removing endpoints.
+
+**Deprecation headers:**
+```
+Sunset: Sat, 01 Mar 2026 00:00:00 GMT
+Deprecation: true
+Link: <https://api.example.com/v2/users>; rel="successor-version"
+```
+
+---
+
+## 8. Request / Response Design (MEDIUM)
+
+### Consistent Envelope
+
+```json
+{
+  "data": { "id": "ord_123", "status": "pending", "total": 99.50 },
+  "meta": { "request_id": "req_abc123", "timestamp": "2025-06-15T10:30:00Z" }
+}
+```
+
+### Key Rules
+
+| Rule | Correct | Wrong |
+|------|---------|-------|
+| Timestamps | `"2025-06-15T10:30:00Z"` (ISO 8601) | `"06/15/2025"` or `1718447400` |
+| Public IDs | UUID `"550e8400-..."` | Auto-increment `42` |
+| Null vs absent (PATCH) | `{ "nickname": null }` = clear field | Absent field = don't change |
+| HATEOAS (public APIs) | `"links": { "cancel": "/orders/123/cancel" }` | No discoverability |
+
+---
+
+## 9. Documentation — OpenAPI (MEDIUM)
+
+**Design-first workflow:**
+
+```
+1. Write OpenAPI 3.1 spec
+2. Review spec with stakeholders
+3. Generate server stubs + client SDKs
+4. Implement handlers
+5. Validate responses against spec in CI
+```
+
+Every endpoint documents: summary, all parameters, request body + examples, all response codes + schemas, auth requirements.
+
+---
+
+## 10. API Style Decision Tree
+
+```
+What kind of API?
+│
+├─ Browser + mobile clients, flexible queries
+│   └─ GraphQL
+│       Rules: DataLoader (no N+1), depth limit ≤7, Relay pagination
+│
+├─ Standard CRUD, public consumers, caching important
+│   └─ REST (this guide)
+│       Rules: Resources, HTTP methods, status codes, OpenAPI
+│
+├─ Service-to-service, high throughput, strong typing
+│   └─ gRPC
+│       Rules: Protobuf schemas, streaming for large data, deadlines
+│
+├─ Full-stack TypeScript, same team owns client + server
+│   └─ tRPC
+│       Rules: Shared types, no code generation needed
+│
+└─ Real-time bidirectional
+    └─ WebSocket / SSE
+        Rules: Heartbeat, reconnection, message ordering
+```
+
+---
+
+## Anti-Patterns Checklist
+
+| # | ❌ Don't | ✅ Do Instead |
+|---|---------|--------------|
+| 1 | Verbs in URLs (`/getUser`) | HTTP methods + noun resources |
+| 2 | Return 200 for errors | Correct 4xx/5xx status codes |
+| 3 | Mix naming styles | One convention per context |
+| 4 | Expose database IDs | UUIDs for public identifiers |
+| 5 | No pagination on lists | Always paginate (default 20) |
+| 6 | Swallow errors silently | Structured RFC 9457 errors |
+| 7 | Token in URL query | Authorization header |
+| 8 | Deep nesting (3+ levels) | Flatten with query params |
+| 9 | Break changes without version | Maintain compatibility or version |
+| 10 | No rate limiting | Implement + communicate via headers |
+| 11 | No request ID | `X-Request-Id` on every response |
+| 12 | Stack traces in production | Safe error message + internal log |
+
+---
+
+## Common Issues
+
+### Issue 1: "Should this be a new resource or a sub-resource?"
+
+**Symptom:** URL path keeps growing (`/users/{id}/orders/{id}/items/{id}/reviews`)
+
+**Rule:** If the child entity makes sense on its own, promote it. If it only exists within the parent context, keep it nested (max 2 levels).
+
+```
+/reviews?orderId=123      ✅  (reviews exist independently)
+/orders/{id}/items         ✅  (items belong to orders, 1 level)
+```
+
+### Issue 2: "PUT or PATCH?"
+
+**Symptom:** Team can't agree on update semantics.
+
+**Rule:**
+- PUT = client sends **complete** resource (missing fields → set to default/null)
+- PATCH = client sends **only changed fields** (missing fields → unchanged)
+- When unsure → **PATCH** (safer, less surprising)
+
+### Issue 3: "400 or 422?"
+
+**Symptom:** Inconsistent validation error codes.
+
+**Rule:**
+- 400 = can't parse request at all (malformed JSON, wrong content-type)
+- 422 = parsed OK, but values fail validation (invalid email, negative quantity)
--- a/fullstack-dev/references/auth-flow.md
+++ b/fullstack-dev/references/auth-flow.md
@@ -0,0 +1,165 @@
+# Authentication Flow Patterns
+
+Complete auth flow across frontend and backend. Covers JWT bearer flow, automatic token refresh, Next.js server-side auth, RBAC, and backend middleware order.
+
+---
+
+## JWT Bearer Flow (Most Common)
+
+```
+1. Login
+   Client → POST /api/auth/login { email, password }
+   Server → { accessToken (15min), refreshToken (7d, httpOnly cookie) }
+
+2. Authenticated Requests
+   Client → GET /api/orders  Authorization: Bearer <accessToken>
+   Server → validates JWT → returns data
+
+3. Token Refresh (transparent)
+   Client → 401 received → POST /api/auth/refresh (cookie auto-sent)
+   Server → new accessToken
+   Client → retry original request with new token
+
+4. Logout
+   Client → POST /api/auth/logout
+   Server → invalidate refresh token → clear cookie
+```
+
+---
+
+## Frontend: Automatic Token Refresh
+
+```typescript
+// lib/api-client.ts — add to existing fetch wrapper
+async function apiWithRefresh<T>(path: string, options: RequestInit = {}): Promise<T> {
+  try {
+    return await api<T>(path, options);
+  } catch (err) {
+    if (err instanceof ApiError && err.status === 401) {
+      // Try refresh
+      const refreshed = await api<{ accessToken: string }>('/api/auth/refresh', {
+        method: 'POST',
+        credentials: 'include',  // send httpOnly cookie
+      });
+      setAuthToken(refreshed.accessToken);
+      // Retry original request
+      return api<T>(path, options);
+    }
+    throw err;
+  }
+}
+```
+
+---
+
+## Next.js: Server-Side Auth (App Router)
+
+```typescript
+// middleware.ts — protect routes server-side
+import { NextResponse } from 'next/server';
+import type { NextRequest } from 'next/server';
+
+export function middleware(request: NextRequest) {
+  const token = request.cookies.get('session')?.value;
+  if (!token && request.nextUrl.pathname.startsWith('/dashboard')) {
+    return NextResponse.redirect(new URL('/login', request.url));
+  }
+  return NextResponse.next();
+}
+
+// app/dashboard/page.tsx — server component with auth
+import { cookies } from 'next/headers';
+
+export default async function Dashboard() {
+  const token = (await cookies()).get('session')?.value;
+  const user = await fetch(`${process.env.API_URL}/api/me`, {
+    headers: { Authorization: `Bearer ${token}` },
+  }).then(r => r.json());
+
+  return <DashboardContent user={user} />;
+}
+```
+
+---
+
+## Backend: Standard Middleware Order
+
+```
+Request → 1.RequestID → 2.Logging → 3.CORS → 4.RateLimit → 5.BodyParse
+       → 6.Auth → 7.Authz → 8.Validation → 9.Handler → 10.ErrorHandler → Response
+```
+
+---
+
+## Backend: JWT Rules
+
+```
+✅ Short expiry access token (15min) + refresh token (server-stored)
+✅ Minimal claims: userId, roles (not entire user object)
+✅ Rotate signing keys periodically
+
+❌ Never store tokens in localStorage (XSS risk)
+❌ Never pass tokens in URL query params
+```
+
+---
+
+## Backend: RBAC Pattern
+
+```typescript
+function authorize(...roles: Role[]) {
+  return (req, res, next) => {
+    if (!req.user) throw new UnauthorizedError();
+    if (!roles.some(r => req.user.roles.includes(r))) throw new ForbiddenError();
+    next();
+  };
+}
+router.delete('/users/:id', authenticate, authorize('admin'), deleteUser);
+```
+
+---
+
+## Auth Decision Table
+
+| Method | When | Frontend |
+|--------|------|----------|
+| Session | Same-domain, SSR, Django templates | Django templates / htmx |
+| JWT | Different domain, SPA, mobile | React, Vue, mobile apps |
+| OAuth2 | Third-party login, API consumers | Any |
+
+---
+
+## Iron Rules
+
+```
+✅ Access token: short-lived (15min), in memory
+✅ Refresh token: httpOnly cookie (XSS-safe)
+✅ Automatic transparent refresh on 401
+✅ Redirect to login when refresh fails
+
+❌ Never store tokens in localStorage (XSS risk)
+❌ Never send tokens in URL query params (logged)
+❌ Never trust client-side auth checks alone (server must validate)
+```
+
+---
+
+## Common Issues
+
+### Issue 1: "Auth works on page load but breaks on navigation"
+
+**Cause:** Token stored in component state (lost on unmount).
+
+**Fix:** Store access token in a persistent location:
+- React Context (survives navigation, lost on refresh)
+- Cookie (survives refresh)
+- React Query cache with `staleTime: Infinity` for session
+
+### Issue 2: "CORS error with auth requests"
+
+**Cause:** Missing `credentials: 'include'` on frontend or `credentials: true` on backend CORS config.
+
+**Fix:**
+1. Frontend: `fetch(url, { credentials: 'include' })`
+2. Backend: `cors({ origin: 'https://your-frontend.com', credentials: true })`
+3. Backend: explicit origin (not `*`) when using credentials
--- a/fullstack-dev/references/db-schema.md
+++ b/fullstack-dev/references/db-schema.md
@@ -0,0 +1,706 @@
+---
+name: fullstack-dev-db-schema
+description: "Database schema design and migrations. Use when creating tables, defining ORM models, adding indexes, or designing relationships. Covers zero-downtime migrations and multi-tenancy."
+license: MIT
+metadata:
+  version: "1.0.0"
+  sources:
+    - PostgreSQL official documentation
+    - Use The Index, Luke (use-the-index-luke.com)
+    - Designing Data-Intensive Applications (Martin Kleppmann)
+    - Database Reliability Engineering (Laine Campbell & Charity Majors)
+---
+
+# Database Schema Design
+
+ORM-agnostic guide for relational database schema design. Covers data modeling, normalization, indexing, migrations, multi-tenancy, and common application patterns. Primarily PostgreSQL-focused but principles apply to MySQL/MariaDB.
+
+## Scope
+
+**USE this skill when:**
+- Designing a schema for a new project or feature
+- Deciding between normalization and denormalization
+- Choosing which indexes to create
+- Planning a zero-downtime migration on a live database
+- Implementing multi-tenant data isolation
+- Adding audit trails, soft delete, or versioning
+- Diagnosing slow queries caused by schema problems
+
+**NOT for:**
+- Choosing which database technology to use (→ `technology-selection`)
+- PostgreSQL-specific query tuning (use PostgreSQL performance docs)
+- ORM-specific configuration (→ `django-best-practices` or your ORM's docs)
+- Application-layer caching (→ `fullstack-dev-practices`)
+
+## Context Required
+
+| Required | Optional |
+|----------|----------|
+| Database engine (PostgreSQL / MySQL) | Expected data volume (rows, growth rate) |
+| Domain entities and relationships | Read/write ratio |
+| Key access patterns (queries) | Multi-tenant requirements |
+
+---
+
+## Quick Start Checklist
+
+Designing a new schema:
+
+- [ ] **Domain entities identified** — map 1 entity = 1 table (not 1 class = 1 table)
+- [ ] **Primary keys**: UUID for public IDs, serial/bigserial for internal-only
+- [ ] **Foreign keys** with explicit `ON DELETE` behavior
+- [ ] **NOT NULL** by default — nullable only when business logic requires it
+- [ ] **Timestamps**: `created_at` + `updated_at` on every table
+- [ ] **Indexes** created for every WHERE, JOIN, ORDER BY column
+- [ ] **No premature denormalization** — start normalized, denormalize when measured
+- [ ] **Naming convention** consistent: `snake_case`, plural table names
+
+---
+
+## Quick Navigation
+
+| Need to… | Jump to |
+|----------|---------|
+| Model entities and relationships | [1. Data Modeling](#1-data-modeling-critical) |
+| Decide normalize vs denormalize | [2. Normalization](#2-normalization-vs-denormalization-critical) |
+| Choose the right index | [3. Indexing](#3-indexing-strategy-critical) |
+| Run migrations safely on live DB | [4. Migrations](#4-zero-downtime-migrations-high) |
+| Design multi-tenant schema | [5. Multi-Tenancy](#5-multi-tenant-design-high) |
+| Add soft delete / audit trails | [6. Common Patterns](#6-common-schema-patterns-medium) |
+| Partition large tables | [7. Partitioning](#7-table-partitioning-medium) |
+| See anti-patterns | [Anti-Patterns](#anti-patterns) |
+
+---
+
+## Core Principles (7 Rules)
+
+```
+1. ✅ Start normalized (3NF) — denormalize only when you have measured evidence
+2. ✅ Every table has a primary key, created_at, updated_at
+3. ✅ UUID for public-facing IDs, serial for internal join keys
+4. ✅ NOT NULL by default — null is a business decision, not a lazy default
+5. ✅ Index every column used in WHERE, JOIN, ORDER BY
+6. ✅ Foreign keys enforced in database (not just application code)
+7. ✅ Migrations are additive — never drop/rename in production without a multi-step plan
+```
+
+---
+
+## 1. Data Modeling (CRITICAL)
+
+### Table Naming
+
+```sql
+-- ✅ Plural, snake_case
+CREATE TABLE orders (...);
+CREATE TABLE order_items (...);
+CREATE TABLE user_profiles (...);
+
+-- ❌ Singular, mixed case
+CREATE TABLE Order (...);
+CREATE TABLE OrderItem (...);
+CREATE TABLE tbl_usr_prof (...);    -- cryptic abbreviation
+```
+
+### Primary Keys
+
+| Strategy | When | Pros | Cons |
+|----------|------|------|------|
+| `bigserial` (auto-increment) | Internal tables, FK joins | Compact, fast joins | Enumerable, not safe for public IDs |
+| `uuid` (v4 random) | Public-facing resources | Non-guessable, globally unique | Larger (16 bytes), random I/O on B-Tree |
+| `uuid` v7 (time-sorted) | Public + needs ordering | Non-guessable + insert-friendly | Newer, less ecosystem support |
+| `text` slug | URL-friendly resources | Human-readable | Must enforce uniqueness, updates expensive |
+
+**Recommended default:**
+
+```sql
+CREATE TABLE orders (
+    id          bigserial PRIMARY KEY,             -- internal FK target
+    public_id   uuid NOT NULL DEFAULT gen_random_uuid() UNIQUE,  -- API-facing
+    -- ...
+    created_at  timestamptz NOT NULL DEFAULT now(),
+    updated_at  timestamptz NOT NULL DEFAULT now()
+);
+```
+
+### Relationships
+
+```sql
+-- One-to-Many: user → orders
+CREATE TABLE orders (
+    id         bigserial PRIMARY KEY,
+    user_id    bigint NOT NULL REFERENCES users(id) ON DELETE CASCADE,
+    -- ...
+);
+CREATE INDEX idx_orders_user_id ON orders(user_id);
+
+-- Many-to-Many: orders ↔ products (via junction table)
+CREATE TABLE order_items (
+    id         bigserial PRIMARY KEY,
+    order_id   bigint NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
+    product_id bigint NOT NULL REFERENCES products(id) ON DELETE RESTRICT,
+    quantity   int NOT NULL CHECK (quantity > 0),
+    unit_price numeric(10,2) NOT NULL,
+    UNIQUE (order_id, product_id)  -- prevent duplicate line items
+);
+
+-- One-to-One: user → profile
+CREATE TABLE user_profiles (
+    user_id    bigint PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
+    bio        text,
+    avatar_url text,
+    -- ...
+);
+```
+
+### ON DELETE Behavior
+
+| Behavior | When | Example |
+|----------|------|---------|
+| `CASCADE` | Child meaningless without parent | order_items when order deleted |
+| `RESTRICT` | Prevent accidental deletion | products referenced by order_items |
+| `SET NULL` | Preserve child, clear reference | orders.assigned_to when employee leaves |
+| `SET DEFAULT` | Fallback to default value | Rare, for status columns |
+
+---
+
+## 2. Normalization vs Denormalization (CRITICAL)
+
+### Start Normalized (3NF)
+
+**Normal forms in practice:**
+
+| Form | Rule | Example Violation |
+|------|------|-------------------|
+| 1NF | No repeating groups, atomic values | `tags = "go,python,rust"` in one column |
+| 2NF | No partial dependencies (composite keys) | `order_items.product_name` depends on `product_id` alone |
+| 3NF | No transitive dependencies | `orders.customer_city` depends on `customer_id`, not `order_id` |
+
+**1NF violation fix:**
+```sql
+-- ❌ Tags as comma-separated string
+CREATE TABLE posts (id serial, tags text);  -- tags = "go,python"
+
+-- ✅ Separate table (or array/JSONB if simple)
+CREATE TABLE post_tags (
+    post_id bigint REFERENCES posts(id) ON DELETE CASCADE,
+    tag_id  bigint REFERENCES tags(id) ON DELETE CASCADE,
+    PRIMARY KEY (post_id, tag_id)
+);
+
+-- ✅ Alternative: PostgreSQL array (if tags are just strings, no metadata)
+CREATE TABLE posts (id serial, tags text[] NOT NULL DEFAULT '{}');
+CREATE INDEX idx_posts_tags ON posts USING GIN(tags);
+```
+
+### When to Denormalize
+
+**Denormalize ONLY when:**
+1. You have **measured** a performance problem (EXPLAIN ANALYZE, not "I think it's slow")
+2. The denormalized data is **read-heavy** (read:write ratio > 100:1)
+3. You accept the **consistency maintenance cost** (triggers, application logic, or materialized views)
+
+**Safe denormalization patterns:**
+
+```sql
+-- Pattern 1: Materialized view (computed, refreshable)
+CREATE MATERIALIZED VIEW order_summary AS
+SELECT o.id, o.user_id, o.total,
+       COUNT(oi.id) AS item_count,
+       u.email AS user_email
+FROM orders o
+JOIN order_items oi ON oi.order_id = o.id
+JOIN users u ON u.id = o.user_id
+GROUP BY o.id, u.email;
+
+REFRESH MATERIALIZED VIEW CONCURRENTLY order_summary;  -- non-blocking
+
+-- Pattern 2: Cached aggregate column (application-maintained)
+ALTER TABLE orders ADD COLUMN item_count int NOT NULL DEFAULT 0;
+-- Update via trigger or application code on order_item insert/delete
+
+-- Pattern 3: JSONB snapshot (freeze-at-write-time)
+-- Store a copy of the product details at the time of purchase
+CREATE TABLE order_items (
+    id          bigserial PRIMARY KEY,
+    order_id    bigint NOT NULL REFERENCES orders(id),
+    product_id  bigint REFERENCES products(id),
+    quantity    int NOT NULL,
+    unit_price  numeric(10,2) NOT NULL,      -- frozen price
+    product_snapshot jsonb NOT NULL           -- frozen name, description, image
+);
+```
+
+---
+
+## 3. Indexing Strategy (CRITICAL)
+
+### Index Types (PostgreSQL)
+
+| Type | When | Example |
+|------|------|---------|
+| **B-Tree** (default) | Equality, range, ORDER BY | `WHERE status = 'active'`, `WHERE created_at > '2025-01-01'` |
+| **Hash** | Equality only (rare, B-Tree usually better) | `WHERE id = 123` (large tables, Postgres 10+) |
+| **GIN** | Arrays, JSONB, full-text search | `WHERE tags @> '{go}'`, `WHERE data->>'key' = 'val'` |
+| **GiST** | Geometry, ranges, nearest-neighbor | PostGIS, tsrange, ltree |
+| **BRIN** | Very large tables with natural ordering | Time-series data sorted by timestamp |
+
+### Index Decision Rules
+
+```
+Rule 1: Index every column in WHERE clauses
+Rule 2: Index every column used in JOIN ON conditions
+Rule 3: Index every column in ORDER BY (if queried with LIMIT)
+Rule 4: Composite index for multi-column WHERE (leftmost prefix rule)
+Rule 5: Partial index when filtering a subset (e.g., only active records)
+Rule 6: Covering index (INCLUDE) to avoid table lookup
+Rule 7: DON'T index low-cardinality columns alone (e.g., boolean)
+```
+
+### Composite Index: Column Order Matters
+
+```sql
+-- Query: WHERE user_id = ? AND status = ? ORDER BY created_at DESC
+-- ✅ Optimal: matches query pattern left-to-right
+CREATE INDEX idx_orders_user_status_created
+ON orders(user_id, status, created_at DESC);
+
+-- ❌ Wrong order: can't use for this query efficiently
+CREATE INDEX idx_orders_created_user_status
+ON orders(created_at DESC, user_id, status);
+```
+
+**Leftmost prefix rule:** Index on `(A, B, C)` supports queries on `(A)`, `(A, B)`, `(A, B, C)` but NOT `(B)`, `(C)`, or `(B, C)`.
+
+### Partial Index (Index Only What Matters)
+
+```sql
+-- Only 5% of orders are 'pending', but queried frequently
+CREATE INDEX idx_orders_pending
+ON orders(created_at DESC)
+WHERE status = 'pending';
+
+-- Only active users matter for login
+CREATE INDEX idx_users_active_email
+ON users(email)
+WHERE is_active = true;
+```
+
+### Covering Index (Avoid Table Lookup)
+
+```sql
+-- Query only needs id and status, no need to read the table row
+CREATE INDEX idx_orders_user_covering
+ON orders(user_id) INCLUDE (status, total);
+
+-- Now this query is index-only:
+SELECT status, total FROM orders WHERE user_id = 123;
+```
+
+### When NOT to Index
+
+```
+❌ Columns rarely used in WHERE/JOIN/ORDER BY
+❌ Tables with < 1,000 rows (sequential scan is faster)
+❌ Columns with very low cardinality alone (e.g., boolean is_active)
+❌ Write-heavy tables where index maintenance cost > read benefit
+❌ Duplicate indexes (check pg_stat_user_indexes for unused indexes)
+```
+
+---
+
+## 4. Zero-Downtime Migrations (HIGH)
+
+### The Golden Rule
+
+```
+NEVER make destructive changes in one step.
+Always: ADD → MIGRATE DATA → REMOVE OLD (in separate deploys).
+```
+
+### Safe Migration Patterns
+
+**Rename a column (3 deploys):**
+
+```
+Deploy 1: Add new column
+  ALTER TABLE users ADD COLUMN full_name text;
+  UPDATE users SET full_name = name;           -- backfill
+  -- App writes to BOTH name and full_name
+
+Deploy 2: Switch reads to new column
+  -- App reads from full_name, still writes to both
+
+Deploy 3: Drop old column
+  ALTER TABLE users DROP COLUMN name;
+  -- App only uses full_name
+```
+
+**Add a NOT NULL column (2 deploys):**
+
+```sql
+-- Deploy 1: Add nullable column, backfill
+ALTER TABLE orders ADD COLUMN currency text;              -- nullable first
+UPDATE orders SET currency = 'USD' WHERE currency IS NULL; -- backfill
+
+-- Deploy 2: Add constraint (after all rows backfilled)
+ALTER TABLE orders ALTER COLUMN currency SET NOT NULL;
+ALTER TABLE orders ALTER COLUMN currency SET DEFAULT 'USD';
+```
+
+**Add an index without locking:**
+
+```sql
+-- ✅ CONCURRENTLY: no table lock, can run on live DB
+CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
+
+-- ❌ Without CONCURRENTLY: locks table for writes during build
+CREATE INDEX idx_orders_status ON orders(status);
+```
+
+### Migration Safety Checklist
+
+```
+✅ Migration runs in < 30 seconds on production data size
+✅ No exclusive table locks (use CONCURRENTLY for indexes)
+✅ Rollback plan documented and tested
+✅ Backfill runs in batches (not one giant UPDATE)
+✅ New column added as nullable first, constraint added later
+✅ Old column kept until all code references removed
+
+❌ Never rename/drop columns in one deploy
+❌ Never ALTER TYPE on large tables without testing timing
+❌ Never run data backfill in a transaction (OOM on large tables)
+```
+
+### Batch Backfill Template
+
+```sql
+-- Backfill in batches of 10,000 (avoids long-running transactions)
+DO $$
+DECLARE
+  batch_size int := 10000;
+  affected int;
+BEGIN
+  LOOP
+    UPDATE orders
+    SET currency = 'USD'
+    WHERE id IN (
+      SELECT id FROM orders WHERE currency IS NULL LIMIT batch_size
+    );
+    GET DIAGNOSTICS affected = ROW_COUNT;
+    RAISE NOTICE 'Updated % rows', affected;
+    EXIT WHEN affected = 0;
+    PERFORM pg_sleep(0.1);  -- brief pause to reduce load
+  END LOOP;
+END $$;
+```
+
+---
+
+## 5. Multi-Tenant Design (HIGH)
+
+### Three Approaches
+
+| Approach | Isolation | Complexity | When |
+|----------|-----------|------------|------|
+| **Row-level** (shared tables + `tenant_id`) | Low | Low | SaaS MVP, < 1,000 tenants |
+| **Schema-per-tenant** | Medium | Medium | Regulated industries, moderate scale |
+| **Database-per-tenant** | High | High | Enterprise, strict data isolation |
+
+### Row-Level Tenancy (Most Common)
+
+```sql
+-- Every table has tenant_id
+CREATE TABLE orders (
+    id         bigserial PRIMARY KEY,
+    tenant_id  bigint NOT NULL REFERENCES tenants(id),
+    user_id    bigint NOT NULL REFERENCES users(id),
+    total      numeric(10,2) NOT NULL,
+    -- ...
+);
+
+-- Composite index: tenant first (most queries filter by tenant)
+CREATE INDEX idx_orders_tenant_user ON orders(tenant_id, user_id);
+CREATE INDEX idx_orders_tenant_status ON orders(tenant_id, status);
+
+-- Row-Level Security (PostgreSQL)
+ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
+CREATE POLICY tenant_isolation ON orders
+  USING (tenant_id = current_setting('app.tenant_id')::bigint);
+```
+
+**Application-level enforcement:**
+
+```typescript
+// Middleware: set tenant context on every request
+app.use((req, res, next) => {
+  const tenantId = req.headers['x-tenant-id'];
+  if (!tenantId) return res.status(400).json({ error: 'Missing tenant' });
+  req.tenantId = tenantId;
+  next();
+});
+
+// Repository: ALWAYS filter by tenant
+async findOrders(tenantId: string, userId: string) {
+  return db.order.findMany({
+    where: { tenantId, userId },  // ← tenant_id in EVERY query
+  });
+}
+```
+
+### Rules
+
+```
+✅ tenant_id in EVERY table that holds tenant data
+✅ tenant_id as FIRST column in every composite index
+✅ Application middleware enforces tenant context
+✅ Use RLS (PostgreSQL) as defense-in-depth, not sole protection
+✅ Test with 2+ tenants to verify isolation
+
+❌ Never allow cross-tenant queries in application code
+❌ Never skip tenant_id in WHERE clauses (even in admin tools)
+```
+
+---
+
+## 6. Common Schema Patterns (MEDIUM)
+
+### Soft Delete
+
+```sql
+ALTER TABLE orders ADD COLUMN deleted_at timestamptz;
+
+-- All queries filter deleted records
+CREATE VIEW active_orders AS
+SELECT * FROM orders WHERE deleted_at IS NULL;
+
+-- Partial index: only index non-deleted rows
+CREATE INDEX idx_orders_active_status
+ON orders(status, created_at DESC)
+WHERE deleted_at IS NULL;
+```
+
+**ORM integration:**
+
+```typescript
+// Prisma middleware: auto-filter soft-deleted records
+prisma.$use(async (params, next) => {
+  if (params.action === 'findMany' || params.action === 'findFirst') {
+    params.args.where = { ...params.args.where, deletedAt: null };
+  }
+  return next(params);
+});
+```
+
+### Audit Trail
+
+```sql
+-- Option A: Audit columns on every table
+ALTER TABLE orders ADD COLUMN created_by bigint REFERENCES users(id);
+ALTER TABLE orders ADD COLUMN updated_by bigint REFERENCES users(id);
+
+-- Option B: Separate audit log table (more detail)
+CREATE TABLE audit_log (
+    id          bigserial PRIMARY KEY,
+    table_name  text NOT NULL,
+    record_id   bigint NOT NULL,
+    action      text NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
+    old_data    jsonb,
+    new_data    jsonb,
+    changed_by  bigint REFERENCES users(id),
+    changed_at  timestamptz NOT NULL DEFAULT now()
+);
+CREATE INDEX idx_audit_table_record ON audit_log(table_name, record_id);
+CREATE INDEX idx_audit_changed_at ON audit_log(changed_at DESC);
+```
+
+### Enum Columns
+
+```sql
+-- Option A: PostgreSQL enum type (strict, but ALTER TYPE is painful)
+CREATE TYPE order_status AS ENUM ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled');
+ALTER TABLE orders ADD COLUMN status order_status NOT NULL DEFAULT 'pending';
+
+-- Option B: Text + CHECK constraint (easier to migrate)
+ALTER TABLE orders ADD COLUMN status text NOT NULL DEFAULT 'pending'
+  CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled'));
+
+-- Option C: Lookup table (most flexible, best for UI-driven lists)
+CREATE TABLE order_statuses (
+    id    serial PRIMARY KEY,
+    name  text UNIQUE NOT NULL,
+    label text NOT NULL      -- display name
+);
+```
+
+**Recommendation:** Option B (text + CHECK) for most cases. Option C if statuses are managed by non-developers.
+
+### Polymorphic Associations
+
+```sql
+-- ❌ Anti-pattern: polymorphic FK (no referential integrity)
+CREATE TABLE comments (
+    id             bigserial PRIMARY KEY,
+    commentable_type text,    -- 'Post' or 'Photo'
+    commentable_id   bigint,  -- no FK constraint possible!
+    body           text
+);
+
+-- ✅ Pattern A: Separate FK columns (nullable)
+CREATE TABLE comments (
+    id       bigserial PRIMARY KEY,
+    post_id  bigint REFERENCES posts(id) ON DELETE CASCADE,
+    photo_id bigint REFERENCES photos(id) ON DELETE CASCADE,
+    body     text NOT NULL,
+    CHECK (
+      (post_id IS NOT NULL AND photo_id IS NULL) OR
+      (post_id IS NULL AND photo_id IS NOT NULL)
+    )
+);
+
+-- ✅ Pattern B: Separate tables (cleanest, best for different schemas)
+CREATE TABLE post_comments (..., post_id bigint REFERENCES posts(id));
+CREATE TABLE photo_comments (..., photo_id bigint REFERENCES photos(id));
+```
+
+### JSONB Columns (Semi-Structured Data)
+
+```sql
+-- Good uses: metadata, settings, flexible attributes
+CREATE TABLE products (
+    id         bigserial PRIMARY KEY,
+    name       text NOT NULL,
+    price      numeric(10,2) NOT NULL,
+    attributes jsonb NOT NULL DEFAULT '{}'  -- color, size, weight...
+);
+
+-- Index for JSONB queries
+CREATE INDEX idx_products_attrs ON products USING GIN(attributes);
+
+-- Query
+SELECT * FROM products WHERE attributes->>'color' = 'red';
+SELECT * FROM products WHERE attributes @> '{"size": "XL"}';
+```
+
+```
+✅ Use JSONB for truly flexible/optional data (metadata, settings, preferences)
+✅ Index JSONB columns with GIN when queried
+
+❌ Never use JSONB for data that should be columns (email, status, price)
+❌ Never use JSONB to avoid schema design (it's not MongoDB-in-Postgres)
+```
+
+---
+
+## 7. Table Partitioning (MEDIUM)
+
+### When to Partition
+
+```
+✅ Table > 100M rows AND growing
+✅ Most queries filter on the partition key (date range, tenant)
+✅ Old data can be dropped/archived by partition (efficient DELETE)
+
+❌ Table < 10M rows (overhead not worth it)
+❌ Queries don't filter on partition key (scans all partitions)
+```
+
+### Range Partitioning (Time-Series)
+
+```sql
+CREATE TABLE events (
+    id         bigserial,
+    tenant_id  bigint NOT NULL,
+    event_type text NOT NULL,
+    payload    jsonb,
+    created_at timestamptz NOT NULL DEFAULT now()
+) PARTITION BY RANGE (created_at);
+
+-- Monthly partitions
+CREATE TABLE events_2025_01 PARTITION OF events
+  FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
+CREATE TABLE events_2025_02 PARTITION OF events
+  FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
+
+-- Automate partition creation with pg_partman or cron
+```
+
+### List Partitioning (Multi-Tenant)
+
+```sql
+CREATE TABLE orders (
+    id        bigserial,
+    tenant_id bigint NOT NULL,
+    total     numeric(10,2)
+) PARTITION BY LIST (tenant_id);
+
+CREATE TABLE orders_tenant_1 PARTITION OF orders FOR VALUES IN (1);
+CREATE TABLE orders_tenant_2 PARTITION OF orders FOR VALUES IN (2);
+```
+
+---
+
+## Anti-Patterns
+
+| # | ❌ Don't | ✅ Do Instead |
+|---|---------|--------------|
+| 1 | Premature denormalization | Start 3NF, denormalize when measured |
+| 2 | Auto-increment IDs as public API identifiers | UUID for public, serial for internal |
+| 3 | No foreign key constraints | FK enforced in database, always |
+| 4 | Nullable by default | NOT NULL by default, nullable when required |
+| 5 | No indexes on FK columns | Index every FK column |
+| 6 | Single-step destructive migration | ADD → MIGRATE → REMOVE in separate deploys |
+| 7 | `CREATE INDEX` without `CONCURRENTLY` | Always `CONCURRENTLY` on live tables |
+| 8 | Polymorphic FK (`commentable_type + commentable_id`) | Separate FK columns or separate tables |
+| 9 | JSONB for everything | JSONB for flexible data only, columns for structured |
+| 10 | No `created_at` / `updated_at` | Timestamp pair on every table |
+| 11 | Comma-separated values in one column | Separate table or PostgreSQL array |
+| 12 | `text` without length validation | CHECK constraint or application validation |
+
+---
+
+## Common Issues
+
+### Issue 1: "Query is slow but I already have an index"
+
+**Symptom:** `EXPLAIN ANALYZE` shows Sequential Scan despite existing index.
+
+**Causes:**
+1. **Wrong index column order** — composite index `(A, B)` won't help `WHERE B = ?`
+2. **Low selectivity** — index on boolean column (50% of rows match), planner prefers seq scan
+3. **Stale statistics** — run `ANALYZE table_name;`
+4. **Type mismatch** — comparing `varchar` column with `integer` parameter → no index use
+
+**Fix:** Check `EXPLAIN (ANALYZE, BUFFERS)`, verify index matches query pattern, run `ANALYZE`.
+
+### Issue 2: "Migration locks the table for minutes"
+
+**Symptom:** `ALTER TABLE` blocks all writes during execution.
+
+**Cause:** Adding NOT NULL constraint, changing column type, or creating index without `CONCURRENTLY`.
+
+**Fix:**
+```sql
+-- Add index without lock
+CREATE INDEX CONCURRENTLY idx_name ON table(col);
+
+-- Add NOT NULL constraint without lock (Postgres 12+)
+ALTER TABLE t ADD CONSTRAINT t_col_nn CHECK (col IS NOT NULL) NOT VALID;
+ALTER TABLE t VALIDATE CONSTRAINT t_col_nn;  -- non-blocking validation
+```
+
+### Issue 3: "How many indexes is too many?"
+
+**Rule of thumb:**
+- Read-heavy table (reports, product catalog): 5-10 indexes is fine
+- Write-heavy table (events, logs): 2-3 indexes max
+- Monitor with `pg_stat_user_indexes` — drop indexes with `idx_scan = 0`
+
+```sql
+-- Find unused indexes
+SELECT schemaname, relname, indexrelname, idx_scan
+FROM pg_stat_user_indexes
+WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
+ORDER BY pg_relation_size(indexrelid) DESC;
+```
--- a/fullstack-dev/references/django-best-practices.md
+++ b/fullstack-dev/references/django-best-practices.md
@@ -0,0 +1,466 @@
+# Django Best Practices
+
+Production-grade guide for Django 5.x and Django REST Framework. 40+ rules across 8 categories.
+
+## Core Principles (7 Rules)
+
+```
+1. ✅ Custom User model BEFORE first migration (can't change later)
+2. ✅ One Django app per domain concept (users, orders, payments)
+3. ✅ Fat models, thin views — business logic in models/managers, not views
+4. ✅ Always use select_related/prefetch_related (prevent N+1)
+5. ✅ Settings split by environment (base + dev + prod)
+6. ✅ Test with pytest-django + factory_boy (not fixtures)
+7. ✅ Never use runserver in production (Gunicorn + Nginx)
+```
+
+---
+
+## 1. Project Structure (CRITICAL)
+
+### App-Per-Domain
+
+```
+myproject/
+├── config/                     # Project config
+│   ├── __init__.py
+│   ├── settings/
+│   │   ├── base.py             # Shared settings
+│   │   ├── dev.py              # DEBUG=True, SQLite ok
+│   │   └── prod.py             # DEBUG=False, Postgres, HTTPS
+│   ├── urls.py
+│   ├── wsgi.py
+│   └── asgi.py
+├── apps/
+│   ├── users/                  # Custom User model
+│   │   ├── models.py
+│   │   ├── serializers.py
+│   │   ├── views.py
+│   │   ├── urls.py
+│   │   ├── admin.py
+│   │   ├── services.py         # Business logic
+│   │   ├── selectors.py        # Complex queries
+│   │   └── tests/
+│   │       ├── test_models.py
+│   │       ├── test_views.py
+│   │       └── factories.py
+│   ├── orders/
+│   └── payments/
+├── manage.py
+├── requirements/
+│   ├── base.txt
+│   ├── dev.txt
+│   └── prod.txt
+└── docker-compose.yml
+```
+
+### Rules
+
+```
+✅ One app = one bounded context (users, orders, payments)
+✅ Business logic in services.py / selectors.py, not views
+✅ Each app has its own urls.py, admin.py, tests/
+
+❌ Never put everything in one app
+❌ Never import across app boundaries at the model level (use IDs)
+❌ Never put business logic in views or serializers
+```
+
+---
+
+## 2. Models & Migrations (CRITICAL)
+
+### Custom User Model (Day 1!)
+
+```python
+# apps/users/models.py
+from django.contrib.auth.models import AbstractUser
+from django.db import models
+import uuid
+
+class User(AbstractUser):
+    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
+    email = models.EmailField(unique=True)
+
+    USERNAME_FIELD = 'email'
+    REQUIRED_FIELDS = ['username']
+
+    class Meta:
+        db_table = 'users'
+
+# config/settings/base.py
+AUTH_USER_MODEL = 'users.User'
+```
+
+**This MUST be done before `migrate`. Cannot change after.**
+
+### Model Best Practices
+
+```python
+class TimeStampedModel(models.Model):
+    created_at = models.DateTimeField(auto_now_add=True)
+    updated_at = models.DateTimeField(auto_now=True)
+    class Meta:
+        abstract = True
+
+class Order(TimeStampedModel):
+    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
+    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, related_name='orders')
+    status = models.CharField(max_length=20, choices=OrderStatus.choices, default=OrderStatus.PENDING, db_index=True)
+    total = models.DecimalField(max_digits=10, decimal_places=2)
+
+    class Meta:
+        db_table = 'orders'
+        ordering = ['-created_at']
+        indexes = [
+            models.Index(fields=['user', 'status']),
+        ]
+
+    def can_cancel(self) -> bool:
+        return self.status in [OrderStatus.PENDING, OrderStatus.CONFIRMED]
+
+    def cancel(self):
+        if not self.can_cancel():
+            raise ValueError(f"Cannot cancel order in {self.status} status")
+        self.status = OrderStatus.CANCELLED
+        self.save(update_fields=['status', 'updated_at'])
+```
+
+### Migration Rules
+
+```
+✅ Review migration SQL: python manage.py sqlmigrate app_name 0001
+✅ Name migrations descriptively: --name add_status_index_to_orders
+✅ Separate data migrations from schema migrations
+✅ Non-destructive first: add column → backfill → remove old column
+
+❌ Never edit or delete applied migrations
+❌ Never use RunPython without reverse function
+```
+
+---
+
+## 3. Views & Serializers — DRF (HIGH)
+
+### Service Layer Pattern
+
+```python
+# apps/orders/services.py
+from django.db import transaction
+
+class OrderService:
+    @staticmethod
+    @transaction.atomic
+    def create_order(user, items_data: list[dict]) -> Order:
+        total = sum(item['price'] * item['quantity'] for item in items_data)
+        order = Order.objects.create(user=user, total=total)
+        OrderItem.objects.bulk_create([
+            OrderItem(order=order, **item) for item in items_data
+        ])
+        return order
+
+    @staticmethod
+    def cancel_order(order_id: str, user) -> Order:
+        order = Order.objects.select_for_update().get(id=order_id, user=user)
+        order.cancel()
+        return order
+```
+
+### Serializers
+
+```python
+class OrderSerializer(serializers.ModelSerializer):
+    items = OrderItemSerializer(many=True, read_only=True)
+    class Meta:
+        model = Order
+        fields = ['id', 'status', 'total', 'items', 'created_at']
+        read_only_fields = ['id', 'total', 'created_at']
+
+class CreateOrderSerializer(serializers.Serializer):
+    """Input-only serializer — separate from output."""
+    items = serializers.ListField(
+        child=serializers.DictField(), min_length=1, max_length=50,
+    )
+    def validate_items(self, items):
+        for item in items:
+            if item.get('quantity', 0) < 1:
+                raise serializers.ValidationError("Quantity must be at least 1")
+        return items
+```
+
+### Views (Thin!)
+
+```python
+@api_view(['POST'])
+@permission_classes([IsAuthenticated])
+def create_order(request):
+    serializer = CreateOrderSerializer(data=request.data)
+    serializer.is_valid(raise_exception=True)
+    order = OrderService.create_order(request.user, serializer.validated_data['items'])
+    return Response({'data': OrderSerializer(order).data}, status=status.HTTP_201_CREATED)
+```
+
+### Rules
+
+```
+✅ Separate input serializers from output serializers
+✅ Views only: validate → call service → serialize → respond
+✅ Use @transaction.atomic for multi-model writes
+
+❌ Never put business logic in views or serializers
+❌ Never use ModelSerializer for write operations (too implicit)
+```
+
+---
+
+## 4. Authentication (HIGH)
+
+| Method | When | Frontend |
+|--------|------|----------|
+| Session | Same-domain, SSR, Django templates | Django templates / htmx |
+| JWT | Different domain, SPA, mobile | React, Vue, mobile apps |
+| OAuth2 | Third-party login, API consumers | Any |
+
+### JWT Config (djangorestframework-simplejwt)
+
+```python
+SIMPLE_JWT = {
+    'ACCESS_TOKEN_LIFETIME': timedelta(minutes=15),
+    'REFRESH_TOKEN_LIFETIME': timedelta(days=7),
+    'ROTATE_REFRESH_TOKENS': True,
+    'BLACKLIST_AFTER_ROTATION': True,
+}
+```
+
+---
+
+## 5. Performance Optimization (HIGH)
+
+### N+1 Query Prevention
+
+```python
+# ❌ N+1: 1 query for orders + N queries for users
+orders = Order.objects.all()
+for o in orders:
+    print(o.user.email)     # hits DB each iteration
+
+# ✅ select_related (FK/OneToOne — JOIN)
+orders = Order.objects.select_related('user').all()
+
+# ✅ prefetch_related (ManyToMany/reverse FK — 2 queries)
+orders = Order.objects.prefetch_related('items').all()
+
+# ✅ Combined
+orders = Order.objects.select_related('user').prefetch_related('items').all()
+```
+
+### Query Optimization Toolkit
+
+```python
+# Only fetch needed columns
+User.objects.values('id', 'email')
+User.objects.values_list('email', flat=True)
+
+# Annotate instead of Python loops
+from django.db.models import Count, Sum
+Order.objects.annotate(item_count=Count('items'), revenue=Sum('items__price'))
+
+# Bulk operations
+OrderItem.objects.bulk_create([...])
+Order.objects.filter(status='pending').update(status='cancelled')
+
+# Database indexes
+class Meta:
+    indexes = [
+        models.Index(fields=['user', 'status']),
+        models.Index(fields=['-created_at']),
+        models.Index(fields=['email'], condition=Q(is_active=True)),
+    ]
+
+# Pagination
+from rest_framework.pagination import CursorPagination
+class OrderPagination(CursorPagination):
+    page_size = 20
+    ordering = '-created_at'
+```
+
+### Caching
+
+```python
+from django.core.cache import cache
+
+def get_product(product_id: str):
+    cache_key = f'product:{product_id}'
+    product = cache.get(cache_key)
+    if product is None:
+        product = Product.objects.get(id=product_id)
+        cache.set(cache_key, product, timeout=300)
+    return product
+```
+
+---
+
+## 6. Testing (MEDIUM-HIGH)
+
+### pytest-django + factory_boy
+
+```python
+# conftest.py
+@pytest.fixture
+def api_client():
+    return APIClient()
+
+@pytest.fixture
+def authenticated_client(api_client, user_factory):
+    user = user_factory()
+    api_client.force_authenticate(user=user)
+    return api_client
+```
+
+```python
+# factories.py
+class UserFactory(factory.django.DjangoModelFactory):
+    class Meta:
+        model = User
+    email = factory.Sequence(lambda n: f'user{n}@example.com')
+    username = factory.Sequence(lambda n: f'user{n}')
+
+class OrderFactory(factory.django.DjangoModelFactory):
+    class Meta:
+        model = 'orders.Order'
+    user = factory.SubFactory(UserFactory)
+    total = factory.Faker('pydecimal', left_digits=3, right_digits=2, positive=True)
+```
+
+```python
+# test_views.py
+@pytest.mark.django_db
+class TestListOrders:
+    def test_returns_user_orders(self, authenticated_client):
+        OrderFactory.create_batch(3, user=authenticated_client.handler._force_user)
+        response = authenticated_client.get('/api/orders/')
+        assert response.status_code == 200
+        assert len(response.data['data']) == 3
+
+    def test_requires_authentication(self, api_client):
+        response = api_client.get('/api/orders/')
+        assert response.status_code == 401
+```
+
+---
+
+## 7. Admin Customization (MEDIUM)
+
+```python
+class OrderItemInline(admin.TabularInline):
+    model = OrderItem
+    extra = 0
+    readonly_fields = ['price']
+
+@admin.register(Order)
+class OrderAdmin(admin.ModelAdmin):
+    list_display = ['id', 'user', 'status', 'total', 'created_at']
+    list_filter = ['status', 'created_at']
+    search_fields = ['user__email', 'id']
+    readonly_fields = ['id', 'created_at', 'updated_at']
+    inlines = [OrderItemInline]
+    date_hierarchy = 'created_at'
+
+    def get_queryset(self, request):
+        return super().get_queryset(request).select_related('user')
+```
+
+---
+
+## 8. Production Deployment (MEDIUM)
+
+### Security Settings
+
+```python
+# settings/prod.py
+DEBUG = False
+ALLOWED_HOSTS = ['example.com', 'www.example.com']
+CSRF_TRUSTED_ORIGINS = ['https://example.com']
+SECURE_SSL_REDIRECT = True
+SESSION_COOKIE_SECURE = True
+CSRF_COOKIE_SECURE = True
+SECURE_HSTS_SECONDS = 31536000
+```
+
+### Deployment Stack
+
+```
+Nginx → Gunicorn → Django
+         ↕
+      PostgreSQL + Redis (cache)
+         ↕
+      Celery (background tasks)
+```
+
+```bash
+gunicorn config.wsgi:application \
+  --bind 0.0.0.0:8000 \
+  --workers 4 \
+  --timeout 120 \
+  --access-logfile -
+```
+
+### WhiteNoise for Static Files
+
+```python
+MIDDLEWARE = [
+    'django.middleware.security.SecurityMiddleware',
+    'whitenoise.middleware.WhiteNoiseMiddleware',  # right after Security
+    ...
+]
+STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
+```
+
+### Rules
+
+```
+✅ Gunicorn + Nginx (or Cloud Run / Railway)
+✅ PostgreSQL (not SQLite)
+✅ python manage.py check --deploy
+✅ Sentry for error tracking
+
+❌ Never use runserver in production
+❌ Never use DEBUG=True in production
+❌ Never use SQLite in production
+```
+
+---
+
+## Anti-Patterns
+
+| # | ❌ Don't | ✅ Do Instead |
+|---|---------|--------------|
+| 1 | Business logic in views | Service layer (`services.py`) |
+| 2 | One giant app | App-per-domain |
+| 3 | Default User model | Custom User before first migrate |
+| 4 | No `select_related` | Always eager-load related objects |
+| 5 | Django fixtures for tests | `factory_boy` factories |
+| 6 | `settings.py` single file | Split: base + dev + prod |
+| 7 | `runserver` in production | Gunicorn + Nginx |
+| 8 | SQLite in production | PostgreSQL |
+| 9 | `ModelSerializer` for writes | Explicit input serializer |
+| 10 | Raw SQL in views | ORM querysets + `selectors.py` |
+
+---
+
+## Common Issues
+
+### Issue 1: "Can't change User model after first migration"
+
+**Fix:** If starting fresh: delete all migrations + DB, set custom User, re-migrate. If data exists: complex migration (use `django-allauth` or incremental field migration).
+
+### Issue 2: "Serializer is too slow on large querysets"
+
+**Fix:** Missing `select_related` / `prefetch_related` → N+1 queries.
+```python
+queryset = Order.objects.select_related('user').prefetch_related('items')
+```
+
+### Issue 3: "Circular import between apps"
+
+**Fix:** Use string references: `models.ForeignKey('orders.Order', ...)` instead of importing the model class. For services, import inside the function.
--- a/fullstack-dev/references/environment-management.md
+++ b/fullstack-dev/references/environment-management.md
@@ -0,0 +1,78 @@
+# Environment & CORS Management
+
+Patterns for managing environment variables, API URLs, and CORS configuration across frontend and backend stacks.
+
+---
+
+## Standard Environment Pattern
+
+```
+# .env.local (gitignored, for local dev)
+NEXT_PUBLIC_API_URL=http://localhost:3001
+NEXT_PUBLIC_WS_URL=ws://localhost:3001
+
+# Staging (set in Vercel/CI)
+NEXT_PUBLIC_API_URL=https://api-staging.example.com
+
+# Production (set in Vercel/CI)
+NEXT_PUBLIC_API_URL=https://api.example.com
+```
+
+---
+
+## Environment Variable Rules
+
+```
+✅ API base URL from environment variable — NEVER hardcoded
+✅ Prefix client-side vars with NEXT_PUBLIC_ (Next.js) or VITE_ (Vite)
+✅ Backend URL = server-only env var (for SSR calls, not exposed to browser)
+✅ CORS on backend: explicit list of allowed origins per environment
+
+❌ Never use localhost URLs in production builds
+❌ Never expose backend-only secrets with NEXT_PUBLIC_ prefix
+❌ Never commit .env.local (commit .env.example with placeholders)
+```
+
+---
+
+## CORS Configuration
+
+```typescript
+// Backend: environment-aware CORS
+const ALLOWED_ORIGINS = {
+  development: ['http://localhost:3000', 'http://localhost:5173'],
+  staging: ['https://staging.example.com'],
+  production: ['https://example.com', 'https://www.example.com'],
+};
+
+app.use(cors({
+  origin: ALLOWED_ORIGINS[process.env.NODE_ENV || 'development'],
+  credentials: true,  // needed for cookies (auth)
+  methods: ['GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
+}));
+```
+
+---
+
+## Common Issues
+
+### Issue 1: "CORS error in browser but works in Postman"
+
+**Cause:** CORS is a browser security feature. Postman/curl skip it.
+
+**Fix:**
+1. Backend must return `Access-Control-Allow-Origin: https://your-frontend.com`
+2. For cookies/auth: `credentials: true` on both sides
+3. Check that preflight `OPTIONS` request returns correct headers
+
+### Issue 2: "Environment variable undefined in browser"
+
+**Cause:** Missing `NEXT_PUBLIC_` or `VITE_` prefix for client-side access.
+
+**Fix:** Client-side vars MUST have the framework prefix. Rebuild after adding new env vars (they are embedded at build time).
+
+### Issue 3: "Works locally, fails in staging"
+
+**Cause:** Different origins, missing CORS config for staging domain.
+
+**Fix:** Add staging origin to `ALLOWED_ORIGINS`, verify env vars are set in deployment platform.
--- a/fullstack-dev/references/release-checklist.md
+++ b/fullstack-dev/references/release-checklist.md
@@ -0,0 +1,278 @@
+# Release & Acceptance Checklist
+
+6-gate release checklist for backend and full-stack applications. Prevents "it works on my machine" and "we forgot to check X" failures.
+
+**Iron Law: NO RELEASE WITHOUT ALL GATES PASSING.**
+
+---
+
+## Release Gates Overview
+
+```
+Feature Complete
+    ↓
+Gate 1: Functional Acceptance        → Does it do what it should?
+    ↓
+Gate 2: Non-Functional Acceptance    → Is it fast, reliable, observable?
+    ↓
+Gate 3: Security Review              → Is it safe?
+    ↓
+Gate 4: Deployment Readiness         → Can we deploy and rollback safely?
+    ↓
+Gate 5: Release Execution            → Deploy with canary + monitoring
+    ↓
+Gate 6: Post-Release Validation      → Did it actually work in production?
+```
+
+---
+
+## Gate 1: Functional Acceptance
+
+**Question: Does it do what the requirements say?**
+
+- [ ] All acceptance criteria from ticket/PRD have passing tests
+- [ ] Happy path works end-to-end
+- [ ] Edge cases tested (empty inputs, max lengths, Unicode)
+- [ ] Error cases tested (invalid input, not found, timeout)
+- [ ] Data integrity verified (CRUD cycle produces correct state)
+- [ ] Backward compatibility confirmed (existing clients not broken)
+- [ ] API contract matches OpenAPI spec
+- [ ] Idempotency verified (retries don't create duplicates)
+
+### Evidence Template
+
+| Requirement | Test | Status | Notes |
+|-------------|------|--------|-------|
+| User can create order | `orders.api.test:creates order` | ✅ PASS | |
+| Empty cart → error | `orders.api.test:rejects empty` | ✅ PASS | |
+| Payment failure handled | `payments.test:handles decline` | ✅ PASS | |
+
+---
+
+## Gate 2: Non-Functional Acceptance
+
+**Question: Is it fast, reliable, and observable?**
+
+### Performance
+
+- [ ] Response time within budget (p95 < ___ms) — measured, not assumed
+- [ ] No N+1 queries (checked with query logging)
+- [ ] New queries use indexes (`EXPLAIN ANALYZE`)
+- [ ] Pagination works on large datasets
+- [ ] Caching effective (hit rate > 80%)
+- [ ] Connection pool healthy under load
+
+### Reliability
+
+- [ ] Graceful degradation when dependencies fail (circuit breaker)
+- [ ] Retry logic works for transient failures
+- [ ] All external calls have timeouts
+- [ ] Rate limiting returns 429 correctly
+- [ ] Health check endpoints verified (`/health`, `/ready`)
+
+### Observability
+
+- [ ] Structured logging with request ID (not `console.log`)
+- [ ] Metrics exposed (request count, latency, error rate)
+- [ ] Alerts configured (error spike, latency spike)
+- [ ] Request tracing works end-to-end
+- [ ] Dashboard updated for new feature
+
+### Evidence
+
+| Metric | Target | Actual | Status |
+|--------|--------|--------|--------|
+| p95 response | < 500ms | ___ms | ✅/❌ |
+| p99 response | < 1000ms | ___ms | ✅/❌ |
+| Error rate (load) | < 0.1% | ___% | ✅/❌ |
+| Throughput | > ___ RPS | ___ RPS | ✅/❌ |
+
+---
+
+## Gate 3: Security Review
+
+**Question: Does this introduce vulnerabilities?**
+
+### Input & Output
+
+- [ ] All input validated server-side (never trust client)
+- [ ] SQL injection prevented (parameterized queries only)
+- [ ] XSS prevented (output encoding)
+- [ ] File upload validated (type, size, name sanitized)
+- [ ] Rate limiting on sensitive endpoints (login, reset, APIs)
+
+### Auth & Data
+
+- [ ] Protected endpoints require valid credentials
+- [ ] Users can only access their own resources
+- [ ] Admin routes require admin role
+- [ ] Tokens expire (short-lived access + refresh)
+- [ ] Passwords hashed (bcrypt/argon2, not MD5/SHA)
+- [ ] Sensitive data not logged (passwords, tokens, PII)
+- [ ] Secrets in env vars (not hardcoded)
+- [ ] Error messages don't leak internals
+
+### Dependencies
+
+- [ ] No known vulnerabilities (`npm audit` / `pip audit` / `govulncheck`)
+- [ ] Dependencies pinned in lockfile
+- [ ] Unused dependencies removed
+
+---
+
+## Gate 4: Deployment Readiness
+
+**Question: Can we deploy safely and roll back if needed?**
+
+### Code
+
+- [ ] All tests pass in CI (not "it passed locally")
+- [ ] Linter clean, build succeeds
+- [ ] Code reviewed and approved
+- [ ] No unresolved TODO/FIXME/HACK
+
+### Database
+
+- [ ] Migration tested on staging with production-like data
+- [ ] Down migration works (tested!)
+- [ ] Migration is non-destructive (additive only)
+- [ ] Migration timing estimated on production data size
+- [ ] Backfill plan documented (if needed)
+
+### Configuration
+
+- [ ] New env vars documented in `.env.example`
+- [ ] Env vars set in staging and verified
+- [ ] Env vars set in production
+- [ ] Feature flags configured (if applicable)
+
+### Rollback Plan Template
+
+```markdown
+## Rollback Plan: [Feature]
+
+### When to rollback
+- Error rate > 1% sustained 5 minutes
+- p99 latency > 3000ms sustained 10 minutes
+- Critical business function broken
+
+### Steps
+1. Revert deploy: [command]
+2. Rollback migration (if applied): [command]
+3. Invalidate cache: [command]
+4. Notify team: #incidents channel
+5. Verify rollback: [verification steps]
+
+### Estimated time: [X minutes]
+### Data recovery: [procedure if data was modified]
+```
+
+---
+
+## Gate 5: Release Execution
+
+### Deployment Sequence
+
+```
+1. 📢 ANNOUNCE in release channel
+
+2. 🗄️ DATABASE — Apply migration
+   - Run migration
+   - Verify completion
+   - Check data integrity
+
+3. 🚀 DEPLOY — Roll out code
+   - Canary first (10% traffic)
+   - Monitor 5 minutes
+   - If OK → 50% → monitor → 100%
+   - If NOT OK → STOP immediately
+
+4. 🔍 SMOKE TEST
+   - Health check → 200
+   - Login works
+   - Core operation works
+   - No error spikes
+
+5. ✅ ANNOUNCE "Release complete. Monitoring 30 min."
+```
+
+### Canary Decision Table
+
+| Metric | Baseline | Canary OK | STOP | ROLLBACK |
+|--------|----------|-----------|------|----------|
+| Error rate | 0.05% | < 0.1% | 0.5% | > 1% |
+| p95 latency | 300ms | < 500ms | 700ms | > 1000ms |
+
+---
+
+## Gate 6: Post-Release Validation
+
+### Immediate (0-30 min)
+
+- [ ] Health checks green on all instances
+- [ ] Error rate within normal range
+- [ ] Latency normal (p95, p99)
+- [ ] Core user journey manually tested
+- [ ] Logs clean — no unexpected errors
+- [ ] Alerts silent
+
+### Short-term (1-24 hours)
+
+- [ ] No customer complaints
+- [ ] Business metrics stable (conversion, revenue, signups)
+- [ ] Memory/CPU stable (no creeping usage)
+- [ ] Queue backlogs clear
+- [ ] Database performance stable
+
+### Post-Release Report Template
+
+```markdown
+## Release Report: [Feature]
+- Deployed: [timestamp] by @[engineer]
+- Duration: [minutes]
+
+| Check | Status | Notes |
+|-------|--------|-------|
+| Health checks | ✅ | All healthy |
+| Error rate | ✅ | 0.03% (baseline: 0.05%) |
+| p95 latency | ✅ | 310ms (baseline: 300ms) |
+| Core flow | ✅ | Order creation verified |
+
+Issues found: None / [details]
+Rollback used: No / Yes: [reason]
+```
+
+---
+
+## Release Readiness Score
+
+Score each gate **0-2**: (0 = not checked, 1 = partially, 2 = fully verified with evidence)
+
+| Gate | Score |
+|------|-------|
+| 1. Functional Acceptance | /2 |
+| 2. Non-Functional Acceptance | /2 |
+| 3. Security Review | /2 |
+| 4. Deployment Readiness | /2 |
+| 5. Release Execution Plan | /2 |
+| 6. Post-Release Validation Plan | /2 |
+| **Total** | **/12** |
+
+**Decision:**
+- **12/12** → Ship it ✅
+- **10-11** → Ship with documented exceptions + owner assigned
+- **< 10** → Do NOT release. Fix gaps first.
+
+---
+
+## Common Rationalizations
+
+| ❌ Excuse | ✅ Reality |
+|----------|-----------|
+| "It's a small change" | Small changes cause outages every day |
+| "We tested locally" | Local ≠ production |
+| "We'll fix it if it breaks" | You'll fix it at 3 AM. Prevent now. |
+| "Deadline is today" | Broken code costs more than late code |
+| "CI passed" | CI doesn't check everything. Run the checklist. |
+| "We can always rollback" | Only if you planned and tested rollback |
+| "We did this last time fine" | Survivorship bias. Checklist every time. |
--- a/fullstack-dev/references/technology-selection.md
+++ b/fullstack-dev/references/technology-selection.md
@@ -0,0 +1,254 @@
+# Technology Selection Framework
+
+Structured decision framework for backend and full-stack technology choices. Prevents analysis paralysis while ensuring rigorous evaluation.
+
+**Iron Law: NO TECHNOLOGY CHOICE WITHOUT EXPLICIT TRADE-OFF ANALYSIS.**
+
+"I like it" and "it's trending" are not engineering arguments.
+
+---
+
+## Phase 1: Requirements Before Technology
+
+### Non-Functional Requirements (Quantify!)
+
+| Dimension | Question | Bad Answer | Good Answer |
+|-----------|----------|-----------|-------------|
+| Scale | How many concurrent users? | "Lots" | "1K concurrent, 500 RPS peak" |
+| Latency | Acceptable p99 response time? | "Fast" | "< 200ms API, < 2s reports" |
+| Availability | Required uptime? | "Always up" | "99.9% (8.7h downtime/year)" |
+| Data volume | Expected storage growth? | "A lot" | "100GB/year, 10M rows" |
+| Consistency | Strong vs eventual? | "Consistent" | "Strong for payments, eventual for feeds" |
+| Compliance | Regulatory? | "Some" | "GDPR data residency EU, SOC 2 Type II" |
+
+### Team Constraints
+
+- Team size and seniority level
+- What the team already knows well
+- Can you hire for this stack? (check job market)
+- Timeline pressure (days vs months to production)
+- Budget for licenses, infrastructure, training
+
+---
+
+## Phase 2: Evaluation Matrix
+
+Score each option 1-5 on weighted criteria:
+
+| Criterion | Weight | Option A | Option B | Option C |
+|-----------|--------|----------|----------|----------|
+| Meets functional requirements | 5× | _ | _ | _ |
+| Meets non-functional requirements | 5× | _ | _ | _ |
+| Team expertise / learning curve | 4× | _ | _ | _ |
+| Ecosystem maturity (libs, tools) | 3× | _ | _ | _ |
+| Community & long-term viability | 3× | _ | _ | _ |
+| Operational complexity | 3× | _ | _ | _ |
+| Hiring pool availability | 2× | _ | _ | _ |
+| Cost (license + infra + training) | 2× | _ | _ | _ |
+| **Weighted Total** | | _ | _ | _ |
+
+**Rules:**
+- Any option scoring **1 on a 5× criterion** → automatically disqualified
+- Options within **10%** of each other → choose what team knows best
+- Options within **15%** → run a **time-boxed PoC** (2-5 days max)
+
+---
+
+## Phase 3: Decision Trees
+
+### Backend Language / Framework
+
+```
+What type of project?
+│
+├─ REST/GraphQL API, rapid development
+│   ├─ Team knows TypeScript → Node.js
+│   │   ├─ Full-featured, enterprise patterns → NestJS
+│   │   ├─ Lightweight, flexible → Fastify / Hono / Express
+│   │   └─ Full-stack with React → Next.js API routes
+│   ├─ Team knows Python
+│   │   ├─ High-perf async API → FastAPI
+│   │   ├─ Full-stack, admin-heavy → Django
+│   │   └─ Lightweight → Flask / Litestar
+│   └─ Team knows Java/Kotlin
+│       ├─ Enterprise, large team → Spring Boot
+│       └─ Lightweight, fast startup → Quarkus / Ktor
+│
+├─ High concurrency, systems-level
+│   ├─ Microservices, network → Go
+│   ├─ Extreme perf, safety → Rust (Axum / Actix)
+│   └─ Fault tolerance → Elixir (Phoenix)
+│
+├─ Real-time (WebSocket, streaming)
+│   ├─ Node.js ecosystem → Socket.io / ws
+│   ├─ Scalable pub/sub → Elixir Phoenix
+│   └─ Low-latency → Go / Rust
+│
+└─ ML / data-intensive
+    └─ Python (FastAPI + ML libs)
+```
+
+### Database
+
+```
+What data model?
+│
+├─ Structured, relational, ACID
+│   ├─ General purpose → PostgreSQL ← DEFAULT CHOICE
+│   ├─ Read-heavy, MySQL ecosystem → MySQL / MariaDB
+│   └─ Embedded / serverless edge → SQLite / Turso / D1
+│
+├─ Semi-structured, flexible schema
+│   ├─ Document-oriented → MongoDB
+│   ├─ Serverless document → DynamoDB / Firestore
+│   └─ Search-heavy → Elasticsearch / OpenSearch
+│
+├─ Key-value / cache
+│   ├─ In-memory + data structures → Redis / Valkey
+│   └─ Planet-scale KV → DynamoDB / Cassandra
+│
+├─ Time-series → TimescaleDB / ClickHouse / InfluxDB
+├─ Graph → Neo4j / Apache AGE (Postgres extension)
+└─ Vector (AI embeddings) → pgvector / Pinecone / Qdrant
+```
+
+**Default:** Start with PostgreSQL. It handles 80% of use cases.
+
+### Caching Strategy
+
+| Pattern | Technology | When |
+|---------|-----------|------|
+| Application cache | Redis / Valkey | Sessions, frequent reads, rate limiting |
+| HTTP cache | CDN (Cloudflare/Vercel) | Static assets, public API responses |
+| Query cache | Materialized views | Complex aggregations, dashboards |
+| In-process cache | LRU (in-memory) | Config, small lookup tables |
+| Edge cache | Cloudflare KV / Vercel KV | Global low-latency reads |
+
+### Message Queue / Event Streaming
+
+| Pattern | Technology | When |
+|---------|-----------|------|
+| Task queue (background jobs) | BullMQ / Celery / SQS | Email, exports, payments |
+| Event streaming (replay, audit) | Kafka / Redpanda | Event sourcing, real-time pipelines |
+| Lightweight pub/sub | Redis Streams / NATS | Simple notifications, broadcasting |
+| Request-reply (sync over async) | NATS / RabbitMQ RPC | Internal service calls |
+
+### Hosting / Deployment
+
+| Model | Technology | When |
+|-------|-----------|------|
+| Serverless (auto-scale) | Vercel / Cloudflare Workers / Lambda | Variable traffic, pay-per-use |
+| Container (predictable) | Cloud Run / Render / Railway / Fly.io | Steady traffic, simple ops |
+| Kubernetes (large scale) | EKS / GKE / AKS | 10+ services, team has K8s expertise |
+| VPS (full control) | DigitalOcean / Hetzner / EC2 | Predictable workload, cost-sensitive |
+
+---
+
+## Phase 4: Decision Documentation
+
+### ADR (Architecture Decision Record) Template
+
+```markdown
+# ADR-{NNN}: {Title}
+
+## Status: Proposed | Accepted | Deprecated | Superseded by ADR-{NNN}
+
+## Context
+What problem are we solving? What forces are at play?
+
+## Decision
+What did we choose and why?
+
+## Evaluation
+| Criterion | Weight | Chosen | Runner-up |
+|-----------|--------|--------|-----------|
+
+## Consequences
+- Positive: ...
+- Negative: ...
+- Risks: ...
+
+## Alternatives Rejected
+- Option B: rejected because...
+- Option C: rejected because...
+```
+
+---
+
+## Common Stack Templates
+
+### A: Startup / MVP (Speed)
+
+| Layer | Choice | Why |
+|-------|--------|-----|
+| Language | TypeScript | One language front + back |
+| Framework | Next.js (full-stack) or NestJS (API) | Fast iteration |
+| Database | PostgreSQL (Supabase / Neon) | Managed, generous free tier |
+| Auth | Better Auth / Clerk | No auth code to maintain |
+| Cache | Redis (Upstash) | Serverless-friendly |
+| Hosting | Vercel / Railway | Zero-config deploys |
+
+### B: SaaS / Business App (Balance)
+
+| Layer | Choice | Why |
+|-------|--------|-----|
+| Language | TypeScript or Python | Team preference |
+| Framework | NestJS or FastAPI | Structured, testable |
+| Database | PostgreSQL | Reliable, feature-rich |
+| Queue | BullMQ (Redis) | Simple background jobs |
+| Auth | OAuth 2.0 + JWT | Standard, flexible |
+| Hosting | AWS ECS / Cloud Run | Scalable containers |
+| Monitoring | Datadog / Grafana + Prometheus | Full observability |
+
+### C: High-Performance (Scale)
+
+| Layer | Choice | Why |
+|-------|--------|-----|
+| Language | Go or Rust | Max throughput, low latency |
+| Database | PostgreSQL + Redis + ClickHouse | OLTP + cache + analytics |
+| Queue | Kafka / Redpanda | High-throughput streaming |
+| Hosting | Kubernetes (EKS/GKE) | Fine-grained scaling |
+| Monitoring | Prometheus + Grafana + Jaeger | Metrics + tracing |
+
+### D: AI / ML Application
+
+| Layer | Choice | Why |
+|-------|--------|-----|
+| Language | Python (API) + TypeScript (frontend) | ML libs + modern UI |
+| Framework | FastAPI + Next.js | Async + SSR |
+| Database | PostgreSQL + pgvector | Relational + embeddings |
+| Queue | Celery + Redis | ML job processing |
+| Hosting | Modal / AWS GPU / Replicate | GPU access |
+
+---
+
+## Anti-Patterns
+
+| # | ❌ Don't | ✅ Do Instead |
+|---|---------|--------------|
+| 1 | "X is trending on HN" | Evaluate against YOUR requirements |
+| 2 | Resume-Driven Development | Choose what team can maintain |
+| 3 | "Must scale to 1M users" (day 1) | Build for 10× current need, not 1000× |
+| 4 | Evaluate for weeks | Time-box to 3-5 days, then decide |
+| 5 | No decision documentation | Write ADR for every major choice |
+| 6 | Ignore operational cost | Include deploy, monitor, debug cost |
+| 7 | "We'll rewrite later" | Assume you won't. Choose carefully. |
+| 8 | Microservices by default | Start monolith, extract when needed |
+| 9 | Different DB per service (day 1) | One database, split when justified |
+| 10 | "It worked at Google" | You're not Google. Scale to YOUR context. |
+
+---
+
+## Common Issues
+
+### Issue 1: "Team can't agree on a framework"
+
+**Fix:** Time-box to 3 days. Fill the evaluation matrix. If scores within 10%, pick what the majority knows. Document in ADR. Move on.
+
+### Issue 2: "We picked X but it doesn't fit"
+
+**Fix:** Sunk cost fallacy check. If < 2 weeks invested, switch now. If > 2 weeks, document pain points and plan phased migration.
+
+### Issue 3: "Do we need microservices?"
+
+**Fix:** Almost certainly no. Start with a well-structured monolith. Extract to services only when: (a) different scaling needs, (b) different team ownership, (c) different deployment cadence.
--- a/fullstack-dev/references/testing-strategy.md
+++ b/fullstack-dev/references/testing-strategy.md
@@ -0,0 +1,404 @@
+# Backend Testing Strategy
+
+Comprehensive testing guide for backend and full-stack applications. Covers the full testing pyramid with deep focus on API integration tests, database testing, contract testing, and performance testing.
+
+## Quick Start Checklist
+
+- [ ] **Test runner configured** (Jest/Vitest, Pytest, Go test)
+- [ ] **Test database** ready (Docker container or in-memory)
+- [ ] **Database isolation** per test (transaction rollback or truncation)
+- [ ] **Test factories** for common entities (user, order, product)
+- [ ] **Auth helper** to generate tokens for tests
+- [ ] **CI pipeline** runs tests with real database service
+- [ ] **Coverage threshold** enforced (≥ 80%)
+
+---
+
+## The Testing Pyramid
+
+```
+         ╱╲        E2E (few, slow) — full flows across services
+        ╱  ╲
+       ╱────╲       Integration (moderate) — API + DB + external
+      ╱      ╲
+     ╱────────╲      Unit (many, fast) — pure business logic
+    ╱__________╲
+```
+
+| Level | What | Speed | Count |
+|-------|------|-------|-------|
+| Unit | Pure functions, business logic, no I/O | < 10ms | 70%+ of tests |
+| Integration | API routes + real database + mocked externals | 50-500ms | ~20% |
+| E2E | Full user flow across deployed services | 1-30s | ~10% |
+| Contract | API compatibility between services | < 100ms | Per API boundary |
+| Performance | Load, stress, soak | Minutes | Per critical path |
+
+---
+
+## 1. API Integration Testing (CRITICAL)
+
+### What to Test for Every Endpoint
+
+| Aspect | Tests to Write |
+|--------|---------------|
+| Happy path | Correct input → expected response + correct DB state |
+| Auth | No token → 401, bad token → 401, expired → 401 |
+| Authorization | Wrong role → 403, not owner → 403 |
+| Validation | Missing fields → 422, bad types → 422, boundary values |
+| Not found | Invalid ID → 404, deleted resource → 404 |
+| Conflict | Duplicate create → 409, stale update → 409 |
+| Idempotency | Same request twice → same result |
+| Side effects | DB state changed, events emitted, cache invalidated |
+| Error format | All errors match RFC 9457 envelope |
+
+### TypeScript (Jest + Supertest)
+
+```typescript
+describe('POST /api/orders', () => {
+  let token: string;
+  let product: Product;
+
+  beforeAll(async () => {
+    await resetDatabase();
+    const user = await createTestUser({ role: 'customer' });
+    token = await getAuthToken(user);
+    product = await createTestProduct({ price: 29.99, stock: 10 });
+  });
+
+  it('creates order → 201 + correct DB state', async () => {
+    const res = await request(app)
+      .post('/api/orders')
+      .set('Authorization', `Bearer ${token}`)
+      .send({ items: [{ productId: product.id, quantity: 2 }] });
+
+    expect(res.status).toBe(201);
+    expect(res.body.data.total).toBe(59.98);
+
+    const updated = await db.product.findUnique({ where: { id: product.id } });
+    expect(updated!.stock).toBe(8);
+  });
+
+  it('rejects without auth → 401', async () => {
+    const res = await request(app).post('/api/orders').send({ items: [] });
+    expect(res.status).toBe(401);
+  });
+
+  it('rejects empty items → 422', async () => {
+    const res = await request(app)
+      .post('/api/orders')
+      .set('Authorization', `Bearer ${token}`)
+      .send({ items: [] });
+    expect(res.status).toBe(422);
+    expect(res.body.errors[0].field).toBe('items');
+  });
+});
+```
+
+### Python (Pytest + FastAPI TestClient)
+
+```python
+@pytest.fixture
+def client(db_session):
+    def override_get_db():
+        yield db_session
+    app.dependency_overrides[get_db] = override_get_db
+    yield TestClient(app)
+    app.dependency_overrides.clear()
+
+def test_create_order_success(client, auth_headers, test_product):
+    response = client.post("/api/orders", json={
+        "items": [{"product_id": test_product.id, "quantity": 2}]
+    }, headers=auth_headers)
+    assert response.status_code == 201
+    assert response.json()["data"]["total"] == 59.98
+
+def test_create_order_no_auth(client):
+    response = client.post("/api/orders", json={"items": []})
+    assert response.status_code == 401
+
+def test_create_order_empty_items(client, auth_headers):
+    response = client.post("/api/orders", json={"items": []}, headers=auth_headers)
+    assert response.status_code == 422
+```
+
+---
+
+## 2. Database Testing (HIGH)
+
+### Test Isolation Strategies
+
+| Strategy | Speed | Realism | When |
+|----------|-------|---------|------|
+| **Transaction rollback** | ⚡ Fastest | Medium | Default for unit + integration |
+| **Truncation** | Fast | High | When rollback isn't possible |
+| **Test containers** | Slow startup | Highest | CI pipeline, full integration |
+
+**Transaction rollback (recommended default):**
+```typescript
+let tx: Transaction;
+beforeEach(async () => { tx = await db.beginTransaction(); });
+afterEach(async () => { await tx.rollback(); });
+```
+
+**Docker test containers (CI):**
+```yaml
+# docker-compose.test.yml
+services:
+  test-db:
+    image: postgres:16-alpine
+    tmpfs: /var/lib/postgresql/data   # RAM disk for speed
+    environment:
+      POSTGRES_DB: myapp_test
+```
+
+### Test Factories (Not Raw SQL)
+
+```typescript
+// factories/user.factory.ts
+import { faker } from '@faker-js/faker';
+
+export function buildUser(overrides: Partial<User> = {}): CreateUserDTO {
+  return {
+    email: faker.internet.email(),
+    firstName: faker.person.firstName(),
+    role: 'customer',
+    ...overrides,
+  };
+}
+export async function createUser(overrides = {}) {
+  return db.user.create({ data: buildUser(overrides) });
+}
+```
+
+```python
+# factories/user_factory.py
+import factory
+from faker import Faker
+
+class UserFactory(factory.Factory):
+    class Meta:
+        model = User
+    email = factory.LazyAttribute(lambda _: Faker().email())
+    first_name = factory.LazyAttribute(lambda _: Faker().first_name())
+    role = "customer"
+```
+
+---
+
+## 3. External Service Testing (HIGH)
+
+### HTTP-Level Mocking (Not Function Mocking)
+
+**TypeScript (nock):**
+```typescript
+import nock from 'nock';
+
+it('processes payment successfully', async () => {
+  nock('https://api.stripe.com')
+    .post('/v1/charges')
+    .reply(200, { id: 'ch_123', status: 'succeeded', amount: 5000 });
+
+  const result = await paymentService.charge({ amount: 50.00, currency: 'usd' });
+  expect(result.status).toBe('succeeded');
+});
+
+it('handles payment timeout', async () => {
+  nock('https://api.stripe.com').post('/v1/charges').delay(10000).reply(200);
+  await expect(paymentService.charge({ amount: 50, currency: 'usd' }))
+    .rejects.toThrow('timeout');
+});
+```
+
+**Python (responses):**
+```python
+import responses
+
+@responses.activate
+def test_payment_success():
+    responses.post("https://api.stripe.com/v1/charges",
+                   json={"id": "ch_123", "status": "succeeded"}, status=200)
+    result = payment_service.charge(amount=50.00, currency="usd")
+    assert result.status == "succeeded"
+```
+
+### Test Containers for Infrastructure
+
+```typescript
+import { PostgreSqlContainer } from '@testcontainers/postgresql';
+import { RedisContainer } from '@testcontainers/redis';
+
+beforeAll(async () => {
+  const pg = await new PostgreSqlContainer('postgres:16').start();
+  process.env.DATABASE_URL = pg.getConnectionUri();
+  await runMigrations();
+}, 60000);
+```
+
+---
+
+## 4. Contract Testing (MEDIUM-HIGH)
+
+### Consumer-Driven Contracts (Pact)
+
+**Consumer (OrderService calls UserService):**
+```typescript
+it('can fetch user by ID', async () => {
+  await pact.addInteraction()
+    .given('user usr_123 exists')
+    .uponReceiving('GET /users/usr_123')
+    .withRequest('GET', '/api/users/usr_123')
+    .willRespondWith(200, (b) => {
+      b.jsonBody({ data: { id: MatchersV3.string(), email: MatchersV3.email() } });
+    })
+    .executeTest(async (mockserver) => {
+      const user = await new UserClient(mockserver.url).getUser('usr_123');
+      expect(user.id).toBeDefined();
+    });
+});
+```
+
+**Provider verifies in CI:**
+```typescript
+await new Verifier({
+  providerBaseUrl: 'http://localhost:3001',
+  pactBrokerUrl: process.env.PACT_BROKER_URL,
+  provider: 'UserService',
+}).verifyProvider();
+```
+
+---
+
+## 5. Performance Testing (MEDIUM)
+
+### k6 Load Test
+
+```javascript
+import http from 'k6/http';
+import { check, sleep } from 'k6';
+
+export const options = {
+  stages: [
+    { duration: '30s', target: 20 },    // ramp up
+    { duration: '1m',  target: 100 },   // sustain
+    { duration: '30s', target: 0 },     // ramp down
+  ],
+  thresholds: {
+    http_req_duration: ['p(95)<500', 'p(99)<1000'],
+    http_req_failed: ['rate<0.01'],
+  },
+};
+
+export default function () {
+  const res = http.get(`${__ENV.BASE_URL}/api/orders`);
+  check(res, { 'status 200': (r) => r.status === 200 });
+  sleep(1);
+}
+```
+
+### Performance Budgets
+
+| Metric | Target | Action if Exceeded |
+|--------|--------|--------------------|
+| p95 response time | < 500ms | Optimize queries/caching |
+| p99 response time | < 1000ms | Check outlier queries |
+| Error rate | < 0.1% | Investigate spikes |
+| DB query time | < 100ms each | Add indexes |
+
+### When to Run
+
+| Trigger | Test Type |
+|---------|-----------|
+| Before major release | Full load test |
+| New DB query/index | Query benchmark |
+| Infrastructure change | Baseline comparison |
+| Weekly (CI) | Smoke load test |
+
+---
+
+## Test File Organization
+
+```
+tests/
+  unit/                      # Pure logic, mocked dependencies
+    order.service.test.ts
+  integration/               # API + real DB
+    orders.api.test.ts
+    auth.api.test.ts
+  contracts/                 # Consumer-driven contracts
+    user-service.consumer.pact.ts
+  performance/               # Load tests
+    load-test.js
+  fixtures/
+    factories/               # Test data factories
+      user.factory.ts
+    seeds/
+      test-data.ts
+  helpers/
+    setup.ts                 # Global test config
+    auth.helper.ts           # Token generation
+    db.helper.ts             # DB cleanup
+```
+
+---
+
+## Anti-Patterns
+
+| # | ❌ Don't | ✅ Do Instead |
+|---|---------|--------------|
+| 1 | Test only happy paths | Test errors, auth, validation, edge cases |
+| 2 | Mock everything (no real DB) | Use test containers or test DB |
+| 3 | Tests depend on execution order | Each test sets up / tears down own state |
+| 4 | Hardcode test data | Use factories (faker + overrides) |
+| 5 | Test implementation details | Test behavior: input → output |
+| 6 | Share mutable state | Isolate per test (transaction rollback) |
+| 7 | Skip migration testing in CI | Run migrations from scratch in CI |
+| 8 | No performance test before release | Load test every major release |
+| 9 | Test against production data | Generated test data only |
+| 10 | Test suite > 10 minutes | Parallelize, RAM disk, optimize setup |
+
+---
+
+## Common Issues
+
+### Issue 1: "Tests pass alone but fail together"
+
+**Cause:** Shared database state between tests. Missing cleanup.
+
+**Fix:**
+```typescript
+beforeEach(async () => { await db.raw('TRUNCATE orders, users CASCADE'); });
+// OR use transaction rollback per test
+```
+
+### Issue 2: "Jest did not exit one second after test run"
+
+**Cause:** Unclosed database connections or HTTP servers.
+
+**Fix:**
+```typescript
+afterAll(async () => {
+  await db.destroy();
+  await server.close();
+});
+```
+
+### Issue 3: "Async callback was not invoked within timeout"
+
+**Cause:** Missing `async/await` or unhandled promise.
+
+**Fix:**
+```typescript
+// ❌ Promise not awaited
+it('should work', () => { request(app).get('/users'); });
+
+// ✅ Properly awaited
+it('should work', async () => { await request(app).get('/users'); });
+```
+
+### Issue 4: "Integration tests too slow in CI"
+
+**Fix:**
+1. Use `tmpfs` for PostgreSQL data dir (RAM disk)
+2. Run migrations once in `beforeAll`, truncate in `beforeEach`
+3. Parallelize test suites with `--maxWorkers`
+4. Skip performance tests on feature branches (only main)