Initial commit: add all skills files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
444
fullstack-dev/references/api-design.md
Normal file
444
fullstack-dev/references/api-design.md
Normal file
@@ -0,0 +1,444 @@
|
||||
---
|
||||
name: fullstack-dev-api-design
|
||||
description: "API design patterns and best practices. Use when creating endpoints, choosing methods/status codes, implementing pagination, or writing OpenAPI specs. Prevents common REST/GraphQL/gRPC mistakes."
|
||||
license: MIT
|
||||
metadata:
|
||||
version: "2.0.0"
|
||||
sources:
|
||||
- Microsoft REST API Guidelines
|
||||
- Google API Design Guide
|
||||
- Zalando RESTful API Guidelines
|
||||
- JSON:API Specification
|
||||
- RFC 9457 (Problem Details for HTTP APIs)
|
||||
- RFC 9110 (HTTP Semantics)
|
||||
---
|
||||
|
||||
# API Design Guidelines
|
||||
|
||||
Framework-agnostic API design guide for backend and full-stack engineers. 50+ rules across 10 categories, prioritized by impact. Covers REST, GraphQL, and gRPC.
|
||||
|
||||
## Scope
|
||||
|
||||
**USE this skill when:**
|
||||
- Designing a new API or adding endpoints
|
||||
- Reviewing API pull requests
|
||||
- Choosing between REST / GraphQL / gRPC
|
||||
- Writing OpenAPI specifications
|
||||
- Migrating or versioning an existing API
|
||||
|
||||
**NOT for:**
|
||||
- Framework-specific implementation details (use your framework's own skill/docs)
|
||||
- Frontend data fetching patterns (use React Query / SWR docs)
|
||||
- Authentication implementation details (use your auth library's docs)
|
||||
- Database schema design (→ `database-schema-design`)
|
||||
|
||||
## Context Required
|
||||
|
||||
Before applying this skill, gather:
|
||||
|
||||
| Required | Optional |
|
||||
|----------|----------|
|
||||
| Target consumers (browser, mobile, service) | Existing API conventions in the project |
|
||||
| Expected request volume (RPS estimate) | Current OpenAPI / Swagger spec |
|
||||
| Authentication method (JWT, API key, OAuth) | Rate limiting requirements |
|
||||
| Data model / domain entities | Caching strategy |
|
||||
|
||||
---
|
||||
|
||||
## Quick Start Checklist
|
||||
|
||||
New API endpoint? Run through this before writing code:
|
||||
|
||||
- [ ] Resource named as **plural noun** (`/orders`, not `/getOrders`)
|
||||
- [ ] URL in **kebab-case**, body fields in **camelCase**
|
||||
- [ ] Correct **HTTP method** (GET=read, POST=create, PUT=replace, PATCH=partial, DELETE=remove)
|
||||
- [ ] Correct **status code** (201 Created, 422 Validation, 404 Not Found…)
|
||||
- [ ] Error response follows **RFC 9457** envelope
|
||||
- [ ] **Pagination** on all list endpoints (default 20, max 100)
|
||||
- [ ] **Authentication** required (Bearer token, not query param)
|
||||
- [ ] **Request ID** in response header (`X-Request-Id`)
|
||||
- [ ] **Rate limit** headers included
|
||||
- [ ] Endpoint documented in **OpenAPI spec**
|
||||
|
||||
---
|
||||
|
||||
## Quick Navigation
|
||||
|
||||
| Need to… | Jump to |
|
||||
|----------|---------|
|
||||
| Name a resource URL | [1. Resource Modeling](#1-resource-modeling-critical) |
|
||||
| Pick HTTP method + status code | [3. HTTP Methods & Status Codes](#3-http-methods--status-codes-critical) |
|
||||
| Format error responses | [4. Error Handling](#4-error-handling-high) |
|
||||
| Add pagination or filtering | [6. Pagination & Filtering](#6-pagination--filtering-high) |
|
||||
| Choose API style (REST vs GraphQL vs gRPC) | [10. API Style Decision](#10-api-style-decision-tree) |
|
||||
| Version an existing API | [7. Versioning](#7-versioning-medium-high) |
|
||||
| Avoid common mistakes | [Anti-Patterns](#anti-patterns-checklist) |
|
||||
|
||||
---
|
||||
|
||||
## 1. Resource Modeling (CRITICAL)
|
||||
|
||||
### Core Rules
|
||||
|
||||
```
|
||||
✅ /users — plural noun
|
||||
✅ /users/{id}/orders — 1 level nesting
|
||||
✅ /reviews?orderId={oid} — flatten deep nesting with query params
|
||||
|
||||
❌ /getUsers — verb in URL
|
||||
❌ /user — singular
|
||||
❌ /users/{uid}/orders/{oid}/items/{iid}/reviews — 3+ levels deep
|
||||
```
|
||||
|
||||
**Max nesting: 2 levels.** Beyond that, promote to top-level resource with filters.
|
||||
|
||||
### Domain Alignment
|
||||
|
||||
Resources map to **domain concepts**, not database tables:
|
||||
|
||||
```
|
||||
✅ /checkout-sessions (domain aggregate)
|
||||
✅ /shipping-labels (domain concept)
|
||||
|
||||
❌ /tbl_order_header (database table leak)
|
||||
❌ /join_user_role (internal schema leak)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. URL & Naming (CRITICAL)
|
||||
|
||||
| Context | Convention | Example |
|
||||
|---------|-----------|---------|
|
||||
| URL path | kebab-case | `/order-items` |
|
||||
| JSON body fields | camelCase | `{ "firstName": "Jane" }` |
|
||||
| Query params | camelCase or snake_case (be consistent) | `?sortBy=createdAt` |
|
||||
| Headers | Train-Case | `X-Request-Id` |
|
||||
|
||||
**Python exception:** If your entire stack is Python/snake_case, you MAY use `snake_case` in JSON — but be **consistent across all endpoints**.
|
||||
|
||||
```
|
||||
✅ GET /users ❌ GET /users/
|
||||
✅ GET /reports/annual ❌ GET /reports/annual.json
|
||||
✅ POST /users ❌ POST /users/create
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. HTTP Methods & Status Codes (CRITICAL)
|
||||
|
||||
### Method Semantics
|
||||
|
||||
| Method | Semantics | Idempotent | Safe | Request Body |
|
||||
|--------|-----------|-----------|------|-------------|
|
||||
| GET | Read | ✅ | ✅ | ❌ Never |
|
||||
| POST | Create / Action | ❌ | ❌ | ✅ Always |
|
||||
| PUT | Full replace | ✅ | ❌ | ✅ Always |
|
||||
| PATCH | Partial update | ❌* | ❌ | ✅ Always |
|
||||
| DELETE | Remove | ✅ | ❌ | ❌ Rarely |
|
||||
|
||||
### Status Code Quick Reference
|
||||
|
||||
**Success:**
|
||||
|
||||
| Code | When | Response Body |
|
||||
|------|------|--------------|
|
||||
| 200 OK | GET, PUT, PATCH success | Resource / result |
|
||||
| 201 Created | POST created resource | Created resource + `Location` header |
|
||||
| 202 Accepted | Async operation started | Job ID / status URL |
|
||||
| 204 No Content | DELETE success, PUT with no body | None |
|
||||
|
||||
**Client Errors:**
|
||||
|
||||
| Code | When | Key Distinction |
|
||||
|------|------|-----------------|
|
||||
| 400 Bad Request | Malformed syntax | Can't even parse |
|
||||
| 401 Unauthorized | Missing / invalid auth | "Who are you?" |
|
||||
| 403 Forbidden | Authenticated, no permission | "I know you, but no" |
|
||||
| 404 Not Found | Resource doesn't exist | Also use to hide 403 |
|
||||
| 409 Conflict | Duplicate, version mismatch | State conflict |
|
||||
| 422 Unprocessable | Valid syntax, failed validation | Semantic errors |
|
||||
| 429 Too Many Requests | Rate limit hit | Include `Retry-After` |
|
||||
|
||||
**Server Errors:** 500 (unexpected), 502 (upstream fail), 503 (overloaded), 504 (upstream timeout)
|
||||
|
||||
---
|
||||
|
||||
## 4. Error Handling (HIGH)
|
||||
|
||||
### Standard Error Envelope (RFC 9457)
|
||||
|
||||
Every error response uses this format:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "https://api.example.com/errors/insufficient-funds",
|
||||
"title": "Insufficient Funds",
|
||||
"status": 422,
|
||||
"detail": "Account balance $10.00 is less than withdrawal $50.00.",
|
||||
"instance": "/transactions/txn_abc123",
|
||||
"request_id": "req_7f3a8b2c",
|
||||
"errors": [
|
||||
{ "field": "amount", "message": "Exceeds balance", "code": "INSUFFICIENT_BALANCE" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Multi-Language Implementation
|
||||
|
||||
**TypeScript (Express):**
|
||||
```typescript
|
||||
class AppError extends Error {
|
||||
constructor(
|
||||
public readonly title: string,
|
||||
public readonly status: number,
|
||||
public readonly detail: string,
|
||||
public readonly code: string,
|
||||
) { super(detail); }
|
||||
}
|
||||
|
||||
// Middleware
|
||||
app.use((err, req, res, next) => {
|
||||
if (err instanceof AppError) {
|
||||
return res.status(err.status).json({
|
||||
type: `https://api.example.com/errors/${err.code}`,
|
||||
title: err.title, status: err.status,
|
||||
detail: err.detail, request_id: req.id,
|
||||
});
|
||||
}
|
||||
res.status(500).json({ title: 'Internal Error', status: 500, request_id: req.id });
|
||||
});
|
||||
```
|
||||
|
||||
**Python (FastAPI):**
|
||||
```python
|
||||
from fastapi import Request
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
class AppError(Exception):
|
||||
def __init__(self, title: str, status: int, detail: str, code: str):
|
||||
self.title, self.status, self.detail, self.code = title, status, detail, code
|
||||
|
||||
@app.exception_handler(AppError)
|
||||
async def app_error_handler(request: Request, exc: AppError):
|
||||
return JSONResponse(status_code=exc.status, content={
|
||||
"type": f"https://api.example.com/errors/{exc.code}",
|
||||
"title": exc.title, "status": exc.status,
|
||||
"detail": exc.detail, "request_id": request.state.request_id,
|
||||
})
|
||||
```
|
||||
|
||||
### Iron Rules
|
||||
|
||||
```
|
||||
✅ Return RFC 9457 error envelope for ALL errors
|
||||
✅ Include request_id in every error response
|
||||
✅ Return per-field validation errors in `errors` array
|
||||
|
||||
❌ Never expose stack traces in production
|
||||
❌ Never return 200 for errors
|
||||
❌ Never swallow errors silently
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Authentication & Authorization (HIGH)
|
||||
|
||||
```
|
||||
✅ Authorization: Bearer eyJhbGci... (header)
|
||||
❌ GET /users?token=eyJhbGci... (URL — appears in logs)
|
||||
|
||||
✅ 401 → "Who are you?" (missing/invalid credentials)
|
||||
✅ 403 → "You can't do this" (authenticated, no permission)
|
||||
✅ 404 → Hide resource existence (use instead of 403 when needed)
|
||||
```
|
||||
|
||||
**Rate Limit Headers (always include):**
|
||||
```
|
||||
X-RateLimit-Limit: 100
|
||||
X-RateLimit-Remaining: 42
|
||||
X-RateLimit-Reset: 1625097600
|
||||
Retry-After: 30
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Pagination & Filtering (HIGH)
|
||||
|
||||
### Cursor vs Offset
|
||||
|
||||
| Strategy | When | Pros | Cons |
|
||||
|----------|------|------|------|
|
||||
| **Cursor** (preferred) | Large/dynamic datasets | Consistent, no skips | Can't jump to page N |
|
||||
| **Offset** | Small/stable datasets, admin UIs | Simple, page jumps | Drift on insert/delete |
|
||||
|
||||
**Cursor pagination response:**
|
||||
```json
|
||||
{
|
||||
"data": [...],
|
||||
"pagination": { "next_cursor": "eyJpZCI6MTIwfQ", "has_more": true }
|
||||
}
|
||||
```
|
||||
|
||||
**Offset pagination response:**
|
||||
```json
|
||||
{
|
||||
"data": [...],
|
||||
"pagination": { "page": 3, "per_page": 20, "total": 256, "total_pages": 13 }
|
||||
}
|
||||
```
|
||||
|
||||
**Always enforce:** Default 20 items, max 100 items.
|
||||
|
||||
### Standard Filter Patterns
|
||||
|
||||
```
|
||||
GET /orders?status=shipped&created_after=2025-01-01&sort=-created_at&fields=id,status
|
||||
```
|
||||
|
||||
| Pattern | Convention |
|
||||
|---------|-----------|
|
||||
| Exact match | `?status=shipped` |
|
||||
| Range | `?price_gte=10&price_lte=100` |
|
||||
| Date range | `?created_after=2025-01-01&created_before=2025-12-31` |
|
||||
| Sort | `?sort=field` (asc), `?sort=-field` (desc) |
|
||||
| Sparse fields | `?fields=id,name,email` |
|
||||
| Search | `?q=search+term` |
|
||||
|
||||
---
|
||||
|
||||
## 7. Versioning (MEDIUM-HIGH)
|
||||
|
||||
| Strategy | Format | Best For |
|
||||
|----------|--------|----------|
|
||||
| **URL path** (recommended) | `/v1/users` | Public APIs |
|
||||
| **Header** | `Api-Version: 2` | Internal APIs |
|
||||
| **Query param** | `?version=2` | Legacy (avoid) |
|
||||
|
||||
**Non-breaking changes (no version bump):** New optional response fields, new endpoints, new optional params.
|
||||
|
||||
**Breaking changes (new version required):** Removing/renaming fields, changing types, stricter validation, removing endpoints.
|
||||
|
||||
**Deprecation headers:**
|
||||
```
|
||||
Sunset: Sat, 01 Mar 2026 00:00:00 GMT
|
||||
Deprecation: true
|
||||
Link: <https://api.example.com/v2/users>; rel="successor-version"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Request / Response Design (MEDIUM)
|
||||
|
||||
### Consistent Envelope
|
||||
|
||||
```json
|
||||
{
|
||||
"data": { "id": "ord_123", "status": "pending", "total": 99.50 },
|
||||
"meta": { "request_id": "req_abc123", "timestamp": "2025-06-15T10:30:00Z" }
|
||||
}
|
||||
```
|
||||
|
||||
### Key Rules
|
||||
|
||||
| Rule | Correct | Wrong |
|
||||
|------|---------|-------|
|
||||
| Timestamps | `"2025-06-15T10:30:00Z"` (ISO 8601) | `"06/15/2025"` or `1718447400` |
|
||||
| Public IDs | UUID `"550e8400-..."` | Auto-increment `42` |
|
||||
| Null vs absent (PATCH) | `{ "nickname": null }` = clear field | Absent field = don't change |
|
||||
| HATEOAS (public APIs) | `"links": { "cancel": "/orders/123/cancel" }` | No discoverability |
|
||||
|
||||
---
|
||||
|
||||
## 9. Documentation — OpenAPI (MEDIUM)
|
||||
|
||||
**Design-first workflow:**
|
||||
|
||||
```
|
||||
1. Write OpenAPI 3.1 spec
|
||||
2. Review spec with stakeholders
|
||||
3. Generate server stubs + client SDKs
|
||||
4. Implement handlers
|
||||
5. Validate responses against spec in CI
|
||||
```
|
||||
|
||||
Every endpoint documents: summary, all parameters, request body + examples, all response codes + schemas, auth requirements.
|
||||
|
||||
---
|
||||
|
||||
## 10. API Style Decision Tree
|
||||
|
||||
```
|
||||
What kind of API?
|
||||
│
|
||||
├─ Browser + mobile clients, flexible queries
|
||||
│ └─ GraphQL
|
||||
│ Rules: DataLoader (no N+1), depth limit ≤7, Relay pagination
|
||||
│
|
||||
├─ Standard CRUD, public consumers, caching important
|
||||
│ └─ REST (this guide)
|
||||
│ Rules: Resources, HTTP methods, status codes, OpenAPI
|
||||
│
|
||||
├─ Service-to-service, high throughput, strong typing
|
||||
│ └─ gRPC
|
||||
│ Rules: Protobuf schemas, streaming for large data, deadlines
|
||||
│
|
||||
├─ Full-stack TypeScript, same team owns client + server
|
||||
│ └─ tRPC
|
||||
│ Rules: Shared types, no code generation needed
|
||||
│
|
||||
└─ Real-time bidirectional
|
||||
└─ WebSocket / SSE
|
||||
Rules: Heartbeat, reconnection, message ordering
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns Checklist
|
||||
|
||||
| # | ❌ Don't | ✅ Do Instead |
|
||||
|---|---------|--------------|
|
||||
| 1 | Verbs in URLs (`/getUser`) | HTTP methods + noun resources |
|
||||
| 2 | Return 200 for errors | Correct 4xx/5xx status codes |
|
||||
| 3 | Mix naming styles | One convention per context |
|
||||
| 4 | Expose database IDs | UUIDs for public identifiers |
|
||||
| 5 | No pagination on lists | Always paginate (default 20) |
|
||||
| 6 | Swallow errors silently | Structured RFC 9457 errors |
|
||||
| 7 | Token in URL query | Authorization header |
|
||||
| 8 | Deep nesting (3+ levels) | Flatten with query params |
|
||||
| 9 | Break changes without version | Maintain compatibility or version |
|
||||
| 10 | No rate limiting | Implement + communicate via headers |
|
||||
| 11 | No request ID | `X-Request-Id` on every response |
|
||||
| 12 | Stack traces in production | Safe error message + internal log |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Should this be a new resource or a sub-resource?"
|
||||
|
||||
**Symptom:** URL path keeps growing (`/users/{id}/orders/{id}/items/{id}/reviews`)
|
||||
|
||||
**Rule:** If the child entity makes sense on its own, promote it. If it only exists within the parent context, keep it nested (max 2 levels).
|
||||
|
||||
```
|
||||
/reviews?orderId=123 ✅ (reviews exist independently)
|
||||
/orders/{id}/items ✅ (items belong to orders, 1 level)
|
||||
```
|
||||
|
||||
### Issue 2: "PUT or PATCH?"
|
||||
|
||||
**Symptom:** Team can't agree on update semantics.
|
||||
|
||||
**Rule:**
|
||||
- PUT = client sends **complete** resource (missing fields → set to default/null)
|
||||
- PATCH = client sends **only changed fields** (missing fields → unchanged)
|
||||
- When unsure → **PATCH** (safer, less surprising)
|
||||
|
||||
### Issue 3: "400 or 422?"
|
||||
|
||||
**Symptom:** Inconsistent validation error codes.
|
||||
|
||||
**Rule:**
|
||||
- 400 = can't parse request at all (malformed JSON, wrong content-type)
|
||||
- 422 = parsed OK, but values fail validation (invalid email, negative quantity)
|
||||
165
fullstack-dev/references/auth-flow.md
Normal file
165
fullstack-dev/references/auth-flow.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Authentication Flow Patterns
|
||||
|
||||
Complete auth flow across frontend and backend. Covers JWT bearer flow, automatic token refresh, Next.js server-side auth, RBAC, and backend middleware order.
|
||||
|
||||
---
|
||||
|
||||
## JWT Bearer Flow (Most Common)
|
||||
|
||||
```
|
||||
1. Login
|
||||
Client → POST /api/auth/login { email, password }
|
||||
Server → { accessToken (15min), refreshToken (7d, httpOnly cookie) }
|
||||
|
||||
2. Authenticated Requests
|
||||
Client → GET /api/orders Authorization: Bearer <accessToken>
|
||||
Server → validates JWT → returns data
|
||||
|
||||
3. Token Refresh (transparent)
|
||||
Client → 401 received → POST /api/auth/refresh (cookie auto-sent)
|
||||
Server → new accessToken
|
||||
Client → retry original request with new token
|
||||
|
||||
4. Logout
|
||||
Client → POST /api/auth/logout
|
||||
Server → invalidate refresh token → clear cookie
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Frontend: Automatic Token Refresh
|
||||
|
||||
```typescript
|
||||
// lib/api-client.ts — add to existing fetch wrapper
|
||||
async function apiWithRefresh<T>(path: string, options: RequestInit = {}): Promise<T> {
|
||||
try {
|
||||
return await api<T>(path, options);
|
||||
} catch (err) {
|
||||
if (err instanceof ApiError && err.status === 401) {
|
||||
// Try refresh
|
||||
const refreshed = await api<{ accessToken: string }>('/api/auth/refresh', {
|
||||
method: 'POST',
|
||||
credentials: 'include', // send httpOnly cookie
|
||||
});
|
||||
setAuthToken(refreshed.accessToken);
|
||||
// Retry original request
|
||||
return api<T>(path, options);
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next.js: Server-Side Auth (App Router)
|
||||
|
||||
```typescript
|
||||
// middleware.ts — protect routes server-side
|
||||
import { NextResponse } from 'next/server';
|
||||
import type { NextRequest } from 'next/server';
|
||||
|
||||
export function middleware(request: NextRequest) {
|
||||
const token = request.cookies.get('session')?.value;
|
||||
if (!token && request.nextUrl.pathname.startsWith('/dashboard')) {
|
||||
return NextResponse.redirect(new URL('/login', request.url));
|
||||
}
|
||||
return NextResponse.next();
|
||||
}
|
||||
|
||||
// app/dashboard/page.tsx — server component with auth
|
||||
import { cookies } from 'next/headers';
|
||||
|
||||
export default async function Dashboard() {
|
||||
const token = (await cookies()).get('session')?.value;
|
||||
const user = await fetch(`${process.env.API_URL}/api/me`, {
|
||||
headers: { Authorization: `Bearer ${token}` },
|
||||
}).then(r => r.json());
|
||||
|
||||
return <DashboardContent user={user} />;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backend: Standard Middleware Order
|
||||
|
||||
```
|
||||
Request → 1.RequestID → 2.Logging → 3.CORS → 4.RateLimit → 5.BodyParse
|
||||
→ 6.Auth → 7.Authz → 8.Validation → 9.Handler → 10.ErrorHandler → Response
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backend: JWT Rules
|
||||
|
||||
```
|
||||
✅ Short expiry access token (15min) + refresh token (server-stored)
|
||||
✅ Minimal claims: userId, roles (not entire user object)
|
||||
✅ Rotate signing keys periodically
|
||||
|
||||
❌ Never store tokens in localStorage (XSS risk)
|
||||
❌ Never pass tokens in URL query params
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backend: RBAC Pattern
|
||||
|
||||
```typescript
|
||||
function authorize(...roles: Role[]) {
|
||||
return (req, res, next) => {
|
||||
if (!req.user) throw new UnauthorizedError();
|
||||
if (!roles.some(r => req.user.roles.includes(r))) throw new ForbiddenError();
|
||||
next();
|
||||
};
|
||||
}
|
||||
router.delete('/users/:id', authenticate, authorize('admin'), deleteUser);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Auth Decision Table
|
||||
|
||||
| Method | When | Frontend |
|
||||
|--------|------|----------|
|
||||
| Session | Same-domain, SSR, Django templates | Django templates / htmx |
|
||||
| JWT | Different domain, SPA, mobile | React, Vue, mobile apps |
|
||||
| OAuth2 | Third-party login, API consumers | Any |
|
||||
|
||||
---
|
||||
|
||||
## Iron Rules
|
||||
|
||||
```
|
||||
✅ Access token: short-lived (15min), in memory
|
||||
✅ Refresh token: httpOnly cookie (XSS-safe)
|
||||
✅ Automatic transparent refresh on 401
|
||||
✅ Redirect to login when refresh fails
|
||||
|
||||
❌ Never store tokens in localStorage (XSS risk)
|
||||
❌ Never send tokens in URL query params (logged)
|
||||
❌ Never trust client-side auth checks alone (server must validate)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Auth works on page load but breaks on navigation"
|
||||
|
||||
**Cause:** Token stored in component state (lost on unmount).
|
||||
|
||||
**Fix:** Store access token in a persistent location:
|
||||
- React Context (survives navigation, lost on refresh)
|
||||
- Cookie (survives refresh)
|
||||
- React Query cache with `staleTime: Infinity` for session
|
||||
|
||||
### Issue 2: "CORS error with auth requests"
|
||||
|
||||
**Cause:** Missing `credentials: 'include'` on frontend or `credentials: true` on backend CORS config.
|
||||
|
||||
**Fix:**
|
||||
1. Frontend: `fetch(url, { credentials: 'include' })`
|
||||
2. Backend: `cors({ origin: 'https://your-frontend.com', credentials: true })`
|
||||
3. Backend: explicit origin (not `*`) when using credentials
|
||||
706
fullstack-dev/references/db-schema.md
Normal file
706
fullstack-dev/references/db-schema.md
Normal file
@@ -0,0 +1,706 @@
|
||||
---
|
||||
name: fullstack-dev-db-schema
|
||||
description: "Database schema design and migrations. Use when creating tables, defining ORM models, adding indexes, or designing relationships. Covers zero-downtime migrations and multi-tenancy."
|
||||
license: MIT
|
||||
metadata:
|
||||
version: "1.0.0"
|
||||
sources:
|
||||
- PostgreSQL official documentation
|
||||
- Use The Index, Luke (use-the-index-luke.com)
|
||||
- Designing Data-Intensive Applications (Martin Kleppmann)
|
||||
- Database Reliability Engineering (Laine Campbell & Charity Majors)
|
||||
---
|
||||
|
||||
# Database Schema Design
|
||||
|
||||
ORM-agnostic guide for relational database schema design. Covers data modeling, normalization, indexing, migrations, multi-tenancy, and common application patterns. Primarily PostgreSQL-focused but principles apply to MySQL/MariaDB.
|
||||
|
||||
## Scope
|
||||
|
||||
**USE this skill when:**
|
||||
- Designing a schema for a new project or feature
|
||||
- Deciding between normalization and denormalization
|
||||
- Choosing which indexes to create
|
||||
- Planning a zero-downtime migration on a live database
|
||||
- Implementing multi-tenant data isolation
|
||||
- Adding audit trails, soft delete, or versioning
|
||||
- Diagnosing slow queries caused by schema problems
|
||||
|
||||
**NOT for:**
|
||||
- Choosing which database technology to use (→ `technology-selection`)
|
||||
- PostgreSQL-specific query tuning (use PostgreSQL performance docs)
|
||||
- ORM-specific configuration (→ `django-best-practices` or your ORM's docs)
|
||||
- Application-layer caching (→ `fullstack-dev-practices`)
|
||||
|
||||
## Context Required
|
||||
|
||||
| Required | Optional |
|
||||
|----------|----------|
|
||||
| Database engine (PostgreSQL / MySQL) | Expected data volume (rows, growth rate) |
|
||||
| Domain entities and relationships | Read/write ratio |
|
||||
| Key access patterns (queries) | Multi-tenant requirements |
|
||||
|
||||
---
|
||||
|
||||
## Quick Start Checklist
|
||||
|
||||
Designing a new schema:
|
||||
|
||||
- [ ] **Domain entities identified** — map 1 entity = 1 table (not 1 class = 1 table)
|
||||
- [ ] **Primary keys**: UUID for public IDs, serial/bigserial for internal-only
|
||||
- [ ] **Foreign keys** with explicit `ON DELETE` behavior
|
||||
- [ ] **NOT NULL** by default — nullable only when business logic requires it
|
||||
- [ ] **Timestamps**: `created_at` + `updated_at` on every table
|
||||
- [ ] **Indexes** created for every WHERE, JOIN, ORDER BY column
|
||||
- [ ] **No premature denormalization** — start normalized, denormalize when measured
|
||||
- [ ] **Naming convention** consistent: `snake_case`, plural table names
|
||||
|
||||
---
|
||||
|
||||
## Quick Navigation
|
||||
|
||||
| Need to… | Jump to |
|
||||
|----------|---------|
|
||||
| Model entities and relationships | [1. Data Modeling](#1-data-modeling-critical) |
|
||||
| Decide normalize vs denormalize | [2. Normalization](#2-normalization-vs-denormalization-critical) |
|
||||
| Choose the right index | [3. Indexing](#3-indexing-strategy-critical) |
|
||||
| Run migrations safely on live DB | [4. Migrations](#4-zero-downtime-migrations-high) |
|
||||
| Design multi-tenant schema | [5. Multi-Tenancy](#5-multi-tenant-design-high) |
|
||||
| Add soft delete / audit trails | [6. Common Patterns](#6-common-schema-patterns-medium) |
|
||||
| Partition large tables | [7. Partitioning](#7-table-partitioning-medium) |
|
||||
| See anti-patterns | [Anti-Patterns](#anti-patterns) |
|
||||
|
||||
---
|
||||
|
||||
## Core Principles (7 Rules)
|
||||
|
||||
```
|
||||
1. ✅ Start normalized (3NF) — denormalize only when you have measured evidence
|
||||
2. ✅ Every table has a primary key, created_at, updated_at
|
||||
3. ✅ UUID for public-facing IDs, serial for internal join keys
|
||||
4. ✅ NOT NULL by default — null is a business decision, not a lazy default
|
||||
5. ✅ Index every column used in WHERE, JOIN, ORDER BY
|
||||
6. ✅ Foreign keys enforced in database (not just application code)
|
||||
7. ✅ Migrations are additive — never drop/rename in production without a multi-step plan
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1. Data Modeling (CRITICAL)
|
||||
|
||||
### Table Naming
|
||||
|
||||
```sql
|
||||
-- ✅ Plural, snake_case
|
||||
CREATE TABLE orders (...);
|
||||
CREATE TABLE order_items (...);
|
||||
CREATE TABLE user_profiles (...);
|
||||
|
||||
-- ❌ Singular, mixed case
|
||||
CREATE TABLE Order (...);
|
||||
CREATE TABLE OrderItem (...);
|
||||
CREATE TABLE tbl_usr_prof (...); -- cryptic abbreviation
|
||||
```
|
||||
|
||||
### Primary Keys
|
||||
|
||||
| Strategy | When | Pros | Cons |
|
||||
|----------|------|------|------|
|
||||
| `bigserial` (auto-increment) | Internal tables, FK joins | Compact, fast joins | Enumerable, not safe for public IDs |
|
||||
| `uuid` (v4 random) | Public-facing resources | Non-guessable, globally unique | Larger (16 bytes), random I/O on B-Tree |
|
||||
| `uuid` v7 (time-sorted) | Public + needs ordering | Non-guessable + insert-friendly | Newer, less ecosystem support |
|
||||
| `text` slug | URL-friendly resources | Human-readable | Must enforce uniqueness, updates expensive |
|
||||
|
||||
**Recommended default:**
|
||||
|
||||
```sql
|
||||
CREATE TABLE orders (
|
||||
id bigserial PRIMARY KEY, -- internal FK target
|
||||
public_id uuid NOT NULL DEFAULT gen_random_uuid() UNIQUE, -- API-facing
|
||||
-- ...
|
||||
created_at timestamptz NOT NULL DEFAULT now(),
|
||||
updated_at timestamptz NOT NULL DEFAULT now()
|
||||
);
|
||||
```
|
||||
|
||||
### Relationships
|
||||
|
||||
```sql
|
||||
-- One-to-Many: user → orders
|
||||
CREATE TABLE orders (
|
||||
id bigserial PRIMARY KEY,
|
||||
user_id bigint NOT NULL REFERENCES users(id) ON DELETE CASCADE,
|
||||
-- ...
|
||||
);
|
||||
CREATE INDEX idx_orders_user_id ON orders(user_id);
|
||||
|
||||
-- Many-to-Many: orders ↔ products (via junction table)
|
||||
CREATE TABLE order_items (
|
||||
id bigserial PRIMARY KEY,
|
||||
order_id bigint NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
|
||||
product_id bigint NOT NULL REFERENCES products(id) ON DELETE RESTRICT,
|
||||
quantity int NOT NULL CHECK (quantity > 0),
|
||||
unit_price numeric(10,2) NOT NULL,
|
||||
UNIQUE (order_id, product_id) -- prevent duplicate line items
|
||||
);
|
||||
|
||||
-- One-to-One: user → profile
|
||||
CREATE TABLE user_profiles (
|
||||
user_id bigint PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
|
||||
bio text,
|
||||
avatar_url text,
|
||||
-- ...
|
||||
);
|
||||
```
|
||||
|
||||
### ON DELETE Behavior
|
||||
|
||||
| Behavior | When | Example |
|
||||
|----------|------|---------|
|
||||
| `CASCADE` | Child meaningless without parent | order_items when order deleted |
|
||||
| `RESTRICT` | Prevent accidental deletion | products referenced by order_items |
|
||||
| `SET NULL` | Preserve child, clear reference | orders.assigned_to when employee leaves |
|
||||
| `SET DEFAULT` | Fallback to default value | Rare, for status columns |
|
||||
|
||||
---
|
||||
|
||||
## 2. Normalization vs Denormalization (CRITICAL)
|
||||
|
||||
### Start Normalized (3NF)
|
||||
|
||||
**Normal forms in practice:**
|
||||
|
||||
| Form | Rule | Example Violation |
|
||||
|------|------|-------------------|
|
||||
| 1NF | No repeating groups, atomic values | `tags = "go,python,rust"` in one column |
|
||||
| 2NF | No partial dependencies (composite keys) | `order_items.product_name` depends on `product_id` alone |
|
||||
| 3NF | No transitive dependencies | `orders.customer_city` depends on `customer_id`, not `order_id` |
|
||||
|
||||
**1NF violation fix:**
|
||||
```sql
|
||||
-- ❌ Tags as comma-separated string
|
||||
CREATE TABLE posts (id serial, tags text); -- tags = "go,python"
|
||||
|
||||
-- ✅ Separate table (or array/JSONB if simple)
|
||||
CREATE TABLE post_tags (
|
||||
post_id bigint REFERENCES posts(id) ON DELETE CASCADE,
|
||||
tag_id bigint REFERENCES tags(id) ON DELETE CASCADE,
|
||||
PRIMARY KEY (post_id, tag_id)
|
||||
);
|
||||
|
||||
-- ✅ Alternative: PostgreSQL array (if tags are just strings, no metadata)
|
||||
CREATE TABLE posts (id serial, tags text[] NOT NULL DEFAULT '{}');
|
||||
CREATE INDEX idx_posts_tags ON posts USING GIN(tags);
|
||||
```
|
||||
|
||||
### When to Denormalize
|
||||
|
||||
**Denormalize ONLY when:**
|
||||
1. You have **measured** a performance problem (EXPLAIN ANALYZE, not "I think it's slow")
|
||||
2. The denormalized data is **read-heavy** (read:write ratio > 100:1)
|
||||
3. You accept the **consistency maintenance cost** (triggers, application logic, or materialized views)
|
||||
|
||||
**Safe denormalization patterns:**
|
||||
|
||||
```sql
|
||||
-- Pattern 1: Materialized view (computed, refreshable)
|
||||
CREATE MATERIALIZED VIEW order_summary AS
|
||||
SELECT o.id, o.user_id, o.total,
|
||||
COUNT(oi.id) AS item_count,
|
||||
u.email AS user_email
|
||||
FROM orders o
|
||||
JOIN order_items oi ON oi.order_id = o.id
|
||||
JOIN users u ON u.id = o.user_id
|
||||
GROUP BY o.id, u.email;
|
||||
|
||||
REFRESH MATERIALIZED VIEW CONCURRENTLY order_summary; -- non-blocking
|
||||
|
||||
-- Pattern 2: Cached aggregate column (application-maintained)
|
||||
ALTER TABLE orders ADD COLUMN item_count int NOT NULL DEFAULT 0;
|
||||
-- Update via trigger or application code on order_item insert/delete
|
||||
|
||||
-- Pattern 3: JSONB snapshot (freeze-at-write-time)
|
||||
-- Store a copy of the product details at the time of purchase
|
||||
CREATE TABLE order_items (
|
||||
id bigserial PRIMARY KEY,
|
||||
order_id bigint NOT NULL REFERENCES orders(id),
|
||||
product_id bigint REFERENCES products(id),
|
||||
quantity int NOT NULL,
|
||||
unit_price numeric(10,2) NOT NULL, -- frozen price
|
||||
product_snapshot jsonb NOT NULL -- frozen name, description, image
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Indexing Strategy (CRITICAL)
|
||||
|
||||
### Index Types (PostgreSQL)
|
||||
|
||||
| Type | When | Example |
|
||||
|------|------|---------|
|
||||
| **B-Tree** (default) | Equality, range, ORDER BY | `WHERE status = 'active'`, `WHERE created_at > '2025-01-01'` |
|
||||
| **Hash** | Equality only (rare, B-Tree usually better) | `WHERE id = 123` (large tables, Postgres 10+) |
|
||||
| **GIN** | Arrays, JSONB, full-text search | `WHERE tags @> '{go}'`, `WHERE data->>'key' = 'val'` |
|
||||
| **GiST** | Geometry, ranges, nearest-neighbor | PostGIS, tsrange, ltree |
|
||||
| **BRIN** | Very large tables with natural ordering | Time-series data sorted by timestamp |
|
||||
|
||||
### Index Decision Rules
|
||||
|
||||
```
|
||||
Rule 1: Index every column in WHERE clauses
|
||||
Rule 2: Index every column used in JOIN ON conditions
|
||||
Rule 3: Index every column in ORDER BY (if queried with LIMIT)
|
||||
Rule 4: Composite index for multi-column WHERE (leftmost prefix rule)
|
||||
Rule 5: Partial index when filtering a subset (e.g., only active records)
|
||||
Rule 6: Covering index (INCLUDE) to avoid table lookup
|
||||
Rule 7: DON'T index low-cardinality columns alone (e.g., boolean)
|
||||
```
|
||||
|
||||
### Composite Index: Column Order Matters
|
||||
|
||||
```sql
|
||||
-- Query: WHERE user_id = ? AND status = ? ORDER BY created_at DESC
|
||||
-- ✅ Optimal: matches query pattern left-to-right
|
||||
CREATE INDEX idx_orders_user_status_created
|
||||
ON orders(user_id, status, created_at DESC);
|
||||
|
||||
-- ❌ Wrong order: can't use for this query efficiently
|
||||
CREATE INDEX idx_orders_created_user_status
|
||||
ON orders(created_at DESC, user_id, status);
|
||||
```
|
||||
|
||||
**Leftmost prefix rule:** Index on `(A, B, C)` supports queries on `(A)`, `(A, B)`, `(A, B, C)` but NOT `(B)`, `(C)`, or `(B, C)`.
|
||||
|
||||
### Partial Index (Index Only What Matters)
|
||||
|
||||
```sql
|
||||
-- Only 5% of orders are 'pending', but queried frequently
|
||||
CREATE INDEX idx_orders_pending
|
||||
ON orders(created_at DESC)
|
||||
WHERE status = 'pending';
|
||||
|
||||
-- Only active users matter for login
|
||||
CREATE INDEX idx_users_active_email
|
||||
ON users(email)
|
||||
WHERE is_active = true;
|
||||
```
|
||||
|
||||
### Covering Index (Avoid Table Lookup)
|
||||
|
||||
```sql
|
||||
-- Query only needs id and status, no need to read the table row
|
||||
CREATE INDEX idx_orders_user_covering
|
||||
ON orders(user_id) INCLUDE (status, total);
|
||||
|
||||
-- Now this query is index-only:
|
||||
SELECT status, total FROM orders WHERE user_id = 123;
|
||||
```
|
||||
|
||||
### When NOT to Index
|
||||
|
||||
```
|
||||
❌ Columns rarely used in WHERE/JOIN/ORDER BY
|
||||
❌ Tables with < 1,000 rows (sequential scan is faster)
|
||||
❌ Columns with very low cardinality alone (e.g., boolean is_active)
|
||||
❌ Write-heavy tables where index maintenance cost > read benefit
|
||||
❌ Duplicate indexes (check pg_stat_user_indexes for unused indexes)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Zero-Downtime Migrations (HIGH)
|
||||
|
||||
### The Golden Rule
|
||||
|
||||
```
|
||||
NEVER make destructive changes in one step.
|
||||
Always: ADD → MIGRATE DATA → REMOVE OLD (in separate deploys).
|
||||
```
|
||||
|
||||
### Safe Migration Patterns
|
||||
|
||||
**Rename a column (3 deploys):**
|
||||
|
||||
```
|
||||
Deploy 1: Add new column
|
||||
ALTER TABLE users ADD COLUMN full_name text;
|
||||
UPDATE users SET full_name = name; -- backfill
|
||||
-- App writes to BOTH name and full_name
|
||||
|
||||
Deploy 2: Switch reads to new column
|
||||
-- App reads from full_name, still writes to both
|
||||
|
||||
Deploy 3: Drop old column
|
||||
ALTER TABLE users DROP COLUMN name;
|
||||
-- App only uses full_name
|
||||
```
|
||||
|
||||
**Add a NOT NULL column (2 deploys):**
|
||||
|
||||
```sql
|
||||
-- Deploy 1: Add nullable column, backfill
|
||||
ALTER TABLE orders ADD COLUMN currency text; -- nullable first
|
||||
UPDATE orders SET currency = 'USD' WHERE currency IS NULL; -- backfill
|
||||
|
||||
-- Deploy 2: Add constraint (after all rows backfilled)
|
||||
ALTER TABLE orders ALTER COLUMN currency SET NOT NULL;
|
||||
ALTER TABLE orders ALTER COLUMN currency SET DEFAULT 'USD';
|
||||
```
|
||||
|
||||
**Add an index without locking:**
|
||||
|
||||
```sql
|
||||
-- ✅ CONCURRENTLY: no table lock, can run on live DB
|
||||
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
|
||||
|
||||
-- ❌ Without CONCURRENTLY: locks table for writes during build
|
||||
CREATE INDEX idx_orders_status ON orders(status);
|
||||
```
|
||||
|
||||
### Migration Safety Checklist
|
||||
|
||||
```
|
||||
✅ Migration runs in < 30 seconds on production data size
|
||||
✅ No exclusive table locks (use CONCURRENTLY for indexes)
|
||||
✅ Rollback plan documented and tested
|
||||
✅ Backfill runs in batches (not one giant UPDATE)
|
||||
✅ New column added as nullable first, constraint added later
|
||||
✅ Old column kept until all code references removed
|
||||
|
||||
❌ Never rename/drop columns in one deploy
|
||||
❌ Never ALTER TYPE on large tables without testing timing
|
||||
❌ Never run data backfill in a transaction (OOM on large tables)
|
||||
```
|
||||
|
||||
### Batch Backfill Template
|
||||
|
||||
```sql
|
||||
-- Backfill in batches of 10,000 (avoids long-running transactions)
|
||||
DO $$
|
||||
DECLARE
|
||||
batch_size int := 10000;
|
||||
affected int;
|
||||
BEGIN
|
||||
LOOP
|
||||
UPDATE orders
|
||||
SET currency = 'USD'
|
||||
WHERE id IN (
|
||||
SELECT id FROM orders WHERE currency IS NULL LIMIT batch_size
|
||||
);
|
||||
GET DIAGNOSTICS affected = ROW_COUNT;
|
||||
RAISE NOTICE 'Updated % rows', affected;
|
||||
EXIT WHEN affected = 0;
|
||||
PERFORM pg_sleep(0.1); -- brief pause to reduce load
|
||||
END LOOP;
|
||||
END $$;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Multi-Tenant Design (HIGH)
|
||||
|
||||
### Three Approaches
|
||||
|
||||
| Approach | Isolation | Complexity | When |
|
||||
|----------|-----------|------------|------|
|
||||
| **Row-level** (shared tables + `tenant_id`) | Low | Low | SaaS MVP, < 1,000 tenants |
|
||||
| **Schema-per-tenant** | Medium | Medium | Regulated industries, moderate scale |
|
||||
| **Database-per-tenant** | High | High | Enterprise, strict data isolation |
|
||||
|
||||
### Row-Level Tenancy (Most Common)
|
||||
|
||||
```sql
|
||||
-- Every table has tenant_id
|
||||
CREATE TABLE orders (
|
||||
id bigserial PRIMARY KEY,
|
||||
tenant_id bigint NOT NULL REFERENCES tenants(id),
|
||||
user_id bigint NOT NULL REFERENCES users(id),
|
||||
total numeric(10,2) NOT NULL,
|
||||
-- ...
|
||||
);
|
||||
|
||||
-- Composite index: tenant first (most queries filter by tenant)
|
||||
CREATE INDEX idx_orders_tenant_user ON orders(tenant_id, user_id);
|
||||
CREATE INDEX idx_orders_tenant_status ON orders(tenant_id, status);
|
||||
|
||||
-- Row-Level Security (PostgreSQL)
|
||||
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY tenant_isolation ON orders
|
||||
USING (tenant_id = current_setting('app.tenant_id')::bigint);
|
||||
```
|
||||
|
||||
**Application-level enforcement:**
|
||||
|
||||
```typescript
|
||||
// Middleware: set tenant context on every request
|
||||
app.use((req, res, next) => {
|
||||
const tenantId = req.headers['x-tenant-id'];
|
||||
if (!tenantId) return res.status(400).json({ error: 'Missing tenant' });
|
||||
req.tenantId = tenantId;
|
||||
next();
|
||||
});
|
||||
|
||||
// Repository: ALWAYS filter by tenant
|
||||
async findOrders(tenantId: string, userId: string) {
|
||||
return db.order.findMany({
|
||||
where: { tenantId, userId }, // ← tenant_id in EVERY query
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Rules
|
||||
|
||||
```
|
||||
✅ tenant_id in EVERY table that holds tenant data
|
||||
✅ tenant_id as FIRST column in every composite index
|
||||
✅ Application middleware enforces tenant context
|
||||
✅ Use RLS (PostgreSQL) as defense-in-depth, not sole protection
|
||||
✅ Test with 2+ tenants to verify isolation
|
||||
|
||||
❌ Never allow cross-tenant queries in application code
|
||||
❌ Never skip tenant_id in WHERE clauses (even in admin tools)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Common Schema Patterns (MEDIUM)
|
||||
|
||||
### Soft Delete
|
||||
|
||||
```sql
|
||||
ALTER TABLE orders ADD COLUMN deleted_at timestamptz;
|
||||
|
||||
-- All queries filter deleted records
|
||||
CREATE VIEW active_orders AS
|
||||
SELECT * FROM orders WHERE deleted_at IS NULL;
|
||||
|
||||
-- Partial index: only index non-deleted rows
|
||||
CREATE INDEX idx_orders_active_status
|
||||
ON orders(status, created_at DESC)
|
||||
WHERE deleted_at IS NULL;
|
||||
```
|
||||
|
||||
**ORM integration:**
|
||||
|
||||
```typescript
|
||||
// Prisma middleware: auto-filter soft-deleted records
|
||||
prisma.$use(async (params, next) => {
|
||||
if (params.action === 'findMany' || params.action === 'findFirst') {
|
||||
params.args.where = { ...params.args.where, deletedAt: null };
|
||||
}
|
||||
return next(params);
|
||||
});
|
||||
```
|
||||
|
||||
### Audit Trail
|
||||
|
||||
```sql
|
||||
-- Option A: Audit columns on every table
|
||||
ALTER TABLE orders ADD COLUMN created_by bigint REFERENCES users(id);
|
||||
ALTER TABLE orders ADD COLUMN updated_by bigint REFERENCES users(id);
|
||||
|
||||
-- Option B: Separate audit log table (more detail)
|
||||
CREATE TABLE audit_log (
|
||||
id bigserial PRIMARY KEY,
|
||||
table_name text NOT NULL,
|
||||
record_id bigint NOT NULL,
|
||||
action text NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
|
||||
old_data jsonb,
|
||||
new_data jsonb,
|
||||
changed_by bigint REFERENCES users(id),
|
||||
changed_at timestamptz NOT NULL DEFAULT now()
|
||||
);
|
||||
CREATE INDEX idx_audit_table_record ON audit_log(table_name, record_id);
|
||||
CREATE INDEX idx_audit_changed_at ON audit_log(changed_at DESC);
|
||||
```
|
||||
|
||||
### Enum Columns
|
||||
|
||||
```sql
|
||||
-- Option A: PostgreSQL enum type (strict, but ALTER TYPE is painful)
|
||||
CREATE TYPE order_status AS ENUM ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled');
|
||||
ALTER TABLE orders ADD COLUMN status order_status NOT NULL DEFAULT 'pending';
|
||||
|
||||
-- Option B: Text + CHECK constraint (easier to migrate)
|
||||
ALTER TABLE orders ADD COLUMN status text NOT NULL DEFAULT 'pending'
|
||||
CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled'));
|
||||
|
||||
-- Option C: Lookup table (most flexible, best for UI-driven lists)
|
||||
CREATE TABLE order_statuses (
|
||||
id serial PRIMARY KEY,
|
||||
name text UNIQUE NOT NULL,
|
||||
label text NOT NULL -- display name
|
||||
);
|
||||
```
|
||||
|
||||
**Recommendation:** Option B (text + CHECK) for most cases. Option C if statuses are managed by non-developers.
|
||||
|
||||
### Polymorphic Associations
|
||||
|
||||
```sql
|
||||
-- ❌ Anti-pattern: polymorphic FK (no referential integrity)
|
||||
CREATE TABLE comments (
|
||||
id bigserial PRIMARY KEY,
|
||||
commentable_type text, -- 'Post' or 'Photo'
|
||||
commentable_id bigint, -- no FK constraint possible!
|
||||
body text
|
||||
);
|
||||
|
||||
-- ✅ Pattern A: Separate FK columns (nullable)
|
||||
CREATE TABLE comments (
|
||||
id bigserial PRIMARY KEY,
|
||||
post_id bigint REFERENCES posts(id) ON DELETE CASCADE,
|
||||
photo_id bigint REFERENCES photos(id) ON DELETE CASCADE,
|
||||
body text NOT NULL,
|
||||
CHECK (
|
||||
(post_id IS NOT NULL AND photo_id IS NULL) OR
|
||||
(post_id IS NULL AND photo_id IS NOT NULL)
|
||||
)
|
||||
);
|
||||
|
||||
-- ✅ Pattern B: Separate tables (cleanest, best for different schemas)
|
||||
CREATE TABLE post_comments (..., post_id bigint REFERENCES posts(id));
|
||||
CREATE TABLE photo_comments (..., photo_id bigint REFERENCES photos(id));
|
||||
```
|
||||
|
||||
### JSONB Columns (Semi-Structured Data)
|
||||
|
||||
```sql
|
||||
-- Good uses: metadata, settings, flexible attributes
|
||||
CREATE TABLE products (
|
||||
id bigserial PRIMARY KEY,
|
||||
name text NOT NULL,
|
||||
price numeric(10,2) NOT NULL,
|
||||
attributes jsonb NOT NULL DEFAULT '{}' -- color, size, weight...
|
||||
);
|
||||
|
||||
-- Index for JSONB queries
|
||||
CREATE INDEX idx_products_attrs ON products USING GIN(attributes);
|
||||
|
||||
-- Query
|
||||
SELECT * FROM products WHERE attributes->>'color' = 'red';
|
||||
SELECT * FROM products WHERE attributes @> '{"size": "XL"}';
|
||||
```
|
||||
|
||||
```
|
||||
✅ Use JSONB for truly flexible/optional data (metadata, settings, preferences)
|
||||
✅ Index JSONB columns with GIN when queried
|
||||
|
||||
❌ Never use JSONB for data that should be columns (email, status, price)
|
||||
❌ Never use JSONB to avoid schema design (it's not MongoDB-in-Postgres)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Table Partitioning (MEDIUM)
|
||||
|
||||
### When to Partition
|
||||
|
||||
```
|
||||
✅ Table > 100M rows AND growing
|
||||
✅ Most queries filter on the partition key (date range, tenant)
|
||||
✅ Old data can be dropped/archived by partition (efficient DELETE)
|
||||
|
||||
❌ Table < 10M rows (overhead not worth it)
|
||||
❌ Queries don't filter on partition key (scans all partitions)
|
||||
```
|
||||
|
||||
### Range Partitioning (Time-Series)
|
||||
|
||||
```sql
|
||||
CREATE TABLE events (
|
||||
id bigserial,
|
||||
tenant_id bigint NOT NULL,
|
||||
event_type text NOT NULL,
|
||||
payload jsonb,
|
||||
created_at timestamptz NOT NULL DEFAULT now()
|
||||
) PARTITION BY RANGE (created_at);
|
||||
|
||||
-- Monthly partitions
|
||||
CREATE TABLE events_2025_01 PARTITION OF events
|
||||
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
|
||||
CREATE TABLE events_2025_02 PARTITION OF events
|
||||
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
|
||||
|
||||
-- Automate partition creation with pg_partman or cron
|
||||
```
|
||||
|
||||
### List Partitioning (Multi-Tenant)
|
||||
|
||||
```sql
|
||||
CREATE TABLE orders (
|
||||
id bigserial,
|
||||
tenant_id bigint NOT NULL,
|
||||
total numeric(10,2)
|
||||
) PARTITION BY LIST (tenant_id);
|
||||
|
||||
CREATE TABLE orders_tenant_1 PARTITION OF orders FOR VALUES IN (1);
|
||||
CREATE TABLE orders_tenant_2 PARTITION OF orders FOR VALUES IN (2);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
| # | ❌ Don't | ✅ Do Instead |
|
||||
|---|---------|--------------|
|
||||
| 1 | Premature denormalization | Start 3NF, denormalize when measured |
|
||||
| 2 | Auto-increment IDs as public API identifiers | UUID for public, serial for internal |
|
||||
| 3 | No foreign key constraints | FK enforced in database, always |
|
||||
| 4 | Nullable by default | NOT NULL by default, nullable when required |
|
||||
| 5 | No indexes on FK columns | Index every FK column |
|
||||
| 6 | Single-step destructive migration | ADD → MIGRATE → REMOVE in separate deploys |
|
||||
| 7 | `CREATE INDEX` without `CONCURRENTLY` | Always `CONCURRENTLY` on live tables |
|
||||
| 8 | Polymorphic FK (`commentable_type + commentable_id`) | Separate FK columns or separate tables |
|
||||
| 9 | JSONB for everything | JSONB for flexible data only, columns for structured |
|
||||
| 10 | No `created_at` / `updated_at` | Timestamp pair on every table |
|
||||
| 11 | Comma-separated values in one column | Separate table or PostgreSQL array |
|
||||
| 12 | `text` without length validation | CHECK constraint or application validation |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Query is slow but I already have an index"
|
||||
|
||||
**Symptom:** `EXPLAIN ANALYZE` shows Sequential Scan despite existing index.
|
||||
|
||||
**Causes:**
|
||||
1. **Wrong index column order** — composite index `(A, B)` won't help `WHERE B = ?`
|
||||
2. **Low selectivity** — index on boolean column (50% of rows match), planner prefers seq scan
|
||||
3. **Stale statistics** — run `ANALYZE table_name;`
|
||||
4. **Type mismatch** — comparing `varchar` column with `integer` parameter → no index use
|
||||
|
||||
**Fix:** Check `EXPLAIN (ANALYZE, BUFFERS)`, verify index matches query pattern, run `ANALYZE`.
|
||||
|
||||
### Issue 2: "Migration locks the table for minutes"
|
||||
|
||||
**Symptom:** `ALTER TABLE` blocks all writes during execution.
|
||||
|
||||
**Cause:** Adding NOT NULL constraint, changing column type, or creating index without `CONCURRENTLY`.
|
||||
|
||||
**Fix:**
|
||||
```sql
|
||||
-- Add index without lock
|
||||
CREATE INDEX CONCURRENTLY idx_name ON table(col);
|
||||
|
||||
-- Add NOT NULL constraint without lock (Postgres 12+)
|
||||
ALTER TABLE t ADD CONSTRAINT t_col_nn CHECK (col IS NOT NULL) NOT VALID;
|
||||
ALTER TABLE t VALIDATE CONSTRAINT t_col_nn; -- non-blocking validation
|
||||
```
|
||||
|
||||
### Issue 3: "How many indexes is too many?"
|
||||
|
||||
**Rule of thumb:**
|
||||
- Read-heavy table (reports, product catalog): 5-10 indexes is fine
|
||||
- Write-heavy table (events, logs): 2-3 indexes max
|
||||
- Monitor with `pg_stat_user_indexes` — drop indexes with `idx_scan = 0`
|
||||
|
||||
```sql
|
||||
-- Find unused indexes
|
||||
SELECT schemaname, relname, indexrelname, idx_scan
|
||||
FROM pg_stat_user_indexes
|
||||
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
|
||||
ORDER BY pg_relation_size(indexrelid) DESC;
|
||||
```
|
||||
466
fullstack-dev/references/django-best-practices.md
Normal file
466
fullstack-dev/references/django-best-practices.md
Normal file
@@ -0,0 +1,466 @@
|
||||
# Django Best Practices
|
||||
|
||||
Production-grade guide for Django 5.x and Django REST Framework. 40+ rules across 8 categories.
|
||||
|
||||
## Core Principles (7 Rules)
|
||||
|
||||
```
|
||||
1. ✅ Custom User model BEFORE first migration (can't change later)
|
||||
2. ✅ One Django app per domain concept (users, orders, payments)
|
||||
3. ✅ Fat models, thin views — business logic in models/managers, not views
|
||||
4. ✅ Always use select_related/prefetch_related (prevent N+1)
|
||||
5. ✅ Settings split by environment (base + dev + prod)
|
||||
6. ✅ Test with pytest-django + factory_boy (not fixtures)
|
||||
7. ✅ Never use runserver in production (Gunicorn + Nginx)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1. Project Structure (CRITICAL)
|
||||
|
||||
### App-Per-Domain
|
||||
|
||||
```
|
||||
myproject/
|
||||
├── config/ # Project config
|
||||
│ ├── __init__.py
|
||||
│ ├── settings/
|
||||
│ │ ├── base.py # Shared settings
|
||||
│ │ ├── dev.py # DEBUG=True, SQLite ok
|
||||
│ │ └── prod.py # DEBUG=False, Postgres, HTTPS
|
||||
│ ├── urls.py
|
||||
│ ├── wsgi.py
|
||||
│ └── asgi.py
|
||||
├── apps/
|
||||
│ ├── users/ # Custom User model
|
||||
│ │ ├── models.py
|
||||
│ │ ├── serializers.py
|
||||
│ │ ├── views.py
|
||||
│ │ ├── urls.py
|
||||
│ │ ├── admin.py
|
||||
│ │ ├── services.py # Business logic
|
||||
│ │ ├── selectors.py # Complex queries
|
||||
│ │ └── tests/
|
||||
│ │ ├── test_models.py
|
||||
│ │ ├── test_views.py
|
||||
│ │ └── factories.py
|
||||
│ ├── orders/
|
||||
│ └── payments/
|
||||
├── manage.py
|
||||
├── requirements/
|
||||
│ ├── base.txt
|
||||
│ ├── dev.txt
|
||||
│ └── prod.txt
|
||||
└── docker-compose.yml
|
||||
```
|
||||
|
||||
### Rules
|
||||
|
||||
```
|
||||
✅ One app = one bounded context (users, orders, payments)
|
||||
✅ Business logic in services.py / selectors.py, not views
|
||||
✅ Each app has its own urls.py, admin.py, tests/
|
||||
|
||||
❌ Never put everything in one app
|
||||
❌ Never import across app boundaries at the model level (use IDs)
|
||||
❌ Never put business logic in views or serializers
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Models & Migrations (CRITICAL)
|
||||
|
||||
### Custom User Model (Day 1!)
|
||||
|
||||
```python
|
||||
# apps/users/models.py
|
||||
from django.contrib.auth.models import AbstractUser
|
||||
from django.db import models
|
||||
import uuid
|
||||
|
||||
class User(AbstractUser):
|
||||
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
|
||||
email = models.EmailField(unique=True)
|
||||
|
||||
USERNAME_FIELD = 'email'
|
||||
REQUIRED_FIELDS = ['username']
|
||||
|
||||
class Meta:
|
||||
db_table = 'users'
|
||||
|
||||
# config/settings/base.py
|
||||
AUTH_USER_MODEL = 'users.User'
|
||||
```
|
||||
|
||||
**This MUST be done before `migrate`. Cannot change after.**
|
||||
|
||||
### Model Best Practices
|
||||
|
||||
```python
|
||||
class TimeStampedModel(models.Model):
|
||||
created_at = models.DateTimeField(auto_now_add=True)
|
||||
updated_at = models.DateTimeField(auto_now=True)
|
||||
class Meta:
|
||||
abstract = True
|
||||
|
||||
class Order(TimeStampedModel):
|
||||
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
|
||||
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, related_name='orders')
|
||||
status = models.CharField(max_length=20, choices=OrderStatus.choices, default=OrderStatus.PENDING, db_index=True)
|
||||
total = models.DecimalField(max_digits=10, decimal_places=2)
|
||||
|
||||
class Meta:
|
||||
db_table = 'orders'
|
||||
ordering = ['-created_at']
|
||||
indexes = [
|
||||
models.Index(fields=['user', 'status']),
|
||||
]
|
||||
|
||||
def can_cancel(self) -> bool:
|
||||
return self.status in [OrderStatus.PENDING, OrderStatus.CONFIRMED]
|
||||
|
||||
def cancel(self):
|
||||
if not self.can_cancel():
|
||||
raise ValueError(f"Cannot cancel order in {self.status} status")
|
||||
self.status = OrderStatus.CANCELLED
|
||||
self.save(update_fields=['status', 'updated_at'])
|
||||
```
|
||||
|
||||
### Migration Rules
|
||||
|
||||
```
|
||||
✅ Review migration SQL: python manage.py sqlmigrate app_name 0001
|
||||
✅ Name migrations descriptively: --name add_status_index_to_orders
|
||||
✅ Separate data migrations from schema migrations
|
||||
✅ Non-destructive first: add column → backfill → remove old column
|
||||
|
||||
❌ Never edit or delete applied migrations
|
||||
❌ Never use RunPython without reverse function
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Views & Serializers — DRF (HIGH)
|
||||
|
||||
### Service Layer Pattern
|
||||
|
||||
```python
|
||||
# apps/orders/services.py
|
||||
from django.db import transaction
|
||||
|
||||
class OrderService:
|
||||
@staticmethod
|
||||
@transaction.atomic
|
||||
def create_order(user, items_data: list[dict]) -> Order:
|
||||
total = sum(item['price'] * item['quantity'] for item in items_data)
|
||||
order = Order.objects.create(user=user, total=total)
|
||||
OrderItem.objects.bulk_create([
|
||||
OrderItem(order=order, **item) for item in items_data
|
||||
])
|
||||
return order
|
||||
|
||||
@staticmethod
|
||||
def cancel_order(order_id: str, user) -> Order:
|
||||
order = Order.objects.select_for_update().get(id=order_id, user=user)
|
||||
order.cancel()
|
||||
return order
|
||||
```
|
||||
|
||||
### Serializers
|
||||
|
||||
```python
|
||||
class OrderSerializer(serializers.ModelSerializer):
|
||||
items = OrderItemSerializer(many=True, read_only=True)
|
||||
class Meta:
|
||||
model = Order
|
||||
fields = ['id', 'status', 'total', 'items', 'created_at']
|
||||
read_only_fields = ['id', 'total', 'created_at']
|
||||
|
||||
class CreateOrderSerializer(serializers.Serializer):
|
||||
"""Input-only serializer — separate from output."""
|
||||
items = serializers.ListField(
|
||||
child=serializers.DictField(), min_length=1, max_length=50,
|
||||
)
|
||||
def validate_items(self, items):
|
||||
for item in items:
|
||||
if item.get('quantity', 0) < 1:
|
||||
raise serializers.ValidationError("Quantity must be at least 1")
|
||||
return items
|
||||
```
|
||||
|
||||
### Views (Thin!)
|
||||
|
||||
```python
|
||||
@api_view(['POST'])
|
||||
@permission_classes([IsAuthenticated])
|
||||
def create_order(request):
|
||||
serializer = CreateOrderSerializer(data=request.data)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
order = OrderService.create_order(request.user, serializer.validated_data['items'])
|
||||
return Response({'data': OrderSerializer(order).data}, status=status.HTTP_201_CREATED)
|
||||
```
|
||||
|
||||
### Rules
|
||||
|
||||
```
|
||||
✅ Separate input serializers from output serializers
|
||||
✅ Views only: validate → call service → serialize → respond
|
||||
✅ Use @transaction.atomic for multi-model writes
|
||||
|
||||
❌ Never put business logic in views or serializers
|
||||
❌ Never use ModelSerializer for write operations (too implicit)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Authentication (HIGH)
|
||||
|
||||
| Method | When | Frontend |
|
||||
|--------|------|----------|
|
||||
| Session | Same-domain, SSR, Django templates | Django templates / htmx |
|
||||
| JWT | Different domain, SPA, mobile | React, Vue, mobile apps |
|
||||
| OAuth2 | Third-party login, API consumers | Any |
|
||||
|
||||
### JWT Config (djangorestframework-simplejwt)
|
||||
|
||||
```python
|
||||
SIMPLE_JWT = {
|
||||
'ACCESS_TOKEN_LIFETIME': timedelta(minutes=15),
|
||||
'REFRESH_TOKEN_LIFETIME': timedelta(days=7),
|
||||
'ROTATE_REFRESH_TOKENS': True,
|
||||
'BLACKLIST_AFTER_ROTATION': True,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Performance Optimization (HIGH)
|
||||
|
||||
### N+1 Query Prevention
|
||||
|
||||
```python
|
||||
# ❌ N+1: 1 query for orders + N queries for users
|
||||
orders = Order.objects.all()
|
||||
for o in orders:
|
||||
print(o.user.email) # hits DB each iteration
|
||||
|
||||
# ✅ select_related (FK/OneToOne — JOIN)
|
||||
orders = Order.objects.select_related('user').all()
|
||||
|
||||
# ✅ prefetch_related (ManyToMany/reverse FK — 2 queries)
|
||||
orders = Order.objects.prefetch_related('items').all()
|
||||
|
||||
# ✅ Combined
|
||||
orders = Order.objects.select_related('user').prefetch_related('items').all()
|
||||
```
|
||||
|
||||
### Query Optimization Toolkit
|
||||
|
||||
```python
|
||||
# Only fetch needed columns
|
||||
User.objects.values('id', 'email')
|
||||
User.objects.values_list('email', flat=True)
|
||||
|
||||
# Annotate instead of Python loops
|
||||
from django.db.models import Count, Sum
|
||||
Order.objects.annotate(item_count=Count('items'), revenue=Sum('items__price'))
|
||||
|
||||
# Bulk operations
|
||||
OrderItem.objects.bulk_create([...])
|
||||
Order.objects.filter(status='pending').update(status='cancelled')
|
||||
|
||||
# Database indexes
|
||||
class Meta:
|
||||
indexes = [
|
||||
models.Index(fields=['user', 'status']),
|
||||
models.Index(fields=['-created_at']),
|
||||
models.Index(fields=['email'], condition=Q(is_active=True)),
|
||||
]
|
||||
|
||||
# Pagination
|
||||
from rest_framework.pagination import CursorPagination
|
||||
class OrderPagination(CursorPagination):
|
||||
page_size = 20
|
||||
ordering = '-created_at'
|
||||
```
|
||||
|
||||
### Caching
|
||||
|
||||
```python
|
||||
from django.core.cache import cache
|
||||
|
||||
def get_product(product_id: str):
|
||||
cache_key = f'product:{product_id}'
|
||||
product = cache.get(cache_key)
|
||||
if product is None:
|
||||
product = Product.objects.get(id=product_id)
|
||||
cache.set(cache_key, product, timeout=300)
|
||||
return product
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Testing (MEDIUM-HIGH)
|
||||
|
||||
### pytest-django + factory_boy
|
||||
|
||||
```python
|
||||
# conftest.py
|
||||
@pytest.fixture
|
||||
def api_client():
|
||||
return APIClient()
|
||||
|
||||
@pytest.fixture
|
||||
def authenticated_client(api_client, user_factory):
|
||||
user = user_factory()
|
||||
api_client.force_authenticate(user=user)
|
||||
return api_client
|
||||
```
|
||||
|
||||
```python
|
||||
# factories.py
|
||||
class UserFactory(factory.django.DjangoModelFactory):
|
||||
class Meta:
|
||||
model = User
|
||||
email = factory.Sequence(lambda n: f'user{n}@example.com')
|
||||
username = factory.Sequence(lambda n: f'user{n}')
|
||||
|
||||
class OrderFactory(factory.django.DjangoModelFactory):
|
||||
class Meta:
|
||||
model = 'orders.Order'
|
||||
user = factory.SubFactory(UserFactory)
|
||||
total = factory.Faker('pydecimal', left_digits=3, right_digits=2, positive=True)
|
||||
```
|
||||
|
||||
```python
|
||||
# test_views.py
|
||||
@pytest.mark.django_db
|
||||
class TestListOrders:
|
||||
def test_returns_user_orders(self, authenticated_client):
|
||||
OrderFactory.create_batch(3, user=authenticated_client.handler._force_user)
|
||||
response = authenticated_client.get('/api/orders/')
|
||||
assert response.status_code == 200
|
||||
assert len(response.data['data']) == 3
|
||||
|
||||
def test_requires_authentication(self, api_client):
|
||||
response = api_client.get('/api/orders/')
|
||||
assert response.status_code == 401
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Admin Customization (MEDIUM)
|
||||
|
||||
```python
|
||||
class OrderItemInline(admin.TabularInline):
|
||||
model = OrderItem
|
||||
extra = 0
|
||||
readonly_fields = ['price']
|
||||
|
||||
@admin.register(Order)
|
||||
class OrderAdmin(admin.ModelAdmin):
|
||||
list_display = ['id', 'user', 'status', 'total', 'created_at']
|
||||
list_filter = ['status', 'created_at']
|
||||
search_fields = ['user__email', 'id']
|
||||
readonly_fields = ['id', 'created_at', 'updated_at']
|
||||
inlines = [OrderItemInline]
|
||||
date_hierarchy = 'created_at'
|
||||
|
||||
def get_queryset(self, request):
|
||||
return super().get_queryset(request).select_related('user')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Production Deployment (MEDIUM)
|
||||
|
||||
### Security Settings
|
||||
|
||||
```python
|
||||
# settings/prod.py
|
||||
DEBUG = False
|
||||
ALLOWED_HOSTS = ['example.com', 'www.example.com']
|
||||
CSRF_TRUSTED_ORIGINS = ['https://example.com']
|
||||
SECURE_SSL_REDIRECT = True
|
||||
SESSION_COOKIE_SECURE = True
|
||||
CSRF_COOKIE_SECURE = True
|
||||
SECURE_HSTS_SECONDS = 31536000
|
||||
```
|
||||
|
||||
### Deployment Stack
|
||||
|
||||
```
|
||||
Nginx → Gunicorn → Django
|
||||
↕
|
||||
PostgreSQL + Redis (cache)
|
||||
↕
|
||||
Celery (background tasks)
|
||||
```
|
||||
|
||||
```bash
|
||||
gunicorn config.wsgi:application \
|
||||
--bind 0.0.0.0:8000 \
|
||||
--workers 4 \
|
||||
--timeout 120 \
|
||||
--access-logfile -
|
||||
```
|
||||
|
||||
### WhiteNoise for Static Files
|
||||
|
||||
```python
|
||||
MIDDLEWARE = [
|
||||
'django.middleware.security.SecurityMiddleware',
|
||||
'whitenoise.middleware.WhiteNoiseMiddleware', # right after Security
|
||||
...
|
||||
]
|
||||
STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
|
||||
```
|
||||
|
||||
### Rules
|
||||
|
||||
```
|
||||
✅ Gunicorn + Nginx (or Cloud Run / Railway)
|
||||
✅ PostgreSQL (not SQLite)
|
||||
✅ python manage.py check --deploy
|
||||
✅ Sentry for error tracking
|
||||
|
||||
❌ Never use runserver in production
|
||||
❌ Never use DEBUG=True in production
|
||||
❌ Never use SQLite in production
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
| # | ❌ Don't | ✅ Do Instead |
|
||||
|---|---------|--------------|
|
||||
| 1 | Business logic in views | Service layer (`services.py`) |
|
||||
| 2 | One giant app | App-per-domain |
|
||||
| 3 | Default User model | Custom User before first migrate |
|
||||
| 4 | No `select_related` | Always eager-load related objects |
|
||||
| 5 | Django fixtures for tests | `factory_boy` factories |
|
||||
| 6 | `settings.py` single file | Split: base + dev + prod |
|
||||
| 7 | `runserver` in production | Gunicorn + Nginx |
|
||||
| 8 | SQLite in production | PostgreSQL |
|
||||
| 9 | `ModelSerializer` for writes | Explicit input serializer |
|
||||
| 10 | Raw SQL in views | ORM querysets + `selectors.py` |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Can't change User model after first migration"
|
||||
|
||||
**Fix:** If starting fresh: delete all migrations + DB, set custom User, re-migrate. If data exists: complex migration (use `django-allauth` or incremental field migration).
|
||||
|
||||
### Issue 2: "Serializer is too slow on large querysets"
|
||||
|
||||
**Fix:** Missing `select_related` / `prefetch_related` → N+1 queries.
|
||||
```python
|
||||
queryset = Order.objects.select_related('user').prefetch_related('items')
|
||||
```
|
||||
|
||||
### Issue 3: "Circular import between apps"
|
||||
|
||||
**Fix:** Use string references: `models.ForeignKey('orders.Order', ...)` instead of importing the model class. For services, import inside the function.
|
||||
78
fullstack-dev/references/environment-management.md
Normal file
78
fullstack-dev/references/environment-management.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Environment & CORS Management
|
||||
|
||||
Patterns for managing environment variables, API URLs, and CORS configuration across frontend and backend stacks.
|
||||
|
||||
---
|
||||
|
||||
## Standard Environment Pattern
|
||||
|
||||
```
|
||||
# .env.local (gitignored, for local dev)
|
||||
NEXT_PUBLIC_API_URL=http://localhost:3001
|
||||
NEXT_PUBLIC_WS_URL=ws://localhost:3001
|
||||
|
||||
# Staging (set in Vercel/CI)
|
||||
NEXT_PUBLIC_API_URL=https://api-staging.example.com
|
||||
|
||||
# Production (set in Vercel/CI)
|
||||
NEXT_PUBLIC_API_URL=https://api.example.com
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variable Rules
|
||||
|
||||
```
|
||||
✅ API base URL from environment variable — NEVER hardcoded
|
||||
✅ Prefix client-side vars with NEXT_PUBLIC_ (Next.js) or VITE_ (Vite)
|
||||
✅ Backend URL = server-only env var (for SSR calls, not exposed to browser)
|
||||
✅ CORS on backend: explicit list of allowed origins per environment
|
||||
|
||||
❌ Never use localhost URLs in production builds
|
||||
❌ Never expose backend-only secrets with NEXT_PUBLIC_ prefix
|
||||
❌ Never commit .env.local (commit .env.example with placeholders)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CORS Configuration
|
||||
|
||||
```typescript
|
||||
// Backend: environment-aware CORS
|
||||
const ALLOWED_ORIGINS = {
|
||||
development: ['http://localhost:3000', 'http://localhost:5173'],
|
||||
staging: ['https://staging.example.com'],
|
||||
production: ['https://example.com', 'https://www.example.com'],
|
||||
};
|
||||
|
||||
app.use(cors({
|
||||
origin: ALLOWED_ORIGINS[process.env.NODE_ENV || 'development'],
|
||||
credentials: true, // needed for cookies (auth)
|
||||
methods: ['GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
|
||||
}));
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "CORS error in browser but works in Postman"
|
||||
|
||||
**Cause:** CORS is a browser security feature. Postman/curl skip it.
|
||||
|
||||
**Fix:**
|
||||
1. Backend must return `Access-Control-Allow-Origin: https://your-frontend.com`
|
||||
2. For cookies/auth: `credentials: true` on both sides
|
||||
3. Check that preflight `OPTIONS` request returns correct headers
|
||||
|
||||
### Issue 2: "Environment variable undefined in browser"
|
||||
|
||||
**Cause:** Missing `NEXT_PUBLIC_` or `VITE_` prefix for client-side access.
|
||||
|
||||
**Fix:** Client-side vars MUST have the framework prefix. Rebuild after adding new env vars (they are embedded at build time).
|
||||
|
||||
### Issue 3: "Works locally, fails in staging"
|
||||
|
||||
**Cause:** Different origins, missing CORS config for staging domain.
|
||||
|
||||
**Fix:** Add staging origin to `ALLOWED_ORIGINS`, verify env vars are set in deployment platform.
|
||||
278
fullstack-dev/references/release-checklist.md
Normal file
278
fullstack-dev/references/release-checklist.md
Normal file
@@ -0,0 +1,278 @@
|
||||
# Release & Acceptance Checklist
|
||||
|
||||
6-gate release checklist for backend and full-stack applications. Prevents "it works on my machine" and "we forgot to check X" failures.
|
||||
|
||||
**Iron Law: NO RELEASE WITHOUT ALL GATES PASSING.**
|
||||
|
||||
---
|
||||
|
||||
## Release Gates Overview
|
||||
|
||||
```
|
||||
Feature Complete
|
||||
↓
|
||||
Gate 1: Functional Acceptance → Does it do what it should?
|
||||
↓
|
||||
Gate 2: Non-Functional Acceptance → Is it fast, reliable, observable?
|
||||
↓
|
||||
Gate 3: Security Review → Is it safe?
|
||||
↓
|
||||
Gate 4: Deployment Readiness → Can we deploy and rollback safely?
|
||||
↓
|
||||
Gate 5: Release Execution → Deploy with canary + monitoring
|
||||
↓
|
||||
Gate 6: Post-Release Validation → Did it actually work in production?
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gate 1: Functional Acceptance
|
||||
|
||||
**Question: Does it do what the requirements say?**
|
||||
|
||||
- [ ] All acceptance criteria from ticket/PRD have passing tests
|
||||
- [ ] Happy path works end-to-end
|
||||
- [ ] Edge cases tested (empty inputs, max lengths, Unicode)
|
||||
- [ ] Error cases tested (invalid input, not found, timeout)
|
||||
- [ ] Data integrity verified (CRUD cycle produces correct state)
|
||||
- [ ] Backward compatibility confirmed (existing clients not broken)
|
||||
- [ ] API contract matches OpenAPI spec
|
||||
- [ ] Idempotency verified (retries don't create duplicates)
|
||||
|
||||
### Evidence Template
|
||||
|
||||
| Requirement | Test | Status | Notes |
|
||||
|-------------|------|--------|-------|
|
||||
| User can create order | `orders.api.test:creates order` | ✅ PASS | |
|
||||
| Empty cart → error | `orders.api.test:rejects empty` | ✅ PASS | |
|
||||
| Payment failure handled | `payments.test:handles decline` | ✅ PASS | |
|
||||
|
||||
---
|
||||
|
||||
## Gate 2: Non-Functional Acceptance
|
||||
|
||||
**Question: Is it fast, reliable, and observable?**
|
||||
|
||||
### Performance
|
||||
|
||||
- [ ] Response time within budget (p95 < ___ms) — measured, not assumed
|
||||
- [ ] No N+1 queries (checked with query logging)
|
||||
- [ ] New queries use indexes (`EXPLAIN ANALYZE`)
|
||||
- [ ] Pagination works on large datasets
|
||||
- [ ] Caching effective (hit rate > 80%)
|
||||
- [ ] Connection pool healthy under load
|
||||
|
||||
### Reliability
|
||||
|
||||
- [ ] Graceful degradation when dependencies fail (circuit breaker)
|
||||
- [ ] Retry logic works for transient failures
|
||||
- [ ] All external calls have timeouts
|
||||
- [ ] Rate limiting returns 429 correctly
|
||||
- [ ] Health check endpoints verified (`/health`, `/ready`)
|
||||
|
||||
### Observability
|
||||
|
||||
- [ ] Structured logging with request ID (not `console.log`)
|
||||
- [ ] Metrics exposed (request count, latency, error rate)
|
||||
- [ ] Alerts configured (error spike, latency spike)
|
||||
- [ ] Request tracing works end-to-end
|
||||
- [ ] Dashboard updated for new feature
|
||||
|
||||
### Evidence
|
||||
|
||||
| Metric | Target | Actual | Status |
|
||||
|--------|--------|--------|--------|
|
||||
| p95 response | < 500ms | ___ms | ✅/❌ |
|
||||
| p99 response | < 1000ms | ___ms | ✅/❌ |
|
||||
| Error rate (load) | < 0.1% | ___% | ✅/❌ |
|
||||
| Throughput | > ___ RPS | ___ RPS | ✅/❌ |
|
||||
|
||||
---
|
||||
|
||||
## Gate 3: Security Review
|
||||
|
||||
**Question: Does this introduce vulnerabilities?**
|
||||
|
||||
### Input & Output
|
||||
|
||||
- [ ] All input validated server-side (never trust client)
|
||||
- [ ] SQL injection prevented (parameterized queries only)
|
||||
- [ ] XSS prevented (output encoding)
|
||||
- [ ] File upload validated (type, size, name sanitized)
|
||||
- [ ] Rate limiting on sensitive endpoints (login, reset, APIs)
|
||||
|
||||
### Auth & Data
|
||||
|
||||
- [ ] Protected endpoints require valid credentials
|
||||
- [ ] Users can only access their own resources
|
||||
- [ ] Admin routes require admin role
|
||||
- [ ] Tokens expire (short-lived access + refresh)
|
||||
- [ ] Passwords hashed (bcrypt/argon2, not MD5/SHA)
|
||||
- [ ] Sensitive data not logged (passwords, tokens, PII)
|
||||
- [ ] Secrets in env vars (not hardcoded)
|
||||
- [ ] Error messages don't leak internals
|
||||
|
||||
### Dependencies
|
||||
|
||||
- [ ] No known vulnerabilities (`npm audit` / `pip audit` / `govulncheck`)
|
||||
- [ ] Dependencies pinned in lockfile
|
||||
- [ ] Unused dependencies removed
|
||||
|
||||
---
|
||||
|
||||
## Gate 4: Deployment Readiness
|
||||
|
||||
**Question: Can we deploy safely and roll back if needed?**
|
||||
|
||||
### Code
|
||||
|
||||
- [ ] All tests pass in CI (not "it passed locally")
|
||||
- [ ] Linter clean, build succeeds
|
||||
- [ ] Code reviewed and approved
|
||||
- [ ] No unresolved TODO/FIXME/HACK
|
||||
|
||||
### Database
|
||||
|
||||
- [ ] Migration tested on staging with production-like data
|
||||
- [ ] Down migration works (tested!)
|
||||
- [ ] Migration is non-destructive (additive only)
|
||||
- [ ] Migration timing estimated on production data size
|
||||
- [ ] Backfill plan documented (if needed)
|
||||
|
||||
### Configuration
|
||||
|
||||
- [ ] New env vars documented in `.env.example`
|
||||
- [ ] Env vars set in staging and verified
|
||||
- [ ] Env vars set in production
|
||||
- [ ] Feature flags configured (if applicable)
|
||||
|
||||
### Rollback Plan Template
|
||||
|
||||
```markdown
|
||||
## Rollback Plan: [Feature]
|
||||
|
||||
### When to rollback
|
||||
- Error rate > 1% sustained 5 minutes
|
||||
- p99 latency > 3000ms sustained 10 minutes
|
||||
- Critical business function broken
|
||||
|
||||
### Steps
|
||||
1. Revert deploy: [command]
|
||||
2. Rollback migration (if applied): [command]
|
||||
3. Invalidate cache: [command]
|
||||
4. Notify team: #incidents channel
|
||||
5. Verify rollback: [verification steps]
|
||||
|
||||
### Estimated time: [X minutes]
|
||||
### Data recovery: [procedure if data was modified]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gate 5: Release Execution
|
||||
|
||||
### Deployment Sequence
|
||||
|
||||
```
|
||||
1. 📢 ANNOUNCE in release channel
|
||||
|
||||
2. 🗄️ DATABASE — Apply migration
|
||||
- Run migration
|
||||
- Verify completion
|
||||
- Check data integrity
|
||||
|
||||
3. 🚀 DEPLOY — Roll out code
|
||||
- Canary first (10% traffic)
|
||||
- Monitor 5 minutes
|
||||
- If OK → 50% → monitor → 100%
|
||||
- If NOT OK → STOP immediately
|
||||
|
||||
4. 🔍 SMOKE TEST
|
||||
- Health check → 200
|
||||
- Login works
|
||||
- Core operation works
|
||||
- No error spikes
|
||||
|
||||
5. ✅ ANNOUNCE "Release complete. Monitoring 30 min."
|
||||
```
|
||||
|
||||
### Canary Decision Table
|
||||
|
||||
| Metric | Baseline | Canary OK | STOP | ROLLBACK |
|
||||
|--------|----------|-----------|------|----------|
|
||||
| Error rate | 0.05% | < 0.1% | 0.5% | > 1% |
|
||||
| p95 latency | 300ms | < 500ms | 700ms | > 1000ms |
|
||||
|
||||
---
|
||||
|
||||
## Gate 6: Post-Release Validation
|
||||
|
||||
### Immediate (0-30 min)
|
||||
|
||||
- [ ] Health checks green on all instances
|
||||
- [ ] Error rate within normal range
|
||||
- [ ] Latency normal (p95, p99)
|
||||
- [ ] Core user journey manually tested
|
||||
- [ ] Logs clean — no unexpected errors
|
||||
- [ ] Alerts silent
|
||||
|
||||
### Short-term (1-24 hours)
|
||||
|
||||
- [ ] No customer complaints
|
||||
- [ ] Business metrics stable (conversion, revenue, signups)
|
||||
- [ ] Memory/CPU stable (no creeping usage)
|
||||
- [ ] Queue backlogs clear
|
||||
- [ ] Database performance stable
|
||||
|
||||
### Post-Release Report Template
|
||||
|
||||
```markdown
|
||||
## Release Report: [Feature]
|
||||
- Deployed: [timestamp] by @[engineer]
|
||||
- Duration: [minutes]
|
||||
|
||||
| Check | Status | Notes |
|
||||
|-------|--------|-------|
|
||||
| Health checks | ✅ | All healthy |
|
||||
| Error rate | ✅ | 0.03% (baseline: 0.05%) |
|
||||
| p95 latency | ✅ | 310ms (baseline: 300ms) |
|
||||
| Core flow | ✅ | Order creation verified |
|
||||
|
||||
Issues found: None / [details]
|
||||
Rollback used: No / Yes: [reason]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Release Readiness Score
|
||||
|
||||
Score each gate **0-2**: (0 = not checked, 1 = partially, 2 = fully verified with evidence)
|
||||
|
||||
| Gate | Score |
|
||||
|------|-------|
|
||||
| 1. Functional Acceptance | /2 |
|
||||
| 2. Non-Functional Acceptance | /2 |
|
||||
| 3. Security Review | /2 |
|
||||
| 4. Deployment Readiness | /2 |
|
||||
| 5. Release Execution Plan | /2 |
|
||||
| 6. Post-Release Validation Plan | /2 |
|
||||
| **Total** | **/12** |
|
||||
|
||||
**Decision:**
|
||||
- **12/12** → Ship it ✅
|
||||
- **10-11** → Ship with documented exceptions + owner assigned
|
||||
- **< 10** → Do NOT release. Fix gaps first.
|
||||
|
||||
---
|
||||
|
||||
## Common Rationalizations
|
||||
|
||||
| ❌ Excuse | ✅ Reality |
|
||||
|----------|-----------|
|
||||
| "It's a small change" | Small changes cause outages every day |
|
||||
| "We tested locally" | Local ≠ production |
|
||||
| "We'll fix it if it breaks" | You'll fix it at 3 AM. Prevent now. |
|
||||
| "Deadline is today" | Broken code costs more than late code |
|
||||
| "CI passed" | CI doesn't check everything. Run the checklist. |
|
||||
| "We can always rollback" | Only if you planned and tested rollback |
|
||||
| "We did this last time fine" | Survivorship bias. Checklist every time. |
|
||||
254
fullstack-dev/references/technology-selection.md
Normal file
254
fullstack-dev/references/technology-selection.md
Normal file
@@ -0,0 +1,254 @@
|
||||
# Technology Selection Framework
|
||||
|
||||
Structured decision framework for backend and full-stack technology choices. Prevents analysis paralysis while ensuring rigorous evaluation.
|
||||
|
||||
**Iron Law: NO TECHNOLOGY CHOICE WITHOUT EXPLICIT TRADE-OFF ANALYSIS.**
|
||||
|
||||
"I like it" and "it's trending" are not engineering arguments.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Requirements Before Technology
|
||||
|
||||
### Non-Functional Requirements (Quantify!)
|
||||
|
||||
| Dimension | Question | Bad Answer | Good Answer |
|
||||
|-----------|----------|-----------|-------------|
|
||||
| Scale | How many concurrent users? | "Lots" | "1K concurrent, 500 RPS peak" |
|
||||
| Latency | Acceptable p99 response time? | "Fast" | "< 200ms API, < 2s reports" |
|
||||
| Availability | Required uptime? | "Always up" | "99.9% (8.7h downtime/year)" |
|
||||
| Data volume | Expected storage growth? | "A lot" | "100GB/year, 10M rows" |
|
||||
| Consistency | Strong vs eventual? | "Consistent" | "Strong for payments, eventual for feeds" |
|
||||
| Compliance | Regulatory? | "Some" | "GDPR data residency EU, SOC 2 Type II" |
|
||||
|
||||
### Team Constraints
|
||||
|
||||
- Team size and seniority level
|
||||
- What the team already knows well
|
||||
- Can you hire for this stack? (check job market)
|
||||
- Timeline pressure (days vs months to production)
|
||||
- Budget for licenses, infrastructure, training
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Evaluation Matrix
|
||||
|
||||
Score each option 1-5 on weighted criteria:
|
||||
|
||||
| Criterion | Weight | Option A | Option B | Option C |
|
||||
|-----------|--------|----------|----------|----------|
|
||||
| Meets functional requirements | 5× | _ | _ | _ |
|
||||
| Meets non-functional requirements | 5× | _ | _ | _ |
|
||||
| Team expertise / learning curve | 4× | _ | _ | _ |
|
||||
| Ecosystem maturity (libs, tools) | 3× | _ | _ | _ |
|
||||
| Community & long-term viability | 3× | _ | _ | _ |
|
||||
| Operational complexity | 3× | _ | _ | _ |
|
||||
| Hiring pool availability | 2× | _ | _ | _ |
|
||||
| Cost (license + infra + training) | 2× | _ | _ | _ |
|
||||
| **Weighted Total** | | _ | _ | _ |
|
||||
|
||||
**Rules:**
|
||||
- Any option scoring **1 on a 5× criterion** → automatically disqualified
|
||||
- Options within **10%** of each other → choose what team knows best
|
||||
- Options within **15%** → run a **time-boxed PoC** (2-5 days max)
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Decision Trees
|
||||
|
||||
### Backend Language / Framework
|
||||
|
||||
```
|
||||
What type of project?
|
||||
│
|
||||
├─ REST/GraphQL API, rapid development
|
||||
│ ├─ Team knows TypeScript → Node.js
|
||||
│ │ ├─ Full-featured, enterprise patterns → NestJS
|
||||
│ │ ├─ Lightweight, flexible → Fastify / Hono / Express
|
||||
│ │ └─ Full-stack with React → Next.js API routes
|
||||
│ ├─ Team knows Python
|
||||
│ │ ├─ High-perf async API → FastAPI
|
||||
│ │ ├─ Full-stack, admin-heavy → Django
|
||||
│ │ └─ Lightweight → Flask / Litestar
|
||||
│ └─ Team knows Java/Kotlin
|
||||
│ ├─ Enterprise, large team → Spring Boot
|
||||
│ └─ Lightweight, fast startup → Quarkus / Ktor
|
||||
│
|
||||
├─ High concurrency, systems-level
|
||||
│ ├─ Microservices, network → Go
|
||||
│ ├─ Extreme perf, safety → Rust (Axum / Actix)
|
||||
│ └─ Fault tolerance → Elixir (Phoenix)
|
||||
│
|
||||
├─ Real-time (WebSocket, streaming)
|
||||
│ ├─ Node.js ecosystem → Socket.io / ws
|
||||
│ ├─ Scalable pub/sub → Elixir Phoenix
|
||||
│ └─ Low-latency → Go / Rust
|
||||
│
|
||||
└─ ML / data-intensive
|
||||
└─ Python (FastAPI + ML libs)
|
||||
```
|
||||
|
||||
### Database
|
||||
|
||||
```
|
||||
What data model?
|
||||
│
|
||||
├─ Structured, relational, ACID
|
||||
│ ├─ General purpose → PostgreSQL ← DEFAULT CHOICE
|
||||
│ ├─ Read-heavy, MySQL ecosystem → MySQL / MariaDB
|
||||
│ └─ Embedded / serverless edge → SQLite / Turso / D1
|
||||
│
|
||||
├─ Semi-structured, flexible schema
|
||||
│ ├─ Document-oriented → MongoDB
|
||||
│ ├─ Serverless document → DynamoDB / Firestore
|
||||
│ └─ Search-heavy → Elasticsearch / OpenSearch
|
||||
│
|
||||
├─ Key-value / cache
|
||||
│ ├─ In-memory + data structures → Redis / Valkey
|
||||
│ └─ Planet-scale KV → DynamoDB / Cassandra
|
||||
│
|
||||
├─ Time-series → TimescaleDB / ClickHouse / InfluxDB
|
||||
├─ Graph → Neo4j / Apache AGE (Postgres extension)
|
||||
└─ Vector (AI embeddings) → pgvector / Pinecone / Qdrant
|
||||
```
|
||||
|
||||
**Default:** Start with PostgreSQL. It handles 80% of use cases.
|
||||
|
||||
### Caching Strategy
|
||||
|
||||
| Pattern | Technology | When |
|
||||
|---------|-----------|------|
|
||||
| Application cache | Redis / Valkey | Sessions, frequent reads, rate limiting |
|
||||
| HTTP cache | CDN (Cloudflare/Vercel) | Static assets, public API responses |
|
||||
| Query cache | Materialized views | Complex aggregations, dashboards |
|
||||
| In-process cache | LRU (in-memory) | Config, small lookup tables |
|
||||
| Edge cache | Cloudflare KV / Vercel KV | Global low-latency reads |
|
||||
|
||||
### Message Queue / Event Streaming
|
||||
|
||||
| Pattern | Technology | When |
|
||||
|---------|-----------|------|
|
||||
| Task queue (background jobs) | BullMQ / Celery / SQS | Email, exports, payments |
|
||||
| Event streaming (replay, audit) | Kafka / Redpanda | Event sourcing, real-time pipelines |
|
||||
| Lightweight pub/sub | Redis Streams / NATS | Simple notifications, broadcasting |
|
||||
| Request-reply (sync over async) | NATS / RabbitMQ RPC | Internal service calls |
|
||||
|
||||
### Hosting / Deployment
|
||||
|
||||
| Model | Technology | When |
|
||||
|-------|-----------|------|
|
||||
| Serverless (auto-scale) | Vercel / Cloudflare Workers / Lambda | Variable traffic, pay-per-use |
|
||||
| Container (predictable) | Cloud Run / Render / Railway / Fly.io | Steady traffic, simple ops |
|
||||
| Kubernetes (large scale) | EKS / GKE / AKS | 10+ services, team has K8s expertise |
|
||||
| VPS (full control) | DigitalOcean / Hetzner / EC2 | Predictable workload, cost-sensitive |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Decision Documentation
|
||||
|
||||
### ADR (Architecture Decision Record) Template
|
||||
|
||||
```markdown
|
||||
# ADR-{NNN}: {Title}
|
||||
|
||||
## Status: Proposed | Accepted | Deprecated | Superseded by ADR-{NNN}
|
||||
|
||||
## Context
|
||||
What problem are we solving? What forces are at play?
|
||||
|
||||
## Decision
|
||||
What did we choose and why?
|
||||
|
||||
## Evaluation
|
||||
| Criterion | Weight | Chosen | Runner-up |
|
||||
|-----------|--------|--------|-----------|
|
||||
|
||||
## Consequences
|
||||
- Positive: ...
|
||||
- Negative: ...
|
||||
- Risks: ...
|
||||
|
||||
## Alternatives Rejected
|
||||
- Option B: rejected because...
|
||||
- Option C: rejected because...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Stack Templates
|
||||
|
||||
### A: Startup / MVP (Speed)
|
||||
|
||||
| Layer | Choice | Why |
|
||||
|-------|--------|-----|
|
||||
| Language | TypeScript | One language front + back |
|
||||
| Framework | Next.js (full-stack) or NestJS (API) | Fast iteration |
|
||||
| Database | PostgreSQL (Supabase / Neon) | Managed, generous free tier |
|
||||
| Auth | Better Auth / Clerk | No auth code to maintain |
|
||||
| Cache | Redis (Upstash) | Serverless-friendly |
|
||||
| Hosting | Vercel / Railway | Zero-config deploys |
|
||||
|
||||
### B: SaaS / Business App (Balance)
|
||||
|
||||
| Layer | Choice | Why |
|
||||
|-------|--------|-----|
|
||||
| Language | TypeScript or Python | Team preference |
|
||||
| Framework | NestJS or FastAPI | Structured, testable |
|
||||
| Database | PostgreSQL | Reliable, feature-rich |
|
||||
| Queue | BullMQ (Redis) | Simple background jobs |
|
||||
| Auth | OAuth 2.0 + JWT | Standard, flexible |
|
||||
| Hosting | AWS ECS / Cloud Run | Scalable containers |
|
||||
| Monitoring | Datadog / Grafana + Prometheus | Full observability |
|
||||
|
||||
### C: High-Performance (Scale)
|
||||
|
||||
| Layer | Choice | Why |
|
||||
|-------|--------|-----|
|
||||
| Language | Go or Rust | Max throughput, low latency |
|
||||
| Database | PostgreSQL + Redis + ClickHouse | OLTP + cache + analytics |
|
||||
| Queue | Kafka / Redpanda | High-throughput streaming |
|
||||
| Hosting | Kubernetes (EKS/GKE) | Fine-grained scaling |
|
||||
| Monitoring | Prometheus + Grafana + Jaeger | Metrics + tracing |
|
||||
|
||||
### D: AI / ML Application
|
||||
|
||||
| Layer | Choice | Why |
|
||||
|-------|--------|-----|
|
||||
| Language | Python (API) + TypeScript (frontend) | ML libs + modern UI |
|
||||
| Framework | FastAPI + Next.js | Async + SSR |
|
||||
| Database | PostgreSQL + pgvector | Relational + embeddings |
|
||||
| Queue | Celery + Redis | ML job processing |
|
||||
| Hosting | Modal / AWS GPU / Replicate | GPU access |
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
| # | ❌ Don't | ✅ Do Instead |
|
||||
|---|---------|--------------|
|
||||
| 1 | "X is trending on HN" | Evaluate against YOUR requirements |
|
||||
| 2 | Resume-Driven Development | Choose what team can maintain |
|
||||
| 3 | "Must scale to 1M users" (day 1) | Build for 10× current need, not 1000× |
|
||||
| 4 | Evaluate for weeks | Time-box to 3-5 days, then decide |
|
||||
| 5 | No decision documentation | Write ADR for every major choice |
|
||||
| 6 | Ignore operational cost | Include deploy, monitor, debug cost |
|
||||
| 7 | "We'll rewrite later" | Assume you won't. Choose carefully. |
|
||||
| 8 | Microservices by default | Start monolith, extract when needed |
|
||||
| 9 | Different DB per service (day 1) | One database, split when justified |
|
||||
| 10 | "It worked at Google" | You're not Google. Scale to YOUR context. |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Team can't agree on a framework"
|
||||
|
||||
**Fix:** Time-box to 3 days. Fill the evaluation matrix. If scores within 10%, pick what the majority knows. Document in ADR. Move on.
|
||||
|
||||
### Issue 2: "We picked X but it doesn't fit"
|
||||
|
||||
**Fix:** Sunk cost fallacy check. If < 2 weeks invested, switch now. If > 2 weeks, document pain points and plan phased migration.
|
||||
|
||||
### Issue 3: "Do we need microservices?"
|
||||
|
||||
**Fix:** Almost certainly no. Start with a well-structured monolith. Extract to services only when: (a) different scaling needs, (b) different team ownership, (c) different deployment cadence.
|
||||
404
fullstack-dev/references/testing-strategy.md
Normal file
404
fullstack-dev/references/testing-strategy.md
Normal file
@@ -0,0 +1,404 @@
|
||||
# Backend Testing Strategy
|
||||
|
||||
Comprehensive testing guide for backend and full-stack applications. Covers the full testing pyramid with deep focus on API integration tests, database testing, contract testing, and performance testing.
|
||||
|
||||
## Quick Start Checklist
|
||||
|
||||
- [ ] **Test runner configured** (Jest/Vitest, Pytest, Go test)
|
||||
- [ ] **Test database** ready (Docker container or in-memory)
|
||||
- [ ] **Database isolation** per test (transaction rollback or truncation)
|
||||
- [ ] **Test factories** for common entities (user, order, product)
|
||||
- [ ] **Auth helper** to generate tokens for tests
|
||||
- [ ] **CI pipeline** runs tests with real database service
|
||||
- [ ] **Coverage threshold** enforced (≥ 80%)
|
||||
|
||||
---
|
||||
|
||||
## The Testing Pyramid
|
||||
|
||||
```
|
||||
╱╲ E2E (few, slow) — full flows across services
|
||||
╱ ╲
|
||||
╱────╲ Integration (moderate) — API + DB + external
|
||||
╱ ╲
|
||||
╱────────╲ Unit (many, fast) — pure business logic
|
||||
╱__________╲
|
||||
```
|
||||
|
||||
| Level | What | Speed | Count |
|
||||
|-------|------|-------|-------|
|
||||
| Unit | Pure functions, business logic, no I/O | < 10ms | 70%+ of tests |
|
||||
| Integration | API routes + real database + mocked externals | 50-500ms | ~20% |
|
||||
| E2E | Full user flow across deployed services | 1-30s | ~10% |
|
||||
| Contract | API compatibility between services | < 100ms | Per API boundary |
|
||||
| Performance | Load, stress, soak | Minutes | Per critical path |
|
||||
|
||||
---
|
||||
|
||||
## 1. API Integration Testing (CRITICAL)
|
||||
|
||||
### What to Test for Every Endpoint
|
||||
|
||||
| Aspect | Tests to Write |
|
||||
|--------|---------------|
|
||||
| Happy path | Correct input → expected response + correct DB state |
|
||||
| Auth | No token → 401, bad token → 401, expired → 401 |
|
||||
| Authorization | Wrong role → 403, not owner → 403 |
|
||||
| Validation | Missing fields → 422, bad types → 422, boundary values |
|
||||
| Not found | Invalid ID → 404, deleted resource → 404 |
|
||||
| Conflict | Duplicate create → 409, stale update → 409 |
|
||||
| Idempotency | Same request twice → same result |
|
||||
| Side effects | DB state changed, events emitted, cache invalidated |
|
||||
| Error format | All errors match RFC 9457 envelope |
|
||||
|
||||
### TypeScript (Jest + Supertest)
|
||||
|
||||
```typescript
|
||||
describe('POST /api/orders', () => {
|
||||
let token: string;
|
||||
let product: Product;
|
||||
|
||||
beforeAll(async () => {
|
||||
await resetDatabase();
|
||||
const user = await createTestUser({ role: 'customer' });
|
||||
token = await getAuthToken(user);
|
||||
product = await createTestProduct({ price: 29.99, stock: 10 });
|
||||
});
|
||||
|
||||
it('creates order → 201 + correct DB state', async () => {
|
||||
const res = await request(app)
|
||||
.post('/api/orders')
|
||||
.set('Authorization', `Bearer ${token}`)
|
||||
.send({ items: [{ productId: product.id, quantity: 2 }] });
|
||||
|
||||
expect(res.status).toBe(201);
|
||||
expect(res.body.data.total).toBe(59.98);
|
||||
|
||||
const updated = await db.product.findUnique({ where: { id: product.id } });
|
||||
expect(updated!.stock).toBe(8);
|
||||
});
|
||||
|
||||
it('rejects without auth → 401', async () => {
|
||||
const res = await request(app).post('/api/orders').send({ items: [] });
|
||||
expect(res.status).toBe(401);
|
||||
});
|
||||
|
||||
it('rejects empty items → 422', async () => {
|
||||
const res = await request(app)
|
||||
.post('/api/orders')
|
||||
.set('Authorization', `Bearer ${token}`)
|
||||
.send({ items: [] });
|
||||
expect(res.status).toBe(422);
|
||||
expect(res.body.errors[0].field).toBe('items');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### Python (Pytest + FastAPI TestClient)
|
||||
|
||||
```python
|
||||
@pytest.fixture
|
||||
def client(db_session):
|
||||
def override_get_db():
|
||||
yield db_session
|
||||
app.dependency_overrides[get_db] = override_get_db
|
||||
yield TestClient(app)
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
def test_create_order_success(client, auth_headers, test_product):
|
||||
response = client.post("/api/orders", json={
|
||||
"items": [{"product_id": test_product.id, "quantity": 2}]
|
||||
}, headers=auth_headers)
|
||||
assert response.status_code == 201
|
||||
assert response.json()["data"]["total"] == 59.98
|
||||
|
||||
def test_create_order_no_auth(client):
|
||||
response = client.post("/api/orders", json={"items": []})
|
||||
assert response.status_code == 401
|
||||
|
||||
def test_create_order_empty_items(client, auth_headers):
|
||||
response = client.post("/api/orders", json={"items": []}, headers=auth_headers)
|
||||
assert response.status_code == 422
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Database Testing (HIGH)
|
||||
|
||||
### Test Isolation Strategies
|
||||
|
||||
| Strategy | Speed | Realism | When |
|
||||
|----------|-------|---------|------|
|
||||
| **Transaction rollback** | ⚡ Fastest | Medium | Default for unit + integration |
|
||||
| **Truncation** | Fast | High | When rollback isn't possible |
|
||||
| **Test containers** | Slow startup | Highest | CI pipeline, full integration |
|
||||
|
||||
**Transaction rollback (recommended default):**
|
||||
```typescript
|
||||
let tx: Transaction;
|
||||
beforeEach(async () => { tx = await db.beginTransaction(); });
|
||||
afterEach(async () => { await tx.rollback(); });
|
||||
```
|
||||
|
||||
**Docker test containers (CI):**
|
||||
```yaml
|
||||
# docker-compose.test.yml
|
||||
services:
|
||||
test-db:
|
||||
image: postgres:16-alpine
|
||||
tmpfs: /var/lib/postgresql/data # RAM disk for speed
|
||||
environment:
|
||||
POSTGRES_DB: myapp_test
|
||||
```
|
||||
|
||||
### Test Factories (Not Raw SQL)
|
||||
|
||||
```typescript
|
||||
// factories/user.factory.ts
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
export function buildUser(overrides: Partial<User> = {}): CreateUserDTO {
|
||||
return {
|
||||
email: faker.internet.email(),
|
||||
firstName: faker.person.firstName(),
|
||||
role: 'customer',
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
export async function createUser(overrides = {}) {
|
||||
return db.user.create({ data: buildUser(overrides) });
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
# factories/user_factory.py
|
||||
import factory
|
||||
from faker import Faker
|
||||
|
||||
class UserFactory(factory.Factory):
|
||||
class Meta:
|
||||
model = User
|
||||
email = factory.LazyAttribute(lambda _: Faker().email())
|
||||
first_name = factory.LazyAttribute(lambda _: Faker().first_name())
|
||||
role = "customer"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. External Service Testing (HIGH)
|
||||
|
||||
### HTTP-Level Mocking (Not Function Mocking)
|
||||
|
||||
**TypeScript (nock):**
|
||||
```typescript
|
||||
import nock from 'nock';
|
||||
|
||||
it('processes payment successfully', async () => {
|
||||
nock('https://api.stripe.com')
|
||||
.post('/v1/charges')
|
||||
.reply(200, { id: 'ch_123', status: 'succeeded', amount: 5000 });
|
||||
|
||||
const result = await paymentService.charge({ amount: 50.00, currency: 'usd' });
|
||||
expect(result.status).toBe('succeeded');
|
||||
});
|
||||
|
||||
it('handles payment timeout', async () => {
|
||||
nock('https://api.stripe.com').post('/v1/charges').delay(10000).reply(200);
|
||||
await expect(paymentService.charge({ amount: 50, currency: 'usd' }))
|
||||
.rejects.toThrow('timeout');
|
||||
});
|
||||
```
|
||||
|
||||
**Python (responses):**
|
||||
```python
|
||||
import responses
|
||||
|
||||
@responses.activate
|
||||
def test_payment_success():
|
||||
responses.post("https://api.stripe.com/v1/charges",
|
||||
json={"id": "ch_123", "status": "succeeded"}, status=200)
|
||||
result = payment_service.charge(amount=50.00, currency="usd")
|
||||
assert result.status == "succeeded"
|
||||
```
|
||||
|
||||
### Test Containers for Infrastructure
|
||||
|
||||
```typescript
|
||||
import { PostgreSqlContainer } from '@testcontainers/postgresql';
|
||||
import { RedisContainer } from '@testcontainers/redis';
|
||||
|
||||
beforeAll(async () => {
|
||||
const pg = await new PostgreSqlContainer('postgres:16').start();
|
||||
process.env.DATABASE_URL = pg.getConnectionUri();
|
||||
await runMigrations();
|
||||
}, 60000);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Contract Testing (MEDIUM-HIGH)
|
||||
|
||||
### Consumer-Driven Contracts (Pact)
|
||||
|
||||
**Consumer (OrderService calls UserService):**
|
||||
```typescript
|
||||
it('can fetch user by ID', async () => {
|
||||
await pact.addInteraction()
|
||||
.given('user usr_123 exists')
|
||||
.uponReceiving('GET /users/usr_123')
|
||||
.withRequest('GET', '/api/users/usr_123')
|
||||
.willRespondWith(200, (b) => {
|
||||
b.jsonBody({ data: { id: MatchersV3.string(), email: MatchersV3.email() } });
|
||||
})
|
||||
.executeTest(async (mockserver) => {
|
||||
const user = await new UserClient(mockserver.url).getUser('usr_123');
|
||||
expect(user.id).toBeDefined();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Provider verifies in CI:**
|
||||
```typescript
|
||||
await new Verifier({
|
||||
providerBaseUrl: 'http://localhost:3001',
|
||||
pactBrokerUrl: process.env.PACT_BROKER_URL,
|
||||
provider: 'UserService',
|
||||
}).verifyProvider();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Performance Testing (MEDIUM)
|
||||
|
||||
### k6 Load Test
|
||||
|
||||
```javascript
|
||||
import http from 'k6/http';
|
||||
import { check, sleep } from 'k6';
|
||||
|
||||
export const options = {
|
||||
stages: [
|
||||
{ duration: '30s', target: 20 }, // ramp up
|
||||
{ duration: '1m', target: 100 }, // sustain
|
||||
{ duration: '30s', target: 0 }, // ramp down
|
||||
],
|
||||
thresholds: {
|
||||
http_req_duration: ['p(95)<500', 'p(99)<1000'],
|
||||
http_req_failed: ['rate<0.01'],
|
||||
},
|
||||
};
|
||||
|
||||
export default function () {
|
||||
const res = http.get(`${__ENV.BASE_URL}/api/orders`);
|
||||
check(res, { 'status 200': (r) => r.status === 200 });
|
||||
sleep(1);
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Budgets
|
||||
|
||||
| Metric | Target | Action if Exceeded |
|
||||
|--------|--------|--------------------|
|
||||
| p95 response time | < 500ms | Optimize queries/caching |
|
||||
| p99 response time | < 1000ms | Check outlier queries |
|
||||
| Error rate | < 0.1% | Investigate spikes |
|
||||
| DB query time | < 100ms each | Add indexes |
|
||||
|
||||
### When to Run
|
||||
|
||||
| Trigger | Test Type |
|
||||
|---------|-----------|
|
||||
| Before major release | Full load test |
|
||||
| New DB query/index | Query benchmark |
|
||||
| Infrastructure change | Baseline comparison |
|
||||
| Weekly (CI) | Smoke load test |
|
||||
|
||||
---
|
||||
|
||||
## Test File Organization
|
||||
|
||||
```
|
||||
tests/
|
||||
unit/ # Pure logic, mocked dependencies
|
||||
order.service.test.ts
|
||||
integration/ # API + real DB
|
||||
orders.api.test.ts
|
||||
auth.api.test.ts
|
||||
contracts/ # Consumer-driven contracts
|
||||
user-service.consumer.pact.ts
|
||||
performance/ # Load tests
|
||||
load-test.js
|
||||
fixtures/
|
||||
factories/ # Test data factories
|
||||
user.factory.ts
|
||||
seeds/
|
||||
test-data.ts
|
||||
helpers/
|
||||
setup.ts # Global test config
|
||||
auth.helper.ts # Token generation
|
||||
db.helper.ts # DB cleanup
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
| # | ❌ Don't | ✅ Do Instead |
|
||||
|---|---------|--------------|
|
||||
| 1 | Test only happy paths | Test errors, auth, validation, edge cases |
|
||||
| 2 | Mock everything (no real DB) | Use test containers or test DB |
|
||||
| 3 | Tests depend on execution order | Each test sets up / tears down own state |
|
||||
| 4 | Hardcode test data | Use factories (faker + overrides) |
|
||||
| 5 | Test implementation details | Test behavior: input → output |
|
||||
| 6 | Share mutable state | Isolate per test (transaction rollback) |
|
||||
| 7 | Skip migration testing in CI | Run migrations from scratch in CI |
|
||||
| 8 | No performance test before release | Load test every major release |
|
||||
| 9 | Test against production data | Generated test data only |
|
||||
| 10 | Test suite > 10 minutes | Parallelize, RAM disk, optimize setup |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue 1: "Tests pass alone but fail together"
|
||||
|
||||
**Cause:** Shared database state between tests. Missing cleanup.
|
||||
|
||||
**Fix:**
|
||||
```typescript
|
||||
beforeEach(async () => { await db.raw('TRUNCATE orders, users CASCADE'); });
|
||||
// OR use transaction rollback per test
|
||||
```
|
||||
|
||||
### Issue 2: "Jest did not exit one second after test run"
|
||||
|
||||
**Cause:** Unclosed database connections or HTTP servers.
|
||||
|
||||
**Fix:**
|
||||
```typescript
|
||||
afterAll(async () => {
|
||||
await db.destroy();
|
||||
await server.close();
|
||||
});
|
||||
```
|
||||
|
||||
### Issue 3: "Async callback was not invoked within timeout"
|
||||
|
||||
**Cause:** Missing `async/await` or unhandled promise.
|
||||
|
||||
**Fix:**
|
||||
```typescript
|
||||
// ❌ Promise not awaited
|
||||
it('should work', () => { request(app).get('/users'); });
|
||||
|
||||
// ✅ Properly awaited
|
||||
it('should work', async () => { await request(app).get('/users'); });
|
||||
```
|
||||
|
||||
### Issue 4: "Integration tests too slow in CI"
|
||||
|
||||
**Fix:**
|
||||
1. Use `tmpfs` for PostgreSQL data dir (RAM disk)
|
||||
2. Run migrations once in `beforeAll`, truncate in `beforeEach`
|
||||
3. Parallelize test suites with `--maxWorkers`
|
||||
4. Skip performance tests on feature branches (only main)
|
||||
Reference in New Issue
Block a user