Node.js Super Senior · Phase 7 — DevOps & Deployment
Phase 7: take it to production — 12-factor config, multi-stage Docker images and PID 1 signals, Docker Compose, PM2 vs Kubernetes, health checks and zero-downtime deploys, a CI/CD pipeline with GitHub Actions, and nginx + TLS.
This is Phase 7 of the 10-phase Super Senior path {Đây là Phase 7 của lộ trình Super Senior 10 phase}. “It works on my machine” is where juniors stop and seniors begin {“Chạy trên máy tôi” là nơi junior dừng và senior bắt đầu}. Now we package the app so it runs identically anywhere, ships automatically, and survives restarts, deploys, and failures {Giờ ta đóng gói app để nó chạy y hệt mọi nơi, ship tự động, và sống sót qua restart, deploy, và sự cố}.
7.1 Environment & config management {Quản lý env & config}
The Twelve-Factor rule: config that varies between environments lives in the environment, never in code {Quy tắc Twelve-Factor: config khác nhau giữa môi trường nằm trong environment, không bao giờ trong code}.
# .env (gitignored — NEVER commit secrets)
NODE_ENV=production
PORT=3000
DATABASE_URL=postgresql://user:pass@localhost/dbname
JWT_SECRET=your_secret_key
REDIS_URL=redis://localhost:6379
Centralize and validate config at boot — fail fast if a required variable is missing or weak {Tập trung và xác thực config khi khởi động — fail nhanh nếu thiếu hoặc yếu}:
import { z } from 'zod';
const envSchema = z.object({
NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
PORT: z.coerce.number().default(3000),
DATABASE_URL: z.string().url(),
JWT_SECRET: z.string().min(32), // refuse to start with a weak secret
});
export const config = envSchema.parse(process.env); // crash loudly at startup, not at 3am
Node 24 reads
.envnatively:node --env-file=.env app.js— nodotenv{Node 24 đọc.envsẵn — không cầndotenv}. For real secrets, prefer a secrets manager (Vault, AWS/GCP Secrets) injected as env at deploy time, not a committed file {Với secret thật, ưu tiên secrets manager tiêm vào env lúc deploy}.
7.2 Docker — build small, run safe {Docker — build nhỏ, chạy an toàn}
A Docker image is built in layers; each instruction is a cached layer {Image Docker dựng theo layer; mỗi lệnh là một layer được cache}. Use a multi-stage build so build tools never reach the final image {Dùng multi-stage build để công cụ build không lọt vào image cuối}:
# ---- build stage ----
FROM node:24-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci # cached until package*.json changes
COPY . .
RUN npm run build # compile TypeScript → dist/
# ---- runtime stage ----
FROM node:24-alpine
WORKDIR /app
ENV NODE_ENV=production
RUN apk add --no-cache tini # a real PID 1 that forwards signals (see 7.5)
COPY package*.json ./
RUN npm ci --omit=dev # production deps only
COPY --from=build /app/dist ./dist
USER node # never run as root
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s \
CMD node -e "fetch('http://localhost:3000/health').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/index.js"]
Senior reflexes {Phản xạ senior}:
.dockerignore(node_modules,.env,.git, tests) → smaller, faster, safer builds {nhỏ, nhanh, an toàn hơn}.- Copy
package*.jsonbefore the source sonpm ciis cached until deps change → fast rebuilds {cachenpm ciđến khi deps đổi}. - Pin a digest, run as non-root, scan the image (
docker scout/trivy) before shipping {ghim digest, chạy non-root, quét image}. - Alpine or distroless for a tiny attack surface {Alpine hoặc distroless cho bề mặt tấn công nhỏ}.
Docker Compose — the whole stack {Docker Compose — cả stack}
Compose runs your app with its dependencies in one command {Compose chạy app cùng phụ thuộc trong một lệnh}:
services:
app:
build: .
ports: ['3000:3000']
environment:
DATABASE_URL: postgresql://user:pass@postgres:5432/mydb
REDIS_URL: redis://redis:6379
depends_on:
postgres: { condition: service_healthy } # wait until DB is actually ready
redis: { condition: service_started }
postgres:
image: postgres:17-alpine
environment: { POSTGRES_USER: user, POSTGRES_PASSWORD: pass, POSTGRES_DB: mydb }
volumes: ['pgdata:/var/lib/postgresql/data'] # persist data across restarts
healthcheck:
test: ['CMD-SHELL', 'pg_isready -U user']
interval: 10s
retries: 5
redis: { image: redis:7-alpine }
volumes:
pgdata:
Service names become hostnames (postgres, redis) on the Compose network; a named volume persists DB data {Tên service thành hostname; named volume giữ dữ liệu DB}.
7.3 Health checks & graceful shutdown {Health check & tắt êm}
Orchestrators decide whether to send you traffic and whether to restart you based on probes — so expose them {Orchestrator quyết định gửi traffic hay restart dựa trên probe — nên hãy phơi chúng}:
- Liveness — “am I alive?” If it fails, the orchestrator restarts the container {liveness — nếu fail, orchestrator restart}.
- Readiness — “can I serve traffic now?” If it fails, traffic is withheld but the container is not killed (e.g. during warm-up or a lost DB) {readiness — nếu fail, ngưng gửi traffic nhưng không giết container}.
app.get('/health', (_req, res) => res.json({ status: 'ok' })); // liveness: cheap
app.get('/ready', async (_req, res) => { // readiness: deps
try { await pool.query('SELECT 1'); res.json({ ready: true }); }
catch { res.status(503).json({ ready: false }); }
});
On deploy the orchestrator sends SIGTERM — drain in-flight work, then exit (Phase 2/3). The full sequence for zero-downtime {Khi deploy, orchestrator gửi SIGTERM — xả việc đang chạy rồi thoát. Trình tự đầy đủ để không downtime}:
SIGTERM ─▶ stop readiness (drop from LB) ─▶ server.close() (finish in-flight)
─▶ close DB pool / queue workers ─▶ exit 0 (force-exit after a grace timeout)
7.4 PM2 vs container orchestration {PM2 vs điều phối container}
A bare node app.js dies on an unhandled crash and uses only one CPU core {node app.js trần chết khi crash và chỉ dùng một lõi CPU}. There are two worlds {Có hai thế giới}.
On a classic VPS — PM2 keeps it alive and runs cluster mode across all cores {Trên VPS cổ điển — PM2 giữ sống và chạy cluster trên mọi lõi}:
// ecosystem.config.js
module.exports = {
apps: [{
name: 'api', script: './dist/index.js',
instances: 'max', exec_mode: 'cluster', // one worker per core, load-balanced
max_memory_restart: '500M',
env_production: { NODE_ENV: 'production', PORT: 3000 },
}],
};
pm2 start ecosystem.config.js --env production
pm2 reload api # zero-downtime: workers cycle one at a time
In containers — the orchestrator (Kubernetes, ECS) owns restarts, scaling, and load-balancing, so you run one Node process per container and skip PM2 cluster mode {Trong container — orchestrator lo restart, scale, cân bằng tải, nên chạy một tiến trình Node mỗi container, bỏ PM2 cluster}. The Kubernetes vocabulary to recognize {Từ vựng Kubernetes cần nhận ra}:
# A Deployment runs N replicas; probes use the endpoints from 7.3
livenessProbe: { httpGet: { path: /health, port: 3000 }, periodSeconds: 10 }
readinessProbe: { httpGet: { path: /ready, port: 3000 }, periodSeconds: 5 }
# A HorizontalPodAutoscaler adds replicas when CPU/RPS rises.
Scale by adding stateless replicas (horizontal), not bigger boxes (vertical) — which is why sessions/cache live in Redis, not in process memory {Mở rộng bằng thêm replica stateless (ngang), không phải máy to hơn (dọc) — vì thế session/cache nằm ở Redis, không trong bộ nhớ tiến trình}.
7.5 PID 1 & signals — the silent container bug {PID 1 & tín hiệu — bug container thầm lặng}
In a container your process runs as PID 1, which has special signal rules: if Node is PID 1 and you didn’t wire handlers correctly, SIGTERM can be ignored, so deploys wait the full grace period then hard-kill you mid-request {Trong container tiến trình chạy là PID 1, có quy tắc tín hiệu đặc biệt: nếu Node là PID 1 mà không gắn handler đúng, SIGTERM có thể bị bỏ qua, nên deploy chờ hết grace rồi giết cứng giữa request}. Two fixes {Hai cách sửa}: use an init like tini (shown in 7.2) as PID 1, and always handle SIGTERM/SIGINT yourself {dùng init như tini làm PID 1, và luôn tự xử lý SIGTERM/SIGINT}.
7.6 CI/CD pipeline (GitHub Actions) {Pipeline CI/CD (GitHub Actions)}
CI/CD automates the path from git push to production: lint → test → build → migrate → deploy {CI/CD tự động hóa đường từ git push tới production: lint → test → build → migrate → deploy}.
# .github/workflows/deploy.yml
name: Deploy
on: { push: { branches: [main] } }
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:17-alpine
env: { POSTGRES_PASSWORD: postgres }
options: >-
--health-cmd pg_isready --health-interval 10s
--health-timeout 5s --health-retries 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '24', cache: 'npm' }
- run: npm ci
- run: npm run lint && npm test && npm run build
build-image:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: docker/build-push-action@v6 # BuildKit + layer cache
with:
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
Deployment strategies a senior names {Chiến lược deploy senior gọi tên được}: rolling (replace replicas gradually — the default), blue-green (stand up a parallel version, flip traffic, instant rollback), and canary (route 5% first, watch metrics, then ramp) {rolling, blue-green, canary}. Run DB migrations as an explicit, backward-compatible step before the new code goes live {Chạy migration như bước riêng, tương thích ngược, trước khi code mới lên}.
The senior principle {Nguyên tắc senior}: the pipeline is the gate — code that fails lint/tests never reaches production, and deploys are repeatable, logged, and reversible {pipeline là cổng — code fail không bao giờ tới production; deploy lặp lại được, có log, đảo ngược được}.
7.7 nginx & TLS {nginx & TLS}
In production, put nginx in front of Node as a reverse proxy — it terminates TLS, serves static files, load-balances, and shields Node {Ở production, đặt nginx trước Node làm reverse proxy — kết thúc TLS, phục vụ file tĩnh, cân bằng tải, che chắn Node}:
server {
listen 443 ssl;
server_name api.example.com;
ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # for trust proxy (Phase 3)
}
}
Get free, auto-renewing certs with Certbot (Let’s Encrypt): certbot --nginx -d api.example.com {Lấy chứng chỉ miễn phí, tự gia hạn bằng Certbot}.
7.8 Observability — the operability triad {Khả năng quan sát — bộ ba vận hành}
You can’t fix what you can’t see {Không sửa được cái không thấy}. Production needs three signals {Production cần ba tín hiệu}: logs (structured, with request id — Phase 6), metrics (RPS, p95 latency, error rate, event-loop lag via prom-client), and traces (a request’s path across services via OpenTelemetry) {log (có cấu trúc), metric (RPS, p95, lỗi, event-loop lag), và trace (đường đi của request qua các service)}.
8. Hands-on projects {Dự án thực hành}
-
Dockerize properly {Đóng gói Docker đúng cách}: multi-stage
Dockerfile+.dockerignore+tini+ non-root + aHEALTHCHECK; build, run, and confirm the image is small and scans clean {multi-stage + .dockerignore + tini + non-root + HEALTHCHECK; xác nhận image nhỏ và quét sạch}. -
Compose the stack {Dựng stack bằng Compose}: app + Postgres + Redis with service-name hosts, healthcheck-gated
depends_on, and a persistent volume {app + Postgres + Redis, hostname theo service, depends_on chờ healthcheck, volume bền}. -
Health + graceful shutdown {Health + tắt êm}: add
/healthand/ready; onSIGTERMstop readiness, drain, close pools, exit — and prove zero dropped requests during a restart {thêm/healthvà/ready; khiSIGTERMxả rồi thoát — chứng minh không rớt request}. -
PM2 cluster {PM2 cluster}: write
ecosystem.config.js, provepm2 reloadis zero-downtime by curling in a loop during the restart {viếtecosystem.config.js, chứng minhpm2 reloadkhông downtime}. -
CI/CD pipeline {Pipeline CI/CD}: GitHub Actions that lints + tests (with a Postgres service) + builds + pushes an image tagged by SHA; run a migration step {lint + test + build + push image gắn SHA; chạy bước migration}.
-
Deploy with TLS {Deploy có TLS}: ship to a VPS behind nginx + Certbot; wire a
mainpush to deploy automatically {ship lên VPS sau nginx + Certbot; pushmaintự deploy}.
What’s next {Phần tiếp theo}
Your app is production-grade in operations: validated config, a small secure multi-stage image with correct PID 1 signals, the full stack via Compose, health probes and graceful shutdown, PM2 or an orchestrator, an automated CI/CD pipeline with deploy strategies, nginx + TLS, and the observability triad {App của bạn đạt chuẩn production về vận hành: config đã xác thực, image multi-stage nhỏ và an toàn với tín hiệu PID 1 đúng, cả stack qua Compose, probe và tắt êm, PM2 hoặc orchestrator, pipeline CI/CD, nginx + TLS, và bộ ba quan sát}.
In Phase 8, we make it fast: profiling with perf_hooks and clinic.js, finding event-loop lag, fixing slow queries (eager loading, DataLoader, indexes), caching strategies, clustering, and API response optimization (pagination, field selection, compression) — all measured with real load tests {Ở Phase 8, ta làm nó nhanh: profiling, tìm event-loop lag, sửa query chậm, chiến lược caching, clustering, và tối ưu response — đo bằng load test thật}.