Node.js Super Senior · Phase 7 — DevOps & Deployment

Phase 7: take it to production — 12-factor config, multi-stage Docker images and PID 1 signals, Docker Compose, PM2 vs Kubernetes, health checks and zero-downtime deploys, a CI/CD pipeline with GitHub Actions, and nginx + TLS.

JUN 9, 2026 10 MIN READ

This is Phase 7 of the 10-phase Super Senior path {Đây là Phase 7 của lộ trình Super Senior 10 phase}. “It works on my machine” is where juniors stop and seniors begin {“Chạy trên máy tôi” là nơi junior dừng và senior bắt đầu}. Now we package the app so it runs identically anywhere, ships automatically, and survives restarts, deploys, and failures {Giờ ta đóng gói app để nó chạy y hệt mọi nơi, ship tự động, và sống sót qua restart, deploy, và sự cố}.

7.1 Environment & config management {Quản lý env & config}

The Twelve-Factor rule: config that varies between environments lives in the environment, never in code {Quy tắc Twelve-Factor: config khác nhau giữa môi trường nằm trong environment, không bao giờ trong code}.

# .env  (gitignored — NEVER commit secrets)
NODE_ENV=production
PORT=3000
DATABASE_URL=postgresql://user:pass@localhost/dbname
JWT_SECRET=your_secret_key
REDIS_URL=redis://localhost:6379

Centralize and validate config at boot — fail fast if a required variable is missing or weak {Tập trung và xác thực config khi khởi động — fail nhanh nếu thiếu hoặc yếu}:

import { z } from 'zod';

const envSchema = z.object({
  NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
  PORT: z.coerce.number().default(3000),
  DATABASE_URL: z.string().url(),
  JWT_SECRET: z.string().min(32), // refuse to start with a weak secret
});
export const config = envSchema.parse(process.env); // crash loudly at startup, not at 3am

Node 24 reads .env natively: node --env-file=.env app.js — no dotenv {Node 24 đọc .env sẵn — không cần dotenv}. For real secrets, prefer a secrets manager (Vault, AWS/GCP Secrets) injected as env at deploy time, not a committed file {Với secret thật, ưu tiên secrets manager tiêm vào env lúc deploy}.

7.2 Docker — build small, run safe {Docker — build nhỏ, chạy an toàn}

A Docker image is built in layers; each instruction is a cached layer {Image Docker dựng theo layer; mỗi lệnh là một layer được cache}. Use a multi-stage build so build tools never reach the final image {Dùng multi-stage build để công cụ build không lọt vào image cuối}:

# ---- build stage ----
FROM node:24-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci                      # cached until package*.json changes
COPY . .
RUN npm run build               # compile TypeScript → dist/

# ---- runtime stage ----
FROM node:24-alpine
WORKDIR /app
ENV NODE_ENV=production
RUN apk add --no-cache tini     # a real PID 1 that forwards signals (see 7.5)
COPY package*.json ./
RUN npm ci --omit=dev           # production deps only
COPY --from=build /app/dist ./dist
USER node                       # never run as root
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s \
  CMD node -e "fetch('http://localhost:3000/health').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/index.js"]

Senior reflexes {Phản xạ senior}:

.dockerignore (node_modules, .env, .git, tests) → smaller, faster, safer builds {nhỏ, nhanh, an toàn hơn}.
Copy package*.json before the source so npm ci is cached until deps change → fast rebuilds {cache npm ci đến khi deps đổi}.
Pin a digest, run as non-root, scan the image (docker scout / trivy) before shipping {ghim digest, chạy non-root, quét image}.
Alpine or distroless for a tiny attack surface {Alpine hoặc distroless cho bề mặt tấn công nhỏ}.

Docker Compose — the whole stack {Docker Compose — cả stack}

Compose runs your app with its dependencies in one command {Compose chạy app cùng phụ thuộc trong một lệnh}:

services:
  app:
    build: .
    ports: ['3000:3000']
    environment:
      DATABASE_URL: postgresql://user:pass@postgres:5432/mydb
      REDIS_URL: redis://redis:6379
    depends_on:
      postgres: { condition: service_healthy }   # wait until DB is actually ready
      redis: { condition: service_started }

  postgres:
    image: postgres:17-alpine
    environment: { POSTGRES_USER: user, POSTGRES_PASSWORD: pass, POSTGRES_DB: mydb }
    volumes: ['pgdata:/var/lib/postgresql/data'] # persist data across restarts
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -U user']
      interval: 10s
      retries: 5

  redis: { image: redis:7-alpine }

volumes:
  pgdata:

Service names become hostnames (postgres, redis) on the Compose network; a named volume persists DB data {Tên service thành hostname; named volume giữ dữ liệu DB}.

7.3 Health checks & graceful shutdown {Health check & tắt êm}

Orchestrators decide whether to send you traffic and whether to restart you based on probes — so expose them {Orchestrator quyết định gửi traffic hay restart dựa trên probe — nên hãy phơi chúng}:

Liveness — “am I alive?” If it fails, the orchestrator restarts the container {liveness — nếu fail, orchestrator restart}.
Readiness — “can I serve traffic now?” If it fails, traffic is withheld but the container is not killed (e.g. during warm-up or a lost DB) {readiness — nếu fail, ngưng gửi traffic nhưng không giết container}.

app.get('/health', (_req, res) => res.json({ status: 'ok' }));         // liveness: cheap
app.get('/ready', async (_req, res) => {                              // readiness: deps
  try { await pool.query('SELECT 1'); res.json({ ready: true }); }
  catch { res.status(503).json({ ready: false }); }
});

On deploy the orchestrator sends SIGTERM — drain in-flight work, then exit (Phase 2/3). The full sequence for zero-downtime {Khi deploy, orchestrator gửi SIGTERM — xả việc đang chạy rồi thoát. Trình tự đầy đủ để không downtime}:

SIGTERM ─▶ stop readiness (drop from LB) ─▶ server.close() (finish in-flight)
        ─▶ close DB pool / queue workers ─▶ exit 0   (force-exit after a grace timeout)

7.4 PM2 vs container orchestration {PM2 vs điều phối container}

A bare node app.js dies on an unhandled crash and uses only one CPU core {node app.js trần chết khi crash và chỉ dùng một lõi CPU}. There are two worlds {Có hai thế giới}.

On a classic VPS — PM2 keeps it alive and runs cluster mode across all cores {Trên VPS cổ điển — PM2 giữ sống và chạy cluster trên mọi lõi}:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'api', script: './dist/index.js',
    instances: 'max', exec_mode: 'cluster',   // one worker per core, load-balanced
    max_memory_restart: '500M',
    env_production: { NODE_ENV: 'production', PORT: 3000 },
  }],
};

pm2 start ecosystem.config.js --env production
pm2 reload api      # zero-downtime: workers cycle one at a time

In containers — the orchestrator (Kubernetes, ECS) owns restarts, scaling, and load-balancing, so you run one Node process per container and skip PM2 cluster mode {Trong container — orchestrator lo restart, scale, cân bằng tải, nên chạy một tiến trình Node mỗi container, bỏ PM2 cluster}. The Kubernetes vocabulary to recognize {Từ vựng Kubernetes cần nhận ra}:

# A Deployment runs N replicas; probes use the endpoints from 7.3
livenessProbe:  { httpGet: { path: /health, port: 3000 }, periodSeconds: 10 }
readinessProbe: { httpGet: { path: /ready,  port: 3000 }, periodSeconds: 5 }
# A HorizontalPodAutoscaler adds replicas when CPU/RPS rises.

Scale by adding stateless replicas (horizontal), not bigger boxes (vertical) — which is why sessions/cache live in Redis, not in process memory {Mở rộng bằng thêm replica stateless (ngang), không phải máy to hơn (dọc) — vì thế session/cache nằm ở Redis, không trong bộ nhớ tiến trình}.

7.5 PID 1 & signals — the silent container bug {PID 1 & tín hiệu — bug container thầm lặng}

In a container your process runs as PID 1, which has special signal rules: if Node is PID 1 and you didn’t wire handlers correctly, SIGTERM can be ignored, so deploys wait the full grace period then hard-kill you mid-request {Trong container tiến trình chạy là PID 1, có quy tắc tín hiệu đặc biệt: nếu Node là PID 1 mà không gắn handler đúng, SIGTERM có thể bị bỏ qua, nên deploy chờ hết grace rồi giết cứng giữa request}. Two fixes {Hai cách sửa}: use an init like tini (shown in 7.2) as PID 1, and always handle SIGTERM/SIGINT yourself {dùng init như tini làm PID 1, và luôn tự xử lý SIGTERM/SIGINT}.

7.6 CI/CD pipeline (GitHub Actions) {Pipeline CI/CD (GitHub Actions)}

CI/CD automates the path from git push to production: lint → test → build → migrate → deploy {CI/CD tự động hóa đường từ git push tới production: lint → test → build → migrate → deploy}.

# .github/workflows/deploy.yml
name: Deploy
on: { push: { branches: [main] } }

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:17-alpine
        env: { POSTGRES_PASSWORD: postgres }
        options: >-
          --health-cmd pg_isready --health-interval 10s
          --health-timeout 5s --health-retries 5
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '24', cache: 'npm' }
      - run: npm ci
      - run: npm run lint && npm test && npm run build

  build-image:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/build-push-action@v6     # BuildKit + layer cache
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}

Deployment strategies a senior names {Chiến lược deploy senior gọi tên được}: rolling (replace replicas gradually — the default), blue-green (stand up a parallel version, flip traffic, instant rollback), and canary (route 5% first, watch metrics, then ramp) {rolling, blue-green, canary}. Run DB migrations as an explicit, backward-compatible step before the new code goes live {Chạy migration như bước riêng, tương thích ngược, trước khi code mới lên}.

The senior principle {Nguyên tắc senior}: the pipeline is the gate — code that fails lint/tests never reaches production, and deploys are repeatable, logged, and reversible {pipeline là cổng — code fail không bao giờ tới production; deploy lặp lại được, có log, đảo ngược được}.

7.7 nginx & TLS {nginx & TLS}

In production, put nginx in front of Node as a reverse proxy — it terminates TLS, serves static files, load-balances, and shields Node {Ở production, đặt nginx trước Node làm reverse proxy — kết thúc TLS, phục vụ file tĩnh, cân bằng tải, che chắn Node}:

server {
  listen 443 ssl;
  server_name api.example.com;
  ssl_certificate     /etc/letsencrypt/live/api.example.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;

  location / {
    proxy_pass http://localhost:3000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # for trust proxy (Phase 3)
  }
}

Get free, auto-renewing certs with Certbot (Let’s Encrypt): certbot --nginx -d api.example.com {Lấy chứng chỉ miễn phí, tự gia hạn bằng Certbot}.

7.8 Observability — the operability triad {Khả năng quan sát — bộ ba vận hành}

You can’t fix what you can’t see {Không sửa được cái không thấy}. Production needs three signals {Production cần ba tín hiệu}: logs (structured, with request id — Phase 6), metrics (RPS, p95 latency, error rate, event-loop lag via prom-client), and traces (a request’s path across services via OpenTelemetry) {log (có cấu trúc), metric (RPS, p95, lỗi, event-loop lag), và trace (đường đi của request qua các service)}.

8. Hands-on projects {Dự án thực hành}

Dockerize properly {Đóng gói Docker đúng cách}: multi-stage Dockerfile + .dockerignore + tini + non-root + a HEALTHCHECK; build, run, and confirm the image is small and scans clean {multi-stage + .dockerignore + tini + non-root + HEALTHCHECK; xác nhận image nhỏ và quét sạch}.
Compose the stack {Dựng stack bằng Compose}: app + Postgres + Redis with service-name hosts, healthcheck-gated depends_on, and a persistent volume {app + Postgres + Redis, hostname theo service, depends_on chờ healthcheck, volume bền}.
Health + graceful shutdown {Health + tắt êm}: add /health and /ready; on SIGTERM stop readiness, drain, close pools, exit — and prove zero dropped requests during a restart {thêm /health và /ready; khi SIGTERM xả rồi thoát — chứng minh không rớt request}.
PM2 cluster {PM2 cluster}: write ecosystem.config.js, prove pm2 reload is zero-downtime by curling in a loop during the restart {viết ecosystem.config.js, chứng minh pm2 reload không downtime}.
CI/CD pipeline {Pipeline CI/CD}: GitHub Actions that lints + tests (with a Postgres service) + builds + pushes an image tagged by SHA; run a migration step {lint + test + build + push image gắn SHA; chạy bước migration}.
Deploy with TLS {Deploy có TLS}: ship to a VPS behind nginx + Certbot; wire a main push to deploy automatically {ship lên VPS sau nginx + Certbot; push main tự deploy}.

What’s next {Phần tiếp theo}

Your app is production-grade in operations: validated config, a small secure multi-stage image with correct PID 1 signals, the full stack via Compose, health probes and graceful shutdown, PM2 or an orchestrator, an automated CI/CD pipeline with deploy strategies, nginx + TLS, and the observability triad {App của bạn đạt chuẩn production về vận hành: config đã xác thực, image multi-stage nhỏ và an toàn với tín hiệu PID 1 đúng, cả stack qua Compose, probe và tắt êm, PM2 hoặc orchestrator, pipeline CI/CD, nginx + TLS, và bộ ba quan sát}.

In Phase 8, we make it fast: profiling with perf_hooks and clinic.js, finding event-loop lag, fixing slow queries (eager loading, DataLoader, indexes), caching strategies, clustering, and API response optimization (pagination, field selection, compression) — all measured with real load tests {Ở Phase 8, ta làm nó nhanh: profiling, tìm event-loop lag, sửa query chậm, chiến lược caching, clustering, và tối ưu response — đo bằng load test thật}.