Filter/Tag

#llm

17 entries

2026-04-30 ai agent architecture

Agent Architecture — Bên trong 1 AI agent là gì

Mổ xẻ kiến trúc bên trong 1 AI agent: agent loop, tool use mechanics, 3 tầng memory, planning patterns (ReAct, Plan-and-Execute, Tree of Thoughts), multi-agent system, và 7 cạm bẫy phổ biến.
2026-04-30 ai llm hallucination

AI Hallucination — Tại sao LLM "bịa" và 6 tầng phòng thủ cho dev

Vì sao hallucination là bản chất của next-token prediction chứ không phải bug: 4 cơ chế, taxonomy 6 dạng dev gặp, mô hình phòng thủ Swiss-cheese 6 tầng, vụ thật (luật sư, Air Canada, slopsquatting), và cân phòng thủ theo rủi ro.
2026-04-30 ai llm tokens

Tokens — Đơn vị tính tiền và đơn vị suy nghĩ của LLM

Token là đơn vị compression, đơn vị suy nghĩ, và đơn vị tính tiền của LLM. Bài đi sâu BPE tokenization, tiếng Việt đắt 2-3x, thinking tokens, và framework chọn subscription vs API.
2026-04-30 ai llm fine-tuning

Fine-tuning LLM — Khi nào cần, khi nào không, và cách thực sự làm

Fine-tuning vs prompt vs RAG — quyết định framework. 4 loại fine-tune (full, LoRA, QLoRA, instruction tuning), data preparation, cost analysis, và 6 cạm bẫy phổ biến (overfitting, catastrophic forgetting).
2026-04-30 ai history machine-learning

Lịch sử AI — Từ Perceptron 1958 đến Coding Agents 2026

Hành trình 70 năm của AI qua 7 era: symbolic AI, expert systems, statistical learning, deep learning, transformer, LLM, đến reasoning + agentic. Mỗi era có winter và breakthrough — bài học cho dev hôm nay.
2026-04-30 ai llm cost-optimization

LLM Cost Optimization — 10 patterns giảm hóa đơn 50-95% không mất quality

Mọi đòn bẩy chi phí LLM quy về 3 nhóm: đừng gọi model, gọi nhẹ hơn, rẻ hơn mỗi token. 10 pattern thực chiến (prompt cache, semantic cache, routing, distillation, batch, quantization, self-host), cost math, và ROI framework.
2026-04-30 ai llm models

LLM Models Comparison — Claude, GPT, Gemini, Llama — dùng cái nào cho task nào

7 dimension đánh giá model LLM, so sánh thực chiến Claude/GPT/Gemini/Llama family đầu 2026, thinking model vs regular, open-source vs proprietary, và decision framework để chọn model đúng task theo chi phí + chất lượng.
2026-04-30 ai llm prompt-engineering

Prompting Fundamentals — Từ câu hỏi mơ hồ đến instruction LLM thực sự hiểu

3 tầng của prompt (system/user/assistant), 6 nguyên tắc viết prompt hiệu quả, sampling parameters (temperature, top-p, top-k, stopping criteria), personalization qua system prompt, multi-turn strategy, và template tái dùng cho dev.
2026-04-30 ai llm tokenization

Tokenization, Temperature, Top-p, Top-k — Mechanics bên dưới mọi LLM

4 cơ chế kỹ thuật mà dev nào dùng LLM cũng nên hiểu sâu: BPE tokenization step-by-step, math của temperature scaling, top-p (nucleus) vs top-k sampling, sampling pipeline hoàn chỉnh, và parameter cheatsheet.
2026-04-30 ai llm open-source

Open-source LLM Ecosystem 2026 — Llama, Mistral, DeepSeek, Qwen và cách run local

Tổng quan ecosystem LLM open weight 2026: 5 family chính, công cụ chạy local (Ollama, LM Studio, vLLM, llama.cpp), quantization formats (GGUF/GPTQ/AWQ), license gotcha, và hardware budget từ laptop đến cluster.
2026-04-30 ai rag vector-database

RAG — Retrieval Augmented Generation từ A đến Z cho dev

Mổ xẻ RAG: indexing pipeline, embeddings, vector DB, chunking strategies, retrieval (dense/sparse/hybrid), reranking, 8 failure mode phổ biến, và quyết định khi nào dùng RAG vs long context vs fine-tuning.
2026-03-03 ai llm models

Choosing an LLM for Agents: A Durable Framework Beyond Leaderboards

A senior engineer framework for model selection — capability tiers, context, modality, cost, privacy, tool use — plus routing, cascades, and why benchmarks lie.
2026-02-23 ai llm agents

Evaluating LLMs & Agents: Golden Sets, Metrics, LLM-as-Judge, and Regression in CI

Why eval is the hardest part of shipping agents — golden datasets, offline vs online metrics, LLM-as-judge rubrics, human agreement, and regression in CI.
2026-02-15 ai llm fine-tuning

Fine-tuning vs Prompting vs RAG: A Decision Framework for Adapting LLMs

When to prompt, retrieve, or fine-tune: knowledge vs behavior, data needs, cost, privacy, SFT/LoRA/DPO — and why most teams start with prompt + RAG.
2026-01-30 ai llm agents

Stopping Criteria & Output Control — When Generation Ends and What to Do About It

EOS tokens, max_tokens, stop sequences, and finish_reason handling for production LLM agents — streaming, truncation, and runaway cost guards.
2026-01-22 ai agents prompt-engineering

Prompt Engineering for Agents — Messages, Personas, Few-Shot & Structured Output

Agent prompt design: messages/roles, personas, few-shot trade-offs, CoT vs reasoning models, JSON schemas, templates, injection guards, iteration.
2026-01-14 ai llm sampling

Sampling for Agents: Temperature, Top-p, Top-k — When Randomness Helps or Hurts

How LLMs turn logits into tokens — temperature, top_p, top_k, penalties, seeds — and why agent builders tune sampling differently for tool calls vs brainstorming.