Field notes on building AI systems

Hi, I'm Anubhav Anand — I build production AI systems and write about how they actually work.

Full Stack AI/ML Engineer focused on RAG pipelines, LLM fine-tuning, and agent frameworks that hold up at scale. Currently building at Publicis Sapient; previously at Gesund.ai and Spritle.

Outside of work, I also…

Write about RAG, agents, MCP, fine-tuning, and the fundamentals of generative AI — that's this blog.
Build the full pipeline end-to-end — from ingestion, chunking, and embeddings to retrieval, reranking, and autoscaling Kubernetes serving.
Take applied AI products from prototype to production — agentic assistants, RAG systems, and multi-agent workflows.
Contribute to open source — merged work across promptfoo, deepset-ai/Haystack, WordPress/ai, Supabase, LibreChat, and Arize Phoenix.
Share notes on LangGraph, LlamaIndex, and whatever I'm figuring out in the field.

58 posts

Jun 10, 2026 · 6 minllmdeeplearning

Cost and Latency Engineering

The cheapest, fastest token is the one you never generate.

May 30, 2026 · 5 minllmdeeplearning

LLM-as-Judge

Using one language model to grade another feels like asking the fox to audit the henhouse and trust

May 26, 2026 · 5 minllmdeeplearningeval

Custom Evals Are the Moat

Your model scores 89 on MMLU.

May 18, 2026 · 5 minllmdeeplearning

Multimodal by Default

"A picture is worth a thousand words" is wrong by about an order of magnitude.

May 12, 2026 · 5 minllmdeeplearning

Why LLMs Hallucinate

Hallucination is not the model malfunctioning.

May 7, 2026 · 5 minllmdeeplearning

Structured Output and Constrained Decoding

There are two ways to get JSON out of a language model.

May 3, 2026 · 5 minllmdeeplearning

Context Windows and Lost-in-the-Middle

You bought a model with a giant context window.

Apr 27, 2026 · 5 minllmdeeplearning

The Transformer, Intuitively

Most explanations of the transformer open with a wall of matrices.

Apr 21, 2026 · 5 minllmdeeplearning

How LLMs Actually Generate Text

A language model does not write a sentence.

Apr 15, 2026 · 5 minfinetuningllmdeeplearning

Efficient Training with Unsloth

Same GPU, same model, same LoRA config — and the run finishes in a third of the time using most of t

Apr 10, 2026 · 5 minfinetuningllmdeeplearning

Data Preparation for Fine-Tuning

Nobody demos the data cleaning.

Apr 3, 2026 · 5 minfinetuningllmdeeplearninggraph

Knowledge and Chain-of-Thought Distillation

You proved the task is solvable.

Mar 30, 2026 · 5 minfinetuningllmdeeplearning

GRPO, PPO, and KTO with TRL

DPO answered the common case.

Mar 20, 2026 · 5 minfinetuningllmdeeplearning

DPO vs RLHF

For a couple of years, teaching a model to prefer good answers over bad ones meant running three mod

Mar 12, 2026 · 4 minfinetuningllmdeeplearning

LoRA vs QLoRA vs DoRA vs Full Fine-Tuning

Four methods, one question: when you sit down to fine-tune, which do you reach for?

Mar 7, 2026 · 5 minfinetuningllmdeeplearning

QLoRA: Fine-Tuning on One GPU

Try to full-fine-tune an 8B model on a single 24 GB consumer card and you won't get to the first tra

Mar 3, 2026 · 5 minfinetuningllmdeeplearning

LoRA, Explained

A 7-billion-parameter model has 7 billion knobs.

Feb 23, 2026 · 5 minfinetuningllmdeeplearning

The Ladder: Prompt, RAG, Fine-tune, Distill

Most fine-tuning projects should have stayed a prompt.

Feb 18, 2026 · 5 minmcpllmengineering

A2A and MCP Together

If you only learn one distinction about the protocol landscape, make it this one: **MCP connects an

Feb 11, 2026 · 5 minmcpllmengineering

MCP Roots and File Access

Give a filesystem server your home directory and you've handed a language model your SSH keys, your

Feb 10, 2026 · 5 minmcpllmengineering

Scaling MCP

A server that runs as a subprocess on your laptop never has a scaling problem.

Feb 5, 2026 · 5 minmcpllmengineering

MCP Sampling, Progress, and Logging

Most introductions to MCP stop at tools, resources, and prompts and call it a day.

Jan 28, 2026 · 5 minmcpllmengineeringsecurity

MCP Security

A protocol that lets a language model run tools on your machine is a loaded gun pointed at your file

Jan 20, 2026 · 5 minmcpllmengineering

MCP vs Function Calling

"Should I use MCP or function calling?

Jan 15, 2026 · 5 minmcpllmengineering

MCP Clients and Host Integration

We built a server last time.

Jan 9, 2026 · 5 minmcpllmengineering

Building an MCP Server from Scratch

The fastest way to understand a protocol is to make something speak it.

Jan 5, 2026 · 5 minmcpllmengineering

What MCP Actually Is

Every couple of years something shows up wearing the word "standard" and promising to end integratio

Dec 31, 2025 · 5 minagentsllm

Exploration and Discovery

# Exploration and Discovery Most agents take the obvious path every time. That'…

Dec 18, 2025 · 5 minagentsllm

Prioritization

# Prioritization Give an agent one task and it does it. Give it three, and you…

Dec 11, 2025 · 5 minagentsllmeval

Evaluation and Monitoring

# Evaluation and Monitoring "It works." On whose machine, against which inputs,…

Dec 4, 2025 · 5 minagentsllmsecurity

Guardrails and Safety Patterns

# Guardrails and Safety Patterns A support agent at a company I won't name had…

Nov 30, 2025 · 5 minagentsllm

Reasoning Techniques

# Reasoning Techniques Three years ago, getting a model to reason meant trickin…

Nov 23, 2025 · 5 minagentsllm

Resource-Aware Optimization

# Resource-Aware Optimization The first month, the agent cost $40 to run. It wa…

Nov 17, 2025 · 5 minagentsllm

Inter-Agent Communication (A2A)

# Inter-Agent Communication (A2A) For a while in 2025, every team building a "m…

Nov 12, 2025 · 5 minagentsllm

Human-in-the-Loop

# Human-in-the-Loop Autonomy isn't a switch. It's a dial, and the whole craft i…

Nov 11, 2025 · 5 minagentsllm

Exception Handling and Recovery

# Exception Handling and Recovery The agent booked the flight. Then the hotel A…

Nov 4, 2025 · 5 minagentsllm

Goal Setting and Monitoring

# Goal Setting and Monitoring "Be helpful." Try writing a test for that. You c…

Oct 30, 2025 · 5 minagentsllm

Learning and Adaptation

# Learning and Adaptation Here's a thing that sounds like heresy: most agents p…

Oct 22, 2025 · 6 minagentsllm

Memory Management

# Memory Management You tell the agent your name on Monday. By Wednesday it ask…

Oct 18, 2025 · 5 minagentsllmengineering

Multi-Agent Systems

Most multi-agent systems are a meeting that should have been an email.

Oct 11, 2025 · 5 minagentsllmengineering

Planning

You hand a planning agent the *what* — "organize the team offsite, budget's $8k, twelve people, some

Oct 7, 2025 · 5 minagentsllmengineering

Tool Use

A language model is a brain in a sealed jar.

Sep 29, 2025 · 5 minagentsllmengineering

Reflection

There's a demo that always lands.

Sep 20, 2025 · 5 minagentsllmengineering

Parallelization

Do the arithmetic on a slow agent and the answer is almost always the same: it's slow because it's w

Sep 15, 2025 · 4 minagentsllmengineering

Routing

A chain assumes you already know the path.

Sep 5, 2025 · 5 minagentsllmengineering

Prompt Chaining

Give one prompt five jobs and it will quietly do four.

Sep 2, 2025 · 6 minragllmproduction

RAG in Production

A support agent escalated a ticket because the assistant kept telling customers about a discount tha

Aug 28, 2025 · 5 minragllm

Long Context vs RAG

Every time a model ships with a bigger context window, the same headline returns: *RAG is dead.

Aug 20, 2025 · 6 minragllmeval

Evaluating RAG: RAGAS and Beyond

A team swaps in a new reranker and declares the RAG system "better.

Aug 13, 2025 · 6 minragllm

Adaptive RAG: Matching Pipeline to Query

Eight posts in, we've built up an arsenal: hybrid search, reranking, HyDE, multi-query, the agentic

Aug 10, 2025 · 6 minragllmevalsearchgraph

GraphRAG: Retrieval Over Knowledge Graphs

Ask a normal RAG system "what are the major themes across these 400 board meeting transcripts?

Aug 6, 2025 · 6 minragllmevalsearch

Agentic RAG: When Retrieval Starts Thinking

Every pipeline in this series so far has been a conveyor belt.

Jul 30, 2025 · 5 minragllm

Query Transformation: HyDE and Multi-Query

"why is it slow" That's a real query a real user typed into a real RAG system.…

Jul 24, 2025 · 5 minragllm

Reranking: The Cheap Accuracy Win

Most accuracy improvements in RAG cost you something painful — a new index, a bigger model, a re-arc

Jul 19, 2025 · 6 minragllmsearch

Hybrid Search: BM25 Meets Dense Vectors

Dense vector search, the thing this whole series has been building on, has a stupid failure mode: it

Jul 11, 2025 · 6 minragllm

Embeddings and Vector Stores, Demystified

An embedding is not a summary of meaning.

Jul 8, 2025 · 6 minragllm

Chunking Strategies That Actually Matter

Pick a chunk size of 500 tokens and you've made a decision worth more than your choice of embedding

Jul 1, 2025 · 6 minragllm

Naive RAG and Why It Disappoints

The first RAG demo always works.