AI Engineering Glossary | AI Mastery | AI Mastery

Chain of Thought (CoT)

A prompting strategy that forces the model to articulate its intermediate reasoning steps before providing a final answer.

Read Full Definition →

KV Cache (Key-Value Cache)

A mechanism used during autoregressive generation to store previously computed Keys and Values, preventing redundant calculations.

Read Full Definition →

LoRA (Low-Rank Adaptation)

A highly efficient fine-tuning technique that freezes the base model weights and trains a small set of injected low-rank matrices.

Read Full Definition →

MoE (Mixture of Experts)

A neural network architecture that utilizes multiple specialized sub-networks ("experts"), routing tokens only to the most relevant ones.

Read Full Definition →

RAG (Retrieval-Augmented Generation)

A framework that grounds an LLM by fetching external, up-to-date information from a database before generating a response.

Read Full Definition →