The AI Dictionary: Technical Terms in Plain English
27 AI and ML terms explained for developers and everyone else.
Technology moves fast, and the jargon moves faster. If you're building with AI, you're probably hearing terms thrown around—some explained well, most not at all. This glossary cuts through the noise and explains what people actually mean when they use these words.
Core Concepts
Context Window — fixed-size buffer that holds all input to the model, like the AI's short-term memory. Similar to working memory in cognitive architecture, it has limited capacity that forces prioritization—if you give it too much, old stuff falls off the end.
Embeddings — semantic compression, like summarizing a book's meaning into a fixed-size fingerprint where similar books have similar fingerprints—neural networks pack meaning into dense vectors so the model can calculate which concepts are closest. Learn more about vector embeddings →
Hallucination — spurious confidence, like a person confidently describing a movie they've never seen and genuinely believing their invented plot—a fundamental LLM limitation where models generate plausible but false information. Recent research shows layered mitigation strategies can reduce hallucinations by 25-45%.
Grounding — source-anchored output, like citing your sources in an essay so readers can verify every claim against the documents you actually read—connecting generated text to retrieved documents reduces hallucination.
Retrieval & Search
RAG (Retrieval-Augmented Generation) — context injection pattern, like asking an LLM to answer from a specific book instead of its training data—you retrieve relevant passages first, then feed them to the model so it answers grounded in fresh facts. The 2025 Guide to RAG covers advanced patterns like SELF-RAG and Graph RAG.
Vector Search — similarity lookup, like finding songs that "sound like" a reference track instead of matching exact title keywords—you convert text to coordinates in semantic space and find nearest neighbors. PostgreSQL benchmarking guide shows practical implementation.
Graphs (Knowledge Graphs) — relationship mapping, like a social network where people are nodes and friendships are connections—enables reasoning about "who knows whom" patterns that isolated profiles can't express.
LEANN — graph-based selective recomputation, like a smart vector index that only recalculates the parts of a knowledge graph that actually changed instead of rebuilding everything from scratch—cutting storage down to 3% of the original size.
Prompting & Learning
Prompt Engineering — input optimization, like the difference between asking a tutor vague questions versus specific ones with examples—crafting system prompts, examples, thinking steps, and output schemas to reliably steer LLM behavior toward your goal. See 2025 best practices.
Chain of Thought (CoT) — reasoning decomposition, like showing your work on a math problem before writing the final answer—model makes intermediate steps explicit, improving accuracy on complex problems. Advanced CoT techniques include self-consistency and tree-of-thoughts.
Few-shot / Zero-shot Learning — example-based adaptation, like learning to write poetry either with sample poems (few-shot) or just a genre description (zero-shot)—model generalizes patterns without updating weights.
Fine-tuning vs In-context Learning — weight adjustment versus prompt engineering, like permanently rewiring a neural network versus temporarily coaching it through conversation.
Curriculum Learning — training data ordering strategy, like teaching hard things by starting easy and mastering the basics before advanced stuff. Start with simple examples and progressively increase complexity—like learning math, where addition comes before calculus.
Reinforcement Learning — agent learns optimal behavior through trial and error with a reward signal guiding policy updates, like learning by doing instead of being told. No explicit programming of behavior—it emerges from experience.
Agent Architecture
Tool Use / Function Calling — LLM-invoked actions, like giving an AI permission to use a calculator—the model outputs structured requests (function name + arguments) that execute real operations like searches, API calls, or file writes instead of just generating text.
Agent Orchestration — multi-agent routing, like coordinating a team where the project manager assigns specialized tasks, tracks handoffs, and merges results—routes work to specialized agents, manages dependencies, and handles failures across collaborative AI workers. Azure's design patterns guide covers hierarchical, graph-based, and parallel fan-out patterns.
Swarm Orchestration — distributed system coordination, like many AI workers tackling a big problem together where each handles their piece and shares what they find. Multiple independent agents work on shared goals with message-passing for coordination—similar to MapReduce but for reasoning tasks.
Agent Harness — container for AI agent execution, like a runtime sandbox that wraps an agent with tool access, context memory, execution guardrails, and state management so the agent can think and act safely within defined boundaries.
Cognitive Dataflow (CDO) — DAG (Directed Acyclic Graph) execution combined with the Actor Model, like breaking down thinking into steps that can happen at the same time when they don't depend on each other. Nodes are reasoning steps and edges are data dependencies, enabling parallel execution where dependencies allow. See our post on YAML to Agentic Runners for implementation patterns.
Feedback Loop — closed-loop control system where output influences subsequent input, like learning from what just happened to do better next time. In AI context, model output affects context for next inference, enabling iterative refinement.
Context & Memory
Context Injection — middleware pattern applied to AI, like automatically adding helpful information to what the AI sees based on what's happening right now. Intercept the context before it reaches the model and enrich it dynamically—like a helpful assistant who hands you the right file before you ask.
Memory Federation — Facade pattern over multiple storage backends, like one way to ask for memories regardless of where they're stored. Query interface abstracts whether data comes from vector DB, graph DB, key-value store, or file system.
Semantic Conditions — predicate evaluation via LLM inference, like letting the AI decide if something is true using common sense instead of just checking boxes. Instead of boolean expressions, natural language conditions are evaluated through reasoning.
Communication & Integration
Bi-directional Communication — Observer + Pub/Sub hybrid, like a two-way conversation instead of asking questions and waiting for answers. Instead of request-response, both parties can initiate messages—think WebSocket vs HTTP, where either side can push updates.
Skills — Claude Code extension pattern, like markdown instruction packages that drop specialized knowledge, workflows, or custom tool integrations into Claude's context when you invoke them—teaching the AI how to solve domain-specific problems without modifying the core system.
Hooks — lifecycle injection points, like event listeners that trigger custom logic at specific moments—before you commit code, after you install dependencies, when an error happens—letting you enforce patterns without touching the core codebase.
Polyglot Runners (Poly) — language-agnostic daemon orchestration, like a process supervisor that runs Python, Node, Rust, and other language services as managed background workers with automatic restarts, health monitoring, and centralized logging.
Further Reading
- Eden AI: The 2025 Guide to RAG
- Azure: AI Agent Design Patterns
- Arxiv: Cutting-Edge Techniques to Reduce LLM Hallucinations
- Galileo: Chain-of-Thought Prompting Techniques
- Instaclustr: Vector Search with PostgreSQL
- SciPy 2025: RAG Tutorial
A living resource for anyone building or thinking about AI. We update this as the field evolves.
Related Posts
The Architecture of Autonomous Flight
How we built a neural-symbolic hybrid system to control manned aircraft in real-time.
From YAML to Deterministic + Agentic Runners
Why disk-based orchestration beats fancy state management for multi-agent systems.
The Growth Architect: Psychology-Driven Marketing for AI Products
A framework for explosive growth combining behavioral psychology, viral mechanics, and data-driven optimization.