AI Glossary

Plain-English definitions of 62 AI terms — tailored for PE investors and manufacturing operators.

Core AIArchitectureOperationsSafety

A

AI AgentCore AI

A software program that uses an AI model to perceive its environment, make decisions, and take actions autonomously to achieve a goal. Agents can browse the web, write code, call APIs, and chain together multi-step tasks without step-by-step human guidance.

Agentic WorkflowOperations

A multi-step process in which an AI model acts autonomously across several stages — planning, tool use, reflection, and iteration — rather than simply responding to a single prompt. Common in complex automation pipelines such as document review or deal sourcing.

AlignmentSafety

The process of ensuring an AI model's behavior matches human intent and safety standards. In a business context, this ensures the model acts consistently with company policy and does not generate harmful or biased output.

AnthropicCore AI

The AI safety company that created Claude. Founded in 2021 by former OpenAI researchers, Anthropic focuses on building reliable, interpretable, and steerable AI systems.

Attention MechanismArchitecture

The core innovation behind the Transformer architecture. It allows the model to weigh the importance of different words in a sentence relative to each other, regardless of their distance. This enables the model to understand context and nuance effectively.

Auto-RegressiveArchitecture

A property of models that generate output one token at a time, using the previously generated tokens as context for the next one. This explains why LLMs "stream" text and why generation speed (latency) depends on output length.

B

BenchmarksOperations

Standardized tests used to evaluate and compare the performance of AI models (e.g., MMLU for general knowledge, HumanEval for coding). For enterprise selection, benchmarks provide a baseline for comparing open-source vs. closed-source options.

C

Chain of Thought (CoT)Operations

A prompting technique where the model is encouraged to "show its work" or reason step-by-step before providing a final answer. This significantly improves accuracy on complex logic, math, and strategic reasoning tasks.

ChunkingArchitecture

The process of splitting large documents into smaller segments before embedding them in a vector database. Proper chunk size balances context richness with retrieval precision — a key tuning parameter in RAG pipelines.

ClaudeCore AI

Anthropic's family of large language models, including Claude 3 Haiku, Sonnet, and Opus. Claude is designed with a focus on safety, long context handling (up to 200K tokens), and following nuanced instructions — making it well-suited for document-heavy PE workflows.

Closed-Source ModelsArchitecture

Proprietary models (like GPT-4 or Claude 3.5) where the weights and training data are not public. They typically offer higher performance and ease of use but come with vendor lock-in and data privacy considerations.

Coding AgentsArchitecture

Specialized AI agents capable of writing, debugging, and executing code autonomously. Tools like Cursor or Devin use these to automate software development tasks, allowing non-technical operators to build internal tools.

Confidence ScoreOperations

A numeric value (typically 0–1) that indicates how certain an AI model is about a given output or classification. In manufacturing quality control, confidence scores help operators decide when to route items for human review versus automated pass-through.

Constitutional AISafety

A training method developed by Anthropic where a model is trained to follow a set of high-level principles (a "constitution") rather than just human feedback. This aims to make models more helpful, harmless, and honest.

Context WindowArchitecture

The maximum amount of text (measured in tokens) that a language model can process in a single interaction. A 200K-token context window can hold roughly 150,000 words — enough to fit a full CIM, financial model, and management presentation simultaneously.

CopilotOperations

An AI assistant embedded within a specific workflow or application (e.g., Microsoft 365 Copilot, GitHub Copilot). It assists the human user rather than taking full control, increasing productivity in drafting, coding, or analysis.

CUDAArchitecture

A parallel computing platform and programming model created by NVIDIA. It is the industry standard for running AI workloads on GPUs. Understanding CUDA dependencies is important when managing on-premise AI infrastructure.

D

Data SovereigntySafety

The concept that data is subject to the laws and governance structures within the nation it is collected. For global firms, this dictates where AI models can process data and where vector databases must be hosted.

E

EmbeddingsArchitecture

Mathematical representations of text as high-dimensional numerical vectors. Similar concepts cluster near each other in vector space, enabling semantic search — finding relevant passages even when the exact words differ.

Evaluations (Evals)Operations

A systematic process to test an AI application's performance on specific business use cases. Unlike generic benchmarks, custom evals measure accuracy, tone, and safety on your proprietary data and workflows.

F

Fine-tuningArchitecture

A training process that adapts a pre-trained foundation model to a specific domain or task by exposing it to additional labeled examples. Fine-tuned models can match a firm's terminology, style, and analytical frameworks more precisely than off-the-shelf models.

Foundation ModelCore AI

A large AI model trained on broad internet-scale data that can be adapted for many downstream tasks. GPT-4o, Claude, Gemini, and Llama are foundation models — they serve as the base layer on top of which specialized applications are built.

G

GPT-4oCore AI

OpenAI's flagship multimodal language model capable of processing text, images, and audio. The "o" stands for "omni," reflecting its ability to handle multiple input types within a single interaction.

GPU (Graphics Processing Unit)Architecture

Specialized hardware originally designed for graphics but now essential for AI due to its ability to perform massive parallel calculations. The NVIDIA H100 is currently the standard for training and running state-of-the-art models.

GroundingSafety

The technique of connecting model outputs to verifiable sources or data to prevent hallucination. RAG is a primary method of grounding, ensuring the AI relies on retrieved documents rather than its training memory.

H

H100 / H200Architecture

NVIDIA's high-performance data center GPUs designed specifically for AI workloads. Access to these chips (compute availability) is often a bottleneck for companies training their own models or running high-throughput inference.

HallucinationSafety

When an AI model confidently generates information that is factually incorrect or fabricated. Hallucinations are a key reason human review remains essential in high-stakes PE workflows such as IC memos, legal review, and financial modeling.

Human-in-the-LoopSafety

A workflow design pattern where human judgment is embedded at critical decision points in an AI-powered process. For example, an AI drafts an investment memo, but a partner reviews and approves before it is shared with the IC.

I

InferenceCore AI

The process of running a trained AI model on new input to generate an output — as opposed to training, which is the process of building the model. When you send a prompt to Claude, you are performing inference.

Instruction TuningArchitecture

A training phase where the model is taught to follow specific instructions (e.g., "Summarize this," "Write code for X"). This bridges the gap between raw text prediction and a helpful assistant.

K

Knowledge GraphArchitecture

A structured representation of information connecting entities (people, companies, concepts) and their relationships. Combining Knowledge Graphs with LLMs (GraphRAG) reduces hallucination and improves reasoning on complex data.

L

LangChainArchitecture

An open-source orchestration framework used to build AI applications. It simplifies connecting LLMs to other data sources, memory, and tools, making it a standard tool for building internal business apps.

LatencyOperations

The time delay between sending a request to an AI model and receiving the first token of the response. For real-time applications (like customer service voice agents), low latency is critical.

Latent SpaceArchitecture

A mathematical concept where similar data points are mapped close together in a multi-dimensional space. "Moving" through latent space allows models to find relationships between concepts that aren't explicitly linked in the text.

LLMCore AI

Large Language Model — an AI system trained on massive text corpora to understand and generate human language. LLMs like GPT-4o and Claude power most of today's generative AI applications, from chatbots to automated document analysis.

M

MCPArchitecture

Model Context Protocol — an open standard that allows AI models to securely connect to external tools, databases, and APIs. MCP enables agents to pull live CRM data, write to spreadsheets, or query internal knowledge bases without custom integrations for each system.

Mixture of Experts (MoE)Architecture

A model architecture that uses multiple specialized sub-models ("experts") and a gating mechanism to choose which experts to use for each token. This allows for massive models (like GPT-4 or Mixtral) to be efficient at inference time.

Multi-agent SystemArchitecture

A design pattern where multiple specialized AI agents collaborate on a complex task — one agent might screen deals, another analyzes financials, a third drafts the IC memo. Coordination between agents enables parallel processing and specialization.

MultimodalCore AI

The ability of an AI model to process and generate multiple types of media simultaneously, such as text, images, audio, and video. Examples include GPT-4o and Gemini 1.5.

O

OrchestrationArchitecture

The coordination of multiple AI models, data sources, and tools to complete a workflow. It involves managing the logic, error handling, and data flow between the user and the underlying AI systems.

P

ParametersCore AI

The internal variables (weights) learned by the model during training. The number of parameters (e.g., 70B, 400B) generally correlates with the model's reasoning capability and complexity.

Pre-trainingCore AI

The initial, computationally expensive phase of training a model on a massive dataset to learn general language patterns. This creates the "base model" before any fine-tuning or alignment occurs.

PromptCore AI

The input text you provide to an AI model. A well-structured prompt includes context (who you are and the situation), specificity (concrete details), scope (what to include or exclude), and format (how to structure the output).

Prompt EngineeringOperations

The practice of designing, testing, and refining prompts to reliably produce high-quality outputs from AI models. Effective prompt engineering is a core skill for PE professionals who want consistent, accurate results from AI tools.

Prompt InjectionSafety

A security vulnerability where malicious inputs manipulate the model into ignoring its instructions and performing unauthorized actions. This is a primary security concern for enterprise AI deployment.

Q

QuantizationArchitecture

The process of reducing the precision of a model's parameters (e.g., from 16-bit to 4-bit) to decrease memory usage and increase speed with minimal loss in accuracy. This enables running powerful models on consumer hardware.

R

RAGArchitecture

Retrieval-Augmented Generation — an architecture that combines a language model with a search system over a private knowledge base. When you ask a question, relevant documents are retrieved first and then given to the model as context, reducing hallucinations and grounding responses in your actual data.

Reinforcement LearningCore AI

A training paradigm in which an AI model learns by receiving rewards for correct behavior and penalties for incorrect behavior. RLHF (Reinforcement Learning from Human Feedback) is the technique used to align LLMs like Claude and ChatGPT with human preferences.

RPAOperations

Robotic Process Automation — software that mimics rule-based human actions in a user interface (clicking, copying, pasting). RPA handles deterministic workflows, while AI handles judgment-heavy tasks. Combined, they cover most back-office automation use cases.

S

Semantic SearchArchitecture

A search approach that finds conceptually similar content rather than exact keyword matches. Powered by embeddings, semantic search can surface a relevant contract clause even if the search query uses different terminology than the original document.

SLMs (Small Language Models)Core AI

Compact AI models (typically <10B parameters) designed to run locally or on edge devices. They are cost-effective for specific, narrow tasks where a massive model is unnecessary.

Structured OutputArchitecture

Forcing an AI model to generate data in a specific machine-readable format, such as JSON or XML. This is critical for integrating AI responses directly into automated workflows and software systems.

Synthetic DataArchitecture

Artificially generated data used to train models when real-world data is scarce, expensive, or sensitive. It is increasingly used to train reasoning models and overcome data privacy hurdles.

System PromptOperations

A hidden set of instructions sent to an AI model before the user's message. System prompts define the AI's persona, constraints, and behavioral guidelines — for example, "You are a PE analyst focused on manufacturing deals. Always flag customer concentration above 20%."

T

TemperatureOperations

A parameter (0–1+) that controls the randomness of an AI model's output. Low temperature (near 0) produces deterministic, consistent responses suited for financial analysis. High temperature produces more creative, varied outputs suited for brainstorming.

ThroughputOperations

The rate at which a system processes requests, usually measured in tokens per second. High throughput is essential for batch processing tasks like summarizing thousands of documents.

TokenCore AI

The basic unit of text that AI models process — roughly 4 characters or 0.75 words in English. Model pricing, context window limits, and latency are all measured in tokens. A typical CIM might be 50,000–150,000 tokens.

Training DataCore AI

The dataset used to train an AI model. Foundation models are trained on trillions of tokens of internet text, books, and code. The composition and quality of training data heavily influences a model's strengths, biases, and knowledge cutoff.

Transformer ArchitectureArchitecture

The deep learning architecture introduced by Google in 2017 that underpins all modern LLMs. Its ability to process data in parallel (unlike previous sequential models) enabled the current AI boom.

V

Vector DatabaseArchitecture

A specialized database that stores and queries embeddings (vector representations of text). Vector databases like Pinecone, Weaviate, and Chroma power the retrieval step in RAG systems, enabling fast semantic search over large document collections.

W

Workflow AutomationOperations

The use of software to automatically execute a series of tasks that would otherwise require manual effort. In PE and portco contexts, workflow automation tools like Make.com, Zapier, and N8N connect AI models with CRMs, spreadsheets, email, and internal systems.

Z

Zero-shot LearningCore AI

The ability of an AI model to perform a task it was never explicitly trained on, relying solely on instructions in the prompt. Zero-shot capability is a hallmark of large foundation models — you can ask Claude to draft an IC memo without providing any examples.