AI Glossary
Plain-English definitions of 62 AI terms — tailored for PE investors and manufacturing operators.
A
A software program that uses an AI model to perceive its environment, make decisions, and take actions autonomously to achieve a goal. Agents can browse the web, write code, call APIs, and chain together multi-step tasks without step-by-step human guidance.
A multi-step process in which an AI model acts autonomously across several stages — planning, tool use, reflection, and iteration — rather than simply responding to a single prompt. Common in complex automation pipelines such as document review or deal sourcing.
The process of ensuring an AI model's behavior matches human intent and safety standards. In a business context, this ensures the model acts consistently with company policy and does not generate harmful or biased output.
The AI safety company that created Claude. Founded in 2021 by former OpenAI researchers, Anthropic focuses on building reliable, interpretable, and steerable AI systems.
The core innovation behind the Transformer architecture. It allows the model to weigh the importance of different words in a sentence relative to each other, regardless of their distance. This enables the model to understand context and nuance effectively.
A property of models that generate output one token at a time, using the previously generated tokens as context for the next one. This explains why LLMs "stream" text and why generation speed (latency) depends on output length.
B
Standardized tests used to evaluate and compare the performance of AI models (e.g., MMLU for general knowledge, HumanEval for coding). For enterprise selection, benchmarks provide a baseline for comparing open-source vs. closed-source options.
C
A prompting technique where the model is encouraged to "show its work" or reason step-by-step before providing a final answer. This significantly improves accuracy on complex logic, math, and strategic reasoning tasks.
The process of splitting large documents into smaller segments before embedding them in a vector database. Proper chunk size balances context richness with retrieval precision — a key tuning parameter in RAG pipelines.
Anthropic's family of large language models, including Claude 3 Haiku, Sonnet, and Opus. Claude is designed with a focus on safety, long context handling (up to 200K tokens), and following nuanced instructions — making it well-suited for document-heavy PE workflows.
Proprietary models (like GPT-4 or Claude 3.5) where the weights and training data are not public. They typically offer higher performance and ease of use but come with vendor lock-in and data privacy considerations.
Specialized AI agents capable of writing, debugging, and executing code autonomously. Tools like Cursor or Devin use these to automate software development tasks, allowing non-technical operators to build internal tools.
A numeric value (typically 0–1) that indicates how certain an AI model is about a given output or classification. In manufacturing quality control, confidence scores help operators decide when to route items for human review versus automated pass-through.
A training method developed by Anthropic where a model is trained to follow a set of high-level principles (a "constitution") rather than just human feedback. This aims to make models more helpful, harmless, and honest.
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. A 200K-token context window can hold roughly 150,000 words — enough to fit a full CIM, financial model, and management presentation simultaneously.
An AI assistant embedded within a specific workflow or application (e.g., Microsoft 365 Copilot, GitHub Copilot). It assists the human user rather than taking full control, increasing productivity in drafting, coding, or analysis.
A parallel computing platform and programming model created by NVIDIA. It is the industry standard for running AI workloads on GPUs. Understanding CUDA dependencies is important when managing on-premise AI infrastructure.
D
The concept that data is subject to the laws and governance structures within the nation it is collected. For global firms, this dictates where AI models can process data and where vector databases must be hosted.
E
Mathematical representations of text as high-dimensional numerical vectors. Similar concepts cluster near each other in vector space, enabling semantic search — finding relevant passages even when the exact words differ.
A systematic process to test an AI application's performance on specific business use cases. Unlike generic benchmarks, custom evals measure accuracy, tone, and safety on your proprietary data and workflows.
F
A training process that adapts a pre-trained foundation model to a specific domain or task by exposing it to additional labeled examples. Fine-tuned models can match a firm's terminology, style, and analytical frameworks more precisely than off-the-shelf models.
A large AI model trained on broad internet-scale data that can be adapted for many downstream tasks. GPT-4o, Claude, Gemini, and Llama are foundation models — they serve as the base layer on top of which specialized applications are built.
G
OpenAI's flagship multimodal language model capable of processing text, images, and audio. The "o" stands for "omni," reflecting its ability to handle multiple input types within a single interaction.
Specialized hardware originally designed for graphics but now essential for AI due to its ability to perform massive parallel calculations. The NVIDIA H100 is currently the standard for training and running state-of-the-art models.
The technique of connecting model outputs to verifiable sources or data to prevent hallucination. RAG is a primary method of grounding, ensuring the AI relies on retrieved documents rather than its training memory.
H
NVIDIA's high-performance data center GPUs designed specifically for AI workloads. Access to these chips (compute availability) is often a bottleneck for companies training their own models or running high-throughput inference.
When an AI model confidently generates information that is factually incorrect or fabricated. Hallucinations are a key reason human review remains essential in high-stakes PE workflows such as IC memos, legal review, and financial modeling.
A workflow design pattern where human judgment is embedded at critical decision points in an AI-powered process. For example, an AI drafts an investment memo, but a partner reviews and approves before it is shared with the IC.
I
The process of running a trained AI model on new input to generate an output — as opposed to training, which is the process of building the model. When you send a prompt to Claude, you are performing inference.
A training phase where the model is taught to follow specific instructions (e.g., "Summarize this," "Write code for X"). This bridges the gap between raw text prediction and a helpful assistant.
K
A structured representation of information connecting entities (people, companies, concepts) and their relationships. Combining Knowledge Graphs with LLMs (GraphRAG) reduces hallucination and improves reasoning on complex data.
L
An open-source orchestration framework used to build AI applications. It simplifies connecting LLMs to other data sources, memory, and tools, making it a standard tool for building internal business apps.
The time delay between sending a request to an AI model and receiving the first token of the response. For real-time applications (like customer service voice agents), low latency is critical.
A mathematical concept where similar data points are mapped close together in a multi-dimensional space. "Moving" through latent space allows models to find relationships between concepts that aren't explicitly linked in the text.
Large Language Model — an AI system trained on massive text corpora to understand and generate human language. LLMs like GPT-4o and Claude power most of today's generative AI applications, from chatbots to automated document analysis.
M
Model Context Protocol — an open standard that allows AI models to securely connect to external tools, databases, and APIs. MCP enables agents to pull live CRM data, write to spreadsheets, or query internal knowledge bases without custom integrations for each system.
A model architecture that uses multiple specialized sub-models ("experts") and a gating mechanism to choose which experts to use for each token. This allows for massive models (like GPT-4 or Mixtral) to be efficient at inference time.
A design pattern where multiple specialized AI agents collaborate on a complex task — one agent might screen deals, another analyzes financials, a third drafts the IC memo. Coordination between agents enables parallel processing and specialization.
The ability of an AI model to process and generate multiple types of media simultaneously, such as text, images, audio, and video. Examples include GPT-4o and Gemini 1.5.
O
The coordination of multiple AI models, data sources, and tools to complete a workflow. It involves managing the logic, error handling, and data flow between the user and the underlying AI systems.
P
The internal variables (weights) learned by the model during training. The number of parameters (e.g., 70B, 400B) generally correlates with the model's reasoning capability and complexity.
The initial, computationally expensive phase of training a model on a massive dataset to learn general language patterns. This creates the "base model" before any fine-tuning or alignment occurs.
The input text you provide to an AI model. A well-structured prompt includes context (who you are and the situation), specificity (concrete details), scope (what to include or exclude), and format (how to structure the output).
The practice of designing, testing, and refining prompts to reliably produce high-quality outputs from AI models. Effective prompt engineering is a core skill for PE professionals who want consistent, accurate results from AI tools.
A security vulnerability where malicious inputs manipulate the model into ignoring its instructions and performing unauthorized actions. This is a primary security concern for enterprise AI deployment.
Q
The process of reducing the precision of a model's parameters (e.g., from 16-bit to 4-bit) to decrease memory usage and increase speed with minimal loss in accuracy. This enables running powerful models on consumer hardware.
R
Retrieval-Augmented Generation — an architecture that combines a language model with a search system over a private knowledge base. When you ask a question, relevant documents are retrieved first and then given to the model as context, reducing hallucinations and grounding responses in your actual data.
A training paradigm in which an AI model learns by receiving rewards for correct behavior and penalties for incorrect behavior. RLHF (Reinforcement Learning from Human Feedback) is the technique used to align LLMs like Claude and ChatGPT with human preferences.
Robotic Process Automation — software that mimics rule-based human actions in a user interface (clicking, copying, pasting). RPA handles deterministic workflows, while AI handles judgment-heavy tasks. Combined, they cover most back-office automation use cases.
S
A search approach that finds conceptually similar content rather than exact keyword matches. Powered by embeddings, semantic search can surface a relevant contract clause even if the search query uses different terminology than the original document.
Compact AI models (typically <10B parameters) designed to run locally or on edge devices. They are cost-effective for specific, narrow tasks where a massive model is unnecessary.
Forcing an AI model to generate data in a specific machine-readable format, such as JSON or XML. This is critical for integrating AI responses directly into automated workflows and software systems.
Artificially generated data used to train models when real-world data is scarce, expensive, or sensitive. It is increasingly used to train reasoning models and overcome data privacy hurdles.
A hidden set of instructions sent to an AI model before the user's message. System prompts define the AI's persona, constraints, and behavioral guidelines — for example, "You are a PE analyst focused on manufacturing deals. Always flag customer concentration above 20%."
T
A parameter (0–1+) that controls the randomness of an AI model's output. Low temperature (near 0) produces deterministic, consistent responses suited for financial analysis. High temperature produces more creative, varied outputs suited for brainstorming.
The rate at which a system processes requests, usually measured in tokens per second. High throughput is essential for batch processing tasks like summarizing thousands of documents.
The basic unit of text that AI models process — roughly 4 characters or 0.75 words in English. Model pricing, context window limits, and latency are all measured in tokens. A typical CIM might be 50,000–150,000 tokens.
The dataset used to train an AI model. Foundation models are trained on trillions of tokens of internet text, books, and code. The composition and quality of training data heavily influences a model's strengths, biases, and knowledge cutoff.
The deep learning architecture introduced by Google in 2017 that underpins all modern LLMs. Its ability to process data in parallel (unlike previous sequential models) enabled the current AI boom.
V
A specialized database that stores and queries embeddings (vector representations of text). Vector databases like Pinecone, Weaviate, and Chroma power the retrieval step in RAG systems, enabling fast semantic search over large document collections.
W
The use of software to automatically execute a series of tasks that would otherwise require manual effort. In PE and portco contexts, workflow automation tools like Make.com, Zapier, and N8N connect AI models with CRMs, spreadsheets, email, and internal systems.
Z
The ability of an AI model to perform a task it was never explicitly trained on, relying solely on instructions in the prompt. Zero-shot capability is a hallmark of large foundation models — you can ask Claude to draft an IC memo without providing any examples.