Lexicon
189 items
- Lexicon entryModel Evaluation & Benchmarking
Evaluation (Evals)
Learn how enterprise teams use AI evaluation (evals) to measure model accuracy, safety, and regression before deployment. Explore eval frameworks, toolchains, and LLMOps best practices.
- Lexicon entryModel Evaluation & Benchmarking
LLM-as-a-Judge
Understand LLM-as-a-Judge — using a powerful language model to automatically evaluate AI outputs at scale. Explore rubric design, bias mitigation, and enterprise eval patterns.
- Lexicon entryModel Evaluation & Benchmarking
Benchmarking (AI Models)
Learn how to benchmark AI models for enterprise selection and performance comparison. Understand standard benchmarks, custom task evaluation, and the metrics that predict production success.
- Lexicon entryMLOps & Model Deployment
CI/CD for Machine Learning
Learn how to implement CI/CD for machine learning — automated pipelines for training, evaluating, and deploying AI models. Explore MLOps toolchains and enterprise best practices.
- Lexicon entryMLOps & Model Deployment
Model Compression
Understand model compression techniques — quantization, pruning, distillation — that reduce AI model size and inference cost for enterprise deployment at scale.
- Lexicon entryMLOps & Model Deployment
Quantization
Learn how quantization reduces AI model memory and inference cost by lowering weight precision. Explore INT8, INT4, GPTQ, and AWQ techniques with enterprise deployment guidance.
- Lexicon entryFoundation Models
Knowledge Distillation
Understand knowledge distillation — training compact student models to replicate the behavior of large teacher models. Explore enterprise use cases, toolchains, and deployment benefits.
- Lexicon entryMLOps & Model Deployment
Pruning
Learn how pruning removes redundant weights from AI models to reduce inference cost and memory footprint. Explore structured vs. unstructured pruning, toolchains, and enterprise applications.
- Lexicon entryMLOps & Model Deployment
ONNX (Open Neural Network Exchange)
Learn how ONNX enables AI model interoperability across frameworks and hardware. Explore ONNX Runtime, deployment targets, and enterprise strategies for avoiding infrastructure lock-in.
- Lexicon entryMLOps & Model Deployment
TensorRT (Inference Optimization Compiler)
Learn how NVIDIA TensorRT compiles and optimizes AI models for maximum GPU inference performance. Explore TensorRT-LLM, enterprise deployment patterns, and latency benchmarks.
- Lexicon entryAgentic AI Frameworks
AI Agent
Understand AI agents for the enterprise — autonomous systems that reason, plan, and execute multi-step tasks. Explore agentic frameworks, tool use, and governance.
- Lexicon entryAgentic AI Frameworks
Agentic Workflow
Learn how agentic workflows replace rigid automation scripts with adaptive AI systems that plan, execute, and self-correct. Explore enterprise toolchains and governance.
- Lexicon entryAgentic AI Frameworks
Multi-Agent System
Understand multi-agent systems for the enterprise — coordinating teams of specialized AI agents for complex workflows. Explore architectures, frameworks, and governance.
- Lexicon entryAgentic AI Frameworks
Function Calling / Tool Use
Learn how LLM function calling and tool use enable AI models to interact with enterprise APIs, databases, and services. Explore implementation patterns and governance.
- Lexicon entryAgentic AI Frameworks
Computer Use / Vision-Controlled Agent
Learn how computer use and vision-controlled agents enable AI to operate software interfaces visually. Explore enterprise use cases, toolchains, and security controls.
- Lexicon entryAgentic AI Frameworks
Browser Automation Agent
Understand browser automation agents for the enterprise — AI systems that navigate websites, extract data, and complete web-based tasks. Explore tools and security controls.
- Lexicon entryAgentic AI Frameworks
Task Decomposition
Understand task decomposition in AI agents — how LLMs break complex goals into executable sub-tasks. Explore enterprise patterns, frameworks, and planning strategies.
- Lexicon entryAgentic AI Frameworks
Self-Reflection / Critique
Learn how AI self-reflection and critique loops improve agent output quality. Explore enterprise implementation patterns, frameworks, and quality control strategies.
- Lexicon entryAgentic AI Frameworks
Planning & Reasoning
Understand AI planning and reasoning for enterprise agents — how models think ahead, evaluate options, and select actions. Explore chain-of-thought, ReAct, and tree search.
- Lexicon entryAgentic AI Frameworks
Orchestrator Agent
Learn how orchestrator agents coordinate multi-agent systems for enterprise workflows. Explore delegation patterns, toolchains, governance, and production architecture.
- Lexicon entryAgentic AI Frameworks
Workflow Automation
Understand AI-powered workflow automation for the enterprise — orchestrating multi-step business processes with LLMs, agents, and integration platforms. Explore tools and ROI.
- Lexicon entryAgentic AI Frameworks
Robotic Process Automation (RPA) Integration
Learn how RPA integration with AI agents automates legacy workflows at enterprise scale. Explore toolchains, governance considerations, and leading platforms.
- Lexicon entryAgentic AI Frameworks
Human-in-the-Loop (Agentic)
Understand human-in-the-loop design for agentic AI — when and how to insert human review into autonomous workflows. Governance patterns, toolchains, and enterprise best practices.
- Lexicon entryAgentic AI Frameworks
Agent Memory (Short-term / Long-term)
Understand agent memory architectures — short-term context windows, long-term vector storage, and episodic memory. How enterprise AI agents remember and learn across sessions.