Lexicon

Practitioner-grade definitions for the terminology that gets thrown around in enterprise AI conversations.

189 items

Lexicon entryAI Governance & Compliance
Compliance Framework (AI)
Master AI compliance frameworks — EU AI Act, NIST AI RMF, ISO 42001, and sector-specific regulations. Learn how enterprises build AI governance programs that satisfy regulators and accelerate deployment.
Lexicon entryAI Cost, FinOps & TCO
AI Accelerator
Understand AI accelerators for the enterprise — GPUs, TPUs, and custom ASICs that power training and inference. Compare hardware options, TCO, and deployment strategies.
Lexicon entryAI Cost, FinOps & TCO
GPU Computing
Learn how GPU computing powers enterprise AI training and inference. Compare NVIDIA, AMD, and cloud GPU options, understand memory constraints, and optimize for cost.
Lexicon entryFoundation Models
TPU / Custom ASIC
Understand TPUs and custom ASICs for enterprise AI — Google TPU, AWS Trainium/Inferentia, Groq LPU. When to use purpose-built silicon over general-purpose GPUs.
Lexicon entryAI Cost, FinOps & TCO
Inference Optimization
Master inference optimization for enterprise AI — quantization, batching, KV cache, speculative decoding, and distillation. Reduce LLM serving costs by 50–80%.
Lexicon entryMLOps & Model Deployment
Serverless Inference
Understand serverless inference for enterprise AI — on-demand model serving with no infrastructure management. Compare providers, cold start trade-offs, and cost models.
Lexicon entry
Edge AI / TinyML
Learn Edge AI and TinyML for the enterprise — deploying AI models directly on devices, sensors, and edge hardware for real-time inference with no cloud round-trip.
Lexicon entryEnterprise AI Readiness & Adoption
On-Premise AI
Understand on-premise AI deployment for the enterprise — self-hosted LLMs, GPU infrastructure, and air-gapped AI for regulated industries with strict data sovereignty requirements.
Lexicon entryEnterprise AI Readiness & Adoption
Private Cloud AI
Understand private cloud AI for the enterprise — dedicated VPC deployments, single-tenant AI infrastructure, and hybrid architectures that combine cloud agility with data control.
Lexicon entryAI Cost, FinOps & TCO
AI-Optimized Cloud Instance
Navigate AI-optimized cloud instances — NVIDIA A100/H100, AWS p4/p5, Azure NDv4, Google A3. Match instance types to workloads, compare TCO, and avoid over-provisioning.
Lexicon entryFoundation Models
Inference-as-a-Service
Understand Inference-as-a-Service for the enterprise — managed model APIs, hosted inference platforms, and the evaluation criteria for selecting an IaaS provider at scale.
Lexicon entryFoundation Models
Model-as-a-Service (MaaS)
Understand Model-as-a-Service (MaaS) for the enterprise — hosted AI model APIs that eliminate GPU infrastructure, reduce time-to-production, and enable pay-per-use model access at scale.
Lexicon entryMLOps & Model Deployment
Kubernetes for AI
Learn how Kubernetes orchestrates AI and ML workloads at enterprise scale — GPU scheduling, model serving, autoscaling, and the AI-specific platforms built on top of K8s.
Lexicon entryMLOps & Model Deployment
Auto-Scaling (Inference)
Master auto-scaling for AI inference workloads — GPU-aware autoscaling, request queue metrics, scale-to-zero, and enterprise patterns for cost-efficient model serving.
Lexicon entryMLOps & Model Deployment
Cold Start (Serverless AI)
Understand cold start latency in serverless AI deployments — what causes model loading delays, how to measure them, and enterprise strategies to minimize their impact on user experience.
Lexicon entryMLOps & Model Deployment
Batch Inference
Understand batch inference for enterprise AI — how to process large volumes of model requests offline at significantly reduced cost and maximum throughput.
Lexicon entryFoundation Models
Streaming Inference
Understand streaming inference for enterprise AI applications — how token-by-token response delivery works, its impact on perceived latency, and the infrastructure required to support it.
Lexicon entryMLOps & Model Deployment
Multi-Tenancy (Model Serving)
Understand multi-tenancy for AI model serving — how to serve multiple customers or business units from shared GPU infrastructure with strong isolation, fair resource allocation, and compliance guarantees.
Lexicon entryMLOps & Model Deployment
Hardware-Aware Model Optimization
Learn hardware-aware model optimization for enterprise AI — quantization, kernel compilation, tensor parallelism, and hardware-specific tuning that reduce inference cost and latency.
Lexicon entryMLOps & Model Deployment
Low-Latency Inference
Master low-latency AI inference for enterprise — time-to-first-token optimization, speculative decoding, hardware selection, and SLO design for sub-second model serving.
Lexicon entryMLOps & Model Deployment
High-Throughput Inference
Master high-throughput AI inference — continuous batching, tensor parallelism, speculative decoding, and the infrastructure patterns that maximize requests per second per GPU.
Lexicon entryEnterprise AI Readiness & Adoption
AI-Augmented Development
Explore AI-augmented development — how enterprise engineering teams embed AI across the full software development lifecycle, from planning through production, to compound velocity and quality.
Lexicon entryConversational AI
Voice AI / Voicebot
Understand enterprise voice AI — end-to-end spoken dialogue systems combining ASR, LLMs, and TTS. Explore platforms, latency requirements, telephony integration, and compliance considerations.
Lexicon entryAgentic AI in Sales & RevOps
Sales Intelligence
Understand AI-powered sales intelligence for the enterprise — how AI enriches accounts, predicts intent, scores leads, and coaches reps to drive revenue outcomes at scale.