Lexicon
189 items
- Lexicon entryAI Governance & Compliance
Compliance Framework (AI)
Master AI compliance frameworks — EU AI Act, NIST AI RMF, ISO 42001, and sector-specific regulations. Learn how enterprises build AI governance programs that satisfy regulators and accelerate deployment.
- Lexicon entryAI Cost, FinOps & TCO
AI Accelerator
Understand AI accelerators for the enterprise — GPUs, TPUs, and custom ASICs that power training and inference. Compare hardware options, TCO, and deployment strategies.
- Lexicon entryAI Cost, FinOps & TCO
GPU Computing
Learn how GPU computing powers enterprise AI training and inference. Compare NVIDIA, AMD, and cloud GPU options, understand memory constraints, and optimize for cost.
- Lexicon entryFoundation Models
TPU / Custom ASIC
Understand TPUs and custom ASICs for enterprise AI — Google TPU, AWS Trainium/Inferentia, Groq LPU. When to use purpose-built silicon over general-purpose GPUs.
- Lexicon entryAI Cost, FinOps & TCO
Inference Optimization
Master inference optimization for enterprise AI — quantization, batching, KV cache, speculative decoding, and distillation. Reduce LLM serving costs by 50–80%.
- Lexicon entryMLOps & Model Deployment
Serverless Inference
Understand serverless inference for enterprise AI — on-demand model serving with no infrastructure management. Compare providers, cold start trade-offs, and cost models.
- Lexicon entry
Edge AI / TinyML
Learn Edge AI and TinyML for the enterprise — deploying AI models directly on devices, sensors, and edge hardware for real-time inference with no cloud round-trip.
- Lexicon entryEnterprise AI Readiness & Adoption
On-Premise AI
Understand on-premise AI deployment for the enterprise — self-hosted LLMs, GPU infrastructure, and air-gapped AI for regulated industries with strict data sovereignty requirements.
- Lexicon entryEnterprise AI Readiness & Adoption
Private Cloud AI
Understand private cloud AI for the enterprise — dedicated VPC deployments, single-tenant AI infrastructure, and hybrid architectures that combine cloud agility with data control.
- Lexicon entryAI Cost, FinOps & TCO
AI-Optimized Cloud Instance
Navigate AI-optimized cloud instances — NVIDIA A100/H100, AWS p4/p5, Azure NDv4, Google A3. Match instance types to workloads, compare TCO, and avoid over-provisioning.
- Lexicon entryFoundation Models
Inference-as-a-Service
Understand Inference-as-a-Service for the enterprise — managed model APIs, hosted inference platforms, and the evaluation criteria for selecting an IaaS provider at scale.
- Lexicon entryFoundation Models
Model-as-a-Service (MaaS)
Understand Model-as-a-Service (MaaS) for the enterprise — hosted AI model APIs that eliminate GPU infrastructure, reduce time-to-production, and enable pay-per-use model access at scale.
- Lexicon entryMLOps & Model Deployment
Kubernetes for AI
Learn how Kubernetes orchestrates AI and ML workloads at enterprise scale — GPU scheduling, model serving, autoscaling, and the AI-specific platforms built on top of K8s.
- Lexicon entryMLOps & Model Deployment
Auto-Scaling (Inference)
Master auto-scaling for AI inference workloads — GPU-aware autoscaling, request queue metrics, scale-to-zero, and enterprise patterns for cost-efficient model serving.
- Lexicon entryMLOps & Model Deployment
Cold Start (Serverless AI)
Understand cold start latency in serverless AI deployments — what causes model loading delays, how to measure them, and enterprise strategies to minimize their impact on user experience.
- Lexicon entryMLOps & Model Deployment
Batch Inference
Understand batch inference for enterprise AI — how to process large volumes of model requests offline at significantly reduced cost and maximum throughput.
- Lexicon entryFoundation Models
Streaming Inference
Understand streaming inference for enterprise AI applications — how token-by-token response delivery works, its impact on perceived latency, and the infrastructure required to support it.
- Lexicon entryMLOps & Model Deployment
Multi-Tenancy (Model Serving)
Understand multi-tenancy for AI model serving — how to serve multiple customers or business units from shared GPU infrastructure with strong isolation, fair resource allocation, and compliance guarantees.
- Lexicon entryMLOps & Model Deployment
Hardware-Aware Model Optimization
Learn hardware-aware model optimization for enterprise AI — quantization, kernel compilation, tensor parallelism, and hardware-specific tuning that reduce inference cost and latency.
- Lexicon entryMLOps & Model Deployment
Low-Latency Inference
Master low-latency AI inference for enterprise — time-to-first-token optimization, speculative decoding, hardware selection, and SLO design for sub-second model serving.
- Lexicon entryMLOps & Model Deployment
High-Throughput Inference
Master high-throughput AI inference — continuous batching, tensor parallelism, speculative decoding, and the infrastructure patterns that maximize requests per second per GPU.
- Lexicon entryEnterprise AI Readiness & Adoption
AI-Augmented Development
Explore AI-augmented development — how enterprise engineering teams embed AI across the full software development lifecycle, from planning through production, to compound velocity and quality.
- Lexicon entryConversational AI
Voice AI / Voicebot
Understand enterprise voice AI — end-to-end spoken dialogue systems combining ASR, LLMs, and TTS. Explore platforms, latency requirements, telephony integration, and compliance considerations.
- Lexicon entryAgentic AI in Sales & RevOps
Sales Intelligence
Understand AI-powered sales intelligence for the enterprise — how AI enriches accounts, predicts intent, scores leads, and coaches reps to drive revenue outcomes at scale.