There's still time! Find the perfect Father's Day gift with store pickup | Shop NowThere's still time! Find the perfect Father's Day gift with store pickup | Shop Now

Quantization and Fast Inference: A practitioner's guide to efficient AI

Paperback
$59.99
Promotion message icon
Premium Members save an extra 10% and all Members collect stamps to save with Rewards. 10 stamps = $5.Learn More
Formats
This item will be released on Dec 29, 2026
Free standard shipping on orders over $60
Get the eBook free when you register your print book at Manning.

Today's AI models demand a lot of memory, compute, and server horsepower—which quickly translates into cost. This book show you how you can optimize AI models without architectural redesigns or task-specific compression. It reveals practical techniques for quantization, systematically reducing numerical precision to achieve faster inference, lower memory usage, and cheaper deployment—all with minimal accuracy loss.

From quantiz...