Reliable Evaluations for LLMs and AI Agents: End-to-End Evaluation Frameworks for LLMs and Autonomous AI Agents | Barnes & Noble®

Successfully Added to Cart

Premium & Rewards Members Earn Double Stamps Shop Now Ends 7/5

Premium & Rewards Members Earn Double Stamps Shop Now Ends 7/5

Add Item to wish list Add Item to wish list

By Alexei Robsky, Liliya Lavitas, Yueqing Wang

Premium Members get an additional 10% off now through 07/05/26, Premium & Rewards Members Earn Double Stamps! 10 stamps = $5 reward.

Premium Members save an extra 10% and all Members collect stamps to save with Rewards. 10 stamps = $5.Learn More

Formats

This item will be released on Jul 31, 2026

Free standard shipping on orders over $60

Overview

This book gives practitioners a concrete, systematic framework for designing evals that make AI systems safe, robust, and customer-ready before they reach production. Drawing on real-world failures, from chatbots that went off the rails to shopping assistants that hallucinated product information, it shows how seemingly small evaluation gaps can cascade into legal, financial, and reputational crisis, and how to close those gaps with disciplined, systematic testing.

Moving from foundational...

Product Details

The Five-Year Century: Bold Leadership and Accelerated Outcomes in the Age of AI

The Five-Year Century: Bold Leadership and Accelerated Outcomes in the Age of AI

Mihir Shukla, Nancy Hauge

Hardcover

$30.00

Internet Password Keeper

Internet Password Keeper

Hardcover

$10.99

Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming

Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming

Paperback

$49.99

Python All-in-One For Dummies

Python All-in-One For Dummies

John C. Shovic, Alan Simpson

Paperback

$44.99

iPhone For Seniors For Dummies, 2026 Edition

iPhone For Seniors For Dummies, 2026 Edition

Paperback

$29.99

Windows 11 For Dummies, 2nd Edition

Windows 11 For Dummies, 2nd Edition

Paperback

$24.99

CompTIA Security+ Study Guide with over 500 Practice Test Questions: Exam SY0-701

CompTIA Security+ Study Guide with over 500 Practice Test Questions: Exam SY0-701

Mike Chapple, David Seidl

Sybex Study Guide

Paperback

$55.00

ChatGPT For Dummies

ChatGPT For Dummies

Paperback

$24.99

Microsoft 365 Excel All-in-One For Dummies

Microsoft 365 Excel All-in-One For Dummies

David H. Ringstrom, Michael Alexander, Dick Kusleika, Paul McFedries, Ken Bluttman

Paperback

$44.99

Automate the Boring Stuff with Python, 3rd Edition

Automate the Boring Stuff with Python, 3rd Edition

Paperback

$59.99

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Martin Kleppmann, Chris Riccomini

Paperback

$69.99

Microsoft 365 Excel For Dummies

Microsoft 365 Excel For Dummies

David H. Ringstrom

Paperback

$29.99

Windows 11 For Seniors For Dummies, 2nd Edition

Windows 11 For Seniors For Dummies, 2nd Edition

Paperback

$24.99

The Ultimate ChatGPT Prompt Book: 750+ Expert Prompts to Boost Productivity, Unlock Creative Potential, and Simplify Tasks

The Ultimate ChatGPT Prompt Book: 750+ Expert Prompts to Boost Productivity, Unlock Creative Potential, and Simplify Tasks

Paperback

$16.00

Generative AI For Dummies

Generative AI For Dummies

Paperback

$29.99

CompTIA A+ Certification All-in-One Exam Guide, Eleventh Edition (Exams 220-1101 & 220-1102)

CompTIA A+ Certification All-in-One Exam Guide, Eleventh Edition (Exams 220-1101 & 220-1102)

Travis A. Everett, Andrew Hutz

Hardcover

$60.00

Linux Basics for Hackers, 2nd Edition: Getting Started with Networking, Scripting, and Security in Kali

Linux Basics for Hackers, 2nd Edition: Getting Started with Networking, Scripting, and Security in Kali

Paperback

$39.99

Teach Yourself VISUALLY iPhone 17 and iPhone Air

Teach Yourself VISUALLY iPhone 17 and iPhone Air

Teach Yourself VISUALLY (Tech)

Paperback

$33.00

Practical SQL, 2nd Edition: A Beginner's Guide to Storytelling with Data

Practical SQL, 2nd Edition: A Beginner's Guide to Storytelling with Data

Anthony DeBarros

Paperback

$39.99

Computers For Seniors For Dummies

Computers For Seniors For Dummies

Paperback

$24.99