Manage episode 515736497 series 2455219
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support. How is Coxwave Redefining AI Evaluation?
In this episode of Eye on AI, host Craig Smith is joined by Yeop Lee, Head of Product at Coxwave. Together they explore how teams move beyond accuracy-only metrics to outcome focused evaluation with Coxwave's Align. We look at how Align measures satisfaction, trust, and task completion across chat, email, and voice, how LLM as judge pairs with human review, and how product teams search conversations to find hidden failure patterns that block adoption.
Learn how leading companies design an evaluation stack that guides prompts, agents, and UX, which pitfalls to avoid when shipping updates, and which metrics matter most for success, including completion rate, CSAT, retention, and cost per resolution. You will also hear how to run experiment tracking with model and prompt change logs, set up governance that prevents regressions, and choose between SaaS and on premise deployments that meet security and compliance needs.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
298 episodes