The podcast explores the challenges enterprises face in deploying AI models due to inherent risks like bias, hallucinations, and vulnerabilities, which often hinder ROI. Ben Lorica, CEO of HIRONDO, introduces machine unlearning as a solution to make AI trustworthy by enabling models to "forget" undesired data and behaviors. He details how current methods like context engineering, fine-tuning, and guardrails are insufficient because they focus on external solutions rather than addressing the core of the model's knowledge. HIRONDO's approach involves "neurosurgery" on the model's internal representations, identifying and removing problematic behaviors and information like PII, vulnerabilities to prompt injections, and biases. The discussion also covers the distinction between behavioral and data unlearning, emphasizing the importance of benchmarks to measure the effectiveness of unlearning.
Sign in to continue reading, translating and more.
Continue