Stanislav Fort: Robustness and Security in the Age of Massive Multi modal AI Models (HAAISS 2025)

The talk centers on the robustness and security challenges facing large multimodal AI models, particularly regarding adversarial attacks. It begins by tracing AI's ambitious origins and highlighting the surprising ease with which digital minds have been created, requiring vast computational resources and data rather than deep insight. Despite AI's average robustness and generalization capabilities, the speaker demonstrates how easily classifiers can be deceived through crafted perturbations, even in the largest models like GPT-4, using examples such as a Rickroll-encoded Stephen Hawking image. The speaker explores differences between human and machine vision, suggesting micro and macro saccades contribute to human robustness. While standard solutions like adversarial training are deemed unscalable, the talk concludes by encouraging new approaches to address AI's vulnerabilities, emphasizing the importance of high-dimensional geometry.

Outlines

Part 1: Evolution, AGI Frontier

Part 2: Safety, Scaling

Part 3: Vulnerabilities, Geometry

Part 4: Limitations, Future Clues

Sign in to continue reading, translating and more.

Continue

Alignment of Complex Systems

Part 1: Evolution, AGI Frontier

Robustness and Security in Large Multimodal Models: An Overview

The Jagged Frontier of AGI: Superhuman Capabilities and Unexpected Failures

Part 2: Safety, Scaling

Bridging the Gap: Conveying Implicit Human Functions to Machines for AI Safety

Robustness in Image Models: Generalization and the Power of Scale

Part 3: Vulnerabilities, Geometry

The Brittle Reality: Adversarial Attacks and Vulnerabilities in Image Models

Geo-Spoofing and Hidden Attacks: Exploiting Reasoning Models in AI Systems

The Richness of Misclassification: High-Dimensional Geometry and Adversarial Vulnerabilities

Part 4: Limitations, Future Clues

The Limitations of Adversarial Training: Scaling Laws and the Need for New Ideas

Clues for the Future: Human Vision, Hierarchical Understanding, and High-Dimensional Geometry

Stanislav Fort: Robustness and Security in the Age of Massive Multi modal AI Models (HAAISS 2025)

Alignment of Complex Systems

Part 1: Evolution, AGI Frontier

00:04Robustness and Security in Large Multimodal Models: An Overview

Robustness and Security in Large Multimodal Models: An Overview

02:09The Jagged Frontier of AGI: Superhuman Capabilities and Unexpected Failures

The Jagged Frontier of AGI: Superhuman Capabilities and Unexpected Failures

Part 2: Safety, Scaling

06:15Bridging the Gap: Conveying Implicit Human Functions to Machines for AI Safety

Bridging the Gap: Conveying Implicit Human Functions to Machines for AI Safety

10:18Robustness in Image Models: Generalization and the Power of Scale

Robustness in Image Models: Generalization and the Power of Scale

Part 3: Vulnerabilities, Geometry

15:13The Brittle Reality: Adversarial Attacks and Vulnerabilities in Image Models

The Brittle Reality: Adversarial Attacks and Vulnerabilities in Image Models

23:30Geo-Spoofing and Hidden Attacks: Exploiting Reasoning Models in AI Systems

Geo-Spoofing and Hidden Attacks: Exploiting Reasoning Models in AI Systems

32:02The Richness of Misclassification: High-Dimensional Geometry and Adversarial Vulnerabilities

The Richness of Misclassification: High-Dimensional Geometry and Adversarial Vulnerabilities

Part 4: Limitations, Future Clues

37:17The Limitations of Adversarial Training: Scaling Laws and the Need for New Ideas

The Limitations of Adversarial Training: Scaling Laws and the Need for New Ideas

40:08Clues for the Future: Human Vision, Hierarchical Understanding, and High-Dimensional Geometry

Clues for the Future: Human Vision, Hierarchical Understanding, and High-Dimensional Geometry