In this episode of the A16Z podcast, Oliver Wang and Nicole Brichtova from Google DeepMind discuss Gemini 2.5 image, also known as Nano Banana. They delve into the model's architecture, its integration of image generation and editing within Gemini's multimodal framework, and the challenges of achieving character consistency, compositional control, and conversational editing at scale. They also touch on open questions and model evaluation, safety and latency optimization, and how visual reasoning connects to broader advances in multimodal systems. The conversation explores the potential impact of AI on creative arts, the evolution of user interfaces, and the future of image representation, as well as the balance between control and intent in AI-driven art creation.
Sign in to continue reading, translating and more.
Continue