Yuandong Tian: Inside-out interpretability: training dynamics in multi-layer transformer | Berkeley RDI Center on Decentralization & AI | Podwise