LW - Memory bandwidth constraints imply economies of scale in AI inference by Ege Erdil | The Nonlinear Library | Podwise