Open in PodwiseOpen

Episode cover

09 Aug 2023

1h 20m

Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference

Generally Intelligent

Open in Podwise to generate AI notes

Sign in to process this episode and unlock summaries, transcripts, highlights and translations.

Open in Podwise

Shownotes are not generated by Podwise.

Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference