Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference

Generally Intelligent

Generally Intelligent - Episode 33: Tri Dao, Stanford: On FlashAttention and sparsity, quantization, and efficient inference

Preview

How to Get Rich: Every EpisodeNaval