
The podcast explores the ethics and practicalities of "distillation attacks" on large language models (LLMs), where smaller models are trained on the outputs of larger, proprietary models. The discussion covers the challenges of detecting such attacks versus legitimate evaluation, noting that scale and pattern analysis are key detection methods. The participants debate whether companies should restrict model access via APIs to prevent distillation, with some arguing for product-exclusive models. The conversation shifts to the saturation and inherent flaws of coding benchmarks like SWE-Bench, including the discovery of unsolvable tasks and models memorizing solutions. They highlight the need for updated, private benchmarks and discuss the surprising capacity of LLMs to memorize data from a single pass, underscoring the understudied information theory of LLMs.
Sign in to continue reading, translating and more.
Continue