arxiv preprint - tinyBenchmarks: evaluating LLMs with fewer examples | AI Breakdown | Podwise