Sample Efficient Preference Alignment in LLMs via Active Exploration | Best AI papers explained | Podwise