arxiv Preprint - Contrastive Prefence Learning: Learning from Human Feedback without RL | AI Breakdown | Podwise