Group Robust Preference Optimization in Reward-free RLHF | Xiaol.x | Podwise