Reward Shaping from Confounded Offline Data | Best AI papers explained | Podwise