RLHF

  1. REDIRECT Reinforcement learning from human feedback

{{rcatsh |

{{R from initialism}}

}}