Reward modeling

  1. REDIRECT AI alignment#Reward modeling and iterated amplification

{{rcat shell|

{{r to section}}

}}