Tech report
https://arxiv.org/abs/2307.15217
APA
Click to copy
Casper, S., Davies, X., Shi, C., Gilbert, T. K., Scheurer, J., Rando, J., … others. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. https://arxiv.org/abs/2307.15217.
Chicago/Turabian
Click to copy
Casper, Stephen, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jeremy Scheurer, Javier Rando, Rachel Freedman, et al. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Https://Arxiv.org/Abs/2307.15217, n.d.
MLA
Click to copy
Casper, Stephen, et al. “Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.” Https://Arxiv.org/Abs/2307.15217.
BibTeX Click to copy
@techreport{stephen-a,
title = {Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback},
journal = {https://arxiv.org/abs/2307.15217},
author = {Casper, Stephen and Davies, Xander and Shi, Claudia and Gilbert, Thomas Krendl and Scheurer, Jeremy and Rando, Javier and Freedman, Rachel and Korbak, Tomasz and Lindner, David and Freire, Pedro and others}
}