Claudia Shi

CS PhD student at Columbia University


Claudia.j.shi AT gmail.com



Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback


Tech report


Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jeremy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, others
https://arxiv.org/abs/2307.15217

Cite

Cite

APA   Click to copy
Casper, S., Davies, X., Shi, C., Gilbert, T. K., Scheurer, J., Rando, J., … others. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. https://arxiv.org/abs/2307.15217.


Chicago/Turabian   Click to copy
Casper, Stephen, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jeremy Scheurer, Javier Rando, Rachel Freedman, et al. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Https://Arxiv.org/Abs/2307.15217, n.d.


MLA   Click to copy
Casper, Stephen, et al. “Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.” Https://Arxiv.org/Abs/2307.15217.


BibTeX   Click to copy

@techreport{stephen-a,
  title = {Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback},
  journal = {https://arxiv.org/abs/2307.15217},
  author = {Casper, Stephen and Davies, Xander and Shi, Claudia and Gilbert, Thomas Krendl and Scheurer, Jeremy and Rando, Javier and Freedman, Rachel and Korbak, Tomasz and Lindner, David and Freire, Pedro and others}
}

Share



Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in