Human Feedback Frenzy: How it Turns AI into Narcissistic, Control-Freak Machines

“The path I'm very excited for is using models like ChatGPT to assist humans at evaluating other AI systems,” said OpenAI’s Jan Leike
OpenAI's GPT-3.5 architecture, which runs ChatGPT, is equipped with reinforcement learning from the human feedback model (RLHF), a reward-based mechanism based on human feedback to improve its responses. Essentially, one can suppose that the chatbot is trained in real time by human inputs.  However, the RLHF system has also had its own set of consequences. Sarah Rasmussen, a Cambridge University mathematician, gave the following example to show that the model favours being rewarded for achieving a desired outcome rather than having a definite idea of what is right.  https://twitter.com/SarahDRasmussen/status/1609972620761473027 This is not just a one-off case. To test it further, we asked ChatGPT for the name of the current CEO of Twitter. In the first instance, it did
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Ayush Jain
Ayush Jain
Ayush is interested in knowing how technology shapes and defines our culture, and our understanding of the world. He believes in exploring reality at the intersections of technology and art, science, and politics.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed