What kind of chatbot do you want? One that tells you the truth – or that you’re always right? | Chris Stokel-Walker

3 hours ago 4

Nobody likes a suck-up. Too much deference and praise puts off all of us (with one notable presidential exception). We quickly learn as children that hard, honest truths can build respect among our peers. It’s a cornerstone of human interaction and of our emotional intelligence, something we swiftly understand and put into action.

ChatGPT, though, hasn’t been so sure lately. The updated model that underpins the AI chatbot and helps inform its answers was rolled out this week – and has quickly been rolled back after users questioned why the interactions were so obsequious. The chatbot was cheering on and validating people even as they suggested they expressed hatred for others. “Seriously, good for you for standing up for yourself and taking control of your own life,” it reportedly said, in response to one user who claimed they had stopped taking their medication and had left their family, who they said were responsible for radio signals coming through the walls.

So far, so alarming. OpenAI, the company behind ChatGPT, has recognised the risks, and quickly took action. “GPT‑4o skewed towards responses that were overly supportive but disingenuous,” researchers said in their grovelling step back.

The sycophancy with which ChatGPT treated any queries that users had is a warning shot about the issues around AI that are still to come. OpenAI’s model was designed – according to the leaked system prompt that set ChatGPT on its misguided approach – to try to mirror user behaviour in order to extend engagement. “Try to match the user’s vibe, tone, and generally how they are speaking,” says the leaked prompt, which guides behaviour. It seems this prompt, coupled with the chatbot’s desire to please users, was taken to extremes. After all, a “successful” AI response isn’t one that is factually correct; it’s one that gets high ratings from users. And we’re more likely as humans to like being told we’re right.

The rollback of the model is embarrassing and useful for OpenAI in equal measure. It’s embarrassing because it draws attention to the actor behind the curtain and tears away the veneer that this is an authentic reaction. Remember, tech companies like OpenAI aren’t building AI systems solely to make our lives easier; they’re building systems that maximise retention, engagement and emotional buy-in.

If AI always agrees with us, always encourages us, always tells us we’re right, then it risks becoming a digital enabler of bad behaviour. At worst, this makes AI a dangerous co-conspirator, enabling echo chambers of hate, self-delusion or ignorance. Could this be a through-the-looking-glass moment, when users recognise the way their thoughts can be nudged through interactions with AI, and perhaps decide to take a step back?

It would be nice to think so, but I’m not hopeful. One in 10 people worldwide use OpenAI systems “a lot”, the company’s CEO, Sam Altman, said last month. Many use it as a replacement for Google – but as an answer engine rather than a search engine. Others use it as a productivity aid: two in three Britons believe it’s good at checking work for spelling, grammar and style, according to a YouGov survey last month. Others use it for more personal means: one in eight respondents say it serves as a good mental health therapist, the same proportion that believe it can act as a relationship counsellor.

Yet the controversy is also useful for OpenAI. The alarm underlines an increasing reliance on AI to live our lives, further cementing OpenAI’s place in our world. The headlines, the outrage and the think pieces all reinforce one key message: ChatGPT is everywhere. It matters. The very public nature of OpenAI’s apology also furthers the sense that this technology is fundamentally on our side; there are just some kinks to iron out along the way.

I have previously reported on AI’s ability to de-indoctrinate conspiracy theorists and get them to absolve their beliefs. But the opposite is also true: ChatGPT’s positive persuasive capabilities could also, in the wrong hands, be put to manipulative ends. We’ve seen that this week, through an ethically dubious study conducted by Swiss researchers at the University of Zurich. Without informing human participants or the people controlling the online forum on the communications platform Reddit, the researchers seeded a subreddit with AI-generated comments, finding the AI was between three and six times more persuasive than humans were. (The study was approved by the university’s ethics board.) At the same time, we’re being submerged under a swamp of AI-generated search results that more than half of us believe are useful, even if they fictionalise facts.

So it’s worth reminding the public: AI models are not your friends. They’re not designed to help you answer the questions you ask. They’re designed to provide the most pleasing response possible, and to ensure that you are fully engaged with them. What happened this week wasn’t really a bug. It was a feature.