Understanding OpenAI’s Advanced Voice Mode: The Evolution of Intimacy with Chatbots

If you’re a paid subscriber to ChatGPT, you may have noticed the artificial intelligence (AI) large language model has recently started to sound more human when you are having audio interactions with it.

That’s because the company behind the language model-cum-chatbot, OpenAI, is currently running a limited pilot of a new feature known as “advanced voice mode”.

OpenAI says this new mode “features more natural, real-time conversations that pick up on and respond with emotion and non-verbal cues”. It plans for all paid ChatGPT subscribers to have access to the advanced voice mode in coming months.

Advanced voice mode sounds strikingly human. There aren’t the awkward gaps we are used to with voice assistants; instead it seems to take breaths like a human would. It is also unfazed by interruption, conveys appropriate emotion cues and seems to infer the user’s emotional state from voice cues.

But at the same time as making ChatGPT seem more human, OpenAI has expressed concern that users might respond to the chatbot as if it were human – by developing an intimate relationship with it.

This is not a hypothetical. For example, a social media influencer Lisa Li coded ChatGPT as her “boyfriend”. But why exactly do some people develop intimate relationships with a chatbot?

The evolution of intimacy

Humans have a remarkable capacity for friendship and intimacy. This is an extension of the way primates physically groom one another to build alliances that can be called upon in times of strife.

But our ancestors also evolved a remarkable capacity to “groom” one another verbally. This drove the evolutionary cycle in which the language centres in our brains became larger and what we did with language became more complex.

More complex language in turn enabled more complex socialising with larger networks of relatives, friends and allies. It also enlarged the social parts of our brains.

Language evolved alongside human social behaviour. The way we draw an acquaintance into friendship or a friend into intimacy is largely through conversation.

Experiments in the 1990s revealed that conversational back-and-forth, especially when it involves disclosing personal details, builds the intimate sense our conversation partner is somehow part of us.

So I’m not surprised that attempts to replicate this process of “escalating self-disclosure” between humans and chatbots results in humans feeling intimate with the chatbots.

And that’s just with text input. When the main sensory experience of conversation – voice – gets involved, the effect is amplified. Even voice-based assistants that don’t sound human, such as Siri and Alexa, still get an avalanche of marriage proposals.

The writing was on the lab chalkboard

If OpenAI were to ask me how to ensure users don’t form social relationships with ChatGPT, I would have a few simple recommendations.

First, don’t give it a voice. Second, don’t make it capable of holding up one end of an apparent conversation. Basically don’t make the product you made.

The product is so powerful precisely because it does such an excellent job of mimicking the traits we use to form social relationships.

Close-up of GPT-4o displayed on a smartphone screen. — OpenAI should have known the risks of creating a human-like chatbot. QubixStudio/Shutterstock

The writing was on the laboratory chalkboard since the first chatbots flickered on nearly 60 years ago. Computers have been recognised as social actors for at least 30 years. The advanced voice mode of ChatGPT is merely the next impressive increment, not what the tech industry would gushingly call a “game changer”.

That users not only form relationships with chatbots but develop very close personal feelings became clear early last year when users of the virtual friend platform Replika AI found themselves unexpectedly cut off from the most advanced functions of their chatbots.

Replika was less advanced than the new version of ChatGPT. And yet the interactions were of such a quality that users formed surprisingly deep attachments.

The risks are real

Many people, starved for the kind of company that listens in a non-judgmental way, will get a lot out of this new generation of chatbots. They may feel less lonely and isolated. These kinds of benefits of technology can never be overlooked.

But the potential dangers of ChatGPT’s advanced voice mode are also very real.

Time spent chatting with any bot is time that can’t be spent interacting with friends and family. And people who spend a lot of time with technology are at greatest risk of displacing relationships with other humans.

As OpenAI identifies, chatting with bots can also contaminate existing relationships people have with other people. They may come to expect their partners or friends to behave like polite, submissive, deferential chatbots.

These bigger effects of machines on culture are going to become more prominent. On the upside, they may also provide deep insights into how culture works.

Rob Brooks, Scientia Professor of Evolutionary Ecology; UNSW Sydney

This article is republished from The Conversation under a Creative Commons license. Read the original article which was titled: “The latest version of ChatGPT has a feature you’ll fall in love with. And that’s a worry.” (I am messing around with WordPress’ AI-generated titles).

Understanding OpenAI’s Advanced Voice Mode: The Evolution of Intimacy with Chatbots

Published by Rob Brooks

Leave a Reply

The evolution of intimacy

The writing was on the lab chalkboard

The risks are real

Share this:

Published by Rob Brooks

Leave a Reply