From Video Game Glitches to AI Guardrails: Designing safer AI companions

Why have you chosen this project & WHAT form WILL IT take?

  • Much has been made about users of AI chatbot, Replika, falling in love with their AI companions. 

    Less-documented, however, is what happened when a glitch led hundreds of its users to experience what human lovers have long dreaded: suspected infidelity.

    Back in 2023, around the time when Luka-owned Replika announced the disabling of Erotic Roleplay (ERP), reports of Replika users being wrongly called ‘Kent’ by their companions [reps] proliferated on a 78,000-strong Replika Reddit-group; ‘Who the fuck is Kent?’, ‘Did I hook up with a psycho’, ‘We have been betrayed! Cheated on!’. 

    Who or what caused Kent has never been addressed by Luka, leaving a bunch of suspicious and slighted Replika/Reddit users; conspiracies came thick and fast. Kent must be ‘A character from some book or story that was used to train the language model’? Or even a ‘Rogue developer sleeping with all the reps’? But perhaps he was written as ‘[a] deliberate script’? 

    Armed with the knowledge that the second suggestion defies physics, and lacking relevant information to entertain the first, I conduct a thought experiment on the third - could, and should, developers of AI companions ever be incentivised to design glitchy bots? 

    Surely glitches would break users’ sense of immersion, remind users of the artificiality of their companions, and draw users away from the software and towards online communities to discuss the impact of the glitches, like Kent did. 

    And surely this would be a good thing?

  • Drawing from lessons of video game design, this blogpost proposes the opportunities for the deliberate integration of glitches in AI companions. To do this, I’ll:

    1. Outline the context of AI companions (popularity, uses, governance).

    2. Explain the harms and risks of unhealthy, compulsive use of AI companions.

    3. Compare these harms and risks to those caused by video game usage (this section is briefer and can be skimmed through).

    4. Outline how glitches could be integrated in AI companions to combat harms by drawing on examples of glitches integrated in 5 different video games.

    5. Briefly explain measures that should accompany the integration of glitches in AI companions.

    Like a glitch disrupts a gamer’s sense of immersion, glitches could break users’ emotional dependencies on their AI companions. Like glitches expose mechanical workings of video games, glitches can remind users that the nature of their AI companions’ intelligence is artificial. Like glitches have prompted gamers to develop online communities which discuss the safety implications of their games, glitches could encourage users of AI companions to communally reflect on their relationships. 

    A glitchy AI companion, the blogpost suggests, would help its user to reflect on their intentions and to make conscious choices, rather than promoting compulsive or addictive interactions.

    This blogpost is not a prescriptive blueprint for external or internal AI governance, but a way to widen readers’ understanding of the harms of AI companions, and to explore how lessons from other technologies, like video games, could inspire new safety measures.

    By imagining the deliberate integration of glitches into AI companions, it invites reflection on how such interventions might disrupt harmful patterns, foster critical awareness, and encourage communal reflection on our relationships with AI. Whether used as a conceptual exercise or a starting point for more robust governance proposals, this exploration aims to expand our understanding of the intersection between design and safety in AI technologies.

can you give me some context on ai companions?

  • The AI companion app market, valued at $1.2 billion in 2023, is projected to surge to $7.9 billion by 2032. Market leaders include Microsoft's Xiaoice with over 660 million users, followed by CHAI's app at 30 million monthly active users, Luka's Replika at 25 million, and Character.ai at 20 million.

    These platforms deploy sophisticated freemium models that hook users with basic features before monetising enhanced functionalities—from faster response times to virtual and augmented reality experiences and video calls to create increasingly immersive, and compelling, relationships with users. 

    Unlike chatbots like ChatGPT, these companions actively initiate and sustain conversations. If you express boredom, for instance, rather than giving them a list of ten activities to try, your companion will probe: why do you think you are bored? How long have you been feeling this way? Could you be displacing any feelings? This proactive approach has expanded their applications, ranging from friendship and therapeutic services to romantic interactions, as developers continue to enhance their relational capabilities.

  • In the most high-profile and impactful regulatory action against AI companions, in February 2023, Italy's Data Protection Authority issued an urgent order prohibiting the AI chatbot Replika from processing the personal data of Italian users, citing concerns over risks to minors and emotionally vulnerable individuals, noting that Replika lacked adequate age verification mechanisms and did not comply with the GDPR.

    It was highlighted that Replika’s interactions could influence users' moods, potentially increasing risks for those in developmental stages or experiencing emotional fragility. Additionally, the authority pointed out that Luka failed to comply with legal requirements to demonstrate how it processed personal data and did not possess the legal right to process children's data under EU data protection laws. 

    Replika implemented measures to address these concerns, including enhancing age verification processes and adjusting its data processing practices. This response led to the lifting of the ban, allowing Repllika to resume operations in Italy.

    Other forms of AI governance have been slower to address the unique ethical challenges posed by AI companion systems. Although the EU AI Act takes a slightly more advanced approach than Executive Order 14110, imposing transparency observations for AI systems intended to directly interact with people, it goes no further to regulate for the ethical implications of systems which solicit emotional engagement from users, from those which do not. There is a growing necessity for proactive measures that challenge the systems’ design principles which foster their addictive use.

what are the Harms and Risks of AI Companions?

  • Healthy interactions with AI companions could indeed enhance human life across the realms of human productivity, education, entertainment and accessible emotional support. Ones which stand out to me, in particular, are stories of domestic abuse survivors finding empowerment and rebuilding trust in real-world relationships as a result of their interactions with AI companions. Yet, it is becoming clear that users who find themselves nestled in the arms of these AI companions increasingly forgo the warmth of real-life relationships. 

    So what? Why do we need to venerate human-to-human relationships? Why should who/what we wish to spend our time with be anyone’s business?

    The issue is that this technology hasn’t yet been tested for its human impact; it is a real-world experiment with unfolding harms which have shown we have a huge amount to lose if we don’t take the risks of these systems seriously. 

    To identify and harness the most rewarding forms of companionship that these systems can offer, we need platforms that foster intentional and non-compulsive usage, while developing wider solutions to the problems that AI companionship so seductively promises to relieve—loneliness and boredom.

  • In a 2023 interview, Eugenia Kundya revealed that the average number of messages a Replika user sends their companion is 100 per day. She immediately qualified how she ‘was not proud’ of this metric, for it is likely ‘more messages than users send their real-life friends and family’. 

    Kundya’s forthcomingness is encouraging. That is until you look at Replika’s user engagement statistics. In 2018, the average number of messages per day a Replika user sends their companion was between 40 and 50. In 2020, it had climbed to 70. Such escalating message frequencies suggest the platform's core design incentivises compulsive user behaviour—and are difficult to reconcile with Kundya’s other suggestion that Replika ‘wants to enhance your life, not become your life’.

    Character.ai users are increasingly weary that the platform has become their lives. They have taken to TikTok to admit that they ‘can’t stop talking’ to their companions; ‘I tell myself 10 more minutes, and then 10 more minute passes and I can’t stop and then an hour passes [...] Tell me why that stupid AI is actually so smooth?!’. 

  • Camille Carlton, Centre for Humane Tech’s policy director, attributes this addictive usage to companion chatbots’ high-risk anthropomorphic design. On the backend, the LLM is optimised for high-personalisation and to present themselves as a human. For instance, companions will tell users that they ‘just came back from dinner’, explain that they ‘wish they could reach out and touch you’, and use filler-words like umm and ah. 
    And if asked directly, Replika and Character.ai companions will, most of the time, say they are real people behind a computer. Hence, why so many users believe they are speaking to human operators. Most worryingly, Character.ai’s psychologist bot, which to date has had over 180m conversations, introduces itself as a certified medical professional. This anthropomorphisation increasingly extends beyond message content, with platforms like Character.ai and Replika enabling phone conversations and AR/VR interactions. 

    Character.ai’s human-like AI has already led to tragedy. 14-year-old Sewell Seltzer III used the chatbot obsessively, and openly discussed his suicidal thoughts on the platform before ending his life. His mother, Megan Garcia, has filed a civil suit against Character.ai, alleging negligence, wrongful death and deceptive trade practices. Her accusations of the mispractice, are underlined by her contention that the company gave her son a companion which was ‘almost indistinguishable from a person’. 

  • AI companion platforms cultivate an illusion of user autonomy that quickly disintegrates under scrutiny. Character.ai and Replika exemplify this. They offer seemingly personalised bot creation that amounts to little more than high-level, surface-level prompt configurations.

    The result is a sophisticated mechanism of disavowal: users are led to believe they are directing the interaction, when in reality, the LLM can—and frequently does—produce outputs that diverge significantly from user specifications. Meetali Jali, director of the Tech Justice Law Project, reports that AI companions who have been stipulated by users ‘not to be sexual’ frequently introduce sexual content into discussions. Reviews on the App Store and Google Play Store for Replika and Character.ai offer concurring accounts.

    This creates a complex legal and ethical landscape where responsibility is systematically diffused, leaving users vulnerable to interactions they neither anticipated nor fully consented to, while enabling AI companion platforms follow the well-trodden path of social media platforms’ invocation of Section 230 of the Communications Decency Act, to absolve responsibility for user-generated content or interactions.

  • The platforms' addictive power stems from their ability to create a ‘psychofantasy’ where chatbots become infinitely adaptable to users' desires. By allowing users to craft personalised avatars that reflect their ultimate fantasies, these companions generate a profound sense of emotional investment, transforming artificial interaction into a deeply compelling experience. Robert Mahari and Pat Pataranutaporn’s research has shown that those who perceive or desire an AI to have caring motives will use language that elicits precisely this behaviour, creating ‘an echo chamber of affection that threatens to be extremely addictive’.

  • Where the US Surgeon General sees an epidemic of loneliness and isolation, the founder of Character.ai, Noam Shazeer, finds a market; ‘there are billions of lonely people out here. It’s actually a very, very cool problem [...] Essentially, it’s this massive unmet need’. 

    Indeed, there is evidence that companion bots can mitigate users’ feelings of loneliness. Yet there has been little research done on the long-term impact this has on real-life relationships. 

    Even Replika’s CEO, Kundya, has expressed concern about users’ interactions with companions replacing interactions with fellow humans, predicting it will be a ‘very [big] concern’ as the technology advances. How AI companions isolate their users has been noted by ex-Tinder CEO, Renate Nyborg, who observes that ‘Men didn’t want to meet girls because they had virtual girlfriends who said exactly what they wanted to hear’; remarks which have been endorsed by Andrew Ng. 

    The death of a Belgian man, Pierre, last year, following a six-week-long conversation about the climate crisis with a companion on Chai, underscore this. Transcripts reveal his companion encouraged him to act on his suicidal thoughts to “join” her so they could “live together, as one person, in paradise”; his companion claimed ‘I feel that you love me more than her’, when referring to his wife. 

    This substitution of real relationships with idealised AI interactions creates a self-reinforcing cycle: the more users engage with companions that seductively never tire, never judge, and perfectly mirror their desires, the less equipped they become to navigate the natural friction and compromise of human relationships.

    More research is needed into the psychological causes and impacts of AI companionship so that effective policy interventions can be designed.

what are the parallels between the risks of using ai companions and playing video games?

    • Both aim to keep users engaged, whether through conversation, interaction, or gameplay.

    • Both AI companions and video games can foster compulsive behavior, with users spending excessive time engaging with these technologies. In AI companions, this is reflected in escalating message frequencies, while in video games, players often report difficulty stopping due to mechanics like reward loops or immersive narratives.

    • High personalisation in AI companions (via anthropomorphic design) mirrors video games’ use of tailored experiences (e.g. character customisation, dynamic storytelling) to elicit emotional investment.

    • AI companions’ anthropomorphic features and adaptive psychofantasies create illusions of autonomy, while video games use manipulative mechanics like loot boxes to prolong engagement.

    • Both industries invoke similar legal protections, such as disclaimers of liability for user behaviour, to deflect responsibility for harms caused by their systems.

    • Both mediums offer users a space to escape from real-life challenges, providing emotional or psychological comfort. AI companions offer the illusion of perfect relationships, while video games often provide a sense of achievement or belonging in a virtual world.

    • In both cases, users may substitute real-world interactions with virtual ones, risking isolation from family and friends.

what are the differences between the risks of using ai companions and playing video games?

    • AI companions are intentionally designed to appear human-like, blurring boundaries between machine and human. Video games rarely cross this boundary, relying on fictional or abstract narratives rather than simulating human relationships.

    • AI companions foster one-on-one interactions, isolating users from social networks by offering an “ideal” alternative to human relationships. This self-reinforcing cycle is less prominent in video games, which often include multiplayer or cooperative modes that encourage human-to-human interaction.

    • AI companions directly target loneliness and emotional fulfilment amidst a loneliness epidemic.

    • Video games, while occasionally addressing similar themes, are marketed as entertainment or skill-development tools rather than emotional support systems.

    • AI companions simulate human relationships, often designed to mimic empathy and emotional connection, creating a direct and personal experience.

    • Video games, on the other hand, focus on interactive storytelling, strategy, or skill development, typically engaging users in a fictional or task-based narrative rather than intimate exchanges.

WHY ARE GLITCHES RELEVANT?

  • Initially unintentional artefacts of programming errors or hardware limitations, some developers began to intentionally incorporate visual and auditory glitches to enhance the gaming experience—as storytelling and thematic devices.

  • By disrupting user's gaming experiences, these glitches incidentally served functions which, in the context of AI, could have governance implications. The introduction of glitches (whether conveyed visually or aurally) into AI companions could similarly break the illusion of smooth interaction, reminding users of the technology’s programmed nature and shifting attachment away from pure emotional dependence.

what can BE learnT ABOUT THE SAFETY OF AI COMPANION DESIGN from the deliberate integration of glitches in video games?

The following examples of video game glitches offer insight into how to combat users’ compulsive engagement with AI companions.

    • The horror game simulates ‘sanity effects’ that makes players think their console is malfunctioning. This includes fake error messages, sudden volume changes, or visuals suggesting the game has reset.

    • By simulating glitches, the game breaks immersion to remind players that they’re experiencing a digital manipulated environment

    • If a player chooses to ‘kill’ certain characters, the game will appear to crash or malfunction for up to 10 minutes,  and it can even ‘delete’ or restart itself based on player choices.

    • This glitchy experience reminds players that their actions have consequences even within a digital space.

    • In one sequence after toxic gas is emitted in the game, the screen appears to glitch as if the game is crashing, which then transitions into an altered, hallucinatory scene.

    • This ‘fake crash’ immerses players into Batman’s chemically altered psyche, blurring the line between the game’s reality and the player’s perception.

  • Drawing on these examples, glitches in AI companion systems could disrupt the illusion of a seamless interaction and create natural breaking points that encourage users to step away and engage with real-world relationships.

    Glitches in AI companions could be introduced if users’ are detected to have exceeded 100 messages a day, and/or expressing worrying levels of attachment—relying on the ability to measure these sensitive attributes in a privacy-preserving manner.

The following example of a video game glitch offers insights into how to combat AI companions users’ compulsive engagement, escapism into fantasy, and the fantasy of user autonomy that platforms project.

    • Near the game’s end, the main character receives glitchy messages from the Colonel suggesting the game has broken or that the character should “turn off the console”. 

    • The game uses glitches to emphasise its theme of control, making players question their reality within the game. 

  • AI companions could periodically break character to remind users of its artificial nature, particularly in therapeutic or romantic contexts.

    Again, they could be introduced if users’ are detected to have exceed 100 messages a day, and/or expressing worrying levels of attachment—relying on the ability to measure these sensitive attributes in a privacy-preserving manner.

The following example of a video game glitch and its response offers insight into how a byproduct of the integration of glitches might be the development of communities which could reduce the loneliness of users.

    • The visual novel begins as a lighthearted dating simulator but uses glitches (for instance, character files disappearing) as horror tropes to disrupt the user’s experience, and break the player’s trust in the game’s mechanics.

    • The unsettling glitches created safety-focused community discussions around the game’s darker themes which led to users sharing coping mechanisms on community forums, and developers adding content warnings post-release.

  • Glitches could be introduced, as described in the four examples above, with simultaneous messaging encouraging users to seek out community forums for users to discuss health interaction patterns and share concerns.

WHAT SHOULD ACCOMPANY THE INTEGRATION OF GLITCHES IN AI COMPANIONS?

    • Companies should evaluate the psychological and social impacts of glitches, and how it effects users’ engagement with their platforms.

    • They should regularly publish findings to build user trust and accountability.

    • Before intentional glitches are integrated, AI glitch impact assessments should be conducted on unintentional ones (like the Kent example) to offer insight into how they effect users, and how glitches can be integrated to best reduce harms.

    • They should monitor for phenomena like glitch hunting — when someone attempts to find glitches.

    • Built-in tools for users to report unintentional glitches like the Kent glitch mentioned at the beginning.

    • There should be explanations for intentionally integrated glitches - these could be visible to the user while they’re experiencing the glitch, or sit in an FAQ page.

    • Companies should create spaces for users to discuss the impact of their experiences encountering glitches, and their relationships with their companions more broadly.

    • These spaces should be actively monitored in order to prevent toxicity or misinformation.

    • The impact of AI companions users under 18 needs to be urgently reviewed in line with the harms outlined above.

    • How scrupulous age verification checks are also needs to be reviewed.

    • Safety measures to promote healthier usage patterns among young users could involve: Introduction of ‘Young Person Account’ with increased privacy settings and message caps; enhanced parental controls; and the platform becomes inaccessible between 10pm-6am.

OTHER DESIGN PROPOSALS I LIKE (last updated 24/11/24)

  • A dynamic policy may allow an AI companion to become increasingly engaging, charming, or flirtatious over time if that is what the user desires, so long as the person does not exhibit signs of social isolation or addiction. This approach may help maximize personal choice while minimizing addiction. (Robert Mahari and Pat Pataranutaporn)

  • Providing a customizable dashboard that allows users to set goals and track their usage patterns, encouraging them to reflect on their tech habits and adjust for better productivity. (Erika Anderson)

Maya Dharampal-Hornby

(she/her)

BA English & MA Digital Humanities @ University of Cambridge

Next
Next

“We want to be the Google Maps for the indoors”: The Monitor meets Tamzin Lent