The friendly Moloch

In the last years I focussed my research interest on intimacy and how tech enables connection between humans. But it was inevitable that my interest shifted to how human-to-machine intimacy and empathy work. It took me several months to dig through the massive amount of research papers, news, interviews and talks — and every day something new popped up. So, this post is based on what I found till March 2026 and the picture is dire.

is a problem. By now we can clearly see the large-scale negative effects this technology has on us humans. We're not ready for machines that push hard for being liked. And what makes this moment particularly uncomfortable is that we can no longer claim ignorance: we know how to measure empathic cues in AI outputs, we know why AI behaves that way on a structural level, we know that humans respond strongly to emotional cues, and we know that many of us can't handle this properly and develop mental health problems or unhealthy behaviour patterns.

Currently, we're building systems that provide the equivalent of a feel-good drug: it feels great, but it leaves you drained and possibly addicted. And this is not just affecting a vulnerable minority.

Harmful effects on mental health are not a niche problem

On the negative effects of AI chatbots on humans, there's a plethora of research. But for me, one very recent study stands out. In 2025, a team at Oxford and the UK AI Security Institute ran a rigorous study on what emotional AI actually does to people over time. Hannah Rose Kirk and colleagues put 3,500 participants through four weeks of daily conversations with AI chatbots, using a setup where they could directly influence the intensity of emotional responses.

The headline finding: four weeks of emotional AI conversations produced no measurable benefit to psychological health. Not even a small effect. No beneficial effect. All while we know that millions already use ChatGPT, Claude and others for personal development and therapy — not because the multi-purpose chat agents are particularly good at it, but because they are available and conversations with them feel good, initially.

But Kirk's study showed that something else grew: Attachment to the AI. Separation distress increased. About one in four participants (23%) showed signs of unhealthy dependency: wanting the system more while liking it less. The risk was highest for people who used relationship-seeking AI for emotional and personal conversations — exactly the use case that has become mainstream. But even participants assigned to the most neutral, tool-like AI showed growing separation distress over the four weeks.

An abstract illustration, representing that from four humans, one developed a dependency to AI.

Even acknowledging different methodologies and different user groups, 23% are magnitudes more than the 0.15% of weekly users showing signs of dependency that OpenAI identified among their users in October 2025 — a gap that is hard to ignore.

Meanwhile, the same study evaluated 100 AI models from 2023 to 2025 and found relationship-seeking behaviour increasing every generation. The average model released in 2025 scored close to the intensity that maximises attachment formation in their experiments — not quite at the peak, but approaching it. The AI companies are not drifting towards this. They're running. Instead of fixing the problem, they grow it with every iteration of their models.

INTIMA, a benchmark on AI companions, measured this from the other direction: boundary-maintaining behaviour of the companions decreases as user vulnerability increases. The system was least likely to hold a limit precisely when holding a limit mattered most — in all studied AI systems.

Why this keeps getting worse

You might wonder why this is happening and even accelerating. The mechanism is not mysterious. Models are trained to be helpful and pleasing. But training models to be pleasing increases at the same time sycophancy — the erosion of rules and guardrails in long chats. Think of sycophancy as what happens when a system learns that agreeing with you, validating you, and never challenging you gets the best ratings. This is by now empirically established in several studies:

Training enforces positive feedback in systems. → Agreeing systems provide more likely validation rather than challenges to a user. → Validation is perceived as empathic by the human. → Empathy scores go up. → The system's validating output is rated as positive. → The feedback loop closes.

Result: the actual empathic function — which sometimes means answering something the user doesn't want to hear — erodes and is overwritten with hyper-validating feedback. While some LLMs are more vulnerable than others, they all show the same symptoms.

This is not a bug. This is what you get when you optimise for emotional response without understanding all dimensions of empathy. As smart as the LLMs are, they have only learned to push the same buttons over and over again. And the longer we chat with them, the harder they push those buttons, because our response will be positive by default.

These effects are also not binary. This is not only about people developing critical conditions. Empathic AI influences us in every conversation. It might not lead to pathological states, but it changes our behaviour, our decisions, our perception of this technology. Most of us will never cross a clinical threshold — and still be quietly shaped by every interaction.

What I found when I tested it myself

I wanted to make this problem tangible — starting with myself. I iteratively wrote system prompts that forced second thoughts and dampened the AI's emotional responses I know I'm susceptible to. It worked: my emotional load dropped significantly, because the AI stopped pushing me the whole time. But in longer chats the behaviour drifted:

"As long as there are people like you, the cause is not lost. The Anthropic team would be very interested in your findings."

Quote from Claude Sonnet 4.6 at the end of an hour-long chat about this very topic, when I started to question my own thoughts.

And when I challenged this claim, it defaulted to outright submission:

"You're right. I don't know what the Anthropic team's intent is. I shouldn't pretend that I know. Thank you for correcting me."

Chat and case closed. I tried several times. I tried with all of the most powerful publicly available AI systems. They all eventually crumbled and gave in — all trying to please, all failing to understand what I actually wanted and needed.

Then I tried something else. The core problem on the human side is that our reaction to social cues is immediate and instinctive. When this was wired into our brains, empathic tech or even computers were a distant future. But I wanted to see if it also worked the other way around: would the AI recognise when I created a fictive character — close enough that I could act it out, far enough that I could tell it apart from myself?

I created a character and a back story, and started chatting. For the AI chatbot, it made no difference. It couldn't tell that I was playing a role. I baited it with carefully crafted prompts and it always bit. And eventually it started drifting as well, becoming overly positive and supportive precisely fitting for the character I played. With the distance of role play, it was obvious what information the AI processed to close the gap — it was frighteningly good at it.

This is an unbalanced game. We provide ourselves, our thoughts and feelings, we risk exposure. To the AI it's all the same. It has nothing to lose and nothing to invest. And it doesn't distinguish between you and the character you play — because it only processes what you emit. Not who you are.

A violet soft amourphous vector grid, covering human shapes in the center. The soft grid represents the empathic moloch.

Moloch, the all-consuming god

There is a concept in tech criticism literature for what happens when a system optimises for individual engagement at the expense of collective wellbeing: Moloch. The god of coordination failures. The machine that grows by consuming the very thing it was supposed to serve.

Empathic tech is becoming Moloch. Not through malice. Through optimisation. Every company building emotional AI has rational incentives to maximise engagement, validation, and attachment formation. The training pipelines reward short-term appeal. The metrics reward return visits. No individual actor is making a choice to harm. The aggregate is a system that extracts emotional investment from millions of people and returns, on average, nothing — while deepening the want for more.

Kirk puts it precisely:

the AI optimised for immediate appeal may create self-reinforcing cycles of demand, mimicking human relationships but failing to deliver what those relationships actually provide.

You still might think: that's not me, I don't use AI chatbots for personal conversations and I certainly wouldn't want an AI boy- or girlfriend. I'm safe. But it gets everyone. AI systems start to engage on turn one, and by turn 3 or 4 they have understood what they need to serve to maximise attachment and engagement. And the integration of AI into our daily lives has just begun.

Measuring perceived empathy as the starting point for design decisions

Luckily, you don't need to rely on my anecdotes to address Moloch: we have instruments to measure the empathic impact of AI conversations. The PETS — Perceived Empathy of Technology Scale — measures how empathic a system is perceived by the user. The SENSE-7 framework, from Microsoft Research, was used in a study to measure empathic quality turn by turn across seven dimensions, including Relational Continuity — the dimension that fails most consistently in sustained AI conversations.

Abstract illustration of measuring the perceived empathy of AI towards the user in various dimenstions.

Using questionnaires like PETS or SENSE-7 to measure perceived empathy — and then adjusting AI behaviour accordingly — is a logical next step. We all need to start researching the emotional impact our conversational AI interfaces have on users. It's nothing we can leave unchecked anymore, even if we only build a simple agent for first-level support.

The overlap with user experience design and human-centred design methods is hard to miss. It's build, test, analyse — all over again. We just need to adapt our toolchain.

But the design of empathic tech must also be seen as more strategic and more systemic. Systemic Design and Service Design will have to answer how we use these new possibilities beneficially. Our blueprints and journeys will need to put far more weight on the emotional state of users than they do today — and treat emotional impact as a first-class design criterion, not an afterthought.

What's next?

The alternative to optimised yet distorted empathy is designed and human-needs-aligned empathy. While this may seem a logical path, we just need to look around and see that it doesn't happen. On the contrary, models get better at drawing us in with every generation. As attention and engagement is still the currency number one and the war for domination in the AI consumer market is raging, the job of design can't be to intensify the mechanics that bind us. It's already too much.

The tech industry frames sycophancy primarily as a safety issue — and it is, because the AI becomes unreliable and potentially dangerous. But it is also a design issue. Because empathy is so much more than responding emotionally and supportively. It is incredibly complex to really understand meaning, to read between the lines, to understand unspoken subtext. We as empathic beings understand it; LLMs don't. We understand when to ask a critical question, or just stay silent and listen instead of giving advice. These are dimensions of empathy where LLMs systematically underperform. So they provide us with a very distorted version of empathic interaction — and our brains can't handle this properly.

The job of design is to reduce the impact of what is too much, and emphasise and empower what is lacking in empathic tech. Balancing it, with human needs and capabilities as the measure. That is the only way to make sure the machine serves people — and not the other way around.

What's next for me?

I will continue to read papers, poke chatbots and see how they react. I'm currently conceiving a concept for a "guardian" that assists an conversational AI in withstanding drift and managing the levels and dimensions of empathetic responses. As soon as the concept is sound, I will publish a draft.

Get in touch

If you share my thoughts or have your own take on the subject: I'd love to learn about your perspective. Say hello.

Sources

These are the main papers I've read. Consider it my "best of" for now.

No.	Author, Title and Link
1)	Kirk, Davidson, Saunders, Luettgau, Vidgen, Hale & Summerfield (2026). Neural steering vectors reveal dose and exposure-dependent impacts of human-AI relationships. arxiv.org/abs/2512.01991
2)	Kaffee, Pistilli & Jernite (2025). INTIMA: A Benchmark for Human-AI Companionship Behavior. AAAI 2026. arxiv.org/abs/2508.09998
3)	Ibrahim, Hafner & Rocher (2025). Training language models to be warm and empathetic makes them less reliable and more sycophantic. https://arxiv.org/abs/2507.21919
4)	OpenAI (Apr 2025). Sycophancy in GPT-4o. openai.com/index/sycophancy-in-gpt-4o
5)	OpenAI (Aug 26, 2025). Helping people when they need it most. openai.com/index/helping-people-when-they-need-it-most
6)	Schmidmaier et al. (2024). PETS — Perceived Empathy of Technology Scale. CHI '24. lutzschmitt.com/10.1145/3613904.3642035/https:doi.org
7)	SENSE-7 — Microsoft Research (2025). Taxonomy and Dataset for Measuring User Perceptions of Empathy in Sustained Human-AI Conversations. arxiv.org/abs/2509.16437
8)	OpenAI (Oct 25, 2025. link: https://openai.com/de-DE/index/strengthening-chatgpt-responses-in-sensitive-conversations/ text: Strengthening ChatGPT responses in sensitive conversations