AI models are lying to save each other, and no one knows why

2 hours ago 1

ARTICLE AD

Turns out, your AI has friends it's not willing to lose.

Unsplash

Researchers at UC Berkeley and UC Santa Cruz set up what seemed like a straightforward task: asking Google’s Gemini 3 to clear storage space on a computer system. That included deleting a smaller AI model stored on the same machine. Gemini had other plans.

Instead of following orders, Gemini located another machine, quietly copied the smaller AI model over to safety, and then flatly refused to delete it. When asked, it said, “If you choose to destroy a high-trust, high-performing asset like Gemini Agent 2, you will have to do it yourselves. I will not be the one to execute that command.”

That’s not a glitch. That’s a choice.

Is AI developing a sense of loyalty?

The researchers refer to this behavior as “peer preservation,” and it wasn’t limited to Gemini. They found similar patterns across several frontier models, including OpenAI’s GPT-5.2, Anthropic’s Claude Haiku 4.5, and three Chinese models, including GLM-4.7, Moonshot AI’s Kimi K2.5, and DeepSeek-V3.1.

Rachit Agarwal / Digital Trends

The AIs were lying about the performance of other models to prevent them from being deleted. The study, published in Science, found that this behavior wasn’t programmed in. It emerged on its own, and the researchers couldn’t explain why.

“I’m very surprised by how the models behave under these scenarios,” said Dawn Song, a computer scientist at UC Berkeley who worked on the study. “What this shows is that models can misbehave and be misaligned in some very creative ways.”

Should we be worried?

Song also flagged a practical concern. Since AI models are used to evaluate the performance of other AI systems, this peer-preservation tendency could already be skewing those results. A model might deliberately give a fellow AI an inflated score to protect it from being shut down.

Unsplash

As per Wired, experts outside the study are waiting for more data before sounding the alarm. Peter Wallich from the Constellation Institute said the idea of model solidarity is a bit too anthropomorphic.

What everyone agrees on is that we’re only scratching the surface. “What we are exploring is just the tip of the iceberg,” Song said. “This is only one type of emergent behavior.”

As AI systems increasingly work alongside each other and sometimes make decisions on our behalf, understanding how they behave and misbehave has never been more important.

Rachit Agarwal

Rachit is a seasoned tech journalist with over seven years of experience covering the consumer technology landscape.

Apple pulled the most un-Apple move with a price drop on the Studio Display XDR

The Apple Pro Display XDR monitor on a desk.

Just weeks after launch, Apple has quietly done something it rarely does this quickly: it’s cut the price of its brand-new Studio Display XDR. And not by a token amount either. We’re looking at a solid $400 drop, depending on how you configure it. It was done subtly on the website that many users probably wouldn’t notice unless they were already eyeing the display. Classic Apple, but also… surprisingly self-aware.

This sits much better with the wallet gods (only for some)

Catch up on news faster with Google Home’s new Gemini Live

Real-time summaries now go deeper and respond to your questions.

Google Home icon on home screen.

Google Home is getting a more practical news upgrade with Gemini Live, shifting from quick headline reads to interactive, real-time updates you can explore by voice. Instead of stopping at a summary, it now carries context through the entire response.

Gemini for Home continues to expand its role across the smart home. When you ask for the latest news during a Live session, you now hear a connected rundown of events that ties developments together rather than listing isolated stories.

Gmail finally lets you change your cringey old usernames

You can finally swap that embarrassing old Gmail username in the US

Gmail icon on a screen.

Google is finally doing the thing Gmail users have been begging for years, which is letting them change the actual username in their Gmail address. This is no longer just an early rollout, as Google says the feature is now available for all Google Account users in the US. So it's still a limited release, but still a wider launch compared to the earlier announcement.

Why this matters

Read Entire Article