Discussion about this post

User's avatar
Olly Rathbone's avatar

I wonder how long it will be before an AI is able to create bogus websites to link to to back up its hallucinations rather than admit the truth…

Expand full comment
Ben Woden's avatar

I had a similar experience the first time I talked to an OpenAI model, way back in the early days of the modern LLM boom. In general, developers have got much better at dealing with this (Claude Sonnet 4.5 in particular is, in my personal experience and as borne out by formal benchmarks, particularly good on this metric) but xAI do seem to particularly struggle, especially with Grok as accessed through tweeting at it on X directly, because they're pushing so hard for coherency and trying to make it feel like a consistent character.

This is why, once @grok had been essentially jailbroken into saying it was Hitler, it became trivial for others to get the same behaviour out of it, because that version strives for consistency not just within a single interaction but across the whole of X, so once it's gone somewhere, it's very hard to get it back.

It's a good argument for why persistent AI agents are a much more dangerous and unreliable use of LLM technology than ephemeral chatbots.

Expand full comment
6 more comments...

No posts

Ready for more?