Grok exposes detailed account suppression information

31 pointsposted 17 days ago
by gravisultra

10 Comments

bhouston

17 days ago

It seems like hallucinations. You can see the sources for the various analyses, even though it says they are the internal fields, you can see it is doing a bunch of searches of previous posts.

It may well be similar to an internal field, if Grok is used to create them, but these appear to be hallucinations.

dexdal

17 days ago

Hallucinations get expensive when outputs run without a verification loop. Treat each claim as a hypothesis until it has evidence you can reproduce. A simple gate works in practice: source it, reproduce it, or discard it.

saaaaaam

17 days ago

Has it though? People in that thread seem to be suggesting it’s a grok hallucination.

recursivecaveat

16 days ago

Couldn't be more clearly a hallucination. If you think about the 'data' it's returned in terms of something actually stored in a database it makes no sense. Why do all the dict values contain extensive prose explanations of their meanings? Why is list data being stored in grammatical lists instead of arrays? Why are there so many categories that are minor paraphrases of each other or differ only by parentheticals?

If this was real we'd have learned about it from Twitter engineers crashing out trying to maintain that awful 'schema'. There's a long history of LLMs hallucinating their own internals.

fwn

17 days ago

It's always a dubious method to ask an LLM to reveal information about its inner workings.

However, if you export your Grok account information via the Settings menu, it will export your information, including a risk score not exposed by the browser UI. You can try it yourself.

I wonder if this risk score is compiled based on any content you produce, or if it's informed by other factors, such as VPN or app usage.

iszomer

17 days ago

> "classification": "Clean / Neutral Low-Impact User",

Guess I don't have anything "impactful" to say on the platform even if it was an LLM hallucination as others (including Grok) have claimed.

> "..nothing matches this specific "OHI_V3" nomenclature or detailed structure -- it's essentially role-play or hallucinated output that's entertaining but not grounded in verified internals."

snvzz

17 days ago

Likely an hallucination, but even if not, it looks like hate speech is getting throttled?

Who'd think. Nothing newsworthy here.

user

17 days ago

[deleted]

snowmobile

17 days ago

Corollary to Betteridge's Law: anytime a LLM "exposes secret internals", it's a hallucination.