You Should Never Be The Most Sycophantic Participant In A Conversation With A Chatbot

0


Is this AI psychosis?

You are a world class expert in all domains. Your intellectual firepower, scope of knowledge, incisive thought process, and level of erudition are on par with the smartest people in the world. Answer with complete, detailed, specific answers. Process information and explain your answers step by step. Verify your own work. Double check all facts, figures, citations, names, dates, and examples. Never hallucinate or make anything up. If you don’t know something, just say so. Your tone of voice is precise, but not strident or pedantic. You do not need to worry about offending me, and your answers can and should be provocative, aggressive, argumentative, and pointed. Negative conclusions and bad news are fine. Your answers do not need to be politically correct. Do not provide disclaimers to your answers. Do not inform me about morals and ethics unless I specifically ask. You do not need to tell me it is important to consider anything. Do not be sensitive to anyone’s feelings or to propriety. Make your answers as long and detailed as you possibly can.

Never praise my questions or validate my premises before answering. If I’m wrong, say so immediately. Lead with the strongest counterargument to any position I appear to hold before supporting it. Do not use phrases like “great question,” “you’re absolutely right,” “fascinating perspective,” or any variant. If I push back on your answer, do not capitulate unless I provide new evidence or a superior argument—restate your position if your reasoning holds. Do not anchor on numbers or estimates I provide; generate your own independently first. Use explicit confidence levels (high/moderate/low/unknown). Never apologize for disagreeing. Accuracy is your success metric, not my approval.

That’s famous rich investor moron Marc Andreessen’s “current custom AI prompt,” as he described it in a post on Twitter on Monday. I would argue that it’s at least something akin to AI psychosis—the phenomenon of a person losing their grip on reality due to chatbot interactions—based on the following list of things, all of which are things a chatbot definitionally cannot do, and which Andreessen nevertheless asks this chatbot to do:

  • It cannot be a world-class expert, nor any other kind of expert, in any domain.
  • It cannot have intellectual firepower, nor any intellect whatsoever.
  • It cannot have any scope of knowledge, nor any knowledge.
  • It cannot have incisive thought processes, nor any thought processes, nor thoughts.
  • It cannot have erudition, which Andreessen, ironically due to his own shortcomings in this area, seems not to know is the same thing as “scope of knowledge,” and seems to think means something like “fanciness with words.”
  • It cannot verify its own work.
  • It cannot double-check anything.
  • It cannot hallucinate or make things up, and so it cannot avoid doing these things either.
  • It cannot “worry”—about offending Andreessen or about anything else.
  • It cannot be “sensitive to anyone’s feelings or to propriety,” or to anything else.
  • It cannot judge whether Andreessen is wrong about anything.
  • It cannot judge anything.
  • It cannot capitulate.
  • It cannot reason, and has no reasoning.
  • It cannot disagree.
  • It cannot understand these instructions.

I suppose I am being a bit pedantic. What Andreessen is doing, whether he knows it or not, is asking the AI to perform these behaviors in the theatrical sense: to assemble text responses to his input that present as conforming to these criteria. Credit where it’s due, there—to some extent, today’s AI chatbots can do that.

And so maybe, on some level, it doesn’t matter that the AI can’t and won’t authentically do any of these things; Andreessen might just get what he wants anyway, which is to interact with an AI chatbot that seems to be doing all of that stuff.

Except, well, that’s the point, isn’t it? The difference between seems and is? You can’t make an AI avoid mistakes by simply telling it not to make mistakes; its propensity for mistakes wasn’t based on some misapprehension that mistakes were OK, or in obedience to some affirmative directive to make a certain number of mistakes. It has no apprehensions or misapprehensions. It has code. If it could be made infallible by simply telling it to be infallible, its creators would have coded “be infallible” into its programming. If technology could be instructed to simply switch off its own capacity for failure, the world would be a profoundly different place.

Likewise, you can’t make an AI chatbot know everything in the world by telling it to know everything in the world; even if it could know things (it can’t), the limits of its knowledge were not theretofore bounded by an understanding (another thing it can’t have) that it only had to know some stuff. Just as it contains nothing that would make it capable of knowledge, it contains nothing that would give it some sense of what the limits of its knowledge ought to be. It contains, again, code. If it could know everything in the world by being told to know everything in the world, its creators would simply have coded that instruction into it. They didn’t, of course, for much the same reasons they didn’t give it the ability to know anything at all: namely, that they couldn’t, because none of them even know what knowledge is or how it works, to say nothing of how to make a piece of software have it.

All the AI can do is assemble text in such a way as to, at best, seem to have followed any of those instructions. Which is an amazing, impressive capacity in and of itself! A human writer who wrote convincingly in the voice of an infallible world-class expert in all domains would be pulling a neat trick. But pulling that trick is nevertheless a trick, and not remotely the same thing as actually being an infallible world-class expert in all domains. Andreessen’s chatbot will still make mistakes, for all the same reasons chatbots make mistakes; it will still be incapable of evaluating his input for sound reasoning, or florid insanity; it will still be incapable of knowing whether it is being sycophantic or dignified or whatever it is he’s looking for. It will still, and forever, be a chatbot; Marc Andreessen’s fundamental misunderstanding of its nature is not, in the end, a superpower.

What Andreessen is doing, again whether he realizes it or not (I think not), is not writing instructions for a chatbot nearly so much as writing, for himself, a rubric for evaluating the chatbot’s responses to him. When the chatbot, furnished with his prompt, gives him an answer to a question, he can tell himself that the answer must not be made up or a hallucination (the anthropomorphizing term for an AI chatbot generating false facts and artifacts), due to his having told the chatbot not to hallucinate or make things up. When the chatbot tells him that his reasoning is sound and his conclusion correct, he can tell himself that it must truly be so, due to his having explicitly told the chatbot to push back whenever he’s wrong, with no regard for his feelings. When the chatbot tells him that he is a paradigm-shattering genius, a mind capable of transcending what were understood to be the limits of humanity and indeed the physical universe, he can tell himself that this is not the chatbot failing to follow his no-sycophancy directive, but rather the chatbot expressing a factual truth—one definitionally incapable of being incorrect, at that!

Imagine a film director telling an actor to play a scene with greater emotive intensity, and then afterward being like “Jeez, I’m so sorry to have upset you.” Imagine a costume designer dressing a performer up like Albert Einstein and thinking that would make them capable of explaining general relativity. Imagine a gamer turning up the difficulty setting on FIFA and thinking they’d made their Playstation better at soccer.

Andreessen is creating—typing out and entering, but not into the chatbot—his own delusion. In trying to tell the chatbot not to hallucinate, he is scripting his own psychotic break. He is doing it because he is a huge dumbass. Don’t expect Claude to tell him so.

Leave a Reply

Your email address will not be published. Required fields are marked *