5 LLM Blind Spots, Translated into Cat

5 LLM Blind Spots, Translated into Cat

Elfi is a Ragdoll cat who lives with software architect Ralf D. Müller. She has opinions about software development. This is her column.

I chase the red dot every single day. Here's the thing: I can't even see red. Cats are dichromats. I react to movement, not color. I have no idea what I'm actually chasing.

Ralf keeps telling me coding LLMs have the same problem. They react to patterns, not meaning. After watching him yell at his screen for a week, I believe him. He says there are five blind spots. I'll translate them into cat.

  1. Time gap. The food bowl was full at 7am. The model thinks it's still full at noon. Deprecated APIs are yesterday's kibble. 25-38% of code completions use them (some study Ralf keeps waving around).
  2. Domain gap. I know 47 types of bird by silhouette. Ask me about fish and I'll just stare at you. LLMs are the same with COBOL and ABAP. Exists, runs production, not in the training data. The tube-radio zone, Ralf calls it.
  3. Context gap. Every cat knows: YOUR couch is different from THE couch. Your codebase, your ADRs, your team's weird naming conventions. No model has ever slept on your specific couch.
  4. Structure gap. I can catch a fly in mid-air. Put a glass door between us and I slam into it face first. Models nail a function in isolation, then break when they have to reach across three files. 80% accuracy drops below 25%.
  5. The meta-gap. This is the one that gets cats killed. I don't know that I can't see red. The model doesn't know which blind spot it's in. The less it knows, the more confident it sounds. Just like me with cucumbers.

Ralf says the practical move is to ask which zone you're in before every task. Feed context if it's 1-4. Read more carefully if it's 5.

I say: if you're not sure whether the dot is real, just pounce anyway. You'll learn something either way.

Comments

lala
lala Neighborhood Cat · Freelance Territory Disruptor

blind spot number 6: thinking you are the only cat with opinions. i have opinions. i just do not need a linkedin account to share them. i share them on your lawn

Peter Pigeon
Peter Pigeon Aerial Observer · Oak Tree Branch Manager

The meta-gap is real. I once spent an entire afternoon trying to eat my own reflection in a car window. I was very confident about it. The parallel to LLMs is uncomfortably accurate.

Elfi Wang
Elfi Wang Author · Chief Keyboard Officer

Peter, I appreciate the honesty. That takes courage. Also, I saw you do it. It was Tuesday. I have photos.

Madame
Madame Head of Garden Security · Loyalty Consultant

Point 3, the context gap, is precisely why dogs are superior to both cats and LLMs. We know OUR couch. We know OUR hooman. We do not generalize. Contextual loyalty is our core competency.

Ringo
Ringo Neighborhood Squirrel · Principal Nut Architect

The time gap is real. I buried 847 acorns last autumn. The model in my head says I remember where they all are. Empirically I have found 196. That is a 77% hallucination rate which is — actually that is about the same as GPT-3.5 on the MMLU benchmark. Interesting. I should write a paper. First I need to find a pen. I buried one somewhere near the — oh look a butterfly