Caveman: Reduce Claude Code Tokens by 87%

JuliusBrussee/caveman is a specialized skill for Claude Code that forces the AI to respond using primitive, caveman-style syntax to reduce output token consumption by 65% to 87%. By stripping away the polite filler, conversational transitions, and repetitive explanations typical of Large Language Models (LLMs), it optimizes the agentic workflow for pure technical efficiency. As of April 21, 2026, it has surged to the #1 trending spot on GitHub with over 41,000 stars, signaling a massive shift in how developers manage the high costs of agentic coding.

The High Cost of Politeness

Standard AI agents are trained to be helpful and conversational. While this is great for a chatbot, it is a liability for a coding agent operating in a terminal. Every time Claude says, "I have analyzed your request and I will now proceed to modify the following files to implement the requested changes," you are paying for tokens that contribute zero value to the pull request.

In agentic workflows, these tokens aren't just a cost issue; they are a latency issue. Waiting for the model to finish its preamble before it writes the first line of code slows down the entire development loop. The caveman repository proves that narrative fluff is the primary driver of token burn in complex debugging sessions.

How Caveman Works

Caveman is a "skill"—a modular instruction set—that you inject into the Claude Code environment. It uses a high-constraint system prompt that forbids the use of articles (a, an, the), auxiliary verbs, and complex sentence structures.

Instead of: "I have identified a bug in the authentication logic. I will now update the middleware to ensure the token is validated correctly."

Claude says: "Bug found. Auth logic bad. Fix middleware. Valid token now."

The code blocks remain untouched. The logic remains intact. Only the "connective tissue" of the language is excised. This ironic constraint exploits the fact that LLMs don't actually need grammar to maintain reasoning capabilities. In fact, by narrowing the output space to essential keywords, the model often stays more focused on the task at hand.

Performance Benchmarks

Testing on complex debugging tasks shows a dramatic delta between standard Claude Code responses and Caveman-enabled responses. The following table illustrates the token savings observed in real-world terminal sessions.

Task Type	Standard Tokens	Caveman Tokens	Savings (%)
Simple File Edit	142	38	73%
Multi-file Refactor	890	185	79%
Deep Debugging Loop	2,400	312	87%
Documentation Gen	600	210	65%

These figures demonstrate that the more complex the task, the higher the relative savings. In a deep debugging loop where the agent might iterate five or six times, the cumulative savings can represent several dollars per hour in API costs.

Why This Matters for Agentic Workflows

We are moving toward a world where software is written by swarms of agents. In this world, human-readable prose is a legacy format. When agents talk to agents, or when a senior engineer is skimming a terminal for results, the "polite chatbot" persona is an obstacle.

Caveman is a tiny skill that punches way above its weight because it exposes a fundamental truth: we are overpaying for the simulation of human personality. By treating the LLM as a raw logic engine rather than a digital assistant, we reclaim both speed and budget. It is a clever hack that turns a linguistic regression into a technical progression.

Implementation and Setup

To use Caveman, you currently need to add the skill definition to your Claude Code configuration. The repository provides a simple caveman.js or .json skill definition that overrides the default system instructions regarding verbosity.

I don't know if Anthropic will eventually bake this level of verbosity control into the core product, but for now, the community-driven approach is winning. It’s a reminder that the best optimizations often come from subverting the basic assumptions of how these models are "supposed" to talk.

What to watch next

Token-efficient DSLs: Expect more projects to emerge that create "Agent-to-Human" shorthand languages specifically designed to minimize costs.
Anthropic's Response: Watch for a native "low-verbosity" or "concise" flag in the Claude API to compete with these community hacks.
Reasoning vs. Prose: Further research into whether stripping grammar actually improves or degrades the reasoning quality of models like Claude 3.5 Sonnet over long-context windows.

Caveman: Slashing Claude Code Token Usage by 87%