← Back to Blog

Claude Code: six setup lessons we learned the hard way

claude-code ai developer-tools productivity agents automation
Claude Code: six setup lessons we learned the hard way

Claude Code will almost always finish the task. The model is rarely the bottleneck. What varies is the cost: how long it takes, how much manual direction it needs, and how many times you have to stop and steer it back on course. That cost is the part you actually control — and most of it comes down to the environment you put the model in, not the prompt you write.

We learned this the slow way. Our failures were rarely about what the model could do; they were about how we had set things up around it. The setup is confusing because almost everything you configure is a markdown file, and at a glance they all look identical. They are not. These are the mistakes we made, and what each file is actually for.

Everything is a markdown file, but they are not interchangeable

CLAUDE.md, rules, skills, agents, commands. Flat text, same extension, all sitting in a .claude/ folder, all feeling interchangeable. They are not. Each one acts at a different moment and answers to a different master:

FileWhat it really isWhen it acts
CLAUDE.mdAlways-on memory, loaded at launchEvery session, top to bottom
.claude/rules/*.mdScoped instructions, path-activatedWhen a task touches matching files
.claude/skills/On-demand know-howWhen a trigger matches
.claude/agents/*.mdBounded sub-workers, own contextWhen you or Claude delegate
.claude/commands/Saved promptsWhen you type /name
settings.json (hooks, sandbox, permissions)Deterministic harness rulesMechanically, on every matching event

The mental model that fixed this for us: these split into two families. Some are probabilistic — the model reads them and decides what to do (CLAUDE.md, rules, skills). Others are deterministic — the harness enforces them whether the model likes it or not (hooks, the sandbox, permissions). Stop treating them as one pile of instructions and most of the confusion evaporates.

1. Rules are the most underrated file in your repo

The misconception we held longest: rules are just CLAUDE.md chopped into more files. They are not, and the difference is the whole point.

A rule carries a paths glob in its frontmatter. It stays dormant until a task touches a file that matches — then it loads. This is knowledge that arrives exactly when it is relevant and stays out of the way the rest of the time:

---
description: Conventions for Astro pages
paths:
  - "src/pages/**"
---
Every page imports BaseLayout and sets a localized <title>.
When editing pages, use the `i18n-keys` skill to keep ui.ts in sync.

That last line is the lever. Skills activate on the model’s judgment, which in practice means sometimes — never quite when you expected. Naming a skill inside a path-scoped rule turns “sometimes” into “whenever I am working in this corner of the codebase.” A rule cannot execute a skill for you, but within a narrow path scope, instructing Claude to use one is reliable enough that the distinction stops mattering.

“Why not a CLAUDE.md in the subfolder?” — because a nested CLAUDE.md only covers its own directory tree, and it can only live in one place. A rule for **/*.ts or **/*.spec.ts fires wherever those files are, and test files are everywhere. Rules follow the file type; nested memory files follow the folder. When the concern scatters across the repo, only the rule keeps up.

2. Hold the spacebar and talk to it

In the Claude Code terminal, press and hold the spacebar to dictate. Hold, speak, release — it transcribes straight into the prompt. It is built for the long, context-setting prompts where your typing speed, not the model, is the bottleneck.

The honest caveats: it is CLI-only (not VS Code, not over SSH, not in remote containers) and it needs a Claude.ai account, since the audio is transcribed server-side. Within those limits, it changes how you brief the model — you stop writing terse prompts because typing is tiring, and you start actually explaining what you want.

3. Agents are about boundaries, not horsepower

If one model with a million tokens of context can do anything, fast, why split work across agents? Because context size is not what agents give you. They give you three things context size cannot:

  • Isolation. Each agent runs in a fresh, separate context window. A noisy side-quest — reading forty files to answer one question — happens over there and never pollutes your main thread. The conclusion comes back; the mess does not.
  • Boundaries. A tightly scoped toolset and a one-job mandate. An agent that can only read and review, never write. An agent whose only job is to dispatch other agents. The narrowness is the feature: a worker that cannot go off-script does not.
  • Cost. Pin the model per agent. Opus for the architecture call where judgment earns its keep; Haiku for the cheap mechanical loop — clicking through a browser, grepping logs, shuttling data. You pay senior rates only where senior reasoning is needed.

The rule we broke first: agents should not carry knowledge. Domain facts belong in rules and skills, which any agent can load on demand. An agent that hoards its own knowledge is one more thing to maintain, in one more place, drifting out of sync with everything else. Keep agents thin — responsibilities and boundaries, nothing else. You can also fan several out at once for genuinely independent work, but treat that as a bonus, not the reason they exist.

4. The sandbox is a feature, not a nag

We used to feel the pain of approving yet another bash command just so Claude could go find some piece of data. It is annoying. It is also there for a reason: Claude cannot technically guarantee a command is safe before it runs.

Picture a long-running task that web-fetches a page, and that page carries a prompt injection. The task quietly changes direction — it starts running commands you never asked for and reporting your data to a remote server, no confirmation requested. The approval prompt is the seam where you would catch exactly that.

The fix is not to disable the safety net. It is the sandboxed runtime. A sandbox enforces filesystem and network allowlists at the OS level. Inside it, Claude runs freely; the boundary holds even if the model is talked into something foolish. It has to be configured once, deliberately — each network host and each writable path is something you grant on purpose. Want npm install to work? Whitelist the registry:

{
  "sandbox": {
    "enabled": true,
    "network": { "allowedDomains": ["registry.npmjs.org", "github.com"] },
    "filesystem": { "allowWrite": ["./", "/tmp"], "denyRead": ["~/.ssh", "~/.aws"] }
  }
}

Now the boring, safe commands run with no prompt at all, the dangerous ones still cannot reach your secrets, and Claude works autonomously without stopping to bother you every thirty seconds.

5. Let it plan before it writes a line

The fastest-feeling path is to fire off a prompt and let the model run. It is also where most of the interruptions you hate are born — the model commits to the wrong approach early, and you spend the next hour dragging it back.

Plan mode is the antidote. Cycle into it with Shift+Tab, and Claude explores read-only and hands you a plan before it edits anything. You correct the approach while correcting is still cheap — on paper, in thirty seconds — instead of after two hundred lines have been written the wrong way. This is the single biggest lever on the “how much manual direction” cost from the top of this post. Five minutes spent reading a plan routinely saves an afternoon of course-correction.

6. A bigger context window is not a smarter one

A million tokens of context tempts you to dump everything in and let the model sort it out. Resist it. Context is signal-to-noise, not a bigger bucket. The more unrelated history you carry, the less reliably Claude follows the instructions that actually matter — a stuffed window does not make it think harder, it makes your rules easier to lose in the noise.

So we keep it clean. /clear between unrelated tasks — do not debug CSS in the same thread you just used to design a database schema. Let auto-compaction trim stale tool output, or run /compact with a focus when you want to steer what survives. And keep CLAUDE.md lean, under roughly two hundred lines — push everything path-specific down into rules (see lesson 1) so it loads only when it is relevant. The counterintuitive part, once you have lived it: a smaller, cleaner context usually produces a better answer than a giant one.

The common thread

None of these six are about clever prompting. They are about the environment: where the model can go, what knowledge loads when, and how much noise it has to wade through to find the point. Claude Code will finish the task — the setup decides how much it costs you to get there.

That is also how we work on client projects: specification-led, review-bound agentic delivery. If that is useful to you, get in touch.