Skip to content

Chapter 14: Context Construction and System Prompt

What You'll Learn

When you type a message into Claude Code, the model receives far more than those few words. Before your input arrives, a carefully orchestrated pipeline has already assembled a rich backdrop: a multi-section system prompt, the current working directory and Git snapshot, layered CLAUDE.md instructions, a persistent memory index, and potentially a compressed summary of conversation history that had grown too long. This chapter traces that entire pipeline from source code to API request.

By the end of this chapter you will understand:

  1. What getUserContext() and getSystemContext() each assemble, and why both are wrapped with memoize caching
  2. The complete CLAUDE.md loading hierarchy — Managed, User, Project, Local — and how files can cross-reference each other with @include directives
  3. How getSystemPrompt() is ordered to maximize Prompt Cache hit rates, and what the SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker actually does
  4. How the memdir Auto Memory system persists user preferences and project context across sessions
  5. The three compression strategies — Auto Compact, Micro Compact, and Time-Based Micro Compact — and the specific conditions that trigger each one

14.1 Context Variables: getUserContext and getSystemContext

Every conversation turn assembles two distinct groups of context variables before any API call is made. Both functions live in context.ts and are wrapped with lodash-es/memoize, which means they compute their result once per session and return the cached value on all subsequent calls.

14.1.1 getSystemContext: The Git Snapshot

getSystemContext is responsible for the repository state snapshot. Its primary output is a formatted Git status block:

typescript
// context.ts:116-150
export const getSystemContext = memoize(
  async (): Promise<{ [k: string]: string }> => {
    // Skip git status in CCR (unnecessary overhead on resume) or when git instructions are disabled
    const gitStatus =
      isEnvTruthy(process.env.CLAUDE_CODE_REMOTE) ||
      !shouldIncludeGitInstructions()
        ? null
        : await getGitStatus()

    return {
      ...(gitStatus && { gitStatus }),
      ...(feature('BREAK_CACHE_COMMAND') && injection
        ? { cacheBreaker: `[CACHE_BREAKER: ${injection}]` }
        : {}),
    }
  },
)

The getGitStatus() helper fires five Git commands concurrently:

typescript
// context.ts:61-77
const [branch, mainBranch, status, log, userName] = await Promise.all([
  getBranch(),
  getDefaultBranch(),
  execFileNoThrow(gitExe(), ['--no-optional-locks', 'status', '--short'], ...),
  execFileNoThrow(gitExe(), ['--no-optional-locks', 'log', '--oneline', '-n', '5'], ...),
  execFileNoThrow(gitExe(), ['config', 'user.name'], ...),
])

These five pieces are joined into a descriptive text block so the model knows which branch it is on, what the main branch is called, which files have been modified, and what the last five commits look like — all before the first user message is processed.

Two design decisions are worth noting. First, the Git snapshot is taken once at session start and is explicitly marked as non-updating: the injected text reads "Note that this status is a snapshot in time, and will not update during the conversation." Second, status output is truncated at MAX_STATUS_CHARS = 2000 characters, with an instruction to use BashTool for the full listing if needed.

14.1.2 getUserContext: CLAUDE.md and the Date

getUserContext loads all CLAUDE.md files and appends today's date:

typescript
// context.ts:155-188
export const getUserContext = memoize(
  async (): Promise<{ [k: string]: string }> => {
    const shouldDisableClaudeMd =
      isEnvTruthy(process.env.CLAUDE_CODE_DISABLE_CLAUDE_MDS) ||
      (isBareMode() && getAdditionalDirectoriesForClaudeMd().length === 0)

    const claudeMd = shouldDisableClaudeMd
      ? null
      : getClaudeMds(filterInjectedMemoryFiles(await getMemoryFiles()))

    setCachedClaudeMdContent(claudeMd || null)

    return {
      ...(claudeMd && { claudeMd }),
      currentDate: `Today's date is ${getLocalISODate()}.`,
    }
  },
)

The --bare mode skips automatic CLAUDE.md discovery but still loads files from directories that were explicitly passed via --add-dir. The inline comment captures the principle precisely: "--bare means 'skip what I didn't ask for', not 'ignore what I asked for'."

The call to setCachedClaudeMdContent resolves a circular dependency: the auto-mode classifier (yoloClassifier.ts) needs to read CLAUDE.md content, but if it imported claudemd.ts directly that would create a cycle through permissions/filesystem → permissions → yoloClassifier. The cache sidesteps the cycle.


14.2 CLAUDE.md Loading: The Five-Layer Hierarchy

The CLAUDE.md loading logic lives in utils/claudemd.ts, nearly 1500 lines long. The file-level comment at the top provides the clearest possible summary:

// utils/claudemd.ts:1-26
/**
 * Files are loaded in the following order:
 *
 * 1. Managed memory (/etc/claude-code/CLAUDE.md)   - Global instructions for all users
 * 2. User memory (~/.claude/CLAUDE.md)              - Private global instructions for all projects
 * 3. Project memory (CLAUDE.md, .claude/CLAUDE.md,
 *    and .claude/rules/*.md in project roots)       - Instructions checked into the codebase
 * 4. Local memory (CLAUDE.local.md in project roots) - Private project-specific instructions
 *
 * Files are loaded in reverse order of priority, i.e. the latest files are highest priority
 */

Loading order is lowest-priority-first: Managed first, then User, then Project, then Local. Since all files are eventually concatenated into a single string, content appearing later in that string sits closer to the model's attention window and naturally carries higher effective priority.

14.2.1 Directory Traversal

getMemoryFiles() walks upward from the current working directory to the filesystem root, collecting candidate paths:

typescript
// utils/claudemd.ts:850-934
let currentDir = originalCwd
while (currentDir !== parse(currentDir).root) {
  dirs.push(currentDir)
  currentDir = dirname(currentDir)
}

// Process from root downward to CWD
for (const dir of dirs.reverse()) {
  result.push(...(await processMemoryFile(join(dir, 'CLAUDE.md'), 'Project', ...)))
  result.push(...(await processMemoryFile(join(dir, '.claude', 'CLAUDE.md'), 'Project', ...)))
  // .claude/rules/*.md files...
  result.push(...(await processMemoryFile(join(dir, 'CLAUDE.local.md'), 'Local', ...)))
}

Paths are collected in CWD-to-root order but reversed before processing. This means the CLAUDE.md closest to the project root is loaded first and ends up earliest in the final string, while the CLAUDE.md in the CWD itself is loaded last and ends up at the end — and therefore has higher model attention.

14.2.2 The @include Directive

Memory files can pull in other files using @path syntax:

markdown
<!-- In CLAUDE.md -->
@./rules/typescript-guidelines.md
@~/shared-team-guidelines.md

The function extractIncludePathsFromTokens uses the marked lexer to tokenize the Markdown and scan text nodes for @path patterns, deliberately skipping code blocks so that @ references inside code examples are not followed. Resolution supports four prefix forms: bare @path (treated as @./path), @./relative, @~/home, and @/absolute. Circular references are prevented through a processedPaths set, and depth is capped at MAX_INCLUDE_DEPTH = 5 levels.

External includes — files outside the project's CWD — are blocked by default and require explicit approval from the user, surfaced as a warning dialog on first encounter.

14.2.3 Content Processing Pipeline

After reading each file, parseMemoryFileContent runs three transformations:

  1. Frontmatter parsing: Strips YAML frontmatter delimited by --- and extracts the paths field for conditional rule matching
  2. HTML comment stripping: Removes <!-- ... --> block-level comments using the marked lexer, preserving comments inside code spans and fenced blocks
  3. MEMORY.md truncation: For AutoMem and TeamMem types, enforces a dual cap of 200 lines and 25,000 bytes, appending a warning when truncation occurs

The final assembly happens in getClaudeMds, which formats each file with a descriptive header and joins them under a fixed preamble:

typescript
// utils/claudemd.ts:1153-1195
const MEMORY_INSTRUCTION_PROMPT =
  'Codebase and user instructions are shown below. Be sure to adhere to these instructions. ' +
  'IMPORTANT: These instructions OVERRIDE any default behavior and you MUST follow them exactly as written.'

// Each file becomes:
// "Contents of /path/to/CLAUDE.md (project instructions, checked into the codebase):\n\n[content]"

14.2.4 Conditional Rules

Files in .claude/rules/ directories may include frontmatter paths patterns:

yaml
---
paths:
  - "**/*.test.ts"
  - "**/*.spec.ts"
---
# Rules for test files
Always write assertions with `expect().toBe()` not `assert.equal()`.

These files are only injected into context when the model is working on a file that matches one of those glob patterns. This is the foundation of Claude Code's nested memory system, which loads progressively more specific rules as the model navigates into subdirectories.


14.3 getSystemPrompt: Assembling the Full Prompt

The complete system prompt assembly is in constants/prompts.ts. The function signature is:

typescript
export async function getSystemPrompt(
  tools: Tools,
  model: string,
  additionalWorkingDirectories?: string[],
  mcpClients?: MCPServerConnection[],
): Promise<string[]>

It returns a string array rather than a single string because the array structure allows the API layer to assign different caching scopes to different elements.

14.3.1 The Static/Dynamic Split

The returned array is organized in two zones:

typescript
// constants/prompts.ts:560-576
return [
  // --- Static content (cacheable) ---
  getSimpleIntroSection(outputStyleConfig),
  getSimpleSystemSection(),
  getSimpleDoingTasksSection(),
  getActionsSection(),
  getUsingYourToolsSection(enabledTools),
  getSimpleToneAndStyleSection(),
  getOutputEfficiencySection(),
  // === BOUNDARY MARKER ===
  ...(shouldUseGlobalCacheScope() ? [SYSTEM_PROMPT_DYNAMIC_BOUNDARY] : []),
  // --- Dynamic content (registry-managed) ---
  ...resolvedDynamicSections,
].filter(s => s !== null)

The boundary marker '__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__' is consumed by splitSysPromptPrefix in src/utils/api.ts. Everything before the marker gets scope: 'global' (cross-organization cache sharing); everything after contains session-specific content and is not cached globally.

The comment in the source is emphatic about this:

typescript
// constants/prompts.ts:106-115
/**
 * Boundary marker separating static (cross-org cacheable) content from dynamic content.
 * Everything BEFORE this marker in the system prompt array can use scope: 'global'.
 * Everything AFTER contains user/session-specific content and should not be cached.
 *
 * WARNING: Do not remove or reorder this marker without updating cache logic in:
 * - src/utils/api.ts (splitSysPromptPrefix)
 * - src/services/api/claude.ts (buildSystemPromptBlocks)
 */

The practical consequence is significant: the behavior-defining static sections (seven sections covering identity, tool usage, code style, and output format) are identical across all users and all sessions. The Prompt Cache can therefore amortize the cost of processing those tokens across many users, dramatically reducing per-request input token costs.

14.3.2 The Static Sections

Introduction (getSimpleIntroSection): Declares Claude's role, includes a cyber-risk warning, and forbids generating URLs unless confident they are for programming help.

System mechanics (getSimpleSystemSection): Explains the permission model for tool calls, the semantics of <system-reminder> tags, and includes the sentence "The system will automatically compress prior messages in your conversation as it approaches context limits." This is how the model learns that context compression exists.

Doing tasks (getSimpleDoingTasksSection): The longest static section. It codifies the YAGNI principle (don't add unrequested features, don't add error handling for impossible scenarios, don't create abstractions for one-off operations), security guidelines (avoid command injection, XSS, SQL injection), and code cleanliness rules. When process.env.USER_TYPE === 'ant' (internal Anthropic builds), additional guidance appears — including detailed commenting philosophy and a rule about faithfully reporting test failures rather than hiding them.

Careful actions (getActionsSection): A detailed taxonomy of what requires user confirmation before proceeding: destructive operations (deleting files, git reset --hard), shared-state operations (pushing code, creating PRs, sending messages), and third-party uploads. Critically, it states: "A user approving an action once does NOT mean that they approve it in all contexts."

Tool usage (getUsingYourToolsSection): Directs the model toward purpose-built tools (Read instead of cat, Edit instead of sed, Glob instead of find) and explains when to parallelize tool calls versus when to chain them sequentially.

Tone and style (getSimpleToneAndStyleSection): Prohibits unsolicited emoji, requires file_path:line_number format for code references, and mandates ending sentences with periods before tool calls rather than using colons.

Output efficiency (getOutputEfficiencySection): Two very different variants exist. The external build gets a concise "go straight to the point" instruction. The internal build is a multi-paragraph writing guide covering inverted pyramid structure, avoiding semantic backtracking, and calibrating verbosity to the user's expertise level.

14.3.3 The Dynamic Sections Registry

Dynamic sections use a section registry pattern:

typescript
// constants/prompts.ts:491-555
const dynamicSections = [
  systemPromptSection('session_guidance', () =>
    getSessionSpecificGuidanceSection(enabledTools, skillToolCommands),
  ),
  systemPromptSection('memory', () => loadMemoryPrompt()),
  systemPromptSection('ant_model_override', () => getAntModelOverrideSection()),
  systemPromptSection('env_info_simple', () =>
    computeSimpleEnvInfo(model, additionalWorkingDirectories),
  ),
  systemPromptSection('language', () => getLanguageSection(settings.language)),
  systemPromptSection('output_style', () => getOutputStyleSection(outputStyleConfig)),
  DANGEROUS_uncachedSystemPromptSection(
    'mcp_instructions',
    () => isMcpInstructionsDeltaEnabled()
      ? null
      : getMcpInstructionsSection(mcpClients),
    'MCP servers connect/disconnect between turns',
  ),
  // ...
]

systemPromptSection caches the result of its factory function for the lifetime of the session — once computed, it does not recompute. The DANGEROUS_uncachedSystemPromptSection variant skips caching and recomputes every turn, which breaks the per-turn prompt cache but is necessary for content that can genuinely change mid-session (MCP server connections).

The session_guidance section (getSessionSpecificGuidanceSection) is notable for containing multiple runtime conditionals that were specifically moved behind the boundary marker to prevent cache fragmentation. The comment explains:

typescript
// constants/prompts.ts:352-356
/**
 * Session-variant guidance that would fragment the cacheScope:'global'
 * prefix if placed before SYSTEM_PROMPT_DYNAMIC_BOUNDARY. Each conditional
 * here is a runtime bit that would otherwise multiply the Blake2b prefix
 * hash variants (2^N). See PR #24490, #24171 for the same bug class.
 */

If even one conditional were placed before the boundary marker, the number of distinct cache prefix hashes would double for each such boolean. With four or five such conditionals, that would be 16-32 different cache entries that rarely or never share hits.

14.3.4 Environment Information

computeSimpleEnvInfo generates the # Environment block injected into every session:

typescript
// constants/prompts.ts:651-709
return [
  `# Environment`,
  `You have been invoked in the following environment: `,
  ...prependBullets([
    `Primary working directory: ${cwd}`,
    [`Is a git repository: ${isGit}`],
    `Platform: ${env.platform}`,
    getShellInfoLine(),
    `OS Version: ${unameSR}`,
    modelDescription,
    knowledgeCutoffMessage,
    // Model family reference, Claude Code availability info, Fast mode description...
  ]),
].join(`\n`)

The model's knowledge cutoff is determined by getKnowledgeCutoff(modelId), which maps canonical model IDs to their training cutoff dates. The source code marks each such entry with // @[MODEL LAUNCH]: Add a knowledge cutoff date for the new model. to signal that this table needs updating when new models are released.


14.4 Persistent Memory: The Memdir System

Auto Memory provides cross-session persistence, allowing Claude Code to remember user preferences, project context, and past feedback without relying on the conversation history that gets compressed away.

14.4.1 Path Resolution

The memory directory location is determined by getAutoMemPath():

typescript
// memdir/paths.ts:223-235
export const getAutoMemPath = memoize(
  (): string => {
    const override = getAutoMemPathOverride() ?? getAutoMemPathSetting()
    if (override) {
      return override
    }
    const projectsDir = join(getMemoryBaseDir(), 'projects')
    return join(projectsDir, sanitizePath(getAutoMemBase()), AUTO_MEM_DIRNAME) + sep
  },
  () => getProjectRoot(),
)

The default path is ~/.claude/projects/<sanitized-git-root>/memory/. The key design choice is that getAutoMemBase() calls findCanonicalGitRoot() rather than getProjectRoot() — this ensures all worktrees of the same repository share a single memory directory, not one per worktree. The code comment references the original issue: // anthropics/claude-code#24382.

The path validation in validateMemoryPath is notably thorough, explicitly rejecting relative paths, root and near-root paths (length < 3), Windows drive-roots (C:\), UNC paths, and null bytes. The comment explains the threat model: these paths could become dangerous read-allowlist roots if not validated.

14.4.2 MEMORY.md as the Index

The memory entrypoint file is MEMORY.md, capped at 200 lines / 25,000 bytes. buildMemoryLines() in memdir/memdir.ts generates the behavioral instructions that tell the model how to maintain this system:

typescript
// memdir/memdir.ts:199-266
export function buildMemoryLines(
  displayName: string,
  memoryDir: string,
  extraGuidelines?: string[],
  skipIndex = false,
): string[] {
  const lines: string[] = [
    `# ${displayName}`,
    `You have a persistent, file-based memory system at \`${memoryDir}\`.`,
    "You should build up this memory system over time so that future conversations can have a complete picture...",
    // Type taxonomy, what NOT to save, how to save, when to access...
  ]
  // ...
}

The memory type taxonomy is constrained to four categories: user (preferences and working style), feedback (corrections and what to avoid repeating), project (background context not derivable from code), and reference (external system pointers). Critically, content that can be derived from the current project state — code patterns, architecture, git history — is explicitly excluded from memory. The model can re-read the code at any time; memory is for the knowledge that would not survive a fresh checkout.

The saving process is deliberately two-step: write the memory to its own topic file, then add a one-line pointer to MEMORY.md. The index stays concise (under 150 characters per entry); the full content lives in the topic file. This prevents MEMORY.md from growing so large that its truncation throws away important information.

14.4.3 loadMemoryPrompt Dispatch

The memory section factory in getSystemPrompt calls loadMemoryPrompt(), which dispatches based on enabled feature flags:

typescript
// memdir/memdir.ts:419-507
export async function loadMemoryPrompt(): Promise<string | null> {
  const autoEnabled = isAutoMemoryEnabled()

  // KAIROS (assistant mode): append-only daily log files
  if (feature('KAIROS') && autoEnabled && getKairosActive()) {
    return buildAssistantDailyLogPrompt(skipIndex)
  }

  // TEAMMEM: combined personal + team memory
  if (feature('TEAMMEM') && teamMemPaths!.isTeamMemoryEnabled()) {
    return teamMemPrompts!.buildCombinedMemoryPrompt(extraGuidelines, skipIndex)
  }

  // Default: individual auto memory
  if (autoEnabled) {
    return buildMemoryLines('auto memory', autoDir, extraGuidelines, skipIndex).join('\n')
  }

  return null
}

Because loadMemoryPrompt is wrapped in systemPromptSection caching, it runs exactly once per session. Memory file changes made during a session do not automatically update the system prompt. This is intentional — updating the system prompt mid-session would invalidate the Prompt Cache and force the model to reprocess all prior context.


14.5 Context Window Management: Three Compression Strategies

As conversation history accumulates, it approaches the model's context window limit. Claude Code implements three distinct compression mechanisms at different granularities, applied in sequence from least to most disruptive.

14.5.1 Auto Compact: Full Conversation Summarization

Auto Compact is the most drastic strategy. It sends the entire conversation history to a separate API call with a structured summarization prompt, then replaces all prior messages with the resulting summary. The threshold calculation is:

typescript
// services/compact/autoCompact.ts:33-91
export function getEffectiveContextWindowSize(model: string): number {
  const reservedTokensForSummary = Math.min(
    getMaxOutputTokensForModel(model),
    MAX_OUTPUT_TOKENS_FOR_SUMMARY,  // 20,000
  )
  let contextWindow = getContextWindowForModel(model, getSdkBetas())
  return contextWindow - reservedTokensForSummary
}

export const AUTOCOMPACT_BUFFER_TOKENS = 13_000

export function getAutoCompactThreshold(model: string): number {
  return getEffectiveContextWindowSize(model) - AUTOCOMPACT_BUFFER_TOKENS
}

The effective window is the model's context size minus 20,000 tokens reserved for the summary output. The auto-compact threshold is the effective window minus an additional 13,000 token buffer. These margins exist to ensure the compaction API call itself has room to complete without hitting a secondary context limit.

A circuit breaker prevents runaway retry attempts when the context is irrecoverably oversized:

typescript
// services/compact/autoCompact.ts:68-71, 258-264
const MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3

if (tracking?.consecutiveFailures >= MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES) {
  return { wasCompacted: false }
}

The comment cites production telemetry: "1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally." This is not speculative design — the circuit breaker was added in response to a measured problem.

The compact prompt (BASE_COMPACT_PROMPT in services/compact/prompt.ts) requests a nine-section structured summary:

  1. Primary Request and Intent
  2. Key Technical Concepts
  3. Files and Code Sections (with full code snippets)
  4. Errors and Fixes
  5. Problem Solving
  6. All User Messages (verbatim, for intent tracking)
  7. Pending Tasks
  8. Current Work
  9. Optional Next Step (with direct quotes from the conversation)

The summarizing model generates an <analysis> block first as a drafting scratchpad, then a <summary> block. formatCompactSummary() strips the analysis block before the summary enters the context window, keeping only the polished output.

After compaction, compactConversation runs extensive post-processing: re-injecting recently read files (createPostCompactFileAttachments, up to 5 files, 50,000 token budget), re-injecting invoked skills (createSkillAttachmentIfNeeded), re-injecting the plan file if one exists, and firing pre/post compact hooks.

14.5.2 Micro Compact: Surgical Tool Result Clearing

Micro Compact operates at a finer granularity. Rather than replacing the whole conversation, it finds tool call results in older messages and replaces their content with a placeholder, preserving the structural record of what was done without keeping the bulk of the output.

The set of compactable tool types is defined explicitly:

typescript
// services/compact/microCompact.ts:41-50
const COMPACTABLE_TOOLS = new Set<string>([
  FILE_READ_TOOL_NAME,
  ...SHELL_TOOL_NAMES,
  GREP_TOOL_NAME,
  GLOB_TOOL_NAME,
  WEB_SEARCH_TOOL_NAME,
  WEB_FETCH_TOOL_NAME,
  FILE_EDIT_TOOL_NAME,
  FILE_WRITE_TOOL_NAME,
])

These are all tools whose results tend to be large but whose specific output content is often no longer needed once the action has been processed. The model still knows the action occurred; it just no longer has the full text of what was returned.

Cached Micro Compact (behind feature('CACHED_MICROCOMPACT')) is the advanced variant. Instead of modifying local message content, it queues cache_edits instructions that tell the API server to delete specific tool results from its cached representation of the conversation. The local messages array is unchanged; the API receives both the original messages (for cache key matching) and a cache_edits block that specifies which tool result content to clear on the server side. This preserves the existing Prompt Cache hot state while still reducing the effective token count.

The state is managed through pendingCacheEdits and pinnedCacheEdits:

typescript
// services/compact/microCompact.ts:88-118
export function consumePendingCacheEdits(): CacheEditsBlock | null {
  const edits = pendingCacheEdits
  pendingCacheEdits = null
  return edits
}

export function pinCacheEdits(userMessageIndex: number, block: CacheEditsBlock): void {
  if (cachedMCState) {
    cachedMCState.pinnedEdits.push({ userMessageIndex, block })
  }
}

consumePendingCacheEdits is called once per turn to extract any newly prepared deletions. pinCacheEdits is called after a successful API response to record the deletions so they can be re-sent in subsequent requests (the API needs them present to maintain cache coherence).

14.5.3 Time-Based Micro Compact

When a user returns to a session after a gap, the server-side Prompt Cache has typically expired (it expires after 5 minutes of inactivity). At that point, sending cached micro compact cache_edits would be counterproductive — the cache is cold, so there is no warm prefix to preserve.

Instead, when the gap exceeds a configured threshold, the time-based trigger fires first and content-clears tool results directly in the local message array:

typescript
// services/compact/microCompact.ts:422-444
export function evaluateTimeBasedTrigger(
  messages: Message[],
  querySource: QuerySource | undefined,
): { gapMinutes: number; config: TimeBasedMCConfig } | null {
  const config = getTimeBasedMCConfig()
  if (!config.enabled || !querySource || !isMainThreadSource(querySource)) {
    return null
  }
  const lastAssistant = messages.findLast(m => m.type === 'assistant')
  const gapMinutes =
    (Date.now() - new Date(lastAssistant.timestamp).getTime()) / 60_000
  if (gapMinutes < config.gapThresholdMinutes) {
    return null
  }
  return { gapMinutes, config }
}

When triggered, it keeps the most recent keepRecent compactable tool results (to preserve current working context) and replaces all older ones with '[Old tool result content cleared]'. The code comment explains the floor:

typescript
// The floor at 1: slice(-0) returns the full array (paradoxically keeps
// everything), and clearing ALL results leaves the model with zero working
// context. Neither degenerate is sensible — always keep at least the last.
const keepRecent = Math.max(1, config.keepRecent)

After time-based clearing, the cached micro compact state is reset (resetMicrocompactState()), because the server-side cache entries those IDs referred to no longer exist after the gap-induced expiration.


14.6 The Full Pipeline

With all components understood, the end-to-end flow looks like this:

All three context functions are memoized for the session. The compression mechanisms interact with this caching in a defined way: when compactConversation completes, it calls notifyCompaction to reset cache-read baselines so the post-compact drop in cache hits is not flagged as an unexpected cache break. It also calls resetGetMemoryFilesCache('compact') so that the next time CLAUDE.md files are loaded, the InstructionsLoaded hook fires with 'compact' as the reason rather than 'session_start'.


Key Takeaways

Context construction in Claude Code is a layered, cache-aware pipeline with several design principles that appear consistently throughout the codebase:

Cache boundary discipline: The static sections of the system prompt are ordered specifically to maximize cross-session, cross-organization prompt cache hit rates. Anything that could vary between users or sessions is pushed past SYSTEM_PROMPT_DYNAMIC_BOUNDARY. Every new session-specific conditional that gets added to the static zone doubles the number of distinct cache hashes and reduces hit rates — the code comments actively warn about this class of bug.

Priority through position: CLAUDE.md files are loaded lowest-priority-first and concatenated, so higher-priority files naturally appear later in the string and receive more model attention. This ordering means the hierarchy is implicit in file position rather than explicit metadata.

Graduated compression: The three compression strategies form a defense-in-depth stack. Time-based micro compact handles the "coming back after a break" case where the cache is cold. Cached micro compact handles ongoing large sessions where the cache is warm. Auto compact handles the case where even micro-compaction cannot prevent a context overflow.

Selective persistence: Auto Memory persists only what cannot be re-derived from the codebase itself. The memory type taxonomy deliberately excludes code patterns and architecture because those are always available through file reading. What memory stores is the meta-layer: preferences, corrections, external context, and decisions whose rationale lives in conversation rather than code.

For the query loop that consumes all of this context, see Chapter 5. For the permission model that influences which tool usage instructions appear in the system prompt, see Chapter 7.

Built for learners who want to read Claude Code like a real system.