Skip to content

MCP that gives AI the context of an experienced dev on your team — knows your patterns, where things are, and why decisions were made

License

Notifications You must be signed in to change notification settings

PatrickSys/codebase-context

Repository files navigation

codebase-context

AI coding agents don't know your codebase. This MCP fixes that.

Your team has internal libraries, naming conventions, and patterns that external AI models have never seen. This MCP server gives AI assistants real-time visibility into your codebase: which libraries your team actually uses, how often, and where to find canonical examples.

Quick Start

Add this to your MCP client config (Claude Desktop, VS Code, Cursor, etc.).

"mcpServers": {
  "codebase-context": {
    "command": "npx",
    "args": ["codebase-context", "/path/to/your/project"]
  }
}

If your environment prompts on first run, use npx --yes ... (or npx -y ...) to auto-confirm.

What You Get

  • Internal library discovery@mycompany/ui-toolkit: 847 uses vs primeng: 3 uses
  • Pattern frequenciesinject(): 97%, constructor(): 3%
  • Pattern momentumSignals: Rising (last used 2 days ago) vs RxJS: Declining (180+ days)
  • Golden file examples → Real implementations showing all patterns together
  • Testing conventionsJest: 74%, Playwright: 6%
  • Framework patterns → Angular signals, standalone components, etc.
  • Circular dependency detection → Find toxic import cycles between files
  • Memory system → Record "why" behind choices so AI doesn't repeat mistakes

How It Works

When generating code, the agent checks your patterns first:

Without MCP With MCP
Uses constructor(private svc: Service) Uses inject() (97% team adoption)
Suggests primeng/button directly Uses @mycompany/ui-toolkit wrapper
Generic Jest setup Your team's actual test utilities

Tip: Auto-invoke in your rules

Add this to your .cursorrules, CLAUDE.md, or AGENTS.md:

## Codebase Context

**At start of each task:** Call `get_memory` to load team conventions.

**CRITICAL:** When user says "remember this" or "record this":
- STOP immediately and call `remember` tool FIRST
- DO NOT proceed with other actions until memory is recorded
- This is a blocking requirement, not optional

Now the agent checks patterns automatically instead of waiting for you to ask.

Tools

Tool Purpose
search_codebase Semantic + keyword hybrid search
get_component_usage Find where a library/component is used
get_team_patterns Pattern frequencies + canonical examples
get_codebase_metadata Project structure overview
get_indexing_status Indexing progress + last stats
get_style_guide Query style guide rules
detect_circular_dependencies Find import cycles between files
remember Record memory (conventions/decisions/gotchas)
get_memory Query recorded memory by category/keyword
refresh_index Re-index the codebase

File Structure

The MCP creates the following structure in your project:

.codebase-context/
  ├── memory.json         # Team knowledge (commit this)
  ├── intelligence.json   # Pattern analysis (generated)
  ├── index.json          # Keyword index (generated)
  └── index/              # Vector database (generated)

Recommended .gitignore: The vector database and generated files can be large. Add this to your .gitignore to keep them local while sharing team memory:

# Codebase Context MCP - ignore generated files, keep memory
.codebase-context/*
!.codebase-context/memory.json

Memory System

Patterns tell you what the team does ("97% use inject"), but not why ("standalone compatibility"). Use remember to capture rationale that prevents repeated mistakes:

// AI won't change this again after recording the decision
remember({
  type: 'decision',
  category: 'dependencies',
  memory: 'Use node-linker: hoisted, not isolated',
  reason:
    "Some packages don't declare transitive deps. Isolated forces manual package.json additions."
});

Memories surface automatically in search_codebase results and get_team_patterns responses.

Early baseline — known quirks:

  • Agents may bundle multiple things into one entry
  • Duplicates can happen if you record the same thing twice
  • Edit .codebase-context/memory.json directly to clean up
  • Be explicit: "Remember this: use X not Y"

Configuration

Variable Default Description
EMBEDDING_PROVIDER transformers openai (fast, cloud) or transformers (local, private)
OPENAI_API_KEY - Required if provider is openai
CODEBASE_ROOT - Project root to index (CLI arg takes precedence)
CODEBASE_CONTEXT_DEBUG - Set to 1 to enable verbose logging (startup messages, analyzer registration)

Performance Note

This tool runs locally on your machine using your hardware.

  • Initial Indexing: The first run works hard. It may take several minutes (e.g., ~2-5 mins for 30k files) to compute embeddings for your entire codebase.
  • Caching: Subsequent queries are instant (milliseconds).
  • Updates: Currently, refresh_index re-scans the codebase. True incremental indexing (processing only changed files) is on the roadmap.

Links

License

MIT

About

MCP that gives AI the context of an experienced dev on your team — knows your patterns, where things are, and why decisions were made

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •