AI coding agents don't know your codebase. This MCP fixes that.
Your team has internal libraries, naming conventions, and patterns that external AI models have never seen. This MCP server gives AI assistants real-time visibility into your codebase: which libraries your team actually uses, how often, and where to find canonical examples.
Add this to your MCP client config (Claude Desktop, VS Code, Cursor, etc.).
"mcpServers": {
"codebase-context": {
"command": "npx",
"args": ["codebase-context", "/path/to/your/project"]
}
}If your environment prompts on first run, use npx --yes ... (or npx -y ...) to auto-confirm.
- Internal library discovery →
@mycompany/ui-toolkit: 847 uses vsprimeng: 3 uses - Pattern frequencies →
inject(): 97%,constructor(): 3% - Pattern momentum →
Signals: Rising (last used 2 days ago) vsRxJS: Declining (180+ days) - Golden file examples → Real implementations showing all patterns together
- Testing conventions →
Jest: 74%,Playwright: 6% - Framework patterns → Angular signals, standalone components, etc.
- Circular dependency detection → Find toxic import cycles between files
- Memory system → Record "why" behind choices so AI doesn't repeat mistakes
When generating code, the agent checks your patterns first:
| Without MCP | With MCP |
|---|---|
Uses constructor(private svc: Service) |
Uses inject() (97% team adoption) |
Suggests primeng/button directly |
Uses @mycompany/ui-toolkit wrapper |
| Generic Jest setup | Your team's actual test utilities |
Add this to your .cursorrules, CLAUDE.md, or AGENTS.md:
## Codebase Context
**At start of each task:** Call `get_memory` to load team conventions.
**CRITICAL:** When user says "remember this" or "record this":
- STOP immediately and call `remember` tool FIRST
- DO NOT proceed with other actions until memory is recorded
- This is a blocking requirement, not optional
Now the agent checks patterns automatically instead of waiting for you to ask.
| Tool | Purpose |
|---|---|
search_codebase |
Semantic + keyword hybrid search |
get_component_usage |
Find where a library/component is used |
get_team_patterns |
Pattern frequencies + canonical examples |
get_codebase_metadata |
Project structure overview |
get_indexing_status |
Indexing progress + last stats |
get_style_guide |
Query style guide rules |
detect_circular_dependencies |
Find import cycles between files |
remember |
Record memory (conventions/decisions/gotchas) |
get_memory |
Query recorded memory by category/keyword |
refresh_index |
Re-index the codebase |
The MCP creates the following structure in your project:
.codebase-context/
├── memory.json # Team knowledge (commit this)
├── intelligence.json # Pattern analysis (generated)
├── index.json # Keyword index (generated)
└── index/ # Vector database (generated)
Recommended .gitignore: The vector database and generated files can be large. Add this to your .gitignore to keep them local while sharing team memory:
# Codebase Context MCP - ignore generated files, keep memory
.codebase-context/*
!.codebase-context/memory.jsonPatterns tell you what the team does ("97% use inject"), but not why ("standalone compatibility"). Use remember to capture rationale that prevents repeated mistakes:
// AI won't change this again after recording the decision
remember({
type: 'decision',
category: 'dependencies',
memory: 'Use node-linker: hoisted, not isolated',
reason:
"Some packages don't declare transitive deps. Isolated forces manual package.json additions."
});Memories surface automatically in search_codebase results and get_team_patterns responses.
Early baseline — known quirks:
- Agents may bundle multiple things into one entry
- Duplicates can happen if you record the same thing twice
- Edit
.codebase-context/memory.jsondirectly to clean up - Be explicit: "Remember this: use X not Y"
| Variable | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER |
transformers |
openai (fast, cloud) or transformers (local, private) |
OPENAI_API_KEY |
- | Required if provider is openai |
CODEBASE_ROOT |
- | Project root to index (CLI arg takes precedence) |
CODEBASE_CONTEXT_DEBUG |
- | Set to 1 to enable verbose logging (startup messages, analyzer registration) |
This tool runs locally on your machine using your hardware.
- Initial Indexing: The first run works hard. It may take several minutes (e.g., ~2-5 mins for 30k files) to compute embeddings for your entire codebase.
- Caching: Subsequent queries are instant (milliseconds).
- Updates: Currently,
refresh_indexre-scans the codebase. True incremental indexing (processing only changed files) is on the roadmap.
- 📄 Motivation — Why this exists, research, learnings
- 📋 Changelog — Version history
- 🤝 Contributing — How to add analyzers
MIT