Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/DeusData/codebase-memory-mcp/llms.txt

Use this file to discover all available pages before exploring further.

Codebase Memory MCP parses all 158 supported languages using vendored tree-sitter grammars compiled directly into the static binary — there is nothing to install, no runtime to configure, and nothing that breaks when a grammar library updates. For 11 languages, a second Hybrid LSP pass runs on top of the tree-sitter AST to add semantic type resolution: resolving user.profile.display_name() to Profile.display_name declared three modules away, tracking generics, inferring return types, and resolving cross-file imports — producing call edges accurate enough to drive trace_path across package boundaries and inheritance hierarchies.

Hybrid LSP Languages

Hybrid LSP is a lightweight C implementation of language type-resolution algorithms structurally inspired by major language servers (tsserver/typescript-go, pyright, gopls, Roslyn, Eclipse JDT, rust-analyzer). It runs in-process alongside tree-sitter on every parse — no language server process, no per-project setup, no API key.
LanguageWhat the Hybrid LSP Pass Handles
Python (v0.7.0+)Imports + dotted submodule walks, dataclasses, Self return types, generics, @property, match/case class patterns, SQLAlchemy 2.0 Mapped[T], Pydantic BaseModel, typing.Annotated / ClassVar / Final / InitVar, async/await, classmethod/staticmethod, narrowing (isinstance / is not None / walrus), typing.cast / assert_type, common stdlib (logging, pathlib, json, functools)
TypeScript / JavaScript / JSX / TSXGenerics, JSX component dispatch, JSDoc inference for plain JS, .d.ts declarations, module re-exports, method chaining via return-type propagation, per-file overlay chained to a shared cross-file registry
PHP (v0.7.0+)Namespaces, traits, late-static-binding, PHPDoc inference, parameter binding, return-type inference
C# (v0.7.0+)Global usings, file-scoped namespaces, records (including C# 12 primary constructors), LINQ method syntax, async Task<T> / ValueTask<T> unwrap, generic methods, this / base dispatch, var inference, common BCL stdlib
Go (sharpened v0.7.0+)Pre-built per-package cross-file registry, generics, embedded structs, interface satisfaction, package-aware import resolution
C / C++ (sharpened v0.7.0+)Pre-built per-language cross-file registry shared across C and C++; C handles macros + typedef chains + header-vs-source linking; C++ handles templates, namespaces, auto inference, and method resolution via class hierarchy
Java (v0.8.0+)Imports (single-type, on-demand, static), class hierarchies with this / super dispatch, generics, annotations, overload matching by arity and parameter types, lambdas / method references bound to functional interfaces, field-type inference, common JDK stdlib
Kotlin (v0.8.0+)Imports + same-package resolution, classes / objects / companion objects, extension functions, data classes, nullable-type unwrapping, scope functions (let / apply / run / also / with), infix calls, common stdlib
Rust (v0.8.0+)use declarations + module paths, impl blocks and trait methods, struct fields, generics with trait bounds, operator-trait desugaring, derive-macro method synthesis, UFCS static paths, common std prelude

Two-layer architecture

  1. Tree-sitter pass — fast syntactic analysis, runs for all 158 languages. Extracts definitions, call sites, and imports.
  2. Hybrid LSP pass — type-aware, runs above the tree-sitter pass for the 11 languages listed above. Refines call edges using the import graph and a per-file or pre-built cross-file definition registry.
Languages without a Hybrid LSP pass fall back to textual resolution — you always get some answer.

Benchmark Tiers

Benchmarked against 64 real open-source repositories (78 to 49K nodes) across 12 standard questions per language. Overall score: 91.8% across all tested languages.

Tier 1 — Excellent (≥ 90%)

17 languages with perfect or near-perfect scores across all benchmark questions:
LanguageScoreBenchmark Repository
Lua100%neovim/neovim (23,955 nodes)
Kotlin100%ktorio/ktor (25,297 nodes)
C++100%nlohmann/json (5,262 nodes)
Perl100%mojolicious/mojo (3,287 nodes)
Objective-C100%AFNetworking/AFNetworking (1,087 nodes)
Groovy100%spockframework/spock (14,081 nodes)
C100%jqlang/jq (1,330 nodes)
Bash100%bats-core/bats-core (436 nodes)
Zig100%zigtools/zls (2,824 nodes)
CSS100%animate-css/animate.css
YAML100%kubernetes/examples
TOML100%rust-lang/cargo
HTML100%twbs/bootstrap
SCSS100%twbs/bootstrap
HCL100%hashicorp/terraform
Dockerfile100%docker-library/official-images
Swift95%Alamofire/Alamofire (3,631 nodes)

Tier 2 — Good (75–89%)

16 languages with solid results across all core operations:
LanguageScoreBenchmark Repository
Python87%django/django (49,398 nodes)
TypeScript87%nestjs/nest (9,063 nodes)
TSX87%shadcn-ui/ui (29,755 nodes)
Go87%codebase-memory-mcp (self)
Rust87%BurntSushi/ripgrep (4,118 nodes)
Java87%spring-projects/spring-petclinic (660 nodes)
R87%tidyverse/dplyr (1,618 nodes)
Dart87%felangel/bloc (5,089 nodes)
JavaScript86%lodash/lodash (244 nodes)
Erlang86%ninenines/cowboy (3,270 nodes)
Elixir86%elixir-plug/plug (870 nodes)
Scala75%playframework/playframework (19,627 nodes)
Ruby75%sinatra/sinatra (1,377 nodes)
PHP75%laravel/framework (38,644 nodes)
C#75%jasontaylordev/CleanArchitecture (1,043 nodes)
SQL75%flyway/flyway

Tier 3 — Functional (< 75%)

2 languages with functional but limited semantic analysis:
LanguageScoreNotes
OCaml72%Module functor indirection limits call resolution
Haskell62%Function composition (f . g) not modeled as CALLS edges

All 158 Supported Languages

In addition to the benchmarked languages above, the following are fully supported via vendored tree-sitter grammars (not yet benchmarked): Ada, Agda, Apex, Assembly (NASM), Astro, AWK, Beancount, BibTeX, Bicep, Bitbake, Blade, Cairo, Cap’n Proto, Clojure, CMake, COBOL, Common Lisp, Crystal, CSV, CUDA, D, Devicetree, Diff, .env, Elm, Emacs Lisp, F#, Fennel, Fish, FORM, Fortran, FunC, GDScript, .gitattributes, .gitignore, Gleam, GLSL, GN, Go module, Go template, GraphQL, Hare, HLSL, Hyprlang, INI, ISPC, Janet, Jinja2, JSDoc, JSON, JSON5, Jsonnet, Julia, Just, Kconfig, KDL, Lean 4, Linker Script, Liquid, LLVM IR, Luau, Magma, Makefile, Markdown, MATLAB, Mermaid, Meson, Move, Nickel, Nim, Nix, Odin, Pascal, Pkl, PO (gettext), Pony, PowerShell, Prisma, .properties, Protobuf, Puppet, PureScript, Racket, Regex, requirements.txt, ReScript, RON, reStructuredText, Scheme, Slang, Smali, Smithy, Solidity, SOQL, SOSL, Squirrel, SSH config, Starlark, Svelte, Sway, SystemVerilog, TableGen, Tcl, Teal, Templ, Thrift, TLA+, Typst, Verilog, VHDL, Vim script, Vue, WGSL, WIT, Wolfram, XML, Zsh.

Custom File Extensions

Map additional file extensions to supported languages using a JSON configuration file. This is useful for framework-specific extensions that don’t match standard patterns. Per-project (place in your repository root):
// .codebase-memory.json
{
  "extra_extensions": {
    ".blade.php": "php",
    ".mjs": "javascript"
  }
}
Global (applies to all projects):
// ~/.config/codebase-memory-mcp/config.json
{
  "extra_extensions": {
    ".twig": "html",
    ".phtml": "php"
  }
}
Per-project configuration overrides global for conflicting extensions. Unknown language values are silently skipped. Missing config files are ignored.

Nothing to Install

All 158 tree-sitter grammars are compiled into the binary at build time. The binary is fully self-contained — no grammar libraries to install, no version mismatches, no runtime dependencies. Download the binary, run install, restart your agent.

Build docs developers (and LLMs) love