TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Effectful-Tech/clanka/llms.txt
Use this file to discover all available pages before exploring further.
CodeChunker module splits source files into semantically meaningful segments for embedding and search. For TypeScript and JavaScript files it uses tree-sitter to parse the AST and emit one chunk per top-level declaration (function, class, interface, etc.) along with structured metadata. All other supported file types fall back to sliding line-window chunking.
CodeChunk
Normalised, forward-slash path of the source file relative to the index root.
1-based line number of the first line of the chunk.
1-based line number of the last line of the chunk (inclusive).
The raw source text of the chunk.
Identifier extracted from the AST node — typically the function, class, method, or variable name.
undefined for non-AST (line-window) chunks and for anonymous declarations.Semantic category of the chunk as determined by the AST node type.
undefined for non-AST chunks.The containing declaration formatted as
"<type> <name>" (e.g. "class MyService"). Set for class methods and declarations inside namespaces. undefined for top-level chunks.ChunkType
ChunkType | Tree-sitter node types |
|---|---|
"function" | function_declaration, generator_function_declaration, or a variable declarator whose value is a function expression |
"method" | method_definition, generator_method_definition |
"class" | class_declaration |
"namespace" | internal_module, module |
"interface" | interface_declaration |
"type-alias" | type_alias_declaration |
"enum" | enum_declaration |
"variable" | lexical_declaration, variable_declaration |
CodeChunker service
listFiles
Enumerate all indexable files under a root directory.
Absolute path to the root directory.
Maximum file size passed to
rg --max-filesize. Defaults to "1M".Effect<ReadonlyArray<string>> — paths relative to root, sorted lexicographically. Files inside ignored directories (.git, node_modules, dist, etc.) and minified bundles are excluded automatically.
chunkFile
Chunk a single source file into an array of CodeChunk values.
Absolute path to the root directory. Used to resolve and normalise
path.Path to the file, relative to
root.Maximum number of lines per chunk for the sliding-window fallback.
Number of lines to overlap between adjacent sliding-window chunks.
Maximum character count per chunk. Chunks that exceed this limit are split further.
Effect<ReadonlyArray<CodeChunk>>. Returns an empty array if the file does not exist, has no meaningful content, or appears to be minified.
chunkFiles
Chunk a list of files and stream results as they become available. Files are processed with a concurrency of 5.
Absolute path to the root directory.
Paths of files to chunk, relative to
root.Maximum lines per sliding-window chunk.
Line overlap between adjacent sliding-window chunks.
Maximum character count per chunk.
Stream<CodeChunk>.
chunkCodebase
Enumerate and chunk an entire codebase in one operation. Combines listFiles and chunkFiles.
Absolute path to the root directory.
Maximum file size filter for enumeration. Defaults to
"1M".Maximum lines per sliding-window chunk.
Line overlap between adjacent sliding-window chunks.
Maximum character count per chunk.
Stream<CodeChunk> — a continuous stream of chunks from all indexable files.
CodeChunker.layer
CodeChunker implementation. This layer has no failure channel — errors from the underlying file system or spawner are surfaced through the service methods themselves.
Requirements:
ChildProcessSpawner.ChildProcessSpawner— spawns thergprocess used bylistFiles.FileSystem.FileSystem— reads file content inchunkFile.Path.Path— resolves and normalises file paths.
SemanticSearch.layer provisions CodeChunker.layer automatically. You only need to provide CodeChunker.layer directly when using the chunker outside of SemanticSearch.
AST chunking
For files with a.ts, .tsx, .js, or .jsx extension, CodeChunker parses the file with tree-sitter and extracts top-level declarations as chunk boundaries.
Chunking rules:
- Each exported or top-level declaration becomes its own chunk.
- Leading JSDoc/block comments immediately preceding a declaration are included in that chunk.
- Class methods are emitted as separate child chunks with
parentset to"class ClassName". - Declarations inside TypeScript namespaces/modules are emitted with
parentset to"namespace Name". - A class chunk that has method child chunks and spans more than
chunkSizelines is truncated to its first sliding-window segment only — the methods carry the rest of the content. - Minified files (detected heuristically by line count and line length) are skipped entirely.
chunkSize and chunkOverlap.
Supported file types
The chunker indexes files with the following extensions: Source code:c, cc, cpp, cs, css, cts, cxx, go, gql, graphql, h, hpp, html, ini, java, js, jsx, kt, kts, less, lua, mjs, mts, php, py, rb, rs, sass, scala, scss, sh, sql, svelte, swift, ts, tsx, vue, xml, zsh
Documentation: adoc, asciidoc, md, mdx, rst, txt
Files inside .git, .next, .nuxt, .svelte-kit, .turbo, build, coverage, dist, node_modules, and target directories are always excluded.
Utility functions
chunkFileContent
content is blank or heuristically detected as minified.
isProbablyMinified
true when the content is likely a minified bundle. A file is considered minified when it has fewer than 20 newlines and at least 80% of those lines are 300 or more characters long, and the total content is at least 2,000 characters.
isMeaningfulFile
true when the file at path has a supported extension and does not reside inside an ignored directory. Path is matched in a case-insensitive, normalised (forward-slash) form.