Skip to main content
A minimal agentic coding assistant — a “Hello, World!” for AI agents. It mirrors the core design of Claude Code at a readable scale: a model-independent agentic loop, a small set of coding tools, simple context management, and a CLI interface.

How it works

The assistant runs a loop:
  1. You type a request
  2. The model decides which tools to call
  3. Tools are executed and results are fed back to the model
  4. The loop continues until the model has nothing left to do
  5. The final response is printed and you can type the next request
The four tools available to the model are:
ToolWhat it does
read_fileRead the contents of a file
write_fileCreate or overwrite a file
list_directoryList files in a directory
run_bashRun any shell command (grep, git, python, tests, …)

Setup

Prerequisites: uv
1

Enter the project directory

cd examples/local-coding-assistant
2

Install dependencies

uv sync
3

Configure API key

# Copy the example env file and add your API key
cp .env.example .env
# Edit .env and set ANTHROPIC_API_KEY

Running

uv run lca
You’ll see a prompt like this:
╔══════════════════════════════════════╗
║      Local Coding Assistant          ║
║  Type your request and press Enter.  ║
║  Ctrl+C or 'exit' to quit.           ║
╚══════════════════════════════════════╝

  Backend : anthropic
  Model   : claude-sonnet-4-6
  Work dir: .

>
Type your request and press Enter. Use exit or Ctrl+C to quit.

Example interactions

> List the files in this directory

> Read pyproject.toml and summarize it

> Create a file hello.py that prints Hello World, then run it

> Find all .py files and count their lines of code

> What does the agent.py file do?

Switching to a local model

The assistant supports llama.cpp as a local backend via its OpenAI-compatible server. Pass --backend local and --model to select the model. The llama-server is started and stopped automatically.

From a HuggingFace repo

Downloaded and cached on first run:
uv run lca --backend local --model LiquidAI/LFM2-24B-A2B-GGUF:Q4_0
If the repo requires authentication, set HF_TOKEN in your environment or .env file.

From a local GGUF file

uv run lca --backend local --model /path/to/model.gguf

Non-interactive mode

uv run lca --backend local --model LiquidAI/LFM2-24B-A2B-GGUF:Q4_0 -p "What does agent.py do?"

Manual server management

If you prefer to manage the llama-server yourself, omit --model and the assistant will connect to the already-running server:
# terminal 1 — start server manually
llama-server -hf LiquidAI/LFM2-24B-A2B-GGUF:Q4_0 --port 8080

# terminal 2 — connect without auto-start
uv run lca --backend local

Configuration

All settings are controlled via environment variables (or a .env file):
VariableDefaultDescription
LCA_BACKENDanthropicBackend to use: anthropic or local
ANTHROPIC_API_KEYYour Anthropic API key
LCA_ANTHROPIC_MODELclaude-sonnet-4-6Anthropic model name
LCA_LOCAL_BASE_URLhttp://localhost:8080/v1llama.cpp server URL
LCA_LOCAL_MODELlocalModel passed to the server (HF path or file path)
LCA_LOCAL_CTX_SIZE32768Context window size for the local server
LCA_LOCAL_GPU_LAYERS99Number of layers to offload to GPU
LCA_MAX_TOKENS8192Max tokens per response
LCA_WORKING_DIR.Working directory for bash commands
HF_TOKENHuggingFace token (required for gated models)
CLI flags override env vars:
uv run lca --backend local --model LiquidAI/LFM2-24B-A2B-GGUF:Q4_0 --working-dir /path/to/my/project

Benchmarking

The benchmark/ directory contains two task suites, each with 10 tasks of increasing difficulty (easy → hard) and automated verifiers.

Default suite — this project

Tasks range from reading pyproject.toml to multi-file code analysis:
uv run python benchmark/run.py --backend anthropic --task 1,2,3

llama.cpp suite — real-world C++ codebase

Tasks operate on a large open-source C++ project:
git clone https://github.com/ggerganov/llama.cpp /tmp/llama.cpp

Output

Results are saved to benchmark/results/<timestamp>-<suite>-<backend>-<model>.json and a summary table is printed:
Model : claude-sonnet-4-6 (anthropic)
Date  : 2026-02-27 12:55

#    Task                                     Pass       Time       In/Out tokens  Turns
----------------------------------------------------------------------------------------
1    List directory                           ✓          4.8s            2191/267      2
...
10   Compare LLM backends                     ✓         47.5s          10594/2419      4
----------------------------------------------------------------------------------------

Score: 10/10  |  Total tokens: 75702  |  Avg time: 13.5s

Source code

View the complete source code on GitHub.

Build docs developers (and LLMs) love