LLM lets you run prompts directly from the command line withDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/simonw/LLM/llms.txt
Use this file to discover all available pages before exploring further.
llm 'your prompt', or the equivalent llm prompt 'your prompt'. The default model is gpt-4o-mini (requires an OpenAI API key), but you can swap in any installed model with a single flag. Responses stream to your terminal as they arrive, and every exchange is automatically saved to a local SQLite database.
Executing a Prompt
Run a basic prompt
Tokens stream to your terminal as they arrive:Disable streaming and wait for the complete response:
Choose a model
Pass Use Set the
-m with a model ID or alias:-q once or more to fuzzy-search for a model by substring — LLM picks the shortest matching ID:LLM_MODEL environment variable to change the default for the current shell session:Model Options
Some models expose configurable parameters. Pass them with-o name value:
Attachments
Multi-modal models such asgpt-4o and gpt-4o-mini accept images, PDFs, audio, and video as attachments. Pass them with -a:
System Prompts
Use-s / --system to prepend a system prompt that instructs the model before your user prompt:
Extracting Fenced Code Blocks
When you ask an LLM to write code, it typically wraps the output in a Markdown fenced block. The-x / --extract flag returns only the content of the first such block:
--xl / --extract-last to return the last fenced block instead. The full response (including surrounding text) is still saved to the log database.
Schemas
Models from OpenAI, Anthropic, and Google Gemini can return structured JSON matching a schema. Pass the schema inline, as a file, or using LLM’s concise shorthand:Fragments
Fragments let you inject reusable blocks of text (files, URLs, or saved aliases) into a prompt without duplicating the content in the database:Continuing a Conversation
By default eachllm invocation starts a fresh conversation. Pass -c / --continue to continue the most recent one:
--continue automatically uses the same model as the conversation you are continuing. To continue a specific past conversation, supply its ID:
llm logs.
Additional Flags
| Flag | Description |
|---|---|
--no-stream | Wait for the full response before printing |
--async | Run the prompt asynchronously |
-u / --usage | Print token usage after the response |
-n / --no-log | Skip saving this prompt to the database |
--log | Force logging even when logging is disabled globally |
Starting an Interactive Chat
llm chat opens a persistent interactive session. This is especially useful for local models so they do not have to reload into memory for each prompt.
llm prompt:
Chat Commands
Once inside a chat session, several special commands are available:!multi — paste multiple lines
!multi — paste multiple lines
Type Use a custom delimiter if your text contains
!multi to begin a multi-line block, then !end to submit it:!end:!edit — open your editor
!edit — open your editor
Type
!edit to open the prompt in your $EDITOR before sending:!fragment — inject a fragment mid-chat
!fragment — inject a fragment mid-chat
Inject one or more fragments into the current turn:Combine with
!multi:exit / quit — end the session
exit / quit — end the session
Type
exit or quit followed by Enter to leave the chat.llm chat accepts the same --tool/-T and --functions options as llm prompt, letting you start a chat with specific tools already enabled.Listing Available Models
-q:
-m:
--options to display parameter documentation alongside each model:
Setting Default Options for Models
Configure a persistent default option for a model withllm models options set:
model_options.json in the LLM config directory and applied automatically on every subsequent prompt.
Default model options apply to both
llm prompt and llm chat but are not used when calling LLM as a Python library.