Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HKUDS/nanobot/llms.txt
Use this file to discover all available pages before exploring further.
Vision
nanobot aims to be the simplest, most hackable AI agent framework while maintaining full functionality. We’re not trying to be the biggest or most feature-rich — we’re optimizing for clarity, simplicity, and research-readiness.Design Goals
- Keep it tiny: Target ~5,000 core agent lines (currently ~4,000)
- Stay readable: Every line should be understandable
- Make it hackable: Easy to modify and extend
- Remain practical: Real features, not toy examples
Current Status (v0.1.4.post3)
✅ What’s Working
- 10 chat channels: Telegram, Discord, WhatsApp, Feishu, Email, Slack, QQ, DingTalk, Matrix, Mochat
- 15+ LLM providers: OpenRouter, Anthropic, OpenAI, DeepSeek, Gemini, Groq, and more
- 8 built-in tools: Shell, filesystem, web, spawn, cron, message, MCP
- MCP support: Model Context Protocol integration
- Multi-modal: Images, voice transcription (Groq Whisper)
- Memory system: Persistent MEMORY.md
- Subagents: Background task spawning
- Scheduled tasks: Cron-based scheduling + heartbeat
- Session isolation: Per-user/thread conversations
- Prompt caching: Anthropic/OpenRouter support
- OAuth providers: OpenAI Codex, GitHub Copilot
- Thinking mode: Experimental reasoning support
🔧 Current Limitations
- Manual testing only: No automated test suite
- Basic memory: Simple markdown, no vector search
- Limited multimodal: Images receive-only (most channels)
- No streaming: Responses sent after completion
- Simple context: No advanced retrieval
Roadmap
Phase 1: Enhanced Multi-Modal (Q2 2026)
Goal: See, hear, and create media. Features:- Vision: Image understanding (GPT-4V, Claude 3)
- Image generation: DALL-E, Stable Diffusion integration
- Video support: Receive and analyze videos
- Voice output: Text-to-speech responses
- Audio analysis: Analyze audio files beyond transcription
- Telegram (send images)
- Discord (send images)
- WhatsApp (send images)
- All channels (receive images for vision)
generate_image(prompt: str) -> image_pathanalyze_image(image_path: str) -> descriptionspeak(text: str) -> audio_path
Phase 2: Long-Term Memory (Q3 2026)
Goal: Never forget important context. Features:- Vector search: Semantic memory retrieval
- Automatic summarization: Compress old conversations
- Entity tracking: Remember people, places, facts
- Memory importance scoring: Prioritize key information
- Multi-document memory: Organize by topic/project
- Use lightweight vector DB (ChromaDB, DuckDB)
- Auto-summarize conversations > N messages
- Extract entities with LLM calls
- Store in
~/.nanobot/memory/with indexes
remember(fact: str, importance: int)recall(query: str) -> relevant_factsforget(fact_id: str)
Phase 3: Better Reasoning (Q4 2026)
Goal: Multi-step planning and self-reflection. Features:- Chain-of-thought: Explicit reasoning steps
- Task decomposition: Break complex tasks into subtasks
- Self-critique: Evaluate and revise outputs
- Plan visualization: Show reasoning tree to user
- Alternative exploration: Consider multiple approaches
- Add
reason()tool for internal thinking - Multi-pass agent loop (plan → execute → reflect)
- Reasoning prompt templates
- Visualization in web UI (future)
plan(goal: str) -> steps[]critique(output: str) -> improvements[]reflect() -> insights
Phase 4: More Integrations (Ongoing)
Goal: Work with more platforms and tools. Channels:- Twitter/X: Post tweets, reply to mentions
- LinkedIn: Messaging integration
- SMS: Twilio integration
- Mastodon: Fediverse support
- iMessage: Apple Messages (via bridge)
- Signal: Private messaging
- Zulip: Team chat
- Mattermost: Open-source Slack alternative
- Together AI: Fast inference
- Fireworks: Model zoo
- Replicate: Run any model
- Cohere: Command models
- AI21: Jurassic models
- Mistral: Mistral AI (if not via OpenRouter)
- Calendar: Google Calendar, Outlook integration
- Email send: Proactive email sending
- File sync: Dropbox, Google Drive
- Database: SQL query execution
- Code execution: Jupyter kernels
- Browser: Playwright/Selenium automation
Phase 5: Self-Improvement (2027)
Goal: Learn from feedback and mistakes. Features:- User feedback loop: Rate responses, agent learns
- Error tracking: Log and analyze failures
- Automatic retries: Fix mistakes without user intervention
- Preference learning: Adapt to user style
- Skill discovery: Auto-install useful skills
- Feedback storage in
~/.nanobot/feedback/ - Error pattern detection
- Preference profiles in config
- Skill marketplace integration (ClawHub)
rate_response(rating: int, feedback: str)analyze_errors() -> patterns[]adjust_preferences(key: str, value: any)
Community Priorities
Based on GitHub discussions and Discord feedback:High Demand
- Web UI: Browser-based interface (like OpenWebUI)
- Streaming responses: Real-time output
- Function calling improvements: Parallel tool execution
- Better error messages: More helpful diagnostics
- RAG support: Document Q&A
Medium Demand
- Plugin system: Third-party tool installation
- Multi-agent coordination: Agents working together
- Custom prompts: User-defined system prompts
- Voice UI: Speak to agent directly
- Mobile app: iOS/Android companion
Low Demand (but interesting)
- Agent marketplace: Share and download agents
- Blockchain integration: Web3 tools
- IoT control: Smart home integration
- AR/VR: Spatial computing
Non-Goals
What we’re not building (to keep nanobot simple): ❌ Enterprise features: SSO, multi-tenancy, admin panels❌ Complex UIs: Rich web dashboards (keep it CLI-first)
❌ Heavy dependencies: Avoid large frameworks (Django, etc.)
❌ Monolithic architecture: Stay modular and hackable
❌ Kitchen sink: Don’t add every possible feature If you need these, consider building on top of nanobot or using a different framework.
How to Contribute
Want to help with the roadmap?- Pick an item from the roadmap above
- Open a GitHub Discussion to discuss your approach
- Create a PR with your implementation
- Get feedback from maintainers
- Iterate until it’s ready to merge
Versioning Strategy
Current: v0.1.x (Alpha)
- Rapid iteration
- Breaking changes allowed
- Focus on core features
Future: v0.2.x (Beta)
- Stable API
- Deprecation warnings before breaking changes
- Focus on polish and reliability
Long-term: v1.0.0 (Stable)
- Production-ready
- Semantic versioning
- Long-term support
Release Cadence
- Patch releases (v0.1.4.post1): As needed (bug fixes)
- Minor releases (v0.1.5): Every 1-2 weeks (new features)
- Major releases (v0.2.0): When API changes significantly
Feature Requests
Have an idea? Here’s how to suggest it:- Check existing issues/discussions: Might already be planned
- Open a GitHub Discussion: Describe the feature and use case
- Gauge community interest: See if others want it too
- Estimate complexity: How many lines of code?
- Propose implementation: How would it fit into nanobot?
- Align with nanobot’s goals (simple, hackable)
- Have clear use cases
- Don’t add excessive complexity
- Can be implemented in less than 500 lines
- “Add everything from framework X”
- Niche features used by less than 1% of users
- Require heavy dependencies
- Violate the “keep it simple” principle
Research Areas
For academic/research use:- Memory architectures: Better long-term memory designs
- Multi-agent systems: Agent communication protocols
- Tool learning: Automatic tool discovery and composition
- Context optimization: Smarter prompt compression
- Reasoning methods: Novel planning and reflection techniques
Metrics
How we measure success:- Lines of code: Keep core agent under 5,000 lines
- Startup time: CLI mode under 1 second
- Dependencies: Minimize third-party packages
- Documentation: Every feature documented
- Community: Active Discord, GitHub discussions
- Real usage: People actually use it daily
Timeline
| Quarter | Focus | Key Features |
|---|---|---|
| Q2 2026 | Multi-modal | Vision, image generation, voice output |
| Q3 2026 | Memory | Vector search, summarization, entity tracking |
| Q4 2026 | Reasoning | Chain-of-thought, task decomposition |
| Q1 2027 | Integrations | New channels, providers, tools |
| Q2 2027 | Self-improvement | Feedback loops, error learning |
| Q3 2027 | Polish | Web UI, streaming, better UX |
| Q4 2027 | v1.0 | Production-ready release |
Long-Term Vision (2028+)
- Autonomous agents: Proactively help without prompting
- Agent collaboration: Multiple agents working together
- Continuous learning: Improve over time from usage
- Universal interface: Control anything via natural language
- Personal AI OS: nanobot as your digital assistant layer
Get Involved
- Discord: Join the community
- GitHub: HKUDS/nanobot
- Discussions: Share ideas
- Issues: Report bugs