Desktop Assistant — known as Jarvis — is an open-source virtual assistant written in Python. It listens for your voice, carries out tasks on your computer, answers questions, searches the web, sends emails, plays music, and speaks responses back to you using text-to-speech. Whether you prefer a graphical interface or just your microphone, Jarvis is designed to make everyday desktop tasks faster and more hands-free.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Harsha200105/DesktopAssistant/llms.txt
Use this file to discover all available pages before exploring further.
What Jarvis Can Do
Jarvis combines several capabilities into a single conversational interface:- Voice recognition — captures audio from your microphone and transcribes it using the Google Speech Recognition engine via the
SpeechRecognitionlibrary. - Text-to-speech (TTS) — responds out loud using
pyttsx3, an offline TTS engine that works with SAPI5 on Windows and eSpeak on Ubuntu. - Web browsing — opens popular websites (Google, YouTube, Wikipedia, Amazon; plus GitHub on Windows) and fires off searches on your configured search engine.
- Wikipedia lookups — fetches two-sentence summaries directly from Wikipedia.
- Email — composes and sends emails via SMTP using credentials stored in
config.ini. - Music playback — plays and controls MP3 files from a configured folder using
pygame.mixer. - System time and date — announces the current time and date on request (Ubuntu entry point only).
- Graphical interface — displays a scrollable chat log and a Speak button inside a Tkinter window so you can follow the conversation visually.
Architecture Overview
The project’s source lives entirely in thesrc/ directory and is split into five modules:
| File | Purpose |
|---|---|
Jarvis2.py | Ubuntu entry point. Self-contained: embeds TTS, speech recognition, and command dispatch in one file. |
Jarvis2_4windows.py | Windows entry point. Reads config.ini at startup, then delegates to actions.py and commands.py. |
actions.py | Windows helpers — speak(), wish_me(), search_engine_selector(), and voice-settings mutators (change_rate, change_voice, change_volume). Used by Jarvis2_4windows.py only; initialises the SAPI5 TTS engine. |
commands.py | Individual command handlers (command_wikipedia, command_open, command_search, command_mail, command_play_music, etc.). |
gui.py | Tkinter window definition — the 700×500 root window, scrollable Listbox chat log, and the Speak button. |
Jarvis2_4windows.py) checks for config.ini before doing anything else. If the file is missing it prints an error and exits. The Ubuntu entry point (Jarvis2.py) is configuration-free and runs directly.
Key Features
Offline Text-to-Speech
Uses
pyttsx3 for TTS — no internet connection required to speak responses. On Windows, voice, rate (default 150 wpm), and volume are adjustable at runtime or via config.ini.Google Speech Recognition
Captures audio with
PyAudio and transcribes it through the Google Speech Recognition API. Energy threshold (default 300) and pause threshold (0.5 s) are tunable.Configurable Search Engine
Pick Google, Bing, DuckDuckGo, or YouTube as your default search engine in
config.ini (Windows only). Any other value is validated as a live URL before falling back to Google.Debug / Type Mode
Set
debug = True in config.ini to type commands at the terminal instead of speaking — useful for testing without a microphone (Windows entry point only).Music Playback
Plays, pauses, unpauses, and stops MP3 files via
pygame.mixer. On Windows, point musicpath in config.ini to your music folder.Cross-Platform
Runs on Windows (with SAPI5) and Ubuntu (with eSpeak). Separate entry points and requirement files keep platform-specific dependencies isolated.
System Requirements
| Requirement | Details |
|---|---|
| Python | 3.9 or later (3.9 recommended — all dependencies are pinned against it) |
| Microphone | Required for voice input; can be bypassed with debug = True |
| Internet | Required for Google Speech Recognition and Wikipedia lookups |
| OS | Windows 10+ or Ubuntu (tested with espeak installed) |
| Ubuntu extra | espeak system package (sudo apt-get install espeak) and portaudio19-dev for PyAudio |
Next Steps
Installation
Install Python dependencies on Windows or Ubuntu, or download the pre-built binary installer.
Quickstart
Clone the repo, drop in your
config.ini, and give Jarvis your first voice command in under five minutes.