Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Harsha200105/DesktopAssistant/llms.txt

Use this file to discover all available pages before exploring further.

Desktop Assistant — known as Jarvis — is an open-source virtual assistant written in Python. It listens for your voice, carries out tasks on your computer, answers questions, searches the web, sends emails, plays music, and speaks responses back to you using text-to-speech. Whether you prefer a graphical interface or just your microphone, Jarvis is designed to make everyday desktop tasks faster and more hands-free.

What Jarvis Can Do

Jarvis combines several capabilities into a single conversational interface:
  • Voice recognition — captures audio from your microphone and transcribes it using the Google Speech Recognition engine via the SpeechRecognition library.
  • Text-to-speech (TTS) — responds out loud using pyttsx3, an offline TTS engine that works with SAPI5 on Windows and eSpeak on Ubuntu.
  • Web browsing — opens popular websites (Google, YouTube, Wikipedia, Amazon; plus GitHub on Windows) and fires off searches on your configured search engine.
  • Wikipedia lookups — fetches two-sentence summaries directly from Wikipedia.
  • Email — composes and sends emails via SMTP using credentials stored in config.ini.
  • Music playback — plays and controls MP3 files from a configured folder using pygame.mixer.
  • System time and date — announces the current time and date on request (Ubuntu entry point only).
  • Graphical interface — displays a scrollable chat log and a Speak button inside a Tkinter window so you can follow the conversation visually.

Architecture Overview

The project’s source lives entirely in the src/ directory and is split into five modules:
FilePurpose
Jarvis2.pyUbuntu entry point. Self-contained: embeds TTS, speech recognition, and command dispatch in one file.
Jarvis2_4windows.pyWindows entry point. Reads config.ini at startup, then delegates to actions.py and commands.py.
actions.pyWindows helpers — speak(), wish_me(), search_engine_selector(), and voice-settings mutators (change_rate, change_voice, change_volume). Used by Jarvis2_4windows.py only; initialises the SAPI5 TTS engine.
commands.pyIndividual command handlers (command_wikipedia, command_open, command_search, command_mail, command_play_music, etc.).
gui.pyTkinter window definition — the 700×500 root window, scrollable Listbox chat log, and the Speak button.
The Windows entry point (Jarvis2_4windows.py) checks for config.ini before doing anything else. If the file is missing it prints an error and exits. The Ubuntu entry point (Jarvis2.py) is configuration-free and runs directly.

Key Features

Offline Text-to-Speech

Uses pyttsx3 for TTS — no internet connection required to speak responses. On Windows, voice, rate (default 150 wpm), and volume are adjustable at runtime or via config.ini.

Google Speech Recognition

Captures audio with PyAudio and transcribes it through the Google Speech Recognition API. Energy threshold (default 300) and pause threshold (0.5 s) are tunable.

Configurable Search Engine

Pick Google, Bing, DuckDuckGo, or YouTube as your default search engine in config.ini (Windows only). Any other value is validated as a live URL before falling back to Google.

Debug / Type Mode

Set debug = True in config.ini to type commands at the terminal instead of speaking — useful for testing without a microphone (Windows entry point only).

Music Playback

Plays, pauses, unpauses, and stops MP3 files via pygame.mixer. On Windows, point musicpath in config.ini to your music folder.

Cross-Platform

Runs on Windows (with SAPI5) and Ubuntu (with eSpeak). Separate entry points and requirement files keep platform-specific dependencies isolated.

System Requirements

RequirementDetails
Python3.9 or later (3.9 recommended — all dependencies are pinned against it)
MicrophoneRequired for voice input; can be bypassed with debug = True
InternetRequired for Google Speech Recognition and Wikipedia lookups
OSWindows 10+ or Ubuntu (tested with espeak installed)
Ubuntu extraespeak system package (sudo apt-get install espeak) and portaudio19-dev for PyAudio

Next Steps

Installation

Install Python dependencies on Windows or Ubuntu, or download the pre-built binary installer.

Quickstart

Clone the repo, drop in your config.ini, and give Jarvis your first voice command in under five minutes.

Build docs developers (and LLMs) love