Get Started with Desktop Assistant (Jarvis) in 5 Minutes

This guide walks you from a fresh clone to a running Jarvis session in about five minutes. You’ll install the dependencies, drop in your config.ini, launch the assistant, and speak your first command.

Prerequisites

Before you begin, make sure you have:

Python 3.9 installed and available on your PATH (python --version should report 3.9.x)
A microphone connected and set as the default input device in your OS sound settings
The repository cloned locally (step 1 below covers this)
On Ubuntu only: espeak installed (sudo apt-get install espeak)

Step-by-Step

Clone the repository

git clone https://github.com/Harsha200105/DesktopAssistant.git
cd DesktopAssistant

Install dependencies

Choose the command that matches your operating system:

Windows
Ubuntu

pip install -r "Requirements&COC/requirements.txt"

sudo apt-get update
sudo apt-get install espeak
sudo apt install portaudio19-dev python3-pyaudio
pip install -r "Requirements&COC/ubuntu_requirements.txt"

Set up config.ini (Windows only)

The Windows entry point (Jarvis2_4windows.py) checks for a config.ini file in the directory you run it from and exits immediately if the file is missing.Copy the template from Requirements&COC/ into src/:

Windows (cmd)
Windows (PowerShell)

copy "Requirements&COC\config.ini" src\config.ini

Copy-Item "Requirements&COC\config.ini" -Destination "src\config.ini"

Open src/config.ini and fill in at least your name and, optionally, your preferred search engine:

[DEFAULT]
master = YourName
search_engine = Google
debug = False
musicpath =
voice = Male
rate = 150
volume = 100
energy_threshold = 300

[EMAIL]
server = smtp.gmail.com
port = 587
username =
password =

The Ubuntu entry point (Jarvis2.py) does not read config.ini. Settings like TTS voice, speech rate, and energy threshold are set directly in the source. Skip this step entirely if you are on Ubuntu.

config.ini key reference

Key	Default	Description
`master`	`YourName`	Your name — used in the greeting (“Good Morning, YourName”)
`search_engine`	`Google`	Default search engine: `Google`, `Bing`, `DuckDuckGo`, or `Youtube`
`debug`	`False`	Set to `True` to type commands instead of speaking (see tip below)
`musicpath`	(empty)	Absolute path to your music folder for the “play music” command
`voice`	`Male`	TTS voice: `Male` or `Female`
`rate`	`150`	Speech rate in words per minute
`volume`	`100`	Volume as a percentage (0–100)
`energy_threshold`	`300`	Microphone sensitivity; increase if Jarvis stops responding

Run the assistant

Launch Jarvis from inside the src/ directory so that config.ini (and the actions.py / commands.py imports) resolve correctly:

Windows
Ubuntu

cd src
python Jarvis2_4windows.py

cd src
python Jarvis2.py

You will see Initializing Jarvis.... printed to the terminal, followed by a time-appropriate greeting (“Good Morning / Afternoon / Evening, YourName”) spoken aloud. The Tkinter GUI window opens at the same time — you’re live.

Give your first voice command

Click the Speak button in the GUI window (or, in debug mode, type at the terminal prompt) and say one of these to verify everything is working:

Hello

Jarvis responds: “Hello Sir”Try a follow-up:

What's up

Jarvis replies with one of: “Just doing my thing!”, “I am fine!”, “Nice!”, or “I am nice and full of energy” — chosen at random.To close the session, say:

Bye

Jarvis says “Bye Sir, have a good day.” and exits.

Debug Mode: Type Instead of Speaking

If you don’t have a microphone available, or you want to test new commands quickly, enable debug mode in config.ini:

[DEFAULT]
debug = True

With debug = True, Jarvis2_4windows.py replaces the microphone listener with a simple input() call:

def take_command():
    return input("Command |--> ")

The terminal shows a Command |--> prompt for every command cycle. All speech output still plays through TTS — only the input path changes.

Debug mode is also useful for diagnosing unrecognised commands. Because the raw query string is returned verbatim, you can verify exactly what text the command-matching logic receives.

The GUI Window

When Jarvis starts it opens a 700×500 px Tkinter window titled “Desktop assistant”. The window is not resizable horizontally but can be resized vertically.

┌─────────────────────────────────────────────┐
│  Desktop assistant                          │
├─────────────────────────────────────────────┤
│  Assistant: Initializing Jarvis....         │
│  Assistant: Good Morning YourName           │
│  Assistant: Hello Sir                       │
│  Assistant: Next Command! Sir!              │
│                                        │▲│  │
│                                        │ │  │
│                                        │▼│  │
├─────────────────────────────────────────────┤
│ [Speak]                                     │
└─────────────────────────────────────────────┘

Element	Details
Chat log (`Listbox`)	Scrollable list; each entry is prefixed with `Assistant:` and added by the `gui.speak()` function whenever Jarvis says something
Scroll bar	Linked to the chat listbox via `yscrollcommand`; lets you scroll back through the conversation
Speak button	Triggers one command cycle — Jarvis listens, recognises, executes, and speaks the result. The button is anchored to the bottom-left (`SW`) of the window

Do not close the Tkinter window with the OS close button while a command is in progress. Doing so may leave the TTS engine (pyttsx3) in a locked state. Always end the session by saying “Bye” or “Stop” so Jarvis calls sys.exit() cleanly.

Voice Commands Reference

Some commands are only available in one entry point. The Ubuntu column applies to Jarvis2.py; the Windows column applies to Jarvis2_4windows.py.

What you say	What Jarvis does	Ubuntu	Windows
`Hello`	Greets you: “Hello Sir”	✓	✓
`What's up`	Replies with a random upbeat message	✓	✓
`How are you`	Replies with a random upbeat message	✓	—
`Open Google` / `Open YouTube`	Opens the site in your default browser	✓	✓
`Search for <query>`	Searches using your configured search engine	✓	✓
`Wikipedia <topic>`	Speaks a two-sentence Wikipedia summary	✓	✓
`Date`	Announces the current date (e.g., Monday, June 02, 2025)	✓	—
`Time`	Announces the current time (e.g., 10 30 AM)	✓	—
`Open email`	Prompts for a recipient and message, then sends via SMTP	✓	—
`Mail`	Prompts for a recipient and message, then sends via SMTP	—	✓
`Play music`	Plays a random MP3 from `musicpath`	✓	✓
`Pause music`	Pauses playback	✓	✓
`Unpause`	Resumes playback	✓	✓
`Stop music`	Stops playback	✓	✓
`Change rate to <wpm>`	Changes TTS speech rate on the fly	—	✓
`Change voice to male/female`	Switches TTS voice gender	—	✓
`Change volume to <0-100>`	Adjusts TTS volume	—	✓
`Bye` / `Stop` / `Abort` / `Nothing`	Ends the session	✓	✓

Get Started

Configuration

Voice Commands

Contributing

Get Started with Desktop Assistant (Jarvis) in 5 Minutes

Prerequisites

Step-by-Step

Debug Mode: Type Instead of Speaking

The GUI Window

Voice Commands Reference

Build docs developers (and LLMs) love

Get Started

Configuration

Voice Commands

Contributing

Documentation Index

​Prerequisites

​Step-by-Step

​Debug Mode: Type Instead of Speaking

​The GUI Window

​Voice Commands Reference

Build docs developers (and LLMs) love

Prerequisites

Step-by-Step

Debug Mode: Type Instead of Speaking

The GUI Window

Voice Commands Reference