For the last 4 years I’ve been working toward creating the best JARVIS replicate in the real world. And while I have no experience in AI and this project uses no AI, I worked toward a voice command system for my pc to control my computer. Opening applications, answering simple questions, googling things, sending emails. Previously I used Alexa on the Raspberry Pi, but this time I wanted more specific control over the program so I wrote it myself in Python. However, this project will be less of a tutorial and more of an explanation as the code is a bit extensive to show off in this article.

To start this project I needed to learn speech recognition. Instead of suffering for hours doing this, I found a library that does it for me. I found a STT and TTS (Speech to Text and Text to Speech) engines that would allow me to take vocal input through the microphone and then convert it into a string I could use in the code, and take the string and have it speak through the speakers on the computer.

Then I set up the engines to take input and speak output as well as programming in some little details like how the bot addresses the user depending on the time of day. After this I created two external scripts where I wrote a ton of functions.

Each of these functions were a task I could have the program perform for me. I had one script that would handle tasks that required internet, and one that would handle tasks that would purely run on my computer.

On the online script I used the weather API, News API, Advice, and Dad Jokes APIs that I used in the Morning Texting Bot project. I also included a Find my IP Address API, a TMBD API, a Google Search API, Wikipedia Search API, as well as an Email and WhatsApp Messaging system API.

For the offline script, I simply set up a bunch of tasks that would open files. First I created shortcuts to every file that I needed to open in a folder I could easily access, then I used the:

os.startfile("path")

Command to open the file I linked in the path.

After having all the tasks I wanted to occur I needed to set up the keyword recognition to allow the computer to know what tasks I wanted to run based on what I said. All of the voice input that the computer recorded while the script was running was stored as a string and then turned into an array of words. Thus I could wait until the array contained a certain keyword, in this case “DAVIS” and after hearing that word, the script would await further instruction. If the next words contained a keyword, on of the functions I had in the other files would run, if not, Davis would return a simple output.

This program was really simple. However it liked to disable my audio in other applications which I found annoying, but I’m sure its just a bug I didn’t fix. While this project definitely isnt Stark level tech, it was cool to see what could be done with just some simple beginner Python scripts. Maybe Mark III will have smart home control.

Leave a Reply

Your email address will not be published. Required fields are marked *