
Content
Start
Description
Technical Specifications
FAQ
Copyright & Intellectual Property
ESP-Assist 1.0
Description
Hello! This is ESP Assistant, a DIY voice assistant powered by an ESP32. When reset, the ESP32 starts listening through an INMP441 microphone, captures your voice, and sends the audio via local HTTP to a PC.
The PC handles speech-to-text conversion, queries the OpenAI API, and then uses text-to-speech (TTS) to generate the audio response. The audio file is then sent back via HTTP to the ESP32, stored on SPIFFS, and played on a JBL speaker using the Bluetooth A2DP /library.
In this first version, only short audio clips can be played. In the future, support for an SD card will allow longer and higher-quality audio responses.
You can find the source code on GitHub, and more project details in my YouTube videos.
Technical Specifications
Feature | Details |
---|---|
Main Controller | ESP32 WROOM-32 |
Microphone | INMP441 (I2S digital mic) |
Audio Capture | ESP32 → HTTP upload → PC |
AI Processing | OpenAI API (ChatGPT) |
Text-to-Speech | PC generates audio, sends back to ESP32 |
Audio Storage | SPIFFS (SD card planned for longer audio) |
Audio Output | Bluetooth A2DP (tested with JBL speaker) |
Connectivity | WiFi (local HTTP + OpenAI API) |
FAQ
- Does it work offline?
- No, ESP Assistant requires WiFi and a PC for STT/TTS and OpenAI API calls.
- Can it play long audio responses?
- Currently only short clips (due to SPIFFS limits). Future versions with SD card support will allow longer audio.
- Can I change the voice or responses?
- Yes! Since it's open source, you can modify the code and integrate different APIs or TTS engines.
Copyright & Intellectual Property
The design, electronics, and content related to this project are open for personal use, modification, and redistribution. Commercial use, resale, or redistribution for profit is not permitted without prior consent.