Description
This project demonstrates how to convert spoken voice into text using Python. It utilizes the speech_recognition and pyaudio libraries to listen to the microphone input, recognize the speech using Google’s API, and save the recognized text into a file called journal.txt.
Requirements
Install the necessary Python libraries:
pip install SpeechRecognition
pip install pyaudio # You may also need to install PortAudio system dependencies
What It Does:
- Listens through your microphone.
- Converts spoken voice into text.
- Saves the text into a file (
journal.txt) for journaling or logging purposes.
Code
# vtt stands for voice to text
# This file is part of the Brain project.
import speech_recognition as sr
def save_text_to_file(text, filename="journal.txt"):
try:
with open(filename, "a") as file:
file.write(text)
file.write("\n") # Add a newline for better readability
print(f"Text saved to {filename}")
return True
except Exception as e:
print(f"Error saving text to file: {e}")
return False
def convert_voice_to_text():
recognizer = sr.Recognizer()
with sr.Microphone() as source:
print("Speak something...")
recognizer.adjust_for_ambient_noise(source) # optional
audio = recognizer.listen(source)
# timeout=5 # optional, to avoid waiting indefinitely
# phrase_time_limit=5 # optional, to limit the length of the phrase
try:
text = recognizer.recognize_google(audio)
print("You said:", text)
if save_text_to_file(text):
print("Text saved successfully.")
else:
print("Failed to save text.")
return text
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print(f"Could not request results; {0}".format(e))
How to Use:
- Run the script in your Python environment.
- Speak clearly into your microphone when prompted.
- The recognized text will be printed and saved in
journal.txt.
Notes
- This script uses Google Speech API, which requires an internet connection.
pyaudiomay need additional system-level dependencies depending on your OS.
Learning Outcome:
- Basics of audio input in Python
- Using third-party APIs for speech recognition
- File handling for persistent storage