Skip to content

pulinduvidmal/speech_recognition_robot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

speech_recognition_robot

Overview

This project implements a speech-controlled UR10 robot using OpenAI's Whisper model for speech recognition. The robot listens to spoken commands and assembles vehicles accordingly.

image

Features

  • Speech Recognition: Utilizes OpenAI's Whisper model for accurate speech-to-text conversion.
  • Robot Control: Controls the UR10 robot using the URBasic library.
  • Flask Server: A Flask-based server processes audio inputs and sends commands to the robot.

Setup Instructions

1. Install Dependencies

Ensure you have Python 3.8+ installed, then install the required libraries:

pip install -r requirements.txt

2. Run the Flask Server

Start the server to process speech and control the UR10 robot:

python scripts/server.py

3. Record and Send Audio

Run the send_audio.py script to record and send speech commands:

python scripts/send_audio.py

Dependencies

  • torch
  • transformers
  • flask
  • numpy
  • sounddevice
  • URBasic
  • whisper

Future Improvements

  • Improve recognition accuracy with more training data.
  • Support additional languages.

About

This project integrates OpenAI's Whisper model with a UR10 robotic arm to enable speech-controlled automation. Users give voice commands, which are transcribed using Whisper and processed by a Flask server to control the UR10 robot. The robot then performs specific assembly tasks based on the detected command.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages