Skip to content

A lightweight, fast audio transcription service that automatically converts audio files to text, generates summaries, and saves them as markdown files. Built with Python, Whisper AI, and Gemini AI.

Notifications You must be signed in to change notification settings

RJGY/AITranscribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AITranscribe

A lightweight, fast audio transcription service that automatically converts audio files to text, generates summaries, and saves them as markdown files. Built with Python, Whisper AI, and Gemini AI.

Features

  • 🎯 Automatic audio file detection and processing
  • 🔄 Automatic file format conversion to MP3 using FFmpeg
  • 📝 High-quality transcription using OpenAI's Whisper
  • 📚 AI-powered summaries using Google's Gemini
  • 📋 Clean markdown output format
  • 🔄 Background processing service
  • 🚀 FastAPI endpoint for direct uploads

Setup

  1. Install dependencies:
pip install fastapi uvicorn whisper google-generativeai python-dotenv python-multipart
  1. Install FFmpeg (required for audio conversion)

  2. Create a .env file with:

GENERATIVEAI_API_KEY=your_gemini_api_key
SCAN_DIR=path/to/input/folder
TRANSCRIPT_DIR=path/to/output/folder

Usage

Option 1: Automatic File Processing

  1. Start the server:
uvicorn main:app --host 0.0.0.0 --port 8000
  1. Drop audio files into your SCAN_DIR folder
  2. Files will be automatically processed and transcripts will appear in TRANSCRIPT_DIR

Option 2: API Endpoint

Send POST requests to /transcribe/ with audio files:

curl -X POST -F "file=@your_audio.mp3" http://localhost:8000/transcribe/

Output Format

Transcripts are saved as markdown files with:

  • Timestamp of generation
  • AI-generated summary
  • Full transcription text

Requirements

  • Python 3.8+
  • FFmpeg
  • Google Gemini API key

About

A lightweight, fast audio transcription service that automatically converts audio files to text, generates summaries, and saves them as markdown files. Built with Python, Whisper AI, and Gemini AI.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages