An intelligent surveillance solution using LLaVA-7B, built for both pre-recorded and live webcam feeds. The system analyzes visual input, detects abnormal or violent behavior, and sends real-time alerts via Telegram, along with storing all flagged data in MongoDB Atlas.
- 🧠 LLaVA-7B-based Vision-to-Text Analysis
- 🎥 Pre-recorded video frame analysis
- 📡 Real-time surveillance via phone/webcam (IP stream)
- 📬 Telegram Bot Alerts (triggered only on detected violence/anomaly)
- 🗃️ MongoDB Atlas for storing frame data, timestamps, and captions
Ensure you have Python 3.8+ and run:
pip install -r requirements.txtDownload and set up the LLaVA-7B model using Ollama.
Once setup is complete, make sure the model is accessible via:
ollama run llavaRun the Jupyter notebook for analyzing a recorded video:
jupyter notebook llava_prevideo.ipynb- Extracts frames at every 30-frame interval
- Sends each frame to Model for caption generation
- Stores results in MongoDB Atlas
- Triggers Telegram alert only if a frame contains violence or anomaly
- Processed frame
- Output Provided By Model
- Screenshot of triggered Telegram alert
Run the live surveillance script using:
python llava_multi.py- Captures frames from live webcam (can use phone camera via IP)
- Sends them to Model for real-time analysis
- If violence/anomaly is detected:
- Triggers Telegram alert
- Includes frame description, timestamp, and IP-based location
- Saves flagged frames and captions to MongoDB Atlas
- Live flagged frame
- Description of frame
This project showcases a powerful and flexible AI-driven surveillance pipeline:
- Processes both offline and live video
- Alerts only when anomalies occur
- Stores event data for traceability and further investigation
- Highly modular: extendable with gesture recognition, facial ID, object detection
# Install dependencies
pip install -r requirements.txt
# For Pre-recorded Analysis
jupyter notebook llava_prevideo.ipynb
# For Live Webcam Surveillance
python llava_multi.py



