Skip to content

sam84723/seismic_magnitude_predictor

Repository files navigation

A comprehensive machine learning pipeline for predicting earthquake magnitudes from early-stage 3-component seismic waveforms using 1D CNNs with data augmentation and weighted loss strategies.

🌟 Key Features

  • End-to-end pipeline from raw seismic data to magnitude prediction
  • Multi-component analysis using Z, N, E seismic components
  • Physics-informed data augmentation for rare high-magnitude events
  • Weighted loss function to handle severe class imbalance
  • Comprehensive evaluation across different sampling rates and configurations

📊 Performance Highlights

Configuration Sampling Rate Augmentation MAE Accuracy (±0.2 mag)
Best Model 100 Hz ✅ Yes 0.276 0.166 41.6%
Baseline 50 Hz ❌ No 0.313 -0.007 32.7%

🚀 Quick Start

Prerequisites

Core Dependencies

pip install torch==2.2.2 torchvision==0.17.2
pip install numpy==1.26.4 pandas==2.2.3 scikit-learn==1.7.1
pip install obspy==1.4.2 matplotlib==3.10.0
pip install scipy==1.16.0 tqdm==4.67.1

Full Environment

pip install -r requirements.txt

Usage

from seismic_magnitude_prediction import run_experiment

# Run best configuration
results = run_experiment(
    seconds=5,
    sample_rate=100,
    data_file="seismic_data_5_seconds_sampling_rate_100.pkl",
    target_length=500,
    batch_size=32,
    num_epochs=100,
    learning_rate=0.001,
    is_augmented=True
)

📁 Project Structure

├── seismic_magnitude_prediction.py  # Main ML pipeline
├── data_preprocessing.py            # Data extraction & validation
├── requirements.txt
├── README.md
└── data/
   ├── Turkey_data/
   │   ├── Catalog/                 # Text catalog files
   │   └── DATA/                    # MiniSEED waveform files
   └── processed/
       └── *.pkl                    # Processed datasets

🛠️ Methodology

Data Pipeline

Catalog Parsing: Extract event metadata from Turkish seismic catalogs Waveform Extraction: Extract 5-second snippets around P-wave arrivals Quality Control: 7-level validation system for data integrity Standardization: Resample to 100 Hz and normalize signals

Model Architecture

Input: 3-channel waveforms (Z, N, E components) × 500 time samples Architecture: 4-layer 1D CNN with batch normalization and global pooling Output: Single regression value (earthquake magnitude)

Class Imbalance Solutions

Data Augmentation: Physics-informed noise, time shift, amplitude scaling Weighted Loss: Higher penalties for rare high-magnitude events (10x weight for mag 6.5+)

📈 Dataset Information

Source: Turkish Seismic Network
Total Samples: 17,773 (50 Hz) / 12,671 (100 Hz)
Magnitude Range: 3.5 - 7.5
Components: 3-channel seismic recordings (Z/N/E)
Duration: 5 seconds post P-wave arrival

Class Distribution

Magnitude 3.5-4.5: ~90% of samples (dominant)
Magnitude 4.5-5.5: ~8% of samples  
Magnitude 5.5-6.5: ~1% of samples
Magnitude 6.5-7.5: <0.1% of samples (critical but rare)

🔬 Experimental Results

Key Findings

100 Hz + Augmentation significantly outperforms other configurations
Data augmentation is crucial for high-resolution data (prevents overfitting)
Class imbalance remains the primary challenge for high-magnitude prediction

Performance by Magnitude Range

Mag 3.5-4.5: MAE = 0.224 ✅ (Excellent)
Mag 4.5-5.5: MAE = 0.766 ⚠️ (Acceptable)
Mag 5.5+: MAE > 1.3 ❌ (Poor - insufficient data)

📋 Requirements System Requirements

Python 3.11+
CUDA 12.1+ (optional, for GPU acceleration)
8GB+ RAM (for large seismic datasets)

Tested Environment
This project was developed and tested with:

Python 3.11.13
PyTorch 2.2.2 with CUDA 12.1 support
ObsPy 1.4.2 for seismic data processing
NumPy 1.26.4 (ObsPy compatibility requirement)

Core Dependencies

# Core ML & Data Processing
torch==2.2.2
torchvision==0.17.2
numpy==1.26.4
pandas==2.2.3
scikit-learn==1.7.1

# Seismic Data Processing
obspy==1.4.2

# Visualization
matplotlib==3.10.0

# Data Utilities
scipy==1.16.0
tqdm==4.67.1

# Development Environment (optional)
jupyter==1.1.1
ipython==9.4.0

📄 License

MIT License - see LICENSE file for details.

📚 Citation

 @misc{seismic2025,
 title={Seismic Magnitude Prediction using Deep Learning},
 author={Sam84723},
 year={2025},
 url={https://github.com/sam84723/seismic_magnitude_predictor}
}

🙏 Acknowledgments

Dr. Itzhak Lior from HUJI university
Turkish Seismic Network for providing the dataset
ObsPy community for seismic data processing tools
PyTorch team for the deep learning framework

⚠️ Note: This model is for research purposes. For real earthquake early warning systems, additional validation and safety measures are required.

About

Seismic Magnitude Prediction using Deep Learning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published