Sign Language Segmentation

Pose segmentation model for sign language — signs and sentences — using CNN + Transformer with RoPE.

Usage

# Install
pip install git+https://github.com/sign-language-processing/segmentation

# Acquire a MediaPipe Holistic pose file and the corresponding video
wget -O example.pose https://datasets.sigma-sign-language.com/poses/holistic/dgs_corpus/1413451-11105600-11163240_a.pose
wget -O example.mp4 https://www.sign-lang.uni-hamburg.de/meinedgs/videos/1413451-11105600-11163240/1413451-11105600-11163240_1a1.mp4

# Run the model (video is linked into the ELAN file so it plays in ELAN)
pose_to_segments --pose example.pose --elan output.eaf --video example.mp4

The model reads a .pose file and writes an ELAN (.eaf) annotation file with SIGN and SENTENCE tiers.

from pose_format import Pose
from sign_language_segmentation.bin import segment_pose

with open("example.pose", "rb") as f:
    pose = Pose.read(f)

eaf, tiers = segment_pose(pose)
# tiers["SIGN"] and tiers["SENTENCE"] are lists of {"start": int, "end": int} frame dicts

Server

# Build and run the inference server
docker build -t segmentation-serve .
docker run -p 8080:8080 -e PORT=8080 segmentation-serve

# Segment a pose file (input/output are file paths or gs:// URIs)
curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"input": "/path/to/input.pose", "output": "/path/to/output.eaf"}'

# Health check
curl http://localhost:8080/health

Training

Prerequisites

Requires the DGS Corpus and MediaPipe Holistic poses (internal datasets).

Docker (recommended)

# Build the training image
docker build -f Dockerfile.train -t segmentation-train .

# Train
docker run --rm --gpus all \
  -v /path/to/dgs-corpus:/data/dgs-corpus:ro \
  -v /path/to/mediapipe-poses:/data/poses:ro \
  -v $(pwd)/models:/app/models \
  segmentation-train \
  python -m sign_language_segmentation.train \
    --corpus /data/dgs-corpus \
    --poses /data/poses \
    --hidden_dim 384 --encoder_depth 4 --attn_nhead 8 \
    --batch_size 8 --num_frames 1024 \
    --dice_loss_weight 1.5 \
    --epochs 500 --patience 100

# Evaluate on dev split
docker run --rm --gpus all \
  -v /path/to/dgs-corpus:/data/dgs-corpus:ro \
  -v /path/to/mediapipe-poses:/data/poses:ro \
  -v $(pwd)/models:/app/models \
  segmentation-train \
  python -m sign_language_segmentation.evaluate \
    --checkpoint /app/models/<run_name>/best.ckpt \
    --corpus /data/dgs-corpus \
    --poses /data/poses \
    --split dev

Best hyperparameters and architecture details: dist/2026/README.md.

Local (development)

conda create --name segmentation python=3.12 -y
conda activate segmentation
pip install ".[dev]"
python -m sign_language_segmentation.train --corpus /path/to/dgs-corpus --poses /path/to/poses

Architecture

CNN-medium-attn + RoPE (2026):

Stage 1: Two-stage UNet CNN — spatial compression over joints, then temporal context
Stage 2: N-layer pre-norm Transformer with Rotary Position Embedding (RoPE)
Two output heads: sign (gloss) BIO and phrase (sentence) BIO

See dist/2026/README.md for what worked, what didn't, and key bug fixes.

2023 Version (v2023)

Exact code for the paper Linguistically Motivated Sign Language Segmentation.

Citation

@inproceedings{moryossef-etal-2023-linguistically,
    title = "Linguistically Motivated Sign Language Segmentation",
    author = {Moryossef, Amit  and Jiang, Zifan  and M{\"u}ller, Mathias  and Ebling, Sarah  and Goldberg, Yoav},
    editor = "Bouamor, Houda  and Pino, Juan  and Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.846",
    doi = "10.18653/v1/2023.findings-emnlp.846",
    pages = "12703--12724",
}

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
dist/2026		dist/2026
sign_language_segmentation		sign_language_segmentation
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.train		Dockerfile.train
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sign Language Segmentation

Usage

Server

Training

Prerequisites

Docker (recommended)

Local (development)

Architecture

2023 Version (v2023)

Citation

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sign Language Segmentation

Usage

Server

Training

Prerequisites

Docker (recommended)

Local (development)

Architecture

2023 Version (v2023)

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages