An R package for extracting, analysing, and visualising BirdNET audio embeddings. The heavy lifting (TFLite inference on audio clips) is done in Python via reticulate; everything else — distance calculations, dimensionality reduction, and plotting — is pure R and integrates naturally with ggplot2.
| Capability | Functions |
|---|---|
| Extract embeddings from a clip library | embed_clips() |
| Load / save results without re-running Python | save_embeddings(), load_embeddings_rds(), load_embeddings_npz(), load_centroids_csv() |
| Identify potential mislabelled clips | identify_misclassifications(), misclassification_summary() |
| Dimensionality reduction | reduce_embeddings() — UMAP, PCA, t-SNE |
| ggplot2 visualisation | plot_embedding_space(), plot_misclassifications(), plot_species_distance_matrix() |
| Built-in sample dataset | sample_embeddings(), sample_clips_dir() |
# Install from source (once on CRAN/r-universe, use install.packages())
remotes::install_local("/path/to/birdnetEmbed")Optional packages for UMAP and t-SNE:
install.packages(c("umap", "Rtsne"))embed_clips() calls the birdnet_analyzer Python package. All other
functions are pure R and work without Python.
Step 1 — point reticulate at your BirdNET virtual environment:
reticulate::use_python("/path/to/BirdNET-Analyzer/.venv/bin/python",
required = TRUE)Add this line to your project's .Rprofile (or a setup.R script) so it
runs automatically.
Step 2 — verify the environment:
library(birdnetEmbed)
check_python_env()
#> ✓ Python environment OKTip: All analysis and visualisation functions work on pre-computed embeddings — you only need Python the first time you extract embeddings from a clip library.
library(birdnetEmbed)
# Optional — only needed for embed_clips()
reticulate::use_python("path/to/.venv/bin/python", required = TRUE)
# ── 1. Extract embeddings ─────────────────────────────────────────────────
# Input directory layout:
# clips/
# Corvus brachyrhynchos_American Crow/ ← "Genus species_Common Name"
# clip001.wav
# clip002.wav
# Turdus migratorius_American Robin/
# clip001.wav
x <- embed_clips(
model_path = "path/to/custom_classifier.tflite",
input_dir = "path/to/clips/"
)
print(x)
#> <birdnet_embeddings> 12 species | 480 clips | 1024-dim embeddings
#> model : custom_classifier.tflite
#> input : path/to/clips/
# Save so you never have to re-run Python for this dataset
save_embeddings(x, "my_library.rds")
# ── 2. Reload later (no Python needed) ───────────────────────────────────
x <- load_embeddings_rds("my_library.rds")
# ── 3. Find potential mislabelled clips ───────────────────────────────────
suspects <- identify_misclassifications(x)
head(suspects[suspects$likely_misplaced, c("file", "short_name",
"nearest_short_name", "ratio")])
#> file short_name nearest_short_name ratio
#> clip_023.wav American Crow Common Raven 1.12
misclassification_summary(suspects)
# ── 4. Dimensionality reduction ───────────────────────────────────────────
reduced_pca <- reduce_embeddings(x, method = "pca")
reduced_umap <- reduce_embeddings(x, method = "umap") # requires umap pkg
# reduced_* is a plain data.frame — use directly in ggplot2
library(ggplot2)
ggplot(reduced_pca, aes(dim1, dim2, colour = short_name)) +
geom_point(alpha = 0.6) +
theme_bw()
# ── 5. Package plot helpers ───────────────────────────────────────────────
# Embedding space scatter (runs reduce_embeddings() internally)
plot_embedding_space(x, method = "pca", show_centroids = TRUE)
# Misclassification bar chart
plot_misclassifications(suspects, top_n = 20)
# Pairwise species distance heatmap
plot_species_distance_matrix(x)birdnetEmbed/
├── R/
│ ├── extract_embeddings.R embed_clips(), embeddings_to_df(), centroids_to_df()
│ ├── load_embeddings.R load/save RDS · load NPZ · load centroids CSV
│ ├── misclassification.R identify_misclassifications(), misclassification_summary()
│ ├── reduce.R reduce_embeddings() → UMAP / PCA / t-SNE backends
│ ├── visualize.R plot_embedding_space(), plot_misclassifications(),
│ │ plot_species_distance_matrix()
│ ├── sample_data.R sample_embeddings(), sample_clips_dir()
│ ├── python_env.R check_python_env(), module cache
│ └── utils.R short_name(), .cosine_dist_matrix(), helpers
├── inst/
│ ├── python/
│ │ └── birdnet_embed.py Python module loaded by reticulate
│ └── extdata/
│ ├── sample_clips/ 12 WAV files across 4 species (~3.9 MB)
│ └── sample_embeddings.rds pre-computed embeddings (~60 KB)
└── tests/testthat/ 104 tests (pure R, no Python required)
| Decision | Rationale |
|---|---|
| Python only for inference | birdnet_analyzer and TensorFlow are heavy dependencies. Isolating them behind reticulate means all downstream work (analysis, plotting) is pure R. |
birdnet_embeddings S3 list |
Stores embedding matrices per species plus centroids and metadata. Keeps the data compact (no redundant long-format copies) while remaining easy to inspect. |
| Cosine distance in R | Implemented as a normalised matrix multiply — fast on 1 024-dim vectors and matches the Python scipy.spatial.distance.cdist(metric="cosine") reference implementation. |
reduce_embeddings() returns a plain data.frame |
Reduced coordinates land directly in a data.frame with dim1/dim2 columns, so they drop straight into ggplot() without any conversion step. |
| Pre-built sample RDS | Lets users and CI/CD test analysis and visualisation without a Python environment or a TFLite model. |
A small built-in dataset is included for exploration and testing:
# Load pre-computed embeddings — no Python needed
x <- sample_embeddings()
print(x)
#> <birdnet_embeddings> 4 species | 12 clips | 1024-dim embeddings
# Or point embed_clips() at the bundled WAV files
sample_clips_dir() # path to inst/extdata/sample_clips/
list.files(sample_clips_dir())
#> [1] "Aviceda subcristata_Pacific Baza"
#> [2] "Cyclorana platycephala_Eastern Water-holding Frog"
#> [3] "Falco subniger_Black Falcon"
#> [4] "Limnodynastes interioris_Giant Banjo Frog"Species included (4 Australian species, 3 clips each):
| Species | Common name | Clips |
|---|---|---|
| Aviceda subcristata | Pacific Baza | 3 |
| Cyclorana platycephala | Eastern Water-holding Frog | 3 |
| Falco subniger | Black Falcon | 3 |
| Limnodynastes interioris | Giant Banjo Frog | 3 |
Embeddings were computed with the littlered0-4.tflite custom classifier
trained on the reallybig Australian species library.
Benchmarked on Apple M-series (macOS) using littlered0-4.tflite and
the reallybig library (3-second WAV clips at 48 kHz).
| Phase | Clips | Time | Throughput |
|---|---|---|---|
| First call (TFLite model load + inference) | 150 | ~13 s | ~12 clips/s |
| Subsequent calls (model already loaded) | 150 | ~4 s | ~38 clips/s |
Expected times for larger datasets (steady-state throughput, model loaded):
| Dataset size | Estimated time |
|---|---|
| 500 clips | ~13 s |
| 5 000 clips | ~2 min |
| 20 000 clips | ~9 min |
| 71 751 clips (reallybig full library, 386 species) | ~31 min |
The ~9-second startup cost (TFLite model load) is incurred once per R session. Save the result with
save_embeddings()so you never need to re-run extraction for the same library.
reticulate::use_python("path/to/.venv/bin/python", required = TRUE)
library(birdnetEmbed)
t <- system.time(
x <- embed_clips(
model_path = "path/to/littlered0-4.tflite",
input_dir = "path/to/clips/"
)
)
n_clips <- x$metadata$total_clips
cat(sprintf("%.0f clips in %.1f s (%.1f clips/s)\n",
n_clips, t["elapsed"], n_clips / t["elapsed"]))# Run the full test suite (no Python or model required)
devtools::test()
#> [ FAIL 0 | WARN 0 | SKIP 0 | PASS 104 ]Tests use either the pre-computed sample_embeddings() object or randomly
generated mock embeddings — no Python environment is needed to run them.
- BirdNET-Analyzer — the Python package providing the TFLite models and audio processing pipeline
- embedding_analysis/ — standalone Python scripts this package is based on
MIT