Skip to content

Conversation

AdonaiVera
Copy link
Contributor

@AdonaiVera AdonaiVera commented Oct 2, 2025

What changes are proposed in this pull request?

This PR introduces a series of improvements across the plugins, docs, and zoo sections to improve consistency, usability, and user experience:

  1. Updated the grey button to a cleaner orange style, and added a function to dynamically update its input variables.
  2. Removed the count from the plugin list.
  3. Added a link to the GitHub repository for all plugins.
  4. Standardized the Model Zoo:
    • Improved the search bar to make it more dynamic and reusable for both datasets and models.
    • Moved the index to an overview page.
    • In the future, we should simplified tag alignment (e.g., task-based tags like segmentation, object detection).
  5. Standardized the Dataset Zoo:
    • Moved each dataset into its own folder, keeping the existing text and repeating the function for now.
    • In the future, we could introduce an automated generation function leveraging list_zoo_datasets to dynamically build the pages. For this first version, the manual approach was chosen to preserve extra metadata provided in each dataset page.
    • Example script: Google Drive link

How is this patch tested? If it is not, please explain why.

  • Verified UI updates (button style, plugin list changes) in the local development environment.
  • Validated search bar functionality for both models and datasets.
  • Checked that all dataset and model zoo pages render correctly with the new folder structure and overview organization.
  • You can test it here: https://docs.dev.voxel51.com/

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.
  • No. You can skip the rest of this section.

This release improves the overall usability of the plugin ecosystem, documentation, and zoo pages by:

  • Introducing a redesigned button style.
  • Adding a GitHub link for all the community plugins.
  • Updating the Getting Started guide with user feedback.
  • Integrating a new recipe on retrieving items by ID.
  • Standardizing both the Model Zoo and Dataset Zoo for improved discoverability.

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other (Plugins, Model Zoo, Dataset Zoo)

Summary by CodeRabbit

Release Notes

  • New Features

    • Model Zoo redesigned with card-based interface for streamlined browsing
    • Comprehensive Dataset Zoo documentation with searchable dataset catalog
    • Enhanced search functionality supporting models, datasets, and plugins across the platform
    • Animated call-to-action buttons with dynamic hover effects
  • Documentation

    • Expanded dataset documentation with 40+ new dataset guides
    • Updated interactive UI layouts for Model Zoo and Dataset Zoo sections
    • Improved documentation navigation and cross-references
  • Style

    • Refined button animations and visual transitions

@AdonaiVera AdonaiVera requested a review from a team as a code owner October 2, 2025 13:10
Copy link
Contributor

coderabbitai bot commented Oct 2, 2025

Walkthrough

This pull request refactors the documentation ecosystem to support card-based layouts for models and datasets, introduces a new CustomAnimatedCTADirective for dynamic call-to-action buttons, updates model/dataset zoo generation scripts, migrates search/filter functionality across multiple inputs, and adds extensive dataset zoo documentation pages.

Changes

Cohort / File(s) Summary
Configuration & Directives
docs/conf.py, docs/source/custom_directives.py
Registers new CustomAnimatedCTADirective, updates toctree filtering, replaces plugins-search.js with unified search.js, adds CustomAnimatedCTADirective import and directive binding.
Model Zoo Generation
docs/scripts/make_model_zoo_docs.py
Restructures model zoo generation from section-based to card-based layout; introduces _render_card_model_content(), replaces multi-section templates with per-model card rendering, changes output to models/model_slug.rst structure with model_cards.rst index.
Plugin Documentation Generation
docs/scripts/generate_plugin_docs.py
Adds _generate_github_badge() helper method to generate GitHub repository badges for plugin ecosystem README documents.
.gitignore & Directory Structure
docs/.gitignore
Changes ignored path from source/model_zoo/models.rst (file) to source/model_zoo/models/ (directory).
Model Zoo Index & Structure
docs/source/model_zoo/index.rst, docs/source/model_zoo/overview.rst, docs/source/model_zoo/remote.rst
Updates model zoo index to card-based layout with search UI; creates new overview.rst documenting zoo concepts, built-in/remote models, and API reference; updates cross-reference from model-zoo-models to model-zoo.
Dataset Zoo Index & Structure
docs/source/dataset_zoo/index.rst, docs/source/dataset_zoo/overview.rst, docs/source/dataset_zoo/remote.rst
Restructures dataset zoo index from text-heavy to card-based layout with search UI; creates new overview.rst; updates cross-reference targets from dataset-zoo-datasets to dataset-zoo.
Dataset Documentation Pages
docs/source/dataset_zoo/datasets/*.rst
Adds 40+ new dataset documentation pages (ActivityNet, CIFAR, COCO, Kinetics, ImageNet, KITTI, MNIST, VOC, BDD100K, Caltech, FIW, HMDB51, LFW, Open Images, Places, UCF101, Quickstart variants, Sama-COCO, etc.) and dataset_cards.rst.
Dynamic CTA Styling
docs/source/assets/styles/components/_buttons.sass
Updates .sd-btn.sd-btn-primary.book-a-demo hover transforms to use CSS custom properties --cta-arrow-shift and --cta-text-shift with fallbacks; removes fixed .plugins-cta hover block.
Search & Filter Styling
docs/source/assets/styles/components/_plugins-search.sass
Extends search input selector from input#plugin-search to cover input#plugin-search, input#model-search, input#dataset-search.
Layout Styling
docs/source/assets/styles/layout/_main.sass
Adds styling for .book-a-demo links in tutorials callout with margin-top: auto, align-self: flex-start, and width: fit-content.
CSS Font Declarations
docs/source/_static/css/custom.css
Adds duplicate @font-face declarations for Geist and Geist Mono fonts.
Dynamic CTA JavaScript
docs/source/_static/js/custom.js
Adds initDynamicCTAs() function to compute and apply CSS variable shifts for CTA arrow/text elements based on measured dimensions; invoked on DOMContentLoaded and window resize.
Search JavaScript
docs/source/_static/js/search.js
Updates All button to show static "All" text instead of count; expands search input binding to support multiple IDs (plugin-search, model-search, dataset-search).
Tutorial Filters JavaScript
docs/source/_static/js/tutorial-filters.js
Changes pagination from "6" to "10"; updates updateList() to derive search term from multiple input sources (plugin-search, model-search, dataset-search); updates All button text to static label.
Documentation Cross-References
docs/source/index.rst, docs/source/user_guide/basics.rst, docs/source/user_guide/using_datasets.rst, docs/source/plugins/index.rst, docs/source/integrations/pytorch_hub.rst, docs/source/integrations/coco.rst
Replaces .. custombutton:: directives with .. customanimatedcta::; updates cross-reference anchors from model-zoo-models/dataset-zoo-datasets to model-zoo/dataset-zoo; minor wording updates (e.g., "ints of strings" → "ints or strings" in COCO docs).
Release Notes
docs/source/release-notes.rst
Updates multiple model zoo references from model-zoo-models to model-zoo across version entries.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Browser
    participant initDynamicCTAs as JS: initDynamicCTAs()
    participant DOM
    participant CSS

    User->>Browser: Page Load
    Browser->>initDynamicCTAs: DOMContentLoaded event
    activate initDynamicCTAs
    initDynamicCTAs->>DOM: Query [data-cta-dynamic="true"]
    initDynamicCTAs->>DOM: Measure .sd-btn dimensions
    initDynamicCTAs->>DOM: Measure .arrow & .text dimensions
    initDynamicCTAs->>initDynamicCTAs: Calculate --cta-arrow-shift<br/>Calculate --cta-text-shift
    initDynamicCTAs->>CSS: Set CSS Variables on element
    deactivate initDynamicCTAs
    
    Note over CSS: Hover transforms use<br/>var(--cta-arrow-shift, fallback)<br/>var(--cta-text-shift, fallback)
    
    User->>Browser: Resize window
    Browser->>initDynamicCTAs: resize event
    initDynamicCTAs->>DOM: Recalculate & update CSS vars
Loading
sequenceDiagram
    participant Gen as make_model_zoo_docs.py
    participant FS as Filesystem
    participant RST as RST Output

    Gen->>Gen: Load all models
    alt No models found
        Gen->>Gen: Log warning & exit
    else Models available
        Gen->>FS: Create docs/source/model_zoo/models/
        Gen->>Gen: Render model cards template
        Gen->>RST: Write model_cards.rst
        
        loop For each model
            Gen->>Gen: Get model_name & slug
            Gen->>Gen: Fetch description & tags
            Gen->>Gen: Render per-model page
            Gen->>RST: Write models/model_slug.rst
        end
        
        Gen->>Gen: Log success & counts
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The changes span multiple heterogeneous systems: significant refactoring of model/dataset zoo generation logic with new template structures and output flows; introduction of new directives and CSS variable-based dynamic styling; widespread cross-reference updates across 40+ documentation files; JavaScript enhancements for multi-input search support and dynamic CTA calculations; and extensive addition of new dataset documentation pages. While many dataset page additions are repetitive, the core logic changes in generation scripts, styling system integration, and documentation architecture require careful review.

Poem

🐰 Behold the zoo reimagined in cards,
With CTAs dancing their dynamic regards!
Search spans three realms where ecosystems dwell,
While datasets bloom in a documentation shell.
The rabbit hops through modular splendor anew! 🎪

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Standardize Zoos, Update Plugins, and Improve Docs" is directly related to the main changes in this changeset. The title accurately captures the three primary areas of improvement: standardization of the Model Zoo and Dataset Zoo, updates to the plugins ecosystem (button redesign, GitHub links, count removal), and general documentation enhancements. While the title covers multiple coordinated improvements rather than a single narrow change, it is specific and clear enough to convey the primary objectives to a developer reviewing the commit history.
Description Check ✅ Passed The pull request description comprehensively follows the required template structure. It includes all major sections: "What changes are proposed" with detailed enumerated improvements, "How is this patch tested" with specific validation steps, and a complete "Release Notes" section with the user-facing change checkbox marked as "Yes" and a descriptive summary of the changes. The description also identifies affected areas, marking "Documentation" as the impacted component. The content is specific, well-organized, and provides sufficient context for reviewers to understand the scope and rationale of the changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch docs/unified_model_dataset_plugin

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d7ef36 and 612440a.

📒 Files selected for processing (39)
  • docs/source/conf.py (5 hunks)
  • docs/source/dataset_zoo/datasets/activitynet_100.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/activitynet_200.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/bdd100k.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/caltech101.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/caltech256.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cifar10.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cifar100.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cityscapes.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/coco_2014.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/coco_2017.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/fashion_mnist.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/fiw.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/hmdb51.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/imagenet_2012.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/imagenet_sample.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_400.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_600.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kitti.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kitti_multiview.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/lfw.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/mnist.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/open_images_v6.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/open_images_v7.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/places.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_3d.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_geo.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_groups.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_video.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/sama_coco.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/ucf101.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/voc_2007.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/voc_2012.rst (1 hunks)
  • docs/source/dataset_zoo/overview.rst (1 hunks)
  • docs/source/model_zoo/index.rst (1 hunks)
  • docs/source/model_zoo/overview.rst (1 hunks)
✅ Files skipped from review due to trivial changes (10)
  • docs/source/dataset_zoo/datasets/voc_2012.rst
  • docs/source/dataset_zoo/datasets/activitynet_100.rst
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
  • docs/source/dataset_zoo/datasets/quickstart.rst
  • docs/source/dataset_zoo/datasets/quickstart_3d.rst
  • docs/source/dataset_zoo/datasets/places.rst
  • docs/source/dataset_zoo/datasets/caltech256.rst
  • docs/source/dataset_zoo/datasets/bdd100k.rst
  • docs/source/dataset_zoo/datasets/voc_2007.rst
  • docs/source/dataset_zoo/datasets/mnist.rst
🚧 Files skipped from review as they are similar to previous changes (22)
  • docs/source/dataset_zoo/datasets/activitynet_200.rst
  • docs/source/dataset_zoo/datasets/kinetics_700.rst
  • docs/source/dataset_zoo/datasets/quickstart_groups.rst
  • docs/source/dataset_zoo/datasets/cityscapes.rst
  • docs/source/dataset_zoo/datasets/cifar10.rst
  • docs/source/dataset_zoo/datasets/coco_2014.rst
  • docs/source/dataset_zoo/datasets/imagenet_2012.rst
  • docs/source/model_zoo/overview.rst
  • docs/source/dataset_zoo/datasets/kitti_multiview.rst
  • docs/source/dataset_zoo/datasets/fiw.rst
  • docs/source/dataset_zoo/datasets/open_images_v6.rst
  • docs/source/dataset_zoo/datasets/cifar100.rst
  • docs/source/dataset_zoo/datasets/sama_coco.rst
  • docs/source/conf.py
  • docs/source/dataset_zoo/datasets/imagenet_sample.rst
  • docs/source/dataset_zoo/datasets/caltech101.rst
  • docs/source/dataset_zoo/overview.rst
  • docs/source/dataset_zoo/datasets/quickstart_geo.rst
  • docs/source/dataset_zoo/datasets/hmdb51.rst
  • docs/source/dataset_zoo/datasets/open_images_v7.rst
  • docs/source/dataset_zoo/datasets/ucf101.rst
  • docs/source/dataset_zoo/datasets/lfw.rst
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (15)
docs/source/dataset_zoo/datasets/fashion_mnist.rst (3)

26-30: The cross-reference anchor exists and is properly defined—no action needed.

The verification confirms that the dataset-zoo-ml-backend anchor is defined at docs/source/dataset_zoo/api.rst:599 and is correctly referenced in the fashion_mnist.rst file. The documentation link is valid.


56-58: Image file exists and path is correct.

The dataset preview image has been properly added to the repository at docs/source/images/dataset_zoo/fashion-mnist-test.png, which corresponds correctly to the documentation reference of /images/dataset_zoo/fashion-mnist-test.png.


23-24: No issues found—documentation references are valid.

The verification confirms that both FashionMNISTDataset classes exist at the specified module paths and are properly defined and exported:

  • fiftyone.zoo.datasets.tf.FashionMNISTDataset ✓ (line 101 in tf.py)
  • fiftyone.zoo.datasets.torch.FashionMNISTDataset ✓ (line 101 in torch.py)

Both classes are registered in their respective module dictionaries and follow the established FiftyOne documentation pattern.

docs/source/dataset_zoo/datasets/kitti.rst (3)

55-57: Image asset verified—no issues found.

The image file kitti-train.png exists at docs/source/images/dataset_zoo/kitti-train.png. The RST directive correctly references the asset with appropriate alt text and alignment settings.


29-29: ZooDataset class reference is valid.

The KITTIDataset class is correctly defined at fiftyone.zoo.datasets.base.KITTIDataset, and the image asset kitti-train.png exists at the referenced path. The :class: directive properly links to the dataset class.


40-45: No issues found. Python example code is correct.

The documentation example correctly uses the FiftyOne API. Verification confirms:

  • foz.load_zoo_dataset() accepts the split parameter as shown
  • The KITTI dataset is properly registered and supported
  • fo.launch_app() correctly accepts a dataset parameter
  • The usage pattern matches established patterns in the codebase
docs/source/dataset_zoo/datasets/kinetics_400.rst (1)

1-180: Comprehensive and well-structured documentation.

The Kinetics-400 entry is thorough and follows good documentation practices: clear metadata, detailed split statistics, well-explained partial download semantics, and practical Python/CLI examples. The version-specific note at lines 90–94 appropriately clarifies the zip availability constraint.

docs/source/dataset_zoo/datasets/kinetics_600.rst (1)

1-175: Kinetics-600 documentation is correct; the zip availability constraint applies only to Kinetics 400.

The code shows that Kinetics400DatasetInfo has supports_classwise_s3_downloads returning False, while Kinetics600DatasetInfo inherits from ClasswiseS3KineticsDatasetInfo—indicating that Kinetics 600 does support class-wise S3 downloads. The constraint documented in kinetics_400.rst does not apply to Kinetics 600, so the current documentation is accurate and should not include that note.

Likely an incorrect or invalid review comment.

docs/source/dataset_zoo/datasets/quickstart_video.rst (3)

25-45: Example usage is clear and follows best practices.

The Python and CLI examples are well-structured and provide clear usage patterns for loading and viewing the dataset. The code formatting with line numbers is appropriate.


47-54: No issues found—both references are valid.

The troubleshooting anchor is properly defined at docs/source/installation/troubleshooting.rst:92, and the image file exists at docs/source/images/dataset_zoo/quickstart-video.png. The code is correct as written.


1-21: All documentation references verified and correct.

The class path fiftyone.zoo.datasets.base.QuickstartVideoDataset resolves to the existing class definition at fiftyone/zoo/datasets/base.py:3158. Image asset /images/dataset_zoo/quickstart-video.png exists, and the troubleshooting anchor reference is consistent with documentation patterns across other video dataset pages. File structure, metadata, and code examples are well-formed and properly formatted for Sphinx.

docs/source/model_zoo/index.rst (4)

1-85: LGTM: Card-based layout and navigation structure look solid.

The overall refactoring introduces a clean card-based UI with search, filtering, pagination, and an animated CTA. The structure aligns well with the PR objectives to standardize the Model Zoo with a modern, discoverable layout. Once the HTML structure issue and directive/file verifications are confirmed, this refactor should significantly improve the user experience.


19-29: Search functionality is properly implemented and wired.

Verification confirms that docs/source/_static/js/search.js implements the search filtering logic and properly attaches an event listener to the model-search input element. The search term is extracted, normalized, and used to filter the tutorial list. No issues found.


31-34: The customanimatedcta directive is properly registered and functional.

The directive is correctly defined as a Sphinx Directive subclass in custom_directives.py (line 352), explicitly registered in conf.py via app.add_directive("customanimatedcta", CustomAnimatedCTADirective) (line 289), and properly implements all required options including button_text, button_link, and align. The implementation correctly handles the alignment parameter and generates the button HTML with appropriate styling. The usage in the reviewed file matches the directive's API perfectly.


60-60: No action required—include path is correct.

The file models/model_cards.rst is generated during the documentation build process by make_model_zoo_docs.py (line 386-405), not committed to the repository. The include directive on line 60 correctly references the location where the script writes this file: docs/source/model_zoo/models/model_cards.rst. The path will resolve properly when the documentation is built.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AdonaiVera AdonaiVera requested a review from brimoor October 2, 2025 13:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (1)

242-242: Fix misleading comment.

The comment references "mvtec_bottle" but the code creates "mvtec-screw". Update the comment to match the actual dataset name.

Apply this diff:

-    # Clone the dataset into a new one called "mvtec_bottle"
+    # Clone the dataset into a new one called "mvtec-screw"
     mvtec_screw = test_split.clone("mvtec-screw", persistent=True)
docs/source/getting_started/manufacturing/02_embeddings.ipynb (1)

48-118: Unify dataset handle; current branch logic raises NameError

When the dataset already exists, dataset_emb is never defined, so the later cells (compute_visualization, reload/print) crash. Conversely, in the creation branch dataset is never assigned, so launch_app(dataset, …) fails. Please collapse to a single dataset variable that is always defined and reuse it downstream.

-if dataset_name in fo.list_datasets():
-    print(f"Dataset '{dataset_name}' exists. Loading...")
-    dataset = fo.load_dataset(dataset_name)
-else:
-    print(f"Dataset '{dataset_name}' does not exist. Creating a new one...")
-    # Clone the dataset with a new name and make it persistent
-    #dataset_ = fouh.load_from_hub("Voxel51/mvtec-ad", persistent=True, overwrite=True)
-    dataset_ = fo.Dataset.from_dir(
-        dataset_dir="/content/mvtec-ad",
-        dataset_type=fo.types.FiftyOneDataset
-    )
-    dataset_emb = dataset_.clone("MVTec_AD")
-
-print(dataset_emb)
+if dataset_name in fo.list_datasets():
+    print(f"Dataset '{dataset_name}' exists. Loading...")
+    dataset = fo.load_dataset(dataset_name)
+else:
+    print(f"Dataset '{dataset_name}' does not exist. Creating a new one...")
+    # Clone the dataset with a new name and make it persistent
+    #dataset_src = fouh.load_from_hub("Voxel51/mvtec-ad", persistent=True, overwrite=True)
+    dataset_src = fo.Dataset.from_dir(
+        dataset_dir="/content/mvtec-ad",
+        dataset_type=fo.types.FiftyOneDataset
+    )
+    dataset = dataset_src.clone(dataset_name)
+
+print(dataset)
@@
-fob.compute_visualization(
-    dataset_emb,
+fob.compute_visualization(
+    dataset,
@@
-dataset_emb.reload()
-print(dataset_emb)
-print(dataset_emb.last())
+dataset.reload()
+print(dataset)
+print(dataset.last())
docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb (1)

764-769: Fix compute_visualization embeddings field name

You store the per-sample vectors in the "embedding" field (see the loop on lines 704-727), but the compute_visualization call asks for embeddings="padin_emb". FiftyOne will raise ValueError: field 'padin_emb' does not exist and the notebook stops. Pass the actual field name so the visualization step works.

-    mvtec_bottle,
-    embeddings="padin_emb",
+    mvtec_bottle,
+    embeddings="embedding",
🧹 Nitpick comments (13)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (2)

137-137: Improve idempotency of unzip operation.

The unzip command will fail if run multiple times in the same environment because the directory already exists. Consider adding a check or using the -o flag to overwrite, or better yet, skip if the directory already exists.

Apply this approach:

import os
if not os.path.exists('/content/mvtec-ad'):
    !unzip mvtec_ad.zip

130-158: Consider cleanup of temporary download files.

The downloaded mvtec_ad.zip file is not removed after extraction, consuming unnecessary storage in the Colab environment. Consider adding cleanup after successful extraction.

Add cleanup after the unzip:

!unzip mvtec_ad.zip
import os
if os.path.exists("mvtec_ad.zip"):
    os.remove("mvtec_ad.zip")
docs/source/_static/css/custom.css (1)

1-1: Avoid duplicate @font-face definitions

This block re-declares the Geist / Geist Mono faces that already live earlier in the sheet. Keeping both sets bloats the asset and makes future edits harder. Please consolidate to a single definition (and update the URLs there if needed) so the font setup has one source of truth.

docs/source/dataset_zoo/dataset_zoo_ecosystem/bdd100k.rst (1)

41-48: Prefer HTTPS for BDD links

Update URLs to HTTPS for consistency and security.

-    You can register at http://bdd-data.berkeley.edu in order to get links
+    You can register at https://bdd-data.berkeley.edu in order to get links
-    to download the data.
+    to download the data.
--   Dataset source: http://bdd-data.berkeley.edu
-+   Dataset source: https://bdd-data.berkeley.edu
--   Dataset license: http://bdd-data.berkeley.edu/download.html
-+   Dataset license: https://bdd-data.berkeley.edu/download.html
docs/source/getting_started/manufacturing/03_clustering.ipynb (2)

1-1: Potential path mismatch after unzip

You unzip mvtec_ad.zip but later read from /content/mvtec-ad. Ensure the extracted folder name matches. Consider detecting it programmatically.

from pathlib import Path

!unzip -q mvtec_ad.zip
root = next((p for p in Path(".").iterdir() if p.is_dir() and p.name.startswith("mvtec")), None)
dataset_dir = str(root) if root else "/content/mvtec-ad"  # fallback
dataset_ = fo.Dataset.from_dir(dataset_dir=dataset_dir, dataset_type=fo.types.FiftyOneDataset)

1-1: Reduce duplication and unused imports

You both foz.load_zoo_model("open-clip-vit-base32-torch") and later fozm.load_zoo_model("clip-vit-base32-torch"). Keep one. Also drop unused fiftyone.utils.huggingface as fouh if not used.

docs/scripts/make_model_zoo_docs.py (4)

371-381: Consider enabling Jinja2 autoescape (or explicitly disabling)

RST ultimately renders to HTML. To avoid accidental HTML injection in descriptions, use select_autoescape for html/xml, or explicitly set autoescape=False with a comment.

-from jinja2 import Environment, BaseLoader
+from jinja2 import Environment, BaseLoader, select_autoescape
...
-    environment = Environment(
+    environment = Environment(
         loader=BaseLoader,
         trim_blocks=True,
         lstrip_blocks=True,
+        autoescape=select_autoescape(default_for_string=False, default=False),
     )

382-389: Cross‑platform and cleaner path handling

Replace manual string splitting with pathlib for portability; you already import Path.

-    docs_dir = "/".join(os.path.realpath(__file__).split("/")[:-2])
-    docs_source_dir = os.path.join(docs_dir, "source")
-    model_zoo_ecosystem_dir = os.path.join(
-        docs_source_dir, "model_zoo", "model_zoo_ecosystem"
-    )
+    script_path = Path(__file__).resolve()
+    docs_dir = script_path.parents[1]
+    docs_source_dir = docs_dir / "source"
+    model_zoo_ecosystem_dir = docs_source_dir / "model_zoo" / "model_zoo_ecosystem"
...
-    os.makedirs(model_zoo_ecosystem_dir, exist_ok=True)
+    os.makedirs(str(model_zoo_ecosystem_dir), exist_ok=True)

Also update later writes to str(path).


404-412: Use pathlib consistently for writes

Consistent with the previous refactor; avoid mixing os.path with Path.

-    cards_path = os.path.join(model_zoo_ecosystem_dir, "model_cards.rst")
-    etau.write_file("\n".join(model_cards_content), cards_path)
+    cards_path = model_zoo_ecosystem_dir / "model_cards.rst"
+    etau.write_file("\n".join(model_cards_content), str(cards_path))
...
-        model_slug = model_name.replace(".", "_").replace("-", "_")
-        model_path = os.path.join(model_zoo_ecosystem_dir, f"{model_slug}.rst")
-        etau.write_file(model_content, model_path)
+        model_slug = model_name.replace(".", "_").replace("-", "_")
+        model_path = model_zoo_ecosystem_dir / f"{model_slug}.rst"
+        etau.write_file(model_content, str(model_path))

241-266: Template conditionals test substrings in a comma‑joined string

'foo' in tags does substring checks on a single string. Prefer passing a tags_list to the template and using membership against a list to avoid false positives.

-    content = template.render(
+    content = template.render(
         ...
-        tags=tags_str,
+        tags=tags_str,
+        tags_list=zoo_model.tags,
         ...
     )

And in the template, change checks like {% if 'clip' in tags %} to {% if 'clip' in tags_list %}.

docs/source/_static/js/custom.js (3)

605-628: Consider guarding against zero-width elements.

The compute function uses getBoundingClientRect() but doesn't check if the button has rendered dimensions. If elements are hidden or not yet laid out, btnRect.width could be zero, leading to incorrect CSS variable values.

Apply this diff to add a safety check:

 const compute = (btn) => {
   const arrow = btn.querySelector(".arrow");
   const text = btn.querySelector(".text");
   if (!arrow || !text) return;

   const btnRect = btn.getBoundingClientRect();
+  if (btnRect.width === 0 || btnRect.height === 0) return;
+
   const arrowRect = arrow.getBoundingClientRect();
   const textRect = text.getBoundingClientRect();

614-621: Extract magic numbers to named constants.

The calculation uses several magic numbers (5, 30, 320, 0.5) that lack context. Consider extracting these to named constants for maintainability.

For example:

const ARROW_PADDING = 5;
const MIN_TEXT_SHIFT = 30;
const MAX_ARROW_SHIFT = 320;
const TEXT_SHIFT_RATIO = 0.5;

const spaceRight =
  btnRect.width - (arrowRect.left - btnRect.left) - arrowRect.width;
const arrowShift = Math.max(0, spaceRight - ARROW_PADDING);

const textShift = -Math.min(
  Math.max(arrowRect.width, MIN_TEXT_SHIFT),
  btnRect.width * TEXT_SHIFT_RATIO
);

btn.style.setProperty(
  "--cta-arrow-shift",
  `${Math.min(arrowShift, MAX_ARROW_SHIFT)}px`
);

637-637: Consider debouncing the resize handler.

The resize event can fire rapidly during window resizing, causing frequent recalculations. Consider debouncing or throttling the updateAll call for better performance.

Example implementation:

let resizeTimeout;
const debouncedUpdate = () => {
  clearTimeout(resizeTimeout);
  resizeTimeout = setTimeout(updateAll, 150);
};
window.addEventListener("resize", debouncedUpdate);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b94955 and 2e3d1c6.

📒 Files selected for processing (62)
  • docs/.gitignore (1 hunks)
  • docs/scripts/generate_plugin_docs.py (3 hunks)
  • docs/scripts/make_model_zoo_docs.py (4 hunks)
  • docs/source/_static/css/custom.css (1 hunks)
  • docs/source/_static/js/custom.js (2 hunks)
  • docs/source/_static/js/search.js (1 hunks)
  • docs/source/_static/js/tutorial-filters.js (3 hunks)
  • docs/source/assets/styles/components/_buttons.sass (1 hunks)
  • docs/source/assets/styles/components/_plugins-search.sass (1 hunks)
  • docs/source/assets/styles/layout/_main.sass (1 hunks)
  • docs/source/conf.py (5 hunks)
  • docs/source/custom_directives.py (3 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_100.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_200.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/bdd100k.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech101.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar10.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar100.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/cityscapes.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2014.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2017.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/dataset_cards.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/fashion_mnist.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/fiw.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/hmdb51.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_2012.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_sample.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_400.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_600.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_700.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_700_2020.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti_multiview.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/lfw.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/mnist.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v6.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v7.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/places.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_3d.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_geo.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_groups.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_video.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/sama_coco.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/ucf101.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2007.rst (1 hunks)
  • docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2012.rst (1 hunks)
  • docs/source/dataset_zoo/index.rst (1 hunks)
  • docs/source/dataset_zoo/overview.rst (1 hunks)
  • docs/source/getting_started/manufacturing/01_intro.ipynb (1 hunks)
  • docs/source/getting_started/manufacturing/02_embeddings.ipynb (1 hunks)
  • docs/source/getting_started/manufacturing/03_clustering.ipynb (1 hunks)
  • docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb (13 hunks)
  • docs/source/getting_started/manufacturing/06_augmentation.ipynb (6 hunks)
  • docs/source/index.rst (8 hunks)
  • docs/source/model_zoo/index.rst (1 hunks)
  • docs/source/model_zoo/overview.rst (1 hunks)
  • docs/source/plugins/index.rst (1 hunks)
  • docs/source/recipes/index.rst (2 hunks)
  • docs/source/user_guide/basics.rst (9 hunks)
  • docs/source/user_guide/using_datasets.rst (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.

Applied to files:

  • docs/source/index.rst
  • docs/source/dataset_zoo/overview.rst
  • docs/source/dataset_zoo/index.rst
📚 Learning: 2025-09-04T19:49:48.247Z
Learnt from: brimoor
PR: voxel51/fiftyone#6317
File: fiftyone/utils/caltech.py:149-157
Timestamp: 2025-09-04T19:49:48.247Z
Learning: The Caltech 256 dataset has a static directory structure where all class directory names follow the format "XXX.category_name" (e.g., "001.ak47", "002.american-flag"), meaning all directory names contain dots and the split operation in the label normalization code is safe.

Applied to files:

  • docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst
🧬 Code graph analysis (3)
docs/source/_static/js/search.js (1)
docs/source/_static/js/tutorial-filters.js (1)
  • allButton (180-182)
docs/source/conf.py (1)
docs/source/custom_directives.py (1)
  • CustomAnimatedCTADirective (352-405)
docs/scripts/make_model_zoo_docs.py (1)
fiftyone/zoo/models/__init__.py (2)
  • get_zoo_model (356-366)
  • list_zoo_models (36-73)
🪛 Pylint (3.3.8)
docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb

[refactor] 7-10: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

docs/source/getting_started/manufacturing/02_embeddings.ipynb

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

docs/source/getting_started/manufacturing/03_clustering.ipynb

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

docs/source/getting_started/manufacturing/01_intro.ipynb

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

🪛 Ruff (0.13.2)
docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb

49-49: Redefinition of unused torch from line 2

Remove definition: torch

(F811)


66-66: Redefinition of unused torch from line 49

Remove definition: torch

(F811)

docs/source/custom_directives.py

371-375: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

docs/scripts/make_model_zoo_docs.py

373-373: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (41)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (1)

361-361: Verify the nbformat_minor downgrade.

The nbformat_minor has been changed from 2 to 0, which is a downgrade. This is unusual and could potentially affect compatibility with newer Jupyter features. Typically, this value increases as the notebook format evolves.

Please verify this change is intentional and test that the notebook renders correctly in both Jupyter and Colab environments.

docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2012.rst (1)

1-65: LGTM!

The documentation structure is consistent, well-organized, and follows the standard format for dataset zoo pages. The content accurately describes the VOC-2012 dataset with appropriate details, backend requirements, and usage examples.

docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_video.rst (1)

1-52: LGTM!

The documentation is well-structured and includes all necessary information including the important ffmpeg requirement note. The format is consistent with other dataset zoo documentation pages.

docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst (1)

1-52: LGTM!

The documentation is well-structured, accurate, and consistent with the standard format for dataset zoo pages. All sections are complete and properly formatted.

docs/source/dataset_zoo/dataset_zoo_ecosystem/lfw.rst (1)

1-51: LGTM!

The documentation is well-organized and provides comprehensive information about the LFW dataset. The structure is consistent with other dataset zoo documentation pages.

docs/scripts/generate_plugin_docs.py (2)

106-108: LGTM!

The _generate_github_badge method is simple, focused, and generates a properly formatted GitHub badge. The implementation is clean and serves its purpose well.


510-522: LGTM!

The badge generation is properly integrated into the documentation workflow. For community plugins, it's placed after the community note, and for other plugins, it's placed at the beginning. This provides consistent visual branding across all plugin documentation.

docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar10.rst (1)

1-55: LGTM!

The CIFAR-10 documentation is well-structured and complete. It follows the standard format for dataset zoo pages and includes all necessary information including backend requirements and usage examples.

docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_600.rst (1)

1-173: LGTM! Well-structured dataset documentation.

The Kinetics-600 documentation page is comprehensive and follows best practices:

  • Clear dataset overview and metadata
  • Detailed partial download parameter documentation
  • Practical usage examples in both Python and CLI
  • Appropriate notes about video availability and download behavior
docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_3d.rst (1)

1-51: LGTM! Clear and concise quickstart documentation.

The Quickstart 3D documentation is well-suited for its purpose as an introductory dataset. The description clearly explains the data source, processing methods (Poisson disk sampling, convex hull bounding boxes), and provides straightforward usage examples.

docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_100.rst (1)

1-208: LGTM! Comprehensive ActivityNet documentation.

The ActivityNet 100 documentation effectively addresses the complexities of this dataset:

  • Clear notes about video availability issues and YouTube/source file alternatives
  • Detailed parameter documentation for partial downloads
  • Helpful guidance on full split downloads with external form reference
  • Well-organized example usage for both Python and CLI
docs/source/getting_started/manufacturing/01_intro.ipynb (2)

1-1: Disregard static analysis configuration warnings.

The Pylint warnings about unrecognized options and deprecated settings are configuration issues, not code problems. These don't affect the notebook's functionality and can be safely ignored.


1-1: LGTM! Improved notebook with standardized naming and alternative download.

The notebook updates improve consistency and usability:

  • Standardized file naming with underscores (mvtec_ad.zip, mvtec_ad)
  • Activated Google Drive download alternative for users who encounter Hugging Face issues
  • Minor formatting improvements (port specification)

The changes maintain the tutorial's educational flow while providing better alternatives for dataset access.

docs/source/_static/js/tutorial-filters.js (2)

93-101: LGTM! Flexible search input handling.

The updates enable the filter system to work across different page types (plugins, models, datasets) by:

  • Checking for multiple possible search input IDs
  • Using optional chaining for safe element access
  • Properly falling back to empty string when no input exists

The approach is consistent and maintains backward compatibility.

Also applies to: 171-174


150-151: Simplified "All" button label is clearer.

Changing from a dynamic count display to a static "All" label simplifies the UI and reduces unnecessary DOM updates. The count information is still visually available through the filtered cards themselves.

docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_2012.rst (1)

1-92: LGTM! Clear documentation with appropriate manual download guidance.

The ImageNet 2012 documentation effectively communicates:

  • The manual download requirement with clear directory structure
  • Registration process for obtaining download links
  • Backend installation prerequisites (Torch or TensorFlow)
  • Straightforward usage examples for both Python and CLI

The emphasis on manual steps is appropriate given ImageNet's access requirements.

docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_400.rst (1)

1-179: Well-documented dataset with important availability notes.

The Kinetics-400 documentation is comprehensive and includes an important distinction about download behavior (no class-based zips, requiring YouTube downloads for partial loads). This helps users understand the limitations compared to other Kinetics versions.

docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v6.rst (1)

216-223: No action required: examples are consistent. Both the CLI and Python snippets use the supported plural label_types values (“detections”, “classifications”, “relationships”, “segmentations”), matching the parameter documentation.

Likely an incorrect or invalid review comment.

docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2014.rst (1)

86-89: Verify “attrs” parameter mention

Docs mention filtering on “attrs”, but the parameter isn’t described above. Confirm it’s supported by COCO2014Dataset/Importer; if not, remove or document it.

docs/source/dataset_zoo/dataset_zoo_ecosystem/fiw.rst (1)

3-142: Well-structured and clear FIW docs

Accurate overview, labels, examples, and utilities. LGTM.

docs/source/dataset_zoo/dataset_zoo_ecosystem/fashion_mnist.rst (1)

1-56: LGTM!

The documentation page follows the established pattern for dataset zoo entries and provides clear, accurate information about the Fashion-MNIST dataset. The structure, examples, and references are consistent with other dataset pages in this PR.

docs/source/dataset_zoo/overview.rst (1)

1-107: LGTM!

The overview page provides a clear introduction to the Dataset Zoo with well-structured sections and accurate code examples. The use of the new customanimatedcta directive aligns with the PR's goal of standardizing documentation CTAs.

docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti_multiview.rst (1)

1-58: LGTM!

The KITTI Multiview dataset page is well-structured and provides comprehensive information about the dataset's multiview components, splits, and usage. The documentation is consistent with other dataset pages in the ecosystem.

docs/source/user_guide/basics.rst (1)

58-58: LGTM!

The systematic replacement of custombutton with customanimatedcta aligns with the PR's goal of standardizing CTA buttons across documentation. All directive usages maintain their existing options and the changes are consistent throughout the file.

Also applies to: 99-99, 212-212, 225-225, 241-241, 264-264, 336-336, 393-393, 432-432

docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti.rst (1)

1-55: LGTM!

The KITTI dataset documentation is clear, well-organized, and provides all necessary information for users to load and use the dataset. The structure and examples are consistent with other dataset pages in the ecosystem.

docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_groups.rst (1)

1-48: LGTM!

The Quickstart Groups dataset page provides clear information about the grouped multimodal dataset. The documentation structure, examples, and metadata are consistent with other quickstart dataset pages in the ecosystem.

docs/source/assets/styles/components/_buttons.sass (2)

47-52: LGTM!

The CSS changes correctly integrate with the dynamic CTA JavaScript functionality by using CSS custom properties with appropriate fallback values. The fallbacks preserve the existing behavior when the JavaScript hasn't executed or the variables aren't set.


47-47: Verify the fallback value precision.

The fallback value 116.22px for --cta-arrow-shift is oddly specific. Ensure this value was measured from the existing design and not accidentally introduced with excessive precision.

docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_geo.rst (1)

1-46: LGTM! Well-structured dataset documentation.

The documentation follows a clear, consistent structure with appropriate sections (anchor, description, details, usage examples). The tabbed examples provide both Python and CLI usage patterns, which improves accessibility for different user workflows.

docs/source/dataset_zoo/dataset_zoo_ecosystem/dataset_cards.rst (1)

1-244: LGTM! Comprehensive dataset card catalog.

The dataset cards are well-organized with consistent structure across all entries. Each card provides essential metadata (header, description, link, image, tags) that enables effective browsing and discovery of available datasets.

docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2007.rst (1)

1-65: LGTM! Clear and comprehensive dataset documentation.

The documentation properly notes the test set limitation (no annotations) and backend requirements, which are important for users to understand before loading the dataset.

docs/source/custom_directives.py (3)

208-210: Minor style adjustment in image formatting.

The image formatting now includes explicit margin styling for better visual spacing in callout items.


240-249: Enhanced CTA button with animated arrow.

The updated template adds a dynamic arrow SVG and the data-cta-dynamic="true" attribute, which enables the animated hover effects described in the new CustomAnimatedCTADirective.


352-423: Well-implemented animated CTA directive.

The new CustomAnimatedCTADirective provides a clean interface for creating animated call-to-action buttons with flexible alignment options. The implementation correctly handles all directive options and generates appropriate HTML output.

Note: The static analysis warning about option_spec being a mutable class attribute is a standard pattern for Sphinx directives and can be safely ignored in this context.

docs/source/model_zoo/overview.rst (1)

1-240: LGTM! Excellent model zoo overview documentation.

The documentation provides a comprehensive introduction to the Model Zoo with:

  • Clear organization of key concepts (built-in models, remote models, interface, API)
  • Practical code examples demonstrating predictions, embeddings, and logits
  • Effective use of the new customanimatedcta directive for navigation
  • Helpful notes about requirements and capabilities
docs/source/conf.py (4)

25-25: Import added for new directive.

The CustomAnimatedCTADirective is imported to enable registration in the Sphinx setup.


187-191: Toctree configuration for ecosystem documentation.

The remove_from_toctrees setting excludes ecosystem subdirectories from the main table of contents, which is appropriate for the card-based ecosystem structure where individual pages are accessed via cards rather than toctree navigation.


207-207: New search functionality added.

The js/search.js file is included to support enhanced search/filter capabilities for the ecosystem pages.


285-285: New directive registered successfully.

The customanimatedcta directive is properly registered with the Sphinx application, enabling its use throughout the documentation.

docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2017.rst (1)

1-236: LGTM! Comprehensive COCO-2017 documentation.

This is excellent documentation that provides:

  • Clear overview and context (including COCO team recommendation)
  • Important notes about class counts and data availability
  • Detailed partial download parameter descriptions
  • Multiple practical examples for common use cases
  • Both Python and CLI examples

The documentation enables users to efficiently work with this large dataset by understanding partial download options.

docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_200.rst (1)

1-209: LGTM! Thorough ActivityNet 200 documentation.

The documentation effectively addresses the unique challenges of this dataset:

  • Clear explanation of ActivityNet 100 vs 200 differences
  • Important warnings about video availability and manual downloads
  • Comprehensive parameter documentation for partial downloads
  • Link to full split download instructions
  • Practical examples demonstrating common workflows

The note about requiring ffmpeg installation is a helpful troubleshooting pointer.

@AdonaiVera
Copy link
Contributor Author

I moved some commits into individual PRs to keep this PR focused on the dataset and model improvements

@AdonaiVera
Copy link
Contributor Author

Instead of duplicating content across multiple files, the last commit makes dataset.rst the source of truth. Dynamic individual files are now generated directly from it, ensuring consistency and reducing maintenance overhead as a temporal solution.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e3d1c6 and 21dd913.

📒 Files selected for processing (4)
  • docs/.gitignore (1 hunks)
  • docs/generate_docs.bash (1 hunks)
  • docs/scripts/make_dataset_zoo_docs.py (1 hunks)
  • docs/source/dataset_zoo/dataset.rst (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/.gitignore
🧰 Additional context used
🪛 Pylint (3.3.8)
docs/scripts/make_dataset_zoo_docs.py

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

🪛 Ruff (0.13.3)
docs/scripts/make_dataset_zoo_docs.py

34-34: Local variable name is assigned to but never used

Remove assignment to unused variable name

(F841)


128-128: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)


165-165: Loop control variable title not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21dd913 and d8288fa.

📒 Files selected for processing (1)
  • docs/scripts/make_dataset_zoo_docs.py (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.8)
docs/scripts/make_dataset_zoo_docs.py

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

🪛 Ruff (0.13.3)
docs/scripts/make_dataset_zoo_docs.py

34-34: Local variable name is assigned to but never used

Remove assignment to unused variable name

(F841)


128-128: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)


165-165: Loop control variable title not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (3)
docs/scripts/make_dataset_zoo_docs.py (3)

128-131: Static analysis autoescape warning is a false positive.

The Ruff warning about missing autoescape=True is not applicable here since this script generates RST documentation files, not HTML. XSS vulnerabilities are not a concern for RST generation.


165-175: Duplicate anchor issue has been resolved.

The previous review concern about duplicate .. _dataset-zoo-… anchors causing Sphinx warnings has been properly addressed. Lines 168-172 now strip the original anchor from each generated dataset file, preventing duplicate target warnings.

Based on past review comments.


74-114: Verified description extraction heuristics. Sampled outputs show accurate and properly truncated descriptions across datasets.

Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdonaiVera this is looking awesome!!!

I explored the new layout on https://docs.dev.voxel51.com and just have some small cleanup-related feedback for you 🤓

General Points

  • What do you think about increasing the number of cards per page from 6 to, say, 10 on all of our card lists? (tutorials, models, datasets, plugins, etc). 6 feels a bit too few to me

Dataset Zoo

Per page feedback:

/dataset_zoo/
    index.html
        rename anchor link to `dataset-zoo`
        ensure breadcrumb at the top reads "[Home] > Dataset Zoo"
        change main CTA text to "Learn more about the Dataset Zoo"
    overview.html
        rename page title to "Dataset Zoo Overview"
        rename anchor link to `dataset-zoo-overview`
        update link on "Explore the datasets in the zoo" button to `dataset_zoo/index.html`
        ensure breadcrumb at the top reads "[Home] > Dataset Zoo > Dataset Zoo Overview"
    dataset.html
        find all instances of `dataset-zoo-datasets` anchor link throughout the docs and replace with `dataset-zoo`
        delete this page, it is now deprecated
    dataset_zoo_ecosystem/
        rename this folder to datasets/

        <dataset_name>/
            move each dataset's anchor link from `dataset.rst` to the top of its new dedicated page
            ensure breadcrumb at the top reads "[Home] > Dataset Zoo > <Dataset Name>"

Model Zoo

Per page feedback:

/model_zoo/
    index.html
        rename anchor link to `model-zoo`
        ensure breadcrumb at the top reads "[Home] > Model Zoo"
        change main CTA text to "Learn more about the Model Zoo"
    overview.html
        rename page title to "Model Zoo Overview"
        rename anchor link to `model-zoo-overview`
        update link on "Explore the models in the zoo" button to `model_zoo/index.html`
        ensure breadcrumb at the top reads "[Home] > Model Zoo > Model Zoo Overview"
    models.html
        find all instances of `model-zoo-models` anchor link throughout the docs and replace with `model-zoo`
        delete this page, it is now deprecated

    model_zoo_ecosystem/
        rename this folder to models/

        <model_name>/
            ensure breadcrumb at the top reads "[Home] > Model Zoo > <Model Name>"

@brimoor brimoor added the documentation Documentation work label Oct 8, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/source/_static/js/custom.js (2)

600-637: Consider debouncing the resize listener for better performance.

The updateAll() function performs layout calculations (getBoundingClientRect()) for every CTA on each resize event. Without debouncing or throttling, frequent resize events (e.g., during window drag) could cause performance issues, especially if many CTAs exist on the page.

Apply this pattern to debounce the resize handler:

+ let resizeTimeout;
  const updateAll = () => ctas.forEach(compute);

  if (document.readyState === "loading") {
    document.addEventListener("DOMContentLoaded", updateAll, { once: true });
  } else {
    updateAll();
  }
- window.addEventListener("resize", updateAll);
+ window.addEventListener("resize", () => {
+   clearTimeout(resizeTimeout);
+   resizeTimeout = setTimeout(updateAll, 150);
+ });

614-627: Consider documenting the magic numbers for maintainability.

The calculations use several unexplained constants (5, 30, 0.5, 320). Adding brief inline comments or extracting them as named constants would improve code clarity.

Example:

+ // Buffer to prevent arrow from touching button edge
+ const ARROW_BUFFER = 5;
+ const MIN_TEXT_SHIFT = 30;
+ const MAX_TEXT_SHIFT_RATIO = 0.5;
+ const MAX_ARROW_SHIFT = 320;
+
  const spaceRight =
    btnRect.width - (arrowRect.left - btnRect.left) - arrowRect.width;
- const arrowShift = Math.max(0, spaceRight - 5);
+ const arrowShift = Math.max(0, spaceRight - ARROW_BUFFER);

  const textShift = -Math.min(
-   Math.max(arrowRect.width, 30),
-   btnRect.width * 0.5
+   Math.max(arrowRect.width, MIN_TEXT_SHIFT),
+   btnRect.width * MAX_TEXT_SHIFT_RATIO
  );

  btn.style.setProperty(
    "--cta-arrow-shift",
-   `${Math.min(arrowShift, 320)}px`
+   `${Math.min(arrowShift, MAX_ARROW_SHIFT)}px`
  );
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8288fa and 7c0ed12.

📒 Files selected for processing (1)
  • docs/source/_static/js/custom.js (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
docs/source/_static/js/custom.js (1)

600-637: LGTM! The dynamic CTA calculation logic is correct.

The function correctly computes shifts for arrow and text elements based on their bounding rectangles. The initialization pattern (checking document.readyState) ensures it works whether called before or after DOM load.

@AdonaiVera
Copy link
Contributor Author

Thank you for the feedback, @brimoor ! 🙌

I applied all the feedback for the dataset, model, and the 10 cards in pagination. The only open point was about removing dataset.rst. My first idea was to split it into individual files per dataset, but that created 10k+ new lines just for moving content. Instead, I kept dataset.rst as the single source and generate the individual files only at build time (not committed to GitHub). This way, any change to a core dataset only needs to happen in dataset.rst, and it avoids flooding the repo with thousands of lines. It’s not the most scalable solution, but good enough until we move to a more robust source like an internal DB.

One question: for model-zoo-models and dataset-zoo-datasets, I updated the names everywhere, but not in the release notes since that’s part of history. Should I update those paths there too, or leave them as is?

You can see the changes here:
https://docs.dev.voxel51.com/index.html

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
docs/source/dataset_zoo/index.rst (1)

42-49: Close tutorial-tags-container before ending the nav.

<div class="tutorial-tags-container"> opens but never closes before </nav>, leaving the DOM unbalanced and breaking downstream JS/CSS selectors. Add the missing </div> just before </nav>.

         <div id="dropdown-filter-tags">
             <div class="tutorial-filter-menu">
                 <div class="tutorial-filter filter-btn all-tag-selected" data-tag="all">All</div>
             </div>
         </div>
+        </div>
     </nav>
docs/scripts/make_model_zoo_docs.py (1)

293-296: Make the inline-link scrubber non-greedy.

re.sub(" <.*>", "", description) grabs from the first < through the last >, erasing large chunks of text whenever multiple links/angle brackets appear. Limit the match to a single bracketed fragment.

-    description = re.sub(" <.*>", "", description)
+    description = re.sub(r" <[^>]+>", "", description)
docs/source/model_zoo/index.rst (1)

42-49: Close tutorial-tags-container before </nav>.

You open <div class="tutorial-tags-container"> but never close it before ending the <nav>, leaving the markup unbalanced and breaking styling/scripts that rely on that wrapper. Insert the missing </div> ahead of </nav>.

         <div id="dropdown-filter-tags">
             <div class="tutorial-filter-menu">
                 <div class="tutorial-filter filter-btn all-tag-selected" data-tag="all">All</div>
             </div>
         </div>
+        </div>
     </nav>
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c0ed12 and 268c62d.

📒 Files selected for processing (13)
  • docs/.gitignore (1 hunks)
  • docs/scripts/make_dataset_zoo_docs.py (1 hunks)
  • docs/scripts/make_model_zoo_docs.py (4 hunks)
  • docs/source/_static/js/custom.js (2 hunks)
  • docs/source/_static/js/tutorial-filters.js (4 hunks)
  • docs/source/conf.py (4 hunks)
  • docs/source/dataset_zoo/index.rst (1 hunks)
  • docs/source/dataset_zoo/overview.rst (1 hunks)
  • docs/source/dataset_zoo/remote.rst (1 hunks)
  • docs/source/integrations/pytorch_hub.rst (1 hunks)
  • docs/source/model_zoo/index.rst (1 hunks)
  • docs/source/model_zoo/overview.rst (1 hunks)
  • docs/source/model_zoo/remote.rst (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/source/integrations/pytorch_hub.rst
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/source/_static/js/custom.js
  • docs/source/_static/js/tutorial-filters.js
  • docs/source/model_zoo/overview.rst
  • docs/source/dataset_zoo/overview.rst
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.

Applied to files:

  • docs/source/dataset_zoo/remote.rst
📚 Learning: 2025-10-03T13:44:19.839Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6373
File: docs/scripts/make_model_zoo_docs.py:289-292
Timestamp: 2025-10-03T13:44:19.839Z
Learning: In docs/scripts/make_model_zoo_docs.py, the generated model_cards.rst file is included from docs/source/model_zoo/index.rst using `.. include:: model_zoo_ecosystem/model_cards.rst`. Links within model_cards.rst should be relative to the includer's location (index.rst), so `model_zoo_ecosystem/{slug}.html` is the correct path format for linking to individual model pages.

Applied to files:

  • docs/scripts/make_model_zoo_docs.py
🧬 Code graph analysis (2)
docs/scripts/make_model_zoo_docs.py (1)
fiftyone/zoo/models/__init__.py (2)
  • get_zoo_model (356-366)
  • list_zoo_models (36-73)
docs/source/conf.py (1)
docs/source/custom_directives.py (1)
  • CustomAnimatedCTADirective (352-405)
🪛 Pylint (3.3.9)
docs/scripts/make_dataset_zoo_docs.py

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)

🪛 Ruff (0.13.3)
docs/scripts/make_model_zoo_docs.py

373-373: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)

docs/scripts/make_dataset_zoo_docs.py

127-127: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)


164-164: Loop control variable title not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/scripts/make_dataset_zoo_docs.py (1)

169-175: Strip duplicated dataset anchors to avoid Sphinx target collisions

Per-dataset files currently include the original .. _dataset-zoo-… label from dataset.rst, yielding duplicate explicit targets and refs resolving to the old page. Remove the leading label (and following blank line) before writing.

Apply:

-    for slug, title, section_content in datasets:
+    for slug, title, section_content in datasets:
         link_slug = _normalize_slug(slug)
         dataset_path = out_dir / f"{link_slug}.rst"
-        section_lines = section_content.strip().splitlines()
-        dataset_path.write_text(
-            "\n".join(section_lines) + "\n", encoding="utf-8"
-        )
+        section_lines = section_content.strip().splitlines()
+        # Drop duplicated explicit targets from the source sections
+        if section_lines and section_lines[0].startswith(".. _dataset-zoo-"):
+            section_lines = section_lines[1:]
+            if section_lines and not section_lines[0].strip():
+                section_lines = section_lines[1:]
+        dataset_path.write_text("\n".join(section_lines) + "\n", encoding="utf-8")
🧹 Nitpick comments (2)
docs/scripts/make_dataset_zoo_docs.py (1)

121-127: Make image directive regex accept optional leading slash

Some sections may use .. image:: images/dataset_zoo/... (no leading slash). Accept both.

-    pattern = r"\.\. image::\s+/images/dataset_zoo/([^\s]+)"
+    pattern = r"\.\. image::\s+/?images/dataset_zoo/([^\s]+)"
docs/scripts/make_model_zoo_docs.py (1)

384-389: Use pathlib for robust, cross-platform paths; also removes unused import warning

Avoid manual splitting and os.path juggling; use Path consistently.

-    docs_dir = "/".join(os.path.realpath(__file__).split("/")[:-2])
-    docs_source_dir = os.path.join(docs_dir, "source")
-    models_dir = os.path.join(docs_source_dir, "model_zoo", "models")
-
-    os.makedirs(models_dir, exist_ok=True)
+    docs_dir = Path(__file__).resolve().parents[1]
+    docs_source_dir = docs_dir / "source"
+    models_dir = docs_source_dir / "model_zoo" / "models"
+    models_dir.mkdir(parents=True, exist_ok=True)
@@
-    cards_path = os.path.join(models_dir, "model_cards.rst")
-    etau.write_file("\n".join(model_cards_content), cards_path)
+    cards_path = models_dir / "model_cards.rst"
+    etau.write_file("\n".join(model_cards_content), str(cards_path))
@@
-        model_path = os.path.join(models_dir, f"{model_slug}.rst}")
-        etau.write_file(model_content, model_path)
+        model_path = models_dir / f"{model_slug}.rst"
+        etau.write_file(model_content, str(model_path))

Note: ensure the stray brace in the original f-string is corrected if present locally.

Also applies to: 404-405, 407-414

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 268c62d and b5a9bac.

📒 Files selected for processing (2)
  • docs/scripts/make_dataset_zoo_docs.py (1 hunks)
  • docs/scripts/make_model_zoo_docs.py (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T13:44:19.839Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6373
File: docs/scripts/make_model_zoo_docs.py:289-292
Timestamp: 2025-10-03T13:44:19.839Z
Learning: In docs/scripts/make_model_zoo_docs.py, the generated model_cards.rst file is included from docs/source/model_zoo/index.rst using `.. include:: model_zoo_ecosystem/model_cards.rst`. Links within model_cards.rst should be relative to the includer's location (index.rst), so `model_zoo_ecosystem/{slug}.html` is the correct path format for linking to individual model pages.

Applied to files:

  • docs/scripts/make_model_zoo_docs.py
🔇 Additional comments (2)
docs/scripts/make_model_zoo_docs.py (1)

289-299: Good fix on greedy inline-link stripping

Switch to bracket-bounded regex for RST inline links is correct and avoids over-stripping.

docs/scripts/make_dataset_zoo_docs.py (1)

166-168: Confirm include path matches link base

Cards are written to dataset_zoo/datasets/dataset_cards.rst and links use datasets/{slug}.html. Ensure dataset.rst includes the cards from datasets/dataset_cards.rst so links resolve relative to the includer.

Run:

@AdonaiVera AdonaiVera requested a review from brimoor October 14, 2025 14:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/source/model_zoo/api.rst (1)

22-28: Tighten wording, pluralize, and add helpful cross-links

Improve clarity and discoverability; avoid "(s)". Link to ensure/install methods and CLI.

 .. note::
 
-    Some models require additional packages to be installed. If you do not have
-    a :ref:`required package <model-zoo-requirements>` installed when loading a
-    zoo model, you'll see an error message that will help you install the
-    necessary package(s).
+    Some models require additional packages. If any :ref:`required packages <model-zoo-requirements>`
+    are missing when you load a zoo model, you'll see an error with installation
+    instructions. See :ref:`Installing zoo model requirements <model-zoo-requirements>` to
+    pre-check or install via :meth:`ensure_zoo_model_requirements() <fiftyone.zoo.models.ensure_zoo_model_requirements>`,
+    :meth:`install_zoo_model_requirements() <fiftyone.zoo.models.install_zoo_model_requirements>`,
+    or the :ref:`CLI command <cli-fiftyone-zoo-models-requirements>`.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5a9bac and cd40dc6.

📒 Files selected for processing (2)
  • docs/source/dataset_zoo/api.rst (1 hunks)
  • docs/source/model_zoo/api.rst (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/source/dataset_zoo/api.rst

@AdonaiVera
Copy link
Contributor Author

Based on our conversation, @brimoor, Now we have one file per dataset, I removed the script, and updated both the model-zoo-models and dataset-zoo-datasets for the new version.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 21

🧹 Nitpick comments (23)
docs/source/dataset_zoo/datasets/kitti_multiview.rst (5)

24-26: Use HTTPS and consider linking the license.

  • Switch to HTTPS for the source URL.
  • Optionally link the license to its canonical page for clarity.

Apply:

- -   Dataset source: http://www.cvlibs.net/datasets/kitti
+ -   Dataset source: https://www.cvlibs.net/datasets/kitti/
- -   Dataset license: CC-BY-NC-SA-3.0
+ -   Dataset license: CC BY-NC-SA 3.0

27-28: Align tags with the standardized taxonomy.

“detection” is ambiguous. Consider aligning with project-wide tags (e.g., “object-detection”, “3d-detection”, “stereo”/“multiview”) for consistent filtering/search.

Please confirm which canonical tags are being used across the new zoo pages and update here accordingly.


29-31: Verify the ZooDataset class import path resolves.

Ensure Sphinx resolves this cross-ref. Dataset classes typically aren’t under .base. If needed, update to the correct module path and consider ~ to shorten display.

Example (adjust as appropriate):
:class:~fiftyone.zoo.datasets.kitti.KITTIMultiviewDataset``


56-58: Verify image asset exists and builds cleanly.

Confirm /images/dataset_zoo/kitti-multiview-train.png is present and referenced correctly (path, case). Consider adding a width if needed for layout consistency.


18-20: Optional: link directly to the KITTI devkit page.

Hyperlink “object development kit on the KITTI homepage” to the specific devkit/README page for faster navigation.

docs/source/dataset_zoo/datasets/quickstart_geo.rst (3)

8-10: Add source/citation (and license) for BDD100K.

Attribute the upstream dataset and link to its homepage; optionally mention license/terms to avoid ambiguity.

Apply:

-The dataset consists of 500 images from the validation split of the BDD100K
-dataset in the New York City area with object detections and GPS timestamps.
+The dataset consists of 500 images from the validation split of the
+`BDD100K <https://bdd-data.berkeley.edu/>`_ dataset in the New York City
+area with object detections and GPS timestamps.
+
+Source: BDD100K (validation split subset). Please refer to the BDD100K
+license/terms of use on the linked site.

14-14: Avoid exact size; prefer approximate to prevent drift.

--   Dataset size: 33.50 MB
+-   Dataset size: ~33.5 MB

16-16: Clarify “Supported splits” vs text mentioning validation.

“Supported splits: N/A” can read as contradictory. Clarify that the zoo dataset is a single combined set without split options.

--   Supported splits: ``N/A``
+-   Supported splits: ``none (single combined set)``
docs/source/dataset_zoo/datasets/ucf101.rst (3)

48-54: Set explicit dataset name for consistency with CLI.

Without a name, the dataset’s actual name may vary. Align with CLI examples.

-        dataset = foz.load_zoo_dataset("ucf101", split="test")
+        dataset = foz.load_zoo_dataset(
+            "ucf101", split="test", dataset_name="ucf101-test"
+        )

59-65: Make CLI deterministic by naming the dataset.

Ensure subsequent commands reference the exact dataset.

-        fiftyone zoo datasets load ucf101 --split test
+        fiftyone zoo datasets load ucf101 --split test --dataset-name ucf101-test

6-9: Consistency: use “dataset” (one word).

The text mixes “data set” and “dataset”. Standardize to “dataset”.

Example edits (apply across the section):

- UCF101 is an action recognition data set
+ UCF101 is an action recognition dataset
- This data set is an extension
+ This dataset is an extension
- data sets are not realistic
+ datasets are not realistic

Also applies to: 14-16

docs/source/dataset_zoo/datasets/voc_2012.rst (1)

20-20: Optional: mark size as approximate.

Dataset downloads vary; consider “~3.6 GB” to avoid false precision.

docs/source/dataset_zoo/datasets/kinetics_400.rst (2)

75-77: Use double backticks for inline code literals

Single backticks can create interpreted text; prefer double backticks for code/param names for consistency with the rest of the page.

Apply diffs like:

-By default, `multiprocessing.cpu_count()` is used
+By default, ``multiprocessing.cpu_count()`` is used
-... whenever either `classes` or `max_samples` is provided ...
+... whenever either ``classes`` or ``max_samples`` is provided ...
-Videos that contain all `classes` will be prioritized first, followed
+Videos that contain all ``classes`` will be prioritized first, followed

Also applies to: 90-92, 125-128


21-21: Stats/size accuracy: add “approx.” and verify sources

Numbers look internally consistent (370+56+30=456 GB). Consider marking sizes as “approx.” and confirm counts for “Original” vs “CVDF” splits match the referenced sources.

Example:

--   Dataset size: 456 GB
+-   Dataset size: ~456 GB

Also applies to: 27-38, 41-44

docs/source/dataset_zoo/datasets/cifar100.rst (1)

21-23: Standardize ZooDataset links (use base module) and keep backend note in text

Other pages link to base.* classes. Recommend linking to the base class once and keeping the TF/Torch backend availability in prose to avoid crossref breaks.

-    -   :class:`CIFAR100Dataset <fiftyone.zoo.datasets.tf.CIFAR100Dataset>` (TF backend)
-    -   :class:`CIFAR100Dataset <fiftyone.zoo.datasets.torch.CIFAR100Dataset>` (Torch backend)
+    -   :class:`CIFAR100Dataset <fiftyone.zoo.datasets.base.CIFAR100Dataset>` (available for Torch and TensorFlow backends)

Please confirm the tf/torch module paths resolve in Sphinx; if they do, feel free to keep both, but be consistent with other pages.

docs/source/dataset_zoo/datasets/coco_2017.rst (1)

31-33: Prefer HTTPS for external links

Use HTTPS for COCO source.

--   Dataset source: http://cocodataset.org/#home
+-   Dataset source: https://cocodataset.org/#home
docs/source/dataset_zoo/datasets/bdd100k.rst (1)

41-49: Use HTTPS and current URLs

Prefer HTTPS for registration/source links.

-    You can register at http://bdd-data.berkeley.edu in order to get links
+    You can register at https://bdd-data.berkeley.edu in order to get links
...
--   Dataset source: http://bdd-data.berkeley.edu
--   Dataset license: http://bdd-data.berkeley.edu/download.html
+-   Dataset source: https://bdd-data.berkeley.edu
+-   Dataset license: https://bdd-data.berkeley.edu/download.html

Consider whether https://www.bdd100k.com/ should be used instead.

docs/source/dataset_zoo/datasets/coco_2014.rst (5)

31-31: Use HTTPS for source link

Switch to HTTPS.

--   Dataset source: http://cocodataset.org/#home
+-   Dataset source: https://cocodataset.org/#home

56-60: Minor consistency: split ordering

Earlier you list “train, validation, test”. Mirror that here for consistency.

-    ``("train", "test", "validation")``. If neither is provided, all available
+    ``("train", "validation", "test")``. If neither is provided, all available

81-85: Clarify include_license values (booleans vs strings)

Represent booleans as booleans, not quoted strings, and simplify wording.

-    -   ``"False"`` (default): don't load the license
-    -   ``True``/``"name"``: store the string license name
+    -   ``False`` (default): don't load the license
+    -   ``True`` or ``"name"``: store the string license name
     -   ``"id"``: store the integer license ID
     -   ``"url"``: store the license URL

86-89: Avoid referencing undefined parameter attrs

Either document an attrs parameter above, or remove it here. Quick fix below removes it.

--   **only_matching** (*False*): whether to only load labels that match the
-    ``classes`` or ``attrs`` requirements that you provide (True), or to load
+-   **only_matching** (*False*): whether to only load labels that match the
+    ``classes`` requirements that you provide (True), or to load

6-11: Clarify scope (captions not included by this loader)

You mention captioning in COCO, but this loader covers detections/segmentations. Add a sentence to avoid confusion.

-This version contains images, bounding boxes, and segmentations for the 2014
-version of the dataset.
+This version contains images, bounding boxes, and segmentations for the 2014
+version of the dataset. Captions are not included in this loader.
docs/source/dataset_zoo/datasets/open_images_v7.rst (1)

164-166: PEP 8 keyword arg spacing in example

Optional: remove spaces around '=' in keyword args.

-            classes = ["Tortoise", "Sea turtle"],
+            classes=["Tortoise", "Sea turtle"],
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd40dc6 and 084f6cb.

📒 Files selected for processing (38)
  • docs/.gitignore (1 hunks)
  • docs/source/dataset_zoo/datasets/activitynet_100.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/activitynet_200.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/bdd100k.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/caltech101.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/caltech256.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cifar10.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cifar100.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/cityscapes.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/coco_2014.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/coco_2017.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/dataset_cards.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/fashion_mnist.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/fiw.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/hmdb51.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/imagenet_2012.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/imagenet_sample.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_400.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_600.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kitti.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kitti_multiview.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/lfw.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/mnist.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/open_images_v6.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/open_images_v7.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/places.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_3d.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_geo.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_groups.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/quickstart_video.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/sama_coco.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/ucf101.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/voc_2007.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/voc_2012.rst (1 hunks)
  • docs/source/release-notes.rst (3 hunks)
✅ Files skipped from review due to trivial changes (22)
  • docs/source/dataset_zoo/datasets/cityscapes.rst
  • docs/source/dataset_zoo/datasets/quickstart.rst
  • docs/source/dataset_zoo/datasets/imagenet_sample.rst
  • docs/source/dataset_zoo/datasets/kitti.rst
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
  • docs/source/dataset_zoo/datasets/activitynet_200.rst
  • docs/source/dataset_zoo/datasets/fiw.rst
  • docs/source/dataset_zoo/datasets/kinetics_700.rst
  • docs/source/dataset_zoo/datasets/voc_2007.rst
  • docs/source/dataset_zoo/datasets/cifar10.rst
  • docs/source/dataset_zoo/datasets/quickstart_3d.rst
  • docs/source/dataset_zoo/datasets/imagenet_2012.rst
  • docs/source/dataset_zoo/datasets/kinetics_600.rst
  • docs/source/dataset_zoo/datasets/lfw.rst
  • docs/source/dataset_zoo/datasets/places.rst
  • docs/source/dataset_zoo/datasets/quickstart_groups.rst
  • docs/source/dataset_zoo/datasets/activitynet_100.rst
  • docs/source/dataset_zoo/datasets/fashion_mnist.rst
  • docs/source/dataset_zoo/datasets/mnist.rst
  • docs/source/release-notes.rst
  • docs/source/dataset_zoo/datasets/caltech101.rst
  • docs/source/dataset_zoo/datasets/quickstart_video.rst
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-04T19:49:48.247Z
Learnt from: brimoor
PR: voxel51/fiftyone#6317
File: fiftyone/utils/caltech.py:149-157
Timestamp: 2025-09-04T19:49:48.247Z
Learning: The Caltech 256 dataset has a static directory structure where all class directory names follow the format "XXX.category_name" (e.g., "001.ak47", "002.american-flag"), meaning all directory names contain dots and the split operation in the label normalization code is safe.

Applied to files:

  • docs/source/dataset_zoo/datasets/caltech256.rst
🔇 Additional comments (27)
docs/.gitignore (1)

1-1: Good update; also ignore generated dataset pages and verify no tracked files remain

The models directory ignore is correct. Since dataset pages are also generated (not committed), add the datasets directory too, and ensure no already-tracked files remain under these paths.

Apply this diff:

 source/model_zoo/models/
+source/dataset_zoo/datasets/

Run to verify paths and tracked files:

docs/source/dataset_zoo/datasets/kitti_multiview.rst (1)

50-55: Confirm CLI-created dataset name.

Does fiftyone zoo datasets load kitti-multiview --split train create kitti-multiview-train by default in the new flow? If naming changed, update the launch command accordingly.

docs/source/dataset_zoo/datasets/quickstart_geo.rst (3)

24-43: Tabs directive compatibility check.

Using .. tabs:: with .. group-tab:: depends on the sphinx-tabs extension. Ensure it’s enabled and consistent across docs (or switch to .. tab:: if that’s the project standard).


18-18: Verify API reference target for ZooDataset class.

Confirm that fiftyone.zoo.datasets.base.QuickstartGeoDataset is the correct import path documented in API docs; otherwise the link will break.

Run:


44-46: Verify image presence and improve alt text for accessibility.

  • Ensure the image file exists in the repo.
  • Alt text should describe the content, not the filename.
-.. image:: /images/dataset_zoo/quickstart-geo.png
-   :alt: quickstart-geo
+.. image:: /images/dataset_zoo/quickstart-geo.png
+   :alt: FiftyOne App showing the Quickstart Geo dataset (NYC clips with detections)
    :align: center
docs/source/dataset_zoo/datasets/ucf101.rst (4)

68-69: Verify :ref: target exists.

Confirm the “troubleshooting-video” label is defined to avoid a broken link.


78-80: Confirm image asset exists and path is correct.

Ensure /images/dataset_zoo/ucf101-test.png is added in this PR or already present to prevent a broken image.


37-55: Tab directive name likely incorrect (“group-tab”).

Sphinx typically uses “tab” (sphinx-tabs) or “tab-item” (sphinx-design). “group-tab” may not render.

Apply this diff if using sphinx-tabs:

-.. tabs::
-
-  .. group-tab:: Python
+.. tabs::
+
+  .. tab:: Python

And similarly on Line 55:

-  .. group-tab:: CLI
+  .. tab:: CLI

If using sphinx-design, use:

.. tab-set::
  .. tab-item:: Python
  .. tab-item:: CLI

27-31: Fix license; verify source URL and splits; qualify size.

  • UCF101 is not CC0; it uses custom/non-commercial terms. Update license text.
  • Please verify the source URL (the official page is typically under /data/UCF101.php).
  • Confirm supported splits exposed by the zoo (train/test, and whether “all” is allowed).
  • Prefer “Approx.” for size, as mirrors/compression vary.

Apply this diff:

- -   Dataset source: https://www.crcv.ucf.edu/research/data-sets/ucf101
- -   Dataset license: CC0-1.0
- -   Dataset size: 6.48 GB
+ -   Dataset source: https://www.crcv.ucf.edu/research/data-sets/ucf101
+ -   Dataset license: Custom (see source terms)
+ -   Dataset size: Approx. 6.48 GB
docs/source/dataset_zoo/datasets/voc_2012.rst (2)

22-22: Confirm “validation” split alias is supported for VOC.

Many VOC resources use “val”. Please confirm the loader and CLI accept “validation”. If not, update docs to “val”.

Run:

Also applies to: 51-52, 57-61


63-66: Verify image asset exists.

Ensure /images/dataset_zoo/voc-2012-validation.png is present so the page builds without broken images.

Run:

docs/source/dataset_zoo/datasets/kinetics_400.rst (2)

24-26: Verify Sphinx target for ZooDataset class

Ensure this cross‑reference resolves. The dataset class may not live under fiftyone.zoo.datasets.base. If it 404s, consider linking to the public API or replacing with a literal and linking to load_zoo_dataset.

Suggestion:

-:class:`Kinetics400Dataset <fiftyone.zoo.datasets.base.Kinetics400Dataset>`
+:class:`Kinetics400Dataset <fiftyone.zoo.datasets.Kinetics400Dataset>`

or

See :func:`fiftyone.zoo.datasets.load_zoo_dataset` with name ``"kinetics-400"``.

155-160: CLI kwargs and dataset name: verify syntax and output name

Confirm --kwargs max_samples=10 is valid. In many FO docs, JSON is passed to --kwargs.

If needed, update to:

Also verify the auto‑generated dataset name used in fiftyone app launch (e.g., kinetics-400-validation-10) matches actual CLI behavior.

docs/source/dataset_zoo/datasets/hmdb51.rst (1)

16-18: Verify supported splits

“other” is unusual for HMDB51. Are the supported splits exactly train/test (and possibly validation)? Please confirm and update.

docs/source/dataset_zoo/datasets/caltech256.rst (1)

1-53: LGTM

Content, structure, and examples look correct.

docs/source/dataset_zoo/datasets/dataset_cards.rst (1)

1-245: Fix card links: remove redundant “datasets/” prefix

This file lives in dataset_zoo/datasets/, but links use “datasets/.html”, which will likely resolve to dataset_zoo/datasets/datasets/.html. Use the page filename only (e.g., cifar100.html), or an absolute path (/dataset_zoo/datasets/cifar100.html) depending on how customcarditem resolves links.

Examples (apply globally):

-    :link: datasets/activitynet_100.html
+    :link: activitynet_100.html
...
-    :link: datasets/coco_2017.html
+    :link: coco_2017.html
...
-    :link: datasets/hmdb51.html
+    :link: hmdb51.html

If the directive expects site-root paths, use “/dataset_zoo/datasets/.html” instead. Please confirm against the docs preview.

docs/source/dataset_zoo/datasets/open_images_v6.rst (1)

31-31: Verify cross-references and assets resolve (dataset class, importer, image).

Please confirm these exist and resolve in the built docs:

  • :class:OpenImagesV6Dataset <fiftyone.zoo.datasets.base.OpenImagesV6Dataset>
  • :class:OpenImagesV6DatasetImporter <fiftyone.utils.openimages.OpenImagesV6DatasetImporter>
  • Image file: /images/dataset_zoo/open-images-v6.png

Run:

Also applies to: 111-116, 237-239

docs/source/dataset_zoo/datasets/coco_2014.rst (5)

34-34: Verify dataset size figure

37.57 GB may vary depending on what’s counted (images vs +annotations). Confirm or note “~37.6 GB”.


180-233: CLI dataset names and flags: please verify

Confirm the default dataset names (e.g., coco-2014-validation-50) and the use of --kwargs are accurate for current CLI behavior.


1-5: Confirm commit vs build-time generation

PR notes per-dataset pages are generated at build time and not committed. Is this file intended to be committed, or should it be generated from the single source (dataset.rst)? Align with the new process.


37-38: Cross-reference targets may be incorrect

Validate the module paths for COCO2014Dataset and COCODetectionDatasetImporter. Update if needed.

Also applies to: 108-113


234-236: Verify image asset exists

Ensure /images/dataset_zoo/coco-2014-validation.png is present in the repo.

docs/source/dataset_zoo/datasets/open_images_v7.rst (5)

30-32: Verify class paths resolve (linking/accuracy)

Confirm these symbols and module paths exist; adjust if needed (e.g., module moved).

Also applies to: 113-116


19-21: Check target anchor exists for :ref:open-images

Avoid broken cross-refs.


237-240: Ensure image asset exists

Prevent broken image in docs.


36-38: No edits needed: image rescaling statement is accurate

The FiftyOne Open Images V7 dataset downloads all images rescaled so their largest side is at most 1024 px.


41-44: Confirm split counts and sizes
Official V7 detection/bounding‐box subset: Train 1,743,042 images (513 GB), Validation 41,620 images (12 GB), Test 125,436 images (36 GB).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/source/dataset_zoo/datasets/coco_2017.rst (2)

81-84: Document the actual boolean default for include_license

Line 82 still quotes "False" and hides the True alias ambiguity. Please state the boolean default plainly and note that True is an alias for "name" so users don’t pass string values that won’t work.

-    -   ``"False"`` (default): don't load the license
-    -   ``True``/``"name"``: store the string license name
+    -   ``False`` (default): don't load the license
+    -   ``True`` (alias for ``"name"``): store the string license name

99-103: Fix “labels types” typo

Line 102 still says “labels types”; please change it to “label types” for clarity.

-    labels types or classes. The actual number of samples loaded may be less
+    label types or classes. The actual number of samples loaded may be less
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 084f6cb and 8d7ef36.

📒 Files selected for processing (11)
  • docs/source/dataset_zoo/datasets/coco_2014.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/coco_2017.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/dataset_cards.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/hmdb51.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_400.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_600.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/open_images_v7.rst (1 hunks)
  • docs/source/dataset_zoo/datasets/sama_coco.rst (1 hunks)
  • docs/source/integrations/coco.rst (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/source/dataset_zoo/datasets/open_images_v7.rst
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
  • docs/source/dataset_zoo/datasets/sama_coco.rst
  • docs/source/dataset_zoo/datasets/hmdb51.rst
  • docs/source/dataset_zoo/datasets/kinetics_600.rst
  • docs/source/dataset_zoo/datasets/dataset_cards.rst
🔇 Additional comments (1)
docs/source/dataset_zoo/datasets/kinetics_700.rst (1)

24-26: Fix the ZooDataset class cross-reference

Kinetics700Dataset is defined in fiftyone.zoo.datasets.kinetics, not under .base, so the current target renders a broken link. Please point the :class: directive at the module that actually defines the class to keep the docs intact.

⛔ Skipped due to learnings
Learnt from: brimoor
PR: voxel51/fiftyone#6373
File: docs/source/dataset_zoo/datasets/ucf101.rst:32-0
Timestamp: 2025-10-16T17:24:11.562Z
Learning: In FiftyOne documentation, dataset zoo classes follow the pattern `fiftyone.zoo.datasets.base.{DatasetName}Dataset` and should be referenced using `:class:` directives (e.g., `:class:`UCF101Dataset <fiftyone.zoo.datasets.base.UCF101Dataset>``). This is a valid and established documentation pattern.

Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants