-
Notifications
You must be signed in to change notification settings - Fork 675
Standardize Zoos, Update Plugins, and Improve Docs #6373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request refactors the documentation ecosystem to support card-based layouts for models and datasets, introduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Browser
participant initDynamicCTAs as JS: initDynamicCTAs()
participant DOM
participant CSS
User->>Browser: Page Load
Browser->>initDynamicCTAs: DOMContentLoaded event
activate initDynamicCTAs
initDynamicCTAs->>DOM: Query [data-cta-dynamic="true"]
initDynamicCTAs->>DOM: Measure .sd-btn dimensions
initDynamicCTAs->>DOM: Measure .arrow & .text dimensions
initDynamicCTAs->>initDynamicCTAs: Calculate --cta-arrow-shift<br/>Calculate --cta-text-shift
initDynamicCTAs->>CSS: Set CSS Variables on element
deactivate initDynamicCTAs
Note over CSS: Hover transforms use<br/>var(--cta-arrow-shift, fallback)<br/>var(--cta-text-shift, fallback)
User->>Browser: Resize window
Browser->>initDynamicCTAs: resize event
initDynamicCTAs->>DOM: Recalculate & update CSS vars
sequenceDiagram
participant Gen as make_model_zoo_docs.py
participant FS as Filesystem
participant RST as RST Output
Gen->>Gen: Load all models
alt No models found
Gen->>Gen: Log warning & exit
else Models available
Gen->>FS: Create docs/source/model_zoo/models/
Gen->>Gen: Render model cards template
Gen->>RST: Write model_cards.rst
loop For each model
Gen->>Gen: Get model_name & slug
Gen->>Gen: Fetch description & tags
Gen->>Gen: Render per-model page
Gen->>RST: Write models/model_slug.rst
end
Gen->>Gen: Log success & counts
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The changes span multiple heterogeneous systems: significant refactoring of model/dataset zoo generation logic with new template structures and output flows; introduction of new directives and CSS variable-based dynamic styling; widespread cross-reference updates across 40+ documentation files; JavaScript enhancements for multi-input search support and dynamic CTA calculations; and extensive addition of new dataset documentation pages. While many dataset page additions are repetitive, the core logic changes in generation scripts, styling system integration, and documentation architecture require careful review. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (39)
✅ Files skipped from review due to trivial changes (10)
🚧 Files skipped from review as they are similar to previous changes (22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (15)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (1)
242-242
: Fix misleading comment.The comment references "mvtec_bottle" but the code creates "mvtec-screw". Update the comment to match the actual dataset name.
Apply this diff:
- # Clone the dataset into a new one called "mvtec_bottle" + # Clone the dataset into a new one called "mvtec-screw" mvtec_screw = test_split.clone("mvtec-screw", persistent=True)docs/source/getting_started/manufacturing/02_embeddings.ipynb (1)
48-118
: Unify dataset handle; current branch logic raises NameErrorWhen the dataset already exists,
dataset_emb
is never defined, so the later cells (compute_visualization
, reload/print) crash. Conversely, in the creation branchdataset
is never assigned, solaunch_app(dataset, …)
fails. Please collapse to a singledataset
variable that is always defined and reuse it downstream.-if dataset_name in fo.list_datasets(): - print(f"Dataset '{dataset_name}' exists. Loading...") - dataset = fo.load_dataset(dataset_name) -else: - print(f"Dataset '{dataset_name}' does not exist. Creating a new one...") - # Clone the dataset with a new name and make it persistent - #dataset_ = fouh.load_from_hub("Voxel51/mvtec-ad", persistent=True, overwrite=True) - dataset_ = fo.Dataset.from_dir( - dataset_dir="/content/mvtec-ad", - dataset_type=fo.types.FiftyOneDataset - ) - dataset_emb = dataset_.clone("MVTec_AD") - -print(dataset_emb) +if dataset_name in fo.list_datasets(): + print(f"Dataset '{dataset_name}' exists. Loading...") + dataset = fo.load_dataset(dataset_name) +else: + print(f"Dataset '{dataset_name}' does not exist. Creating a new one...") + # Clone the dataset with a new name and make it persistent + #dataset_src = fouh.load_from_hub("Voxel51/mvtec-ad", persistent=True, overwrite=True) + dataset_src = fo.Dataset.from_dir( + dataset_dir="/content/mvtec-ad", + dataset_type=fo.types.FiftyOneDataset + ) + dataset = dataset_src.clone(dataset_name) + +print(dataset) @@ -fob.compute_visualization( - dataset_emb, +fob.compute_visualization( + dataset, @@ -dataset_emb.reload() -print(dataset_emb) -print(dataset_emb.last()) +dataset.reload() +print(dataset) +print(dataset.last())docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb (1)
764-769
: Fixcompute_visualization
embeddings field nameYou store the per-sample vectors in the
"embedding"
field (see the loop on lines 704-727), but thecompute_visualization
call asks forembeddings="padin_emb"
. FiftyOne will raiseValueError: field 'padin_emb' does not exist
and the notebook stops. Pass the actual field name so the visualization step works.- mvtec_bottle, - embeddings="padin_emb", + mvtec_bottle, + embeddings="embedding",
🧹 Nitpick comments (13)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (2)
137-137
: Improve idempotency of unzip operation.The unzip command will fail if run multiple times in the same environment because the directory already exists. Consider adding a check or using the
-o
flag to overwrite, or better yet, skip if the directory already exists.Apply this approach:
import os if not os.path.exists('/content/mvtec-ad'): !unzip mvtec_ad.zip
130-158
: Consider cleanup of temporary download files.The downloaded
mvtec_ad.zip
file is not removed after extraction, consuming unnecessary storage in the Colab environment. Consider adding cleanup after successful extraction.Add cleanup after the unzip:
!unzip mvtec_ad.zip import os if os.path.exists("mvtec_ad.zip"): os.remove("mvtec_ad.zip")docs/source/_static/css/custom.css (1)
1-1
: Avoid duplicate@font-face
definitionsThis block re-declares the Geist / Geist Mono faces that already live earlier in the sheet. Keeping both sets bloats the asset and makes future edits harder. Please consolidate to a single definition (and update the URLs there if needed) so the font setup has one source of truth.
docs/source/dataset_zoo/dataset_zoo_ecosystem/bdd100k.rst (1)
41-48
: Prefer HTTPS for BDD linksUpdate URLs to HTTPS for consistency and security.
- You can register at http://bdd-data.berkeley.edu in order to get links + You can register at https://bdd-data.berkeley.edu in order to get links - to download the data. + to download the data.-- Dataset source: http://bdd-data.berkeley.edu -+ Dataset source: https://bdd-data.berkeley.edu -- Dataset license: http://bdd-data.berkeley.edu/download.html -+ Dataset license: https://bdd-data.berkeley.edu/download.htmldocs/source/getting_started/manufacturing/03_clustering.ipynb (2)
1-1
: Potential path mismatch after unzipYou unzip
mvtec_ad.zip
but later read from/content/mvtec-ad
. Ensure the extracted folder name matches. Consider detecting it programmatically.from pathlib import Path !unzip -q mvtec_ad.zip root = next((p for p in Path(".").iterdir() if p.is_dir() and p.name.startswith("mvtec")), None) dataset_dir = str(root) if root else "/content/mvtec-ad" # fallback dataset_ = fo.Dataset.from_dir(dataset_dir=dataset_dir, dataset_type=fo.types.FiftyOneDataset)
1-1
: Reduce duplication and unused importsYou both
foz.load_zoo_model("open-clip-vit-base32-torch")
and laterfozm.load_zoo_model("clip-vit-base32-torch")
. Keep one. Also drop unusedfiftyone.utils.huggingface as fouh
if not used.docs/scripts/make_model_zoo_docs.py (4)
371-381
: Consider enabling Jinja2 autoescape (or explicitly disabling)RST ultimately renders to HTML. To avoid accidental HTML injection in descriptions, use
select_autoescape
for html/xml, or explicitly setautoescape=False
with a comment.-from jinja2 import Environment, BaseLoader +from jinja2 import Environment, BaseLoader, select_autoescape ... - environment = Environment( + environment = Environment( loader=BaseLoader, trim_blocks=True, lstrip_blocks=True, + autoescape=select_autoescape(default_for_string=False, default=False), )
382-389
: Cross‑platform and cleaner path handlingReplace manual string splitting with pathlib for portability; you already import Path.
- docs_dir = "/".join(os.path.realpath(__file__).split("/")[:-2]) - docs_source_dir = os.path.join(docs_dir, "source") - model_zoo_ecosystem_dir = os.path.join( - docs_source_dir, "model_zoo", "model_zoo_ecosystem" - ) + script_path = Path(__file__).resolve() + docs_dir = script_path.parents[1] + docs_source_dir = docs_dir / "source" + model_zoo_ecosystem_dir = docs_source_dir / "model_zoo" / "model_zoo_ecosystem" ... - os.makedirs(model_zoo_ecosystem_dir, exist_ok=True) + os.makedirs(str(model_zoo_ecosystem_dir), exist_ok=True)Also update later writes to
str(path)
.
404-412
: Use pathlib consistently for writesConsistent with the previous refactor; avoid mixing os.path with Path.
- cards_path = os.path.join(model_zoo_ecosystem_dir, "model_cards.rst") - etau.write_file("\n".join(model_cards_content), cards_path) + cards_path = model_zoo_ecosystem_dir / "model_cards.rst" + etau.write_file("\n".join(model_cards_content), str(cards_path)) ... - model_slug = model_name.replace(".", "_").replace("-", "_") - model_path = os.path.join(model_zoo_ecosystem_dir, f"{model_slug}.rst") - etau.write_file(model_content, model_path) + model_slug = model_name.replace(".", "_").replace("-", "_") + model_path = model_zoo_ecosystem_dir / f"{model_slug}.rst" + etau.write_file(model_content, str(model_path))
241-266
: Template conditionals test substrings in a comma‑joined string
'foo' in tags
does substring checks on a single string. Prefer passing atags_list
to the template and using membership against a list to avoid false positives.- content = template.render( + content = template.render( ... - tags=tags_str, + tags=tags_str, + tags_list=zoo_model.tags, ... )And in the template, change checks like
{% if 'clip' in tags %}
to{% if 'clip' in tags_list %}
.docs/source/_static/js/custom.js (3)
605-628
: Consider guarding against zero-width elements.The
compute
function usesgetBoundingClientRect()
but doesn't check if the button has rendered dimensions. If elements are hidden or not yet laid out,btnRect.width
could be zero, leading to incorrect CSS variable values.Apply this diff to add a safety check:
const compute = (btn) => { const arrow = btn.querySelector(".arrow"); const text = btn.querySelector(".text"); if (!arrow || !text) return; const btnRect = btn.getBoundingClientRect(); + if (btnRect.width === 0 || btnRect.height === 0) return; + const arrowRect = arrow.getBoundingClientRect(); const textRect = text.getBoundingClientRect();
614-621
: Extract magic numbers to named constants.The calculation uses several magic numbers (5, 30, 320, 0.5) that lack context. Consider extracting these to named constants for maintainability.
For example:
const ARROW_PADDING = 5; const MIN_TEXT_SHIFT = 30; const MAX_ARROW_SHIFT = 320; const TEXT_SHIFT_RATIO = 0.5; const spaceRight = btnRect.width - (arrowRect.left - btnRect.left) - arrowRect.width; const arrowShift = Math.max(0, spaceRight - ARROW_PADDING); const textShift = -Math.min( Math.max(arrowRect.width, MIN_TEXT_SHIFT), btnRect.width * TEXT_SHIFT_RATIO ); btn.style.setProperty( "--cta-arrow-shift", `${Math.min(arrowShift, MAX_ARROW_SHIFT)}px` );
637-637
: Consider debouncing the resize handler.The resize event can fire rapidly during window resizing, causing frequent recalculations. Consider debouncing or throttling the
updateAll
call for better performance.Example implementation:
let resizeTimeout; const debouncedUpdate = () => { clearTimeout(resizeTimeout); resizeTimeout = setTimeout(updateAll, 150); }; window.addEventListener("resize", debouncedUpdate);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (62)
docs/.gitignore
(1 hunks)docs/scripts/generate_plugin_docs.py
(3 hunks)docs/scripts/make_model_zoo_docs.py
(4 hunks)docs/source/_static/css/custom.css
(1 hunks)docs/source/_static/js/custom.js
(2 hunks)docs/source/_static/js/search.js
(1 hunks)docs/source/_static/js/tutorial-filters.js
(3 hunks)docs/source/assets/styles/components/_buttons.sass
(1 hunks)docs/source/assets/styles/components/_plugins-search.sass
(1 hunks)docs/source/assets/styles/layout/_main.sass
(1 hunks)docs/source/conf.py
(5 hunks)docs/source/custom_directives.py
(3 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_100.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_200.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/bdd100k.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech101.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar10.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar100.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/cityscapes.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2014.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2017.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/dataset_cards.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/fashion_mnist.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/fiw.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/hmdb51.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_2012.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_sample.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_400.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_600.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_700.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_700_2020.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti_multiview.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/lfw.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/mnist.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v6.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v7.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/places.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_3d.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_geo.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_groups.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_video.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/sama_coco.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/ucf101.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2007.rst
(1 hunks)docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2012.rst
(1 hunks)docs/source/dataset_zoo/index.rst
(1 hunks)docs/source/dataset_zoo/overview.rst
(1 hunks)docs/source/getting_started/manufacturing/01_intro.ipynb
(1 hunks)docs/source/getting_started/manufacturing/02_embeddings.ipynb
(1 hunks)docs/source/getting_started/manufacturing/03_clustering.ipynb
(1 hunks)docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb
(13 hunks)docs/source/getting_started/manufacturing/06_augmentation.ipynb
(6 hunks)docs/source/index.rst
(8 hunks)docs/source/model_zoo/index.rst
(1 hunks)docs/source/model_zoo/overview.rst
(1 hunks)docs/source/plugins/index.rst
(1 hunks)docs/source/recipes/index.rst
(2 hunks)docs/source/user_guide/basics.rst
(9 hunks)docs/source/user_guide/using_datasets.rst
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.
Applied to files:
docs/source/index.rst
docs/source/dataset_zoo/overview.rst
docs/source/dataset_zoo/index.rst
📚 Learning: 2025-09-04T19:49:48.247Z
Learnt from: brimoor
PR: voxel51/fiftyone#6317
File: fiftyone/utils/caltech.py:149-157
Timestamp: 2025-09-04T19:49:48.247Z
Learning: The Caltech 256 dataset has a static directory structure where all class directory names follow the format "XXX.category_name" (e.g., "001.ak47", "002.american-flag"), meaning all directory names contain dots and the split operation in the label normalization code is safe.
Applied to files:
docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst
🧬 Code graph analysis (3)
docs/source/_static/js/search.js (1)
docs/source/_static/js/tutorial-filters.js (1)
allButton
(180-182)
docs/source/conf.py (1)
docs/source/custom_directives.py (1)
CustomAnimatedCTADirective
(352-405)
docs/scripts/make_model_zoo_docs.py (1)
fiftyone/zoo/models/__init__.py (2)
get_zoo_model
(356-366)list_zoo_models
(36-73)
🪛 Pylint (3.3.8)
docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb
[refactor] 7-10: Unnecessary "elif" after "return", remove the leading "el" from "elif"
(R1705)
docs/source/getting_started/manufacturing/02_embeddings.ipynb
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
docs/source/getting_started/manufacturing/03_clustering.ipynb
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
docs/source/getting_started/manufacturing/01_intro.ipynb
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
🪛 Ruff (0.13.2)
docs/source/getting_started/manufacturing/04_custom_embeddings.ipynb
49-49: Redefinition of unused torch
from line 2
Remove definition: torch
(F811)
66-66: Redefinition of unused torch
from line 49
Remove definition: torch
(F811)
docs/source/custom_directives.py
371-375: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
docs/scripts/make_model_zoo_docs.py
373-373: By default, jinja2 sets autoescape
to False
. Consider using autoescape=True
or the select_autoescape
function to mitigate XSS vulnerabilities.
(S701)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (41)
docs/source/getting_started/manufacturing/06_augmentation.ipynb (1)
361-361
: Verify the nbformat_minor downgrade.The
nbformat_minor
has been changed from 2 to 0, which is a downgrade. This is unusual and could potentially affect compatibility with newer Jupyter features. Typically, this value increases as the notebook format evolves.Please verify this change is intentional and test that the notebook renders correctly in both Jupyter and Colab environments.
docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2012.rst (1)
1-65
: LGTM!The documentation structure is consistent, well-organized, and follows the standard format for dataset zoo pages. The content accurately describes the VOC-2012 dataset with appropriate details, backend requirements, and usage examples.
docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_video.rst (1)
1-52
: LGTM!The documentation is well-structured and includes all necessary information including the important ffmpeg requirement note. The format is consistent with other dataset zoo documentation pages.
docs/source/dataset_zoo/dataset_zoo_ecosystem/caltech256.rst (1)
1-52
: LGTM!The documentation is well-structured, accurate, and consistent with the standard format for dataset zoo pages. All sections are complete and properly formatted.
docs/source/dataset_zoo/dataset_zoo_ecosystem/lfw.rst (1)
1-51
: LGTM!The documentation is well-organized and provides comprehensive information about the LFW dataset. The structure is consistent with other dataset zoo documentation pages.
docs/scripts/generate_plugin_docs.py (2)
106-108
: LGTM!The
_generate_github_badge
method is simple, focused, and generates a properly formatted GitHub badge. The implementation is clean and serves its purpose well.
510-522
: LGTM!The badge generation is properly integrated into the documentation workflow. For community plugins, it's placed after the community note, and for other plugins, it's placed at the beginning. This provides consistent visual branding across all plugin documentation.
docs/source/dataset_zoo/dataset_zoo_ecosystem/cifar10.rst (1)
1-55
: LGTM!The CIFAR-10 documentation is well-structured and complete. It follows the standard format for dataset zoo pages and includes all necessary information including backend requirements and usage examples.
docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_600.rst (1)
1-173
: LGTM! Well-structured dataset documentation.The Kinetics-600 documentation page is comprehensive and follows best practices:
- Clear dataset overview and metadata
- Detailed partial download parameter documentation
- Practical usage examples in both Python and CLI
- Appropriate notes about video availability and download behavior
docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_3d.rst (1)
1-51
: LGTM! Clear and concise quickstart documentation.The Quickstart 3D documentation is well-suited for its purpose as an introductory dataset. The description clearly explains the data source, processing methods (Poisson disk sampling, convex hull bounding boxes), and provides straightforward usage examples.
docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_100.rst (1)
1-208
: LGTM! Comprehensive ActivityNet documentation.The ActivityNet 100 documentation effectively addresses the complexities of this dataset:
- Clear notes about video availability issues and YouTube/source file alternatives
- Detailed parameter documentation for partial downloads
- Helpful guidance on full split downloads with external form reference
- Well-organized example usage for both Python and CLI
docs/source/getting_started/manufacturing/01_intro.ipynb (2)
1-1
: Disregard static analysis configuration warnings.The Pylint warnings about unrecognized options and deprecated settings are configuration issues, not code problems. These don't affect the notebook's functionality and can be safely ignored.
1-1
: LGTM! Improved notebook with standardized naming and alternative download.The notebook updates improve consistency and usability:
- Standardized file naming with underscores (
mvtec_ad.zip
,mvtec_ad
)- Activated Google Drive download alternative for users who encounter Hugging Face issues
- Minor formatting improvements (port specification)
The changes maintain the tutorial's educational flow while providing better alternatives for dataset access.
docs/source/_static/js/tutorial-filters.js (2)
93-101
: LGTM! Flexible search input handling.The updates enable the filter system to work across different page types (plugins, models, datasets) by:
- Checking for multiple possible search input IDs
- Using optional chaining for safe element access
- Properly falling back to empty string when no input exists
The approach is consistent and maintains backward compatibility.
Also applies to: 171-174
150-151
: Simplified "All" button label is clearer.Changing from a dynamic count display to a static
"All"
label simplifies the UI and reduces unnecessary DOM updates. The count information is still visually available through the filtered cards themselves.docs/source/dataset_zoo/dataset_zoo_ecosystem/imagenet_2012.rst (1)
1-92
: LGTM! Clear documentation with appropriate manual download guidance.The ImageNet 2012 documentation effectively communicates:
- The manual download requirement with clear directory structure
- Registration process for obtaining download links
- Backend installation prerequisites (Torch or TensorFlow)
- Straightforward usage examples for both Python and CLI
The emphasis on manual steps is appropriate given ImageNet's access requirements.
docs/source/dataset_zoo/dataset_zoo_ecosystem/kinetics_400.rst (1)
1-179
: Well-documented dataset with important availability notes.The Kinetics-400 documentation is comprehensive and includes an important distinction about download behavior (no class-based zips, requiring YouTube downloads for partial loads). This helps users understand the limitations compared to other Kinetics versions.
docs/source/dataset_zoo/dataset_zoo_ecosystem/open_images_v6.rst (1)
216-223
: No action required: examples are consistent. Both the CLI and Python snippets use the supported plural label_types values (“detections”, “classifications”, “relationships”, “segmentations”), matching the parameter documentation.Likely an incorrect or invalid review comment.
docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2014.rst (1)
86-89
: Verify “attrs” parameter mentionDocs mention filtering on “attrs”, but the parameter isn’t described above. Confirm it’s supported by COCO2014Dataset/Importer; if not, remove or document it.
docs/source/dataset_zoo/dataset_zoo_ecosystem/fiw.rst (1)
3-142
: Well-structured and clear FIW docsAccurate overview, labels, examples, and utilities. LGTM.
docs/source/dataset_zoo/dataset_zoo_ecosystem/fashion_mnist.rst (1)
1-56
: LGTM!The documentation page follows the established pattern for dataset zoo entries and provides clear, accurate information about the Fashion-MNIST dataset. The structure, examples, and references are consistent with other dataset pages in this PR.
docs/source/dataset_zoo/overview.rst (1)
1-107
: LGTM!The overview page provides a clear introduction to the Dataset Zoo with well-structured sections and accurate code examples. The use of the new
customanimatedcta
directive aligns with the PR's goal of standardizing documentation CTAs.docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti_multiview.rst (1)
1-58
: LGTM!The KITTI Multiview dataset page is well-structured and provides comprehensive information about the dataset's multiview components, splits, and usage. The documentation is consistent with other dataset pages in the ecosystem.
docs/source/user_guide/basics.rst (1)
58-58
: LGTM!The systematic replacement of
custombutton
withcustomanimatedcta
aligns with the PR's goal of standardizing CTA buttons across documentation. All directive usages maintain their existing options and the changes are consistent throughout the file.Also applies to: 99-99, 212-212, 225-225, 241-241, 264-264, 336-336, 393-393, 432-432
docs/source/dataset_zoo/dataset_zoo_ecosystem/kitti.rst (1)
1-55
: LGTM!The KITTI dataset documentation is clear, well-organized, and provides all necessary information for users to load and use the dataset. The structure and examples are consistent with other dataset pages in the ecosystem.
docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_groups.rst (1)
1-48
: LGTM!The Quickstart Groups dataset page provides clear information about the grouped multimodal dataset. The documentation structure, examples, and metadata are consistent with other quickstart dataset pages in the ecosystem.
docs/source/assets/styles/components/_buttons.sass (2)
47-52
: LGTM!The CSS changes correctly integrate with the dynamic CTA JavaScript functionality by using CSS custom properties with appropriate fallback values. The fallbacks preserve the existing behavior when the JavaScript hasn't executed or the variables aren't set.
47-47
: Verify the fallback value precision.The fallback value
116.22px
for--cta-arrow-shift
is oddly specific. Ensure this value was measured from the existing design and not accidentally introduced with excessive precision.docs/source/dataset_zoo/dataset_zoo_ecosystem/quickstart_geo.rst (1)
1-46
: LGTM! Well-structured dataset documentation.The documentation follows a clear, consistent structure with appropriate sections (anchor, description, details, usage examples). The tabbed examples provide both Python and CLI usage patterns, which improves accessibility for different user workflows.
docs/source/dataset_zoo/dataset_zoo_ecosystem/dataset_cards.rst (1)
1-244
: LGTM! Comprehensive dataset card catalog.The dataset cards are well-organized with consistent structure across all entries. Each card provides essential metadata (header, description, link, image, tags) that enables effective browsing and discovery of available datasets.
docs/source/dataset_zoo/dataset_zoo_ecosystem/voc_2007.rst (1)
1-65
: LGTM! Clear and comprehensive dataset documentation.The documentation properly notes the test set limitation (no annotations) and backend requirements, which are important for users to understand before loading the dataset.
docs/source/custom_directives.py (3)
208-210
: Minor style adjustment in image formatting.The image formatting now includes explicit margin styling for better visual spacing in callout items.
240-249
: Enhanced CTA button with animated arrow.The updated template adds a dynamic arrow SVG and the
data-cta-dynamic="true"
attribute, which enables the animated hover effects described in the new CustomAnimatedCTADirective.
352-423
: Well-implemented animated CTA directive.The new
CustomAnimatedCTADirective
provides a clean interface for creating animated call-to-action buttons with flexible alignment options. The implementation correctly handles all directive options and generates appropriate HTML output.Note: The static analysis warning about
option_spec
being a mutable class attribute is a standard pattern for Sphinx directives and can be safely ignored in this context.docs/source/model_zoo/overview.rst (1)
1-240
: LGTM! Excellent model zoo overview documentation.The documentation provides a comprehensive introduction to the Model Zoo with:
- Clear organization of key concepts (built-in models, remote models, interface, API)
- Practical code examples demonstrating predictions, embeddings, and logits
- Effective use of the new
customanimatedcta
directive for navigation- Helpful notes about requirements and capabilities
docs/source/conf.py (4)
25-25
: Import added for new directive.The
CustomAnimatedCTADirective
is imported to enable registration in the Sphinx setup.
187-191
: Toctree configuration for ecosystem documentation.The
remove_from_toctrees
setting excludes ecosystem subdirectories from the main table of contents, which is appropriate for the card-based ecosystem structure where individual pages are accessed via cards rather than toctree navigation.
207-207
: New search functionality added.The
js/search.js
file is included to support enhanced search/filter capabilities for the ecosystem pages.
285-285
: New directive registered successfully.The
customanimatedcta
directive is properly registered with the Sphinx application, enabling its use throughout the documentation.docs/source/dataset_zoo/dataset_zoo_ecosystem/coco_2017.rst (1)
1-236
: LGTM! Comprehensive COCO-2017 documentation.This is excellent documentation that provides:
- Clear overview and context (including COCO team recommendation)
- Important notes about class counts and data availability
- Detailed partial download parameter descriptions
- Multiple practical examples for common use cases
- Both Python and CLI examples
The documentation enables users to efficiently work with this large dataset by understanding partial download options.
docs/source/dataset_zoo/dataset_zoo_ecosystem/activitynet_200.rst (1)
1-209
: LGTM! Thorough ActivityNet 200 documentation.The documentation effectively addresses the unique challenges of this dataset:
- Clear explanation of ActivityNet 100 vs 200 differences
- Important warnings about video availability and manual downloads
- Comprehensive parameter documentation for partial downloads
- Link to full split download instructions
- Practical examples demonstrating common workflows
The note about requiring ffmpeg installation is a helpful troubleshooting pointer.
I moved some commits into individual PRs to keep this PR focused on the dataset and model improvements |
…iles, avoiding add new files
Instead of duplicating content across multiple files, the last commit makes dataset.rst the source of truth. Dynamic individual files are now generated directly from it, ensuring consistency and reducing maintenance overhead as a temporal solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/.gitignore
(1 hunks)docs/generate_docs.bash
(1 hunks)docs/scripts/make_dataset_zoo_docs.py
(1 hunks)docs/source/dataset_zoo/dataset.rst
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/.gitignore
🧰 Additional context used
🪛 Pylint (3.3.8)
docs/scripts/make_dataset_zoo_docs.py
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
🪛 Ruff (0.13.3)
docs/scripts/make_dataset_zoo_docs.py
34-34: Local variable name
is assigned to but never used
Remove assignment to unused variable name
(F841)
128-128: By default, jinja2 sets autoescape
to False
. Consider using autoescape=True
or the select_autoescape
function to mitigate XSS vulnerabilities.
(S701)
165-165: Loop control variable title
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/scripts/make_dataset_zoo_docs.py
(1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.8)
docs/scripts/make_dataset_zoo_docs.py
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
🪛 Ruff (0.13.3)
docs/scripts/make_dataset_zoo_docs.py
34-34: Local variable name
is assigned to but never used
Remove assignment to unused variable name
(F841)
128-128: By default, jinja2 sets autoescape
to False
. Consider using autoescape=True
or the select_autoescape
function to mitigate XSS vulnerabilities.
(S701)
165-165: Loop control variable title
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (3)
docs/scripts/make_dataset_zoo_docs.py (3)
128-131
: Static analysis autoescape warning is a false positive.The Ruff warning about missing
autoescape=True
is not applicable here since this script generates RST documentation files, not HTML. XSS vulnerabilities are not a concern for RST generation.
165-175
: Duplicate anchor issue has been resolved.The previous review concern about duplicate
.. _dataset-zoo-…
anchors causing Sphinx warnings has been properly addressed. Lines 168-172 now strip the original anchor from each generated dataset file, preventing duplicate target warnings.Based on past review comments.
74-114
: Verified description extraction heuristics. Sampled outputs show accurate and properly truncated descriptions across datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AdonaiVera this is looking awesome!!!
I explored the new layout on https://docs.dev.voxel51.com and just have some small cleanup-related feedback for you 🤓
General Points
- What do you think about increasing the number of cards per page from 6 to, say, 10 on all of our card lists? (tutorials, models, datasets, plugins, etc). 6 feels a bit too few to me
Dataset Zoo
Per page feedback:
/dataset_zoo/
index.html
rename anchor link to `dataset-zoo`
ensure breadcrumb at the top reads "[Home] > Dataset Zoo"
change main CTA text to "Learn more about the Dataset Zoo"
overview.html
rename page title to "Dataset Zoo Overview"
rename anchor link to `dataset-zoo-overview`
update link on "Explore the datasets in the zoo" button to `dataset_zoo/index.html`
ensure breadcrumb at the top reads "[Home] > Dataset Zoo > Dataset Zoo Overview"
dataset.html
find all instances of `dataset-zoo-datasets` anchor link throughout the docs and replace with `dataset-zoo`
delete this page, it is now deprecated
dataset_zoo_ecosystem/
rename this folder to datasets/
<dataset_name>/
move each dataset's anchor link from `dataset.rst` to the top of its new dedicated page
ensure breadcrumb at the top reads "[Home] > Dataset Zoo > <Dataset Name>"
Model Zoo
Per page feedback:
/model_zoo/
index.html
rename anchor link to `model-zoo`
ensure breadcrumb at the top reads "[Home] > Model Zoo"
change main CTA text to "Learn more about the Model Zoo"
overview.html
rename page title to "Model Zoo Overview"
rename anchor link to `model-zoo-overview`
update link on "Explore the models in the zoo" button to `model_zoo/index.html`
ensure breadcrumb at the top reads "[Home] > Model Zoo > Model Zoo Overview"
models.html
find all instances of `model-zoo-models` anchor link throughout the docs and replace with `model-zoo`
delete this page, it is now deprecated
model_zoo_ecosystem/
rename this folder to models/
<model_name>/
ensure breadcrumb at the top reads "[Home] > Model Zoo > <Model Name>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/source/_static/js/custom.js (2)
600-637
: Consider debouncing the resize listener for better performance.The
updateAll()
function performs layout calculations (getBoundingClientRect()
) for every CTA on each resize event. Without debouncing or throttling, frequent resize events (e.g., during window drag) could cause performance issues, especially if many CTAs exist on the page.Apply this pattern to debounce the resize handler:
+ let resizeTimeout; const updateAll = () => ctas.forEach(compute); if (document.readyState === "loading") { document.addEventListener("DOMContentLoaded", updateAll, { once: true }); } else { updateAll(); } - window.addEventListener("resize", updateAll); + window.addEventListener("resize", () => { + clearTimeout(resizeTimeout); + resizeTimeout = setTimeout(updateAll, 150); + });
614-627
: Consider documenting the magic numbers for maintainability.The calculations use several unexplained constants (5, 30, 0.5, 320). Adding brief inline comments or extracting them as named constants would improve code clarity.
Example:
+ // Buffer to prevent arrow from touching button edge + const ARROW_BUFFER = 5; + const MIN_TEXT_SHIFT = 30; + const MAX_TEXT_SHIFT_RATIO = 0.5; + const MAX_ARROW_SHIFT = 320; + const spaceRight = btnRect.width - (arrowRect.left - btnRect.left) - arrowRect.width; - const arrowShift = Math.max(0, spaceRight - 5); + const arrowShift = Math.max(0, spaceRight - ARROW_BUFFER); const textShift = -Math.min( - Math.max(arrowRect.width, 30), - btnRect.width * 0.5 + Math.max(arrowRect.width, MIN_TEXT_SHIFT), + btnRect.width * MAX_TEXT_SHIFT_RATIO ); btn.style.setProperty( "--cta-arrow-shift", - `${Math.min(arrowShift, 320)}px` + `${Math.min(arrowShift, MAX_ARROW_SHIFT)}px` );
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/source/_static/js/custom.js
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
docs/source/_static/js/custom.js (1)
600-637
: LGTM! The dynamic CTA calculation logic is correct.The function correctly computes shifts for arrow and text elements based on their bounding rectangles. The initialization pattern (checking
document.readyState
) ensures it works whether called before or after DOM load.
Thank you for the feedback, @brimoor ! 🙌 I applied all the feedback for the dataset, model, and the 10 cards in pagination. The only open point was about removing dataset.rst. My first idea was to split it into individual files per dataset, but that created 10k+ new lines just for moving content. Instead, I kept dataset.rst as the single source and generate the individual files only at build time (not committed to GitHub). This way, any change to a core dataset only needs to happen in dataset.rst, and it avoids flooding the repo with thousands of lines. It’s not the most scalable solution, but good enough until we move to a more robust source like an internal DB. One question: for model-zoo-models and dataset-zoo-datasets, I updated the names everywhere, but not in the release notes since that’s part of history. Should I update those paths there too, or leave them as is? You can see the changes here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
docs/source/dataset_zoo/index.rst (1)
42-49
: Closetutorial-tags-container
before ending the nav.
<div class="tutorial-tags-container">
opens but never closes before</nav>
, leaving the DOM unbalanced and breaking downstream JS/CSS selectors. Add the missing</div>
just before</nav>
.<div id="dropdown-filter-tags"> <div class="tutorial-filter-menu"> <div class="tutorial-filter filter-btn all-tag-selected" data-tag="all">All</div> </div> </div> + </div> </nav>
docs/scripts/make_model_zoo_docs.py (1)
293-296
: Make the inline-link scrubber non-greedy.
re.sub(" <.*>", "", description)
grabs from the first<
through the last>
, erasing large chunks of text whenever multiple links/angle brackets appear. Limit the match to a single bracketed fragment.- description = re.sub(" <.*>", "", description) + description = re.sub(r" <[^>]+>", "", description)docs/source/model_zoo/index.rst (1)
42-49
: Closetutorial-tags-container
before</nav>
.You open
<div class="tutorial-tags-container">
but never close it before ending the<nav>
, leaving the markup unbalanced and breaking styling/scripts that rely on that wrapper. Insert the missing</div>
ahead of</nav>
.<div id="dropdown-filter-tags"> <div class="tutorial-filter-menu"> <div class="tutorial-filter filter-btn all-tag-selected" data-tag="all">All</div> </div> </div> + </div> </nav>
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
docs/.gitignore
(1 hunks)docs/scripts/make_dataset_zoo_docs.py
(1 hunks)docs/scripts/make_model_zoo_docs.py
(4 hunks)docs/source/_static/js/custom.js
(2 hunks)docs/source/_static/js/tutorial-filters.js
(4 hunks)docs/source/conf.py
(4 hunks)docs/source/dataset_zoo/index.rst
(1 hunks)docs/source/dataset_zoo/overview.rst
(1 hunks)docs/source/dataset_zoo/remote.rst
(1 hunks)docs/source/integrations/pytorch_hub.rst
(1 hunks)docs/source/model_zoo/index.rst
(1 hunks)docs/source/model_zoo/overview.rst
(1 hunks)docs/source/model_zoo/remote.rst
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/source/integrations/pytorch_hub.rst
🚧 Files skipped from review as they are similar to previous changes (4)
- docs/source/_static/js/custom.js
- docs/source/_static/js/tutorial-filters.js
- docs/source/model_zoo/overview.rst
- docs/source/dataset_zoo/overview.rst
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.
Applied to files:
docs/source/dataset_zoo/remote.rst
📚 Learning: 2025-10-03T13:44:19.839Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6373
File: docs/scripts/make_model_zoo_docs.py:289-292
Timestamp: 2025-10-03T13:44:19.839Z
Learning: In docs/scripts/make_model_zoo_docs.py, the generated model_cards.rst file is included from docs/source/model_zoo/index.rst using `.. include:: model_zoo_ecosystem/model_cards.rst`. Links within model_cards.rst should be relative to the includer's location (index.rst), so `model_zoo_ecosystem/{slug}.html` is the correct path format for linking to individual model pages.
Applied to files:
docs/scripts/make_model_zoo_docs.py
🧬 Code graph analysis (2)
docs/scripts/make_model_zoo_docs.py (1)
fiftyone/zoo/models/__init__.py (2)
get_zoo_model
(356-366)list_zoo_models
(36-73)
docs/source/conf.py (1)
docs/source/custom_directives.py (1)
CustomAnimatedCTADirective
(352-405)
🪛 Pylint (3.3.9)
docs/scripts/make_dataset_zoo_docs.py
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
🪛 Ruff (0.13.3)
docs/scripts/make_model_zoo_docs.py
373-373: By default, jinja2 sets autoescape
to False
. Consider using autoescape=True
or the select_autoescape
function to mitigate XSS vulnerabilities.
(S701)
docs/scripts/make_dataset_zoo_docs.py
127-127: By default, jinja2 sets autoescape
to False
. Consider using autoescape=True
or the select_autoescape
function to mitigate XSS vulnerabilities.
(S701)
164-164: Loop control variable title
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/scripts/make_dataset_zoo_docs.py (1)
169-175
: Strip duplicated dataset anchors to avoid Sphinx target collisionsPer-dataset files currently include the original
.. _dataset-zoo-…
label from dataset.rst, yielding duplicate explicit targets and refs resolving to the old page. Remove the leading label (and following blank line) before writing.Apply:
- for slug, title, section_content in datasets: + for slug, title, section_content in datasets: link_slug = _normalize_slug(slug) dataset_path = out_dir / f"{link_slug}.rst" - section_lines = section_content.strip().splitlines() - dataset_path.write_text( - "\n".join(section_lines) + "\n", encoding="utf-8" - ) + section_lines = section_content.strip().splitlines() + # Drop duplicated explicit targets from the source sections + if section_lines and section_lines[0].startswith(".. _dataset-zoo-"): + section_lines = section_lines[1:] + if section_lines and not section_lines[0].strip(): + section_lines = section_lines[1:] + dataset_path.write_text("\n".join(section_lines) + "\n", encoding="utf-8")
🧹 Nitpick comments (2)
docs/scripts/make_dataset_zoo_docs.py (1)
121-127
: Make image directive regex accept optional leading slashSome sections may use
.. image:: images/dataset_zoo/...
(no leading slash). Accept both.- pattern = r"\.\. image::\s+/images/dataset_zoo/([^\s]+)" + pattern = r"\.\. image::\s+/?images/dataset_zoo/([^\s]+)"docs/scripts/make_model_zoo_docs.py (1)
384-389
: Use pathlib for robust, cross-platform paths; also removes unused import warningAvoid manual splitting and os.path juggling; use Path consistently.
- docs_dir = "/".join(os.path.realpath(__file__).split("/")[:-2]) - docs_source_dir = os.path.join(docs_dir, "source") - models_dir = os.path.join(docs_source_dir, "model_zoo", "models") - - os.makedirs(models_dir, exist_ok=True) + docs_dir = Path(__file__).resolve().parents[1] + docs_source_dir = docs_dir / "source" + models_dir = docs_source_dir / "model_zoo" / "models" + models_dir.mkdir(parents=True, exist_ok=True) @@ - cards_path = os.path.join(models_dir, "model_cards.rst") - etau.write_file("\n".join(model_cards_content), cards_path) + cards_path = models_dir / "model_cards.rst" + etau.write_file("\n".join(model_cards_content), str(cards_path)) @@ - model_path = os.path.join(models_dir, f"{model_slug}.rst}") - etau.write_file(model_content, model_path) + model_path = models_dir / f"{model_slug}.rst" + etau.write_file(model_content, str(model_path))Note: ensure the stray brace in the original f-string is corrected if present locally.
Also applies to: 404-405, 407-414
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/scripts/make_dataset_zoo_docs.py
(1 hunks)docs/scripts/make_model_zoo_docs.py
(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T13:44:19.839Z
Learnt from: AdonaiVera
PR: voxel51/fiftyone#6373
File: docs/scripts/make_model_zoo_docs.py:289-292
Timestamp: 2025-10-03T13:44:19.839Z
Learning: In docs/scripts/make_model_zoo_docs.py, the generated model_cards.rst file is included from docs/source/model_zoo/index.rst using `.. include:: model_zoo_ecosystem/model_cards.rst`. Links within model_cards.rst should be relative to the includer's location (index.rst), so `model_zoo_ecosystem/{slug}.html` is the correct path format for linking to individual model pages.
Applied to files:
docs/scripts/make_model_zoo_docs.py
🔇 Additional comments (2)
docs/scripts/make_model_zoo_docs.py (1)
289-299
: Good fix on greedy inline-link strippingSwitch to bracket-bounded regex for RST inline links is correct and avoids over-stripping.
docs/scripts/make_dataset_zoo_docs.py (1)
166-168
: Confirm include path matches link baseCards are written to
dataset_zoo/datasets/dataset_cards.rst
and links usedatasets/{slug}.html
. Ensure dataset.rst includes the cards fromdatasets/dataset_cards.rst
so links resolve relative to the includer.Run:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/source/model_zoo/api.rst (1)
22-28
: Tighten wording, pluralize, and add helpful cross-linksImprove clarity and discoverability; avoid "(s)". Link to ensure/install methods and CLI.
.. note:: - Some models require additional packages to be installed. If you do not have - a :ref:`required package <model-zoo-requirements>` installed when loading a - zoo model, you'll see an error message that will help you install the - necessary package(s). + Some models require additional packages. If any :ref:`required packages <model-zoo-requirements>` + are missing when you load a zoo model, you'll see an error with installation + instructions. See :ref:`Installing zoo model requirements <model-zoo-requirements>` to + pre-check or install via :meth:`ensure_zoo_model_requirements() <fiftyone.zoo.models.ensure_zoo_model_requirements>`, + :meth:`install_zoo_model_requirements() <fiftyone.zoo.models.install_zoo_model_requirements>`, + or the :ref:`CLI command <cli-fiftyone-zoo-models-requirements>`.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/source/dataset_zoo/api.rst
(1 hunks)docs/source/model_zoo/api.rst
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/source/dataset_zoo/api.rst
Based on our conversation, @brimoor, Now we have one file per dataset, I removed the script, and updated both the model-zoo-models and dataset-zoo-datasets for the new version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 21
🧹 Nitpick comments (23)
docs/source/dataset_zoo/datasets/kitti_multiview.rst (5)
24-26
: Use HTTPS and consider linking the license.
- Switch to HTTPS for the source URL.
- Optionally link the license to its canonical page for clarity.
Apply:
- - Dataset source: http://www.cvlibs.net/datasets/kitti + - Dataset source: https://www.cvlibs.net/datasets/kitti/ - - Dataset license: CC-BY-NC-SA-3.0 + - Dataset license: CC BY-NC-SA 3.0
27-28
: Align tags with the standardized taxonomy.“detection” is ambiguous. Consider aligning with project-wide tags (e.g., “object-detection”, “3d-detection”, “stereo”/“multiview”) for consistent filtering/search.
Please confirm which canonical tags are being used across the new zoo pages and update here accordingly.
29-31
: Verify the ZooDataset class import path resolves.Ensure Sphinx resolves this cross-ref. Dataset classes typically aren’t under
.base
. If needed, update to the correct module path and consider~
to shorten display.Example (adjust as appropriate):
:class:
~fiftyone.zoo.datasets.kitti.KITTIMultiviewDataset``
56-58
: Verify image asset exists and builds cleanly.Confirm
/images/dataset_zoo/kitti-multiview-train.png
is present and referenced correctly (path, case). Consider adding a width if needed for layout consistency.
18-20
: Optional: link directly to the KITTI devkit page.Hyperlink “object development kit on the KITTI homepage” to the specific devkit/README page for faster navigation.
docs/source/dataset_zoo/datasets/quickstart_geo.rst (3)
8-10
: Add source/citation (and license) for BDD100K.Attribute the upstream dataset and link to its homepage; optionally mention license/terms to avoid ambiguity.
Apply:
-The dataset consists of 500 images from the validation split of the BDD100K -dataset in the New York City area with object detections and GPS timestamps. +The dataset consists of 500 images from the validation split of the +`BDD100K <https://bdd-data.berkeley.edu/>`_ dataset in the New York City +area with object detections and GPS timestamps. + +Source: BDD100K (validation split subset). Please refer to the BDD100K +license/terms of use on the linked site.
14-14
: Avoid exact size; prefer approximate to prevent drift.-- Dataset size: 33.50 MB +- Dataset size: ~33.5 MB
16-16
: Clarify “Supported splits” vs text mentioning validation.“Supported splits: N/A” can read as contradictory. Clarify that the zoo dataset is a single combined set without split options.
-- Supported splits: ``N/A`` +- Supported splits: ``none (single combined set)``docs/source/dataset_zoo/datasets/ucf101.rst (3)
48-54
: Set explicit dataset name for consistency with CLI.Without a name, the dataset’s actual name may vary. Align with CLI examples.
- dataset = foz.load_zoo_dataset("ucf101", split="test") + dataset = foz.load_zoo_dataset( + "ucf101", split="test", dataset_name="ucf101-test" + )
59-65
: Make CLI deterministic by naming the dataset.Ensure subsequent commands reference the exact dataset.
- fiftyone zoo datasets load ucf101 --split test + fiftyone zoo datasets load ucf101 --split test --dataset-name ucf101-test
6-9
: Consistency: use “dataset” (one word).The text mixes “data set” and “dataset”. Standardize to “dataset”.
Example edits (apply across the section):
- UCF101 is an action recognition data set + UCF101 is an action recognition dataset - This data set is an extension + This dataset is an extension - data sets are not realistic + datasets are not realisticAlso applies to: 14-16
docs/source/dataset_zoo/datasets/voc_2012.rst (1)
20-20
: Optional: mark size as approximate.Dataset downloads vary; consider “~3.6 GB” to avoid false precision.
docs/source/dataset_zoo/datasets/kinetics_400.rst (2)
75-77
: Use double backticks for inline code literalsSingle backticks can create interpreted text; prefer double backticks for code/param names for consistency with the rest of the page.
Apply diffs like:
-By default, `multiprocessing.cpu_count()` is used +By default, ``multiprocessing.cpu_count()`` is used-... whenever either `classes` or `max_samples` is provided ... +... whenever either ``classes`` or ``max_samples`` is provided ...-Videos that contain all `classes` will be prioritized first, followed +Videos that contain all ``classes`` will be prioritized first, followedAlso applies to: 90-92, 125-128
21-21
: Stats/size accuracy: add “approx.” and verify sourcesNumbers look internally consistent (370+56+30=456 GB). Consider marking sizes as “approx.” and confirm counts for “Original” vs “CVDF” splits match the referenced sources.
Example:
-- Dataset size: 456 GB +- Dataset size: ~456 GBAlso applies to: 27-38, 41-44
docs/source/dataset_zoo/datasets/cifar100.rst (1)
21-23
: Standardize ZooDataset links (use base module) and keep backend note in textOther pages link to base.* classes. Recommend linking to the base class once and keeping the TF/Torch backend availability in prose to avoid crossref breaks.
- - :class:`CIFAR100Dataset <fiftyone.zoo.datasets.tf.CIFAR100Dataset>` (TF backend) - - :class:`CIFAR100Dataset <fiftyone.zoo.datasets.torch.CIFAR100Dataset>` (Torch backend) + - :class:`CIFAR100Dataset <fiftyone.zoo.datasets.base.CIFAR100Dataset>` (available for Torch and TensorFlow backends)Please confirm the tf/torch module paths resolve in Sphinx; if they do, feel free to keep both, but be consistent with other pages.
docs/source/dataset_zoo/datasets/coco_2017.rst (1)
31-33
: Prefer HTTPS for external linksUse HTTPS for COCO source.
-- Dataset source: http://cocodataset.org/#home +- Dataset source: https://cocodataset.org/#homedocs/source/dataset_zoo/datasets/bdd100k.rst (1)
41-49
: Use HTTPS and current URLsPrefer HTTPS for registration/source links.
- You can register at http://bdd-data.berkeley.edu in order to get links + You can register at https://bdd-data.berkeley.edu in order to get links ... -- Dataset source: http://bdd-data.berkeley.edu -- Dataset license: http://bdd-data.berkeley.edu/download.html +- Dataset source: https://bdd-data.berkeley.edu +- Dataset license: https://bdd-data.berkeley.edu/download.htmlConsider whether https://www.bdd100k.com/ should be used instead.
docs/source/dataset_zoo/datasets/coco_2014.rst (5)
31-31
: Use HTTPS for source linkSwitch to HTTPS.
-- Dataset source: http://cocodataset.org/#home +- Dataset source: https://cocodataset.org/#home
56-60
: Minor consistency: split orderingEarlier you list “train, validation, test”. Mirror that here for consistency.
- ``("train", "test", "validation")``. If neither is provided, all available + ``("train", "validation", "test")``. If neither is provided, all available
81-85
: Clarify include_license values (booleans vs strings)Represent booleans as booleans, not quoted strings, and simplify wording.
- - ``"False"`` (default): don't load the license - - ``True``/``"name"``: store the string license name + - ``False`` (default): don't load the license + - ``True`` or ``"name"``: store the string license name - ``"id"``: store the integer license ID - ``"url"``: store the license URL
86-89
: Avoid referencing undefined parameterattrs
Either document an
attrs
parameter above, or remove it here. Quick fix below removes it.-- **only_matching** (*False*): whether to only load labels that match the - ``classes`` or ``attrs`` requirements that you provide (True), or to load +- **only_matching** (*False*): whether to only load labels that match the + ``classes`` requirements that you provide (True), or to load
6-11
: Clarify scope (captions not included by this loader)You mention captioning in COCO, but this loader covers detections/segmentations. Add a sentence to avoid confusion.
-This version contains images, bounding boxes, and segmentations for the 2014 -version of the dataset. +This version contains images, bounding boxes, and segmentations for the 2014 +version of the dataset. Captions are not included in this loader.docs/source/dataset_zoo/datasets/open_images_v7.rst (1)
164-166
: PEP 8 keyword arg spacing in exampleOptional: remove spaces around '=' in keyword args.
- classes = ["Tortoise", "Sea turtle"], + classes=["Tortoise", "Sea turtle"],
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (38)
docs/.gitignore
(1 hunks)docs/source/dataset_zoo/datasets/activitynet_100.rst
(1 hunks)docs/source/dataset_zoo/datasets/activitynet_200.rst
(1 hunks)docs/source/dataset_zoo/datasets/bdd100k.rst
(1 hunks)docs/source/dataset_zoo/datasets/caltech101.rst
(1 hunks)docs/source/dataset_zoo/datasets/caltech256.rst
(1 hunks)docs/source/dataset_zoo/datasets/cifar10.rst
(1 hunks)docs/source/dataset_zoo/datasets/cifar100.rst
(1 hunks)docs/source/dataset_zoo/datasets/cityscapes.rst
(1 hunks)docs/source/dataset_zoo/datasets/coco_2014.rst
(1 hunks)docs/source/dataset_zoo/datasets/coco_2017.rst
(1 hunks)docs/source/dataset_zoo/datasets/dataset_cards.rst
(1 hunks)docs/source/dataset_zoo/datasets/fashion_mnist.rst
(1 hunks)docs/source/dataset_zoo/datasets/fiw.rst
(1 hunks)docs/source/dataset_zoo/datasets/hmdb51.rst
(1 hunks)docs/source/dataset_zoo/datasets/imagenet_2012.rst
(1 hunks)docs/source/dataset_zoo/datasets/imagenet_sample.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_400.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_600.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_700.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
(1 hunks)docs/source/dataset_zoo/datasets/kitti.rst
(1 hunks)docs/source/dataset_zoo/datasets/kitti_multiview.rst
(1 hunks)docs/source/dataset_zoo/datasets/lfw.rst
(1 hunks)docs/source/dataset_zoo/datasets/mnist.rst
(1 hunks)docs/source/dataset_zoo/datasets/open_images_v6.rst
(1 hunks)docs/source/dataset_zoo/datasets/open_images_v7.rst
(1 hunks)docs/source/dataset_zoo/datasets/places.rst
(1 hunks)docs/source/dataset_zoo/datasets/quickstart.rst
(1 hunks)docs/source/dataset_zoo/datasets/quickstart_3d.rst
(1 hunks)docs/source/dataset_zoo/datasets/quickstart_geo.rst
(1 hunks)docs/source/dataset_zoo/datasets/quickstart_groups.rst
(1 hunks)docs/source/dataset_zoo/datasets/quickstart_video.rst
(1 hunks)docs/source/dataset_zoo/datasets/sama_coco.rst
(1 hunks)docs/source/dataset_zoo/datasets/ucf101.rst
(1 hunks)docs/source/dataset_zoo/datasets/voc_2007.rst
(1 hunks)docs/source/dataset_zoo/datasets/voc_2012.rst
(1 hunks)docs/source/release-notes.rst
(3 hunks)
✅ Files skipped from review due to trivial changes (22)
- docs/source/dataset_zoo/datasets/cityscapes.rst
- docs/source/dataset_zoo/datasets/quickstart.rst
- docs/source/dataset_zoo/datasets/imagenet_sample.rst
- docs/source/dataset_zoo/datasets/kitti.rst
- docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
- docs/source/dataset_zoo/datasets/activitynet_200.rst
- docs/source/dataset_zoo/datasets/fiw.rst
- docs/source/dataset_zoo/datasets/kinetics_700.rst
- docs/source/dataset_zoo/datasets/voc_2007.rst
- docs/source/dataset_zoo/datasets/cifar10.rst
- docs/source/dataset_zoo/datasets/quickstart_3d.rst
- docs/source/dataset_zoo/datasets/imagenet_2012.rst
- docs/source/dataset_zoo/datasets/kinetics_600.rst
- docs/source/dataset_zoo/datasets/lfw.rst
- docs/source/dataset_zoo/datasets/places.rst
- docs/source/dataset_zoo/datasets/quickstart_groups.rst
- docs/source/dataset_zoo/datasets/activitynet_100.rst
- docs/source/dataset_zoo/datasets/fashion_mnist.rst
- docs/source/dataset_zoo/datasets/mnist.rst
- docs/source/release-notes.rst
- docs/source/dataset_zoo/datasets/caltech101.rst
- docs/source/dataset_zoo/datasets/quickstart_video.rst
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-04T19:49:48.247Z
Learnt from: brimoor
PR: voxel51/fiftyone#6317
File: fiftyone/utils/caltech.py:149-157
Timestamp: 2025-09-04T19:49:48.247Z
Learning: The Caltech 256 dataset has a static directory structure where all class directory names follow the format "XXX.category_name" (e.g., "001.ak47", "002.american-flag"), meaning all directory names contain dots and the split operation in the label normalization code is safe.
Applied to files:
docs/source/dataset_zoo/datasets/caltech256.rst
🔇 Additional comments (27)
docs/.gitignore (1)
1-1
: Good update; also ignore generated dataset pages and verify no tracked files remainThe models directory ignore is correct. Since dataset pages are also generated (not committed), add the datasets directory too, and ensure no already-tracked files remain under these paths.
Apply this diff:
source/model_zoo/models/ +source/dataset_zoo/datasets/
Run to verify paths and tracked files:
docs/source/dataset_zoo/datasets/kitti_multiview.rst (1)
50-55
: Confirm CLI-created dataset name.Does
fiftyone zoo datasets load kitti-multiview --split train
createkitti-multiview-train
by default in the new flow? If naming changed, update the launch command accordingly.docs/source/dataset_zoo/datasets/quickstart_geo.rst (3)
24-43
: Tabs directive compatibility check.Using
.. tabs::
with.. group-tab::
depends on the sphinx-tabs extension. Ensure it’s enabled and consistent across docs (or switch to.. tab::
if that’s the project standard).
18-18
: Verify API reference target for ZooDataset class.Confirm that
fiftyone.zoo.datasets.base.QuickstartGeoDataset
is the correct import path documented in API docs; otherwise the link will break.Run:
44-46
: Verify image presence and improve alt text for accessibility.
- Ensure the image file exists in the repo.
- Alt text should describe the content, not the filename.
-.. image:: /images/dataset_zoo/quickstart-geo.png - :alt: quickstart-geo +.. image:: /images/dataset_zoo/quickstart-geo.png + :alt: FiftyOne App showing the Quickstart Geo dataset (NYC clips with detections) :align: centerdocs/source/dataset_zoo/datasets/ucf101.rst (4)
68-69
: Verify :ref: target exists.Confirm the “troubleshooting-video” label is defined to avoid a broken link.
78-80
: Confirm image asset exists and path is correct.Ensure /images/dataset_zoo/ucf101-test.png is added in this PR or already present to prevent a broken image.
37-55
: Tab directive name likely incorrect (“group-tab”).Sphinx typically uses “tab” (sphinx-tabs) or “tab-item” (sphinx-design). “group-tab” may not render.
Apply this diff if using sphinx-tabs:
-.. tabs:: - - .. group-tab:: Python +.. tabs:: + + .. tab:: PythonAnd similarly on Line 55:
- .. group-tab:: CLI + .. tab:: CLIIf using sphinx-design, use:
.. tab-set:: .. tab-item:: Python .. tab-item:: CLI
27-31
: Fix license; verify source URL and splits; qualify size.
- UCF101 is not CC0; it uses custom/non-commercial terms. Update license text.
- Please verify the source URL (the official page is typically under /data/UCF101.php).
- Confirm supported splits exposed by the zoo (train/test, and whether “all” is allowed).
- Prefer “Approx.” for size, as mirrors/compression vary.
Apply this diff:
- - Dataset source: https://www.crcv.ucf.edu/research/data-sets/ucf101 - - Dataset license: CC0-1.0 - - Dataset size: 6.48 GB + - Dataset source: https://www.crcv.ucf.edu/research/data-sets/ucf101 + - Dataset license: Custom (see source terms) + - Dataset size: Approx. 6.48 GBdocs/source/dataset_zoo/datasets/voc_2012.rst (2)
22-22
: Confirm “validation” split alias is supported for VOC.Many VOC resources use “val”. Please confirm the loader and CLI accept “validation”. If not, update docs to “val”.
Run:
Also applies to: 51-52, 57-61
63-66
: Verify image asset exists.Ensure /images/dataset_zoo/voc-2012-validation.png is present so the page builds without broken images.
Run:
docs/source/dataset_zoo/datasets/kinetics_400.rst (2)
24-26
: Verify Sphinx target for ZooDataset classEnsure this cross‑reference resolves. The dataset class may not live under
fiftyone.zoo.datasets.base
. If it 404s, consider linking to the public API or replacing with a literal and linking toload_zoo_dataset
.Suggestion:
-:class:`Kinetics400Dataset <fiftyone.zoo.datasets.base.Kinetics400Dataset>` +:class:`Kinetics400Dataset <fiftyone.zoo.datasets.Kinetics400Dataset>`or
See :func:`fiftyone.zoo.datasets.load_zoo_dataset` with name ``"kinetics-400"``.
155-160
: CLI kwargs and dataset name: verify syntax and output nameConfirm
--kwargs max_samples=10
is valid. In many FO docs, JSON is passed to--kwargs
.If needed, update to:
Also verify the auto‑generated dataset name used in
fiftyone app launch
(e.g.,kinetics-400-validation-10
) matches actual CLI behavior.docs/source/dataset_zoo/datasets/hmdb51.rst (1)
16-18
: Verify supported splits“other” is unusual for HMDB51. Are the supported splits exactly train/test (and possibly validation)? Please confirm and update.
docs/source/dataset_zoo/datasets/caltech256.rst (1)
1-53
: LGTMContent, structure, and examples look correct.
docs/source/dataset_zoo/datasets/dataset_cards.rst (1)
1-245
: Fix card links: remove redundant “datasets/” prefixThis file lives in dataset_zoo/datasets/, but links use “datasets/.html”, which will likely resolve to dataset_zoo/datasets/datasets/.html. Use the page filename only (e.g., cifar100.html), or an absolute path (/dataset_zoo/datasets/cifar100.html) depending on how customcarditem resolves links.
Examples (apply globally):
- :link: datasets/activitynet_100.html + :link: activitynet_100.html ... - :link: datasets/coco_2017.html + :link: coco_2017.html ... - :link: datasets/hmdb51.html + :link: hmdb51.htmlIf the directive expects site-root paths, use “/dataset_zoo/datasets/.html” instead. Please confirm against the docs preview.
docs/source/dataset_zoo/datasets/open_images_v6.rst (1)
31-31
: Verify cross-references and assets resolve (dataset class, importer, image).Please confirm these exist and resolve in the built docs:
- :class:
OpenImagesV6Dataset <fiftyone.zoo.datasets.base.OpenImagesV6Dataset>
- :class:
OpenImagesV6DatasetImporter <fiftyone.utils.openimages.OpenImagesV6DatasetImporter>
- Image file: /images/dataset_zoo/open-images-v6.png
Run:
Also applies to: 111-116, 237-239
docs/source/dataset_zoo/datasets/coco_2014.rst (5)
34-34
: Verify dataset size figure37.57 GB may vary depending on what’s counted (images vs +annotations). Confirm or note “~37.6 GB”.
180-233
: CLI dataset names and flags: please verifyConfirm the default dataset names (e.g., coco-2014-validation-50) and the use of --kwargs are accurate for current CLI behavior.
1-5
: Confirm commit vs build-time generationPR notes per-dataset pages are generated at build time and not committed. Is this file intended to be committed, or should it be generated from the single source (dataset.rst)? Align with the new process.
37-38
: Cross-reference targets may be incorrectValidate the module paths for COCO2014Dataset and COCODetectionDatasetImporter. Update if needed.
Also applies to: 108-113
234-236
: Verify image asset existsEnsure /images/dataset_zoo/coco-2014-validation.png is present in the repo.
docs/source/dataset_zoo/datasets/open_images_v7.rst (5)
30-32
: Verify class paths resolve (linking/accuracy)Confirm these symbols and module paths exist; adjust if needed (e.g., module moved).
Also applies to: 113-116
19-21
: Check target anchor exists for :ref:open-images
Avoid broken cross-refs.
237-240
: Ensure image asset existsPrevent broken image in docs.
36-38
: No edits needed: image rescaling statement is accurateThe FiftyOne Open Images V7 dataset downloads all images rescaled so their largest side is at most 1024 px.
41-44
: Confirm split counts and sizes
Official V7 detection/bounding‐box subset: Train 1,743,042 images (513 GB), Validation 41,620 images (12 GB), Test 125,436 images (36 GB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/source/dataset_zoo/datasets/coco_2017.rst (2)
81-84
: Document the actual boolean default forinclude_license
Line 82 still quotes
"False"
and hides theTrue
alias ambiguity. Please state the boolean default plainly and note thatTrue
is an alias for"name"
so users don’t pass string values that won’t work.- - ``"False"`` (default): don't load the license - - ``True``/``"name"``: store the string license name + - ``False`` (default): don't load the license + - ``True`` (alias for ``"name"``): store the string license name
99-103
: Fix “labels types” typoLine 102 still says “labels types”; please change it to “label types” for clarity.
- labels types or classes. The actual number of samples loaded may be less + label types or classes. The actual number of samples loaded may be less
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/source/dataset_zoo/datasets/coco_2014.rst
(1 hunks)docs/source/dataset_zoo/datasets/coco_2017.rst
(1 hunks)docs/source/dataset_zoo/datasets/dataset_cards.rst
(1 hunks)docs/source/dataset_zoo/datasets/hmdb51.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_400.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_600.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_700.rst
(1 hunks)docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
(1 hunks)docs/source/dataset_zoo/datasets/open_images_v7.rst
(1 hunks)docs/source/dataset_zoo/datasets/sama_coco.rst
(1 hunks)docs/source/integrations/coco.rst
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/source/dataset_zoo/datasets/open_images_v7.rst
🚧 Files skipped from review as they are similar to previous changes (5)
- docs/source/dataset_zoo/datasets/kinetics_700_2020.rst
- docs/source/dataset_zoo/datasets/sama_coco.rst
- docs/source/dataset_zoo/datasets/hmdb51.rst
- docs/source/dataset_zoo/datasets/kinetics_600.rst
- docs/source/dataset_zoo/datasets/dataset_cards.rst
🔇 Additional comments (1)
docs/source/dataset_zoo/datasets/kinetics_700.rst (1)
24-26
: Fix the ZooDataset class cross-reference
Kinetics700Dataset
is defined infiftyone.zoo.datasets.kinetics
, not under.base
, so the current target renders a broken link. Please point the:class:
directive at the module that actually defines the class to keep the docs intact.⛔ Skipped due to learnings
Learnt from: brimoor PR: voxel51/fiftyone#6373 File: docs/source/dataset_zoo/datasets/ucf101.rst:32-0 Timestamp: 2025-10-16T17:24:11.562Z Learning: In FiftyOne documentation, dataset zoo classes follow the pattern `fiftyone.zoo.datasets.base.{DatasetName}Dataset` and should be referenced using `:class:` directives (e.g., `:class:`UCF101Dataset <fiftyone.zoo.datasets.base.UCF101Dataset>``). This is a valid and established documentation pattern.
…m/voxel51/fiftyone into docs/unified_model_dataset_plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What changes are proposed in this pull request?
This PR introduces a series of improvements across the plugins, docs, and zoo sections to improve consistency, usability, and user experience:
list_zoo_datasets
to dynamically build the pages. For this first version, the manual approach was chosen to preserve extra metadata provided in each dataset page.How is this patch tested? If it is not, please explain why.
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
This release improves the overall usability of the plugin ecosystem, documentation, and zoo pages by:
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
Release Notes
New Features
Documentation
Style