Skip to content

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Sep 24, 2025

The current DataTree repr never expands sub-groups, and shows all DataTree attributes even if they are empty. This makes it annoying to visualize nested DataTrees, especially if the top levels are just splitting into sub-groups.

This PR hides missing sections, and as a simple heuristic, automatically expands sub-groups if there are no data variables associated with a group level. I've also added a 📁 icon to make groups more visually distinct.

Using one of the examples from the docs:

Before:
image

After:
image

import xarray as xr
import numpy as np

# Set up coordinates
time = xr.DataArray(data=["2022-01", "2023-01"], dims="time")
stations = xr.DataArray(data=list("abcdef"), dims="station")
lon = [-100, -80, -60]
lat = [10, 20, 30]

# Set up fake data
wind_speed = xr.DataArray(np.ones((2, 6)) * 2, dims=("time", "station"))
pressure = xr.DataArray(np.ones((2, 6)) * 3, dims=("time", "station"))
air_temperature = xr.DataArray(np.ones((2, 6)) * 4, dims=("time", "station"))
dewpoint = xr.DataArray(np.ones((2, 6)) * 5, dims=("time", "station"))
infrared = xr.DataArray(np.ones((2, 3, 3)) * 6, dims=("time", "lon", "lat"))
true_color = xr.DataArray(np.ones((2, 3, 3)) * 7, dims=("time", "lon", "lat"))

dt2 = xr.DataTree.from_dict(
    {
        "/": xr.Dataset(
            coords={"time": time},
        ),
        "/weather": xr.Dataset(
            coords={"station": stations},
            data_vars={
                "wind_speed": wind_speed,
                "pressure": pressure,
            },
        ),
        "/weather/temperature": xr.Dataset(
            data_vars={
                "air_temperature": air_temperature,
                "dewpoint": dewpoint,
            },
        ),
        "/satellite": xr.Dataset(
            coords={"lat": lat, "lon": lon},
            data_vars={
                "infrared": infrared,
                "true_color": true_color,
            },
        ),
    },
)
dt2
  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

@shoyer
Copy link
Member Author

shoyer commented Sep 24, 2025

CC folks who may have opinions/ideas here: @jsignell @eni-awowale @TomNicholas

@dcherian
Copy link
Contributor

dcherian commented Sep 25, 2025

This PR hides missing sections, and as a simple heuristic, automatically expands sub-groups if there are no data variables associated with a group level.

I like all this.

Shall we add an emoji to indicate a group and separate it form the remaining sections, so 📁 satellite and 📁 weather? Using bold black for the text might be enough too

@jsignell
Copy link
Contributor

Yeah I think this is much better. It still uses a lot of vertical space and is fairly information sparse but this is an improvement!

Copy link
Contributor

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

@shoyer
Copy link
Member Author

shoyer commented Sep 26, 2025

I did a bigger clean-up of the DataTree HTML code, adding in the 📁 icon. I ended up deleting a handful of very fragile mock tests.

I tried to make the folder icon / path name clickable to hide individual groups. I was able to get something working with the aid of AI code generation, but couldn't get it looking right after an hour of fiddling, so I'll let something with better CSS skills figure that out later.

@TomNicholas
Copy link
Member

This looks great, but if we're going to hide empty sections, shouldn't we also do that for Dataset and so on?

@shoyer
Copy link
Member Author

shoyer commented Sep 26, 2025

This looks great, but if we're going to hide empty sections, shouldn't we also do that for Dataset and so on?

Probably yes?

This would make text and HTML reprs more aligned.

It's less of an issue for Dataset, because Dataset does not have a recursive repr.

@shoyer
Copy link
Member Author

shoyer commented Sep 26, 2025

This looks great, but if we're going to hide empty sections, shouldn't we also do that for Dataset and so on?

I've made this change for Dataset and DataArray, too. I'm now using the exact same logic for deciding to show sections between the text and HTML reprs, and added some (non-exhaustive) tests.

Before:
image

After:
image

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

@shoyer shoyer changed the title Improve display of DataTree HTML repr Improve display of HTML reprs Sep 28, 2025
@shoyer shoyer merged commit c703ce4 into pydata:main Sep 28, 2025
36 checks passed
@shoyer shoyer deleted the better-tree-repr branch September 28, 2025 19:08
@shoyer
Copy link
Member Author

shoyer commented Sep 28, 2025

#10795 removes some of the unnecessary vertical whitespace from Jupyter notebooks.

As usual with HTML formatting glitches, the issue was mostly due to defaults from inherited styles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants