mini-ramses frontend by James11222 · Pull Request #5380 · yt-project/yt

James11222 · 2026-02-17T23:09:48Z

PR Summary

This pull request includes a new fully functional frontend for mini-ramses. With the assistance of github copilot I was able to learn from the valuable insights of Corentin and managed to get a tested, fully functional (and I think just as fast as ramses) frontend working for mini-ramses. I tested on cosmological and non-cosmological simulations and it appears to be working identically to the RAMSES frontend. Currently I have it reading in all particle data (dark matter, star, sink) and AMR data (grav, hydro). I have not ensured compatibility with MHD or RT mini-ramses simulations yet.

Example plots made with fixed resolution buffers and particle data showcase it working in action.

Fixes Issue: #5215

PR Checklist

New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

…I/O handler, and fields Key differences from RAMSES: - Stream-based binary I/O (no Fortran record markers) - info.txt has nfile first (not ncpu) - File naming: amr.NNNNN, hydro.NNNNN (not amr_NNNNN.outNNNNN) - Separate particle files per type (part, star, sink, tree, trac) - Float32 primitive variables in hydro output - Compact AMR format (ckey + refined per grid) - Multi-integer Hilbert keys with different state diagrams Co-authored-by: James11222 <49885659+James11222@users.noreply.github.com>

…parameter Co-authored-by: James11222 <49885659+James11222@users.noreply.github.com>

…d-reader Add mini-ramses frontend reader

…hydro, sinks, and stars

Copilot

Pull request overview

This PR introduces a new mini_ramses frontend in yt, including dataset/index/IO plumbing, field definitions, and a synthetic-output test suite, and registers the frontend in yt.frontends.

Changes:

Added MiniRAMSESDataset/MiniRAMSESIndex/domain subset + file sanitizer for mini-ramses outputs.
Added an IO handler for fluid (hydro/gravity) and particle stream-binary reads, plus basic field definitions.
Added pytest-based synthetic-output tests and registered mini_ramses in the frontend package list.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
`yt/frontends/mini_ramses/data_structures.py`	Core dataset/index/octree construction, file validation, and fluid data extraction logic.
`yt/frontends/mini_ramses/io.py`	Fluid selection and particle IO routines for mini-ramses stream binary files.
`yt/frontends/mini_ramses/fields.py`	Declares known fluid/particle fields and a derived temperature field.
`yt/frontends/mini_ramses/definitions.py`	Filename patterns/regex and field alias groupings for mini-ramses.
`yt/frontends/mini_ramses/api.py`	Exposes the frontend’s public API symbols.
`yt/frontends/mini_ramses/tests/test_mini_ramses.py`	Synthetic mini-ramses output writer and basic dataset/sanitizer tests.
`yt/frontends/mini_ramses/tests/__init__.py`	Test package marker for the new frontend.
`yt/frontends/__init__.py`	Registers `mini_ramses` in the frontend namespace.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-17T23:21:40Z

yt/frontends/mini_ramses/tests/test_mini_ramses.py

+
+import os
+import struct
+import tempfile


tempfile is imported but never used, which will fail Ruff (F401) under the repo's lint configuration. Please remove the unused import (or use it if intended).

Suggested change

import tempfile

Copilot · 2026-02-17T23:21:41Z

yt/frontends/mini_ramses/data_structures.py

+    def fill(self, fn, fields, selector, field_list):
+        """Read and fill field data from a mini-ramses binary file.
+
+        Parameters
+        ----------


This docstring block includes blank lines with trailing whitespace (e.g. immediately after the opening docstring). Ruff/pycodestyle will flag this as W293/W291. Please strip trailing whitespace throughout this file (running the repo formatter/linter should handle it).

Copilot · 2026-02-17T23:21:41Z

yt/frontends/mini_ramses/data_structures.py

+        self, base_region, domain, ds, over_refine_factor=1, num_ghost_zones=0
+    ):
+        super().__init__(
+            base_region, domain, ds, over_refine_factor, num_ghost_zones
+        )


MiniRAMSESDomainSubset.__init__ passes over_refine_factor positionally into OctreeSubset(..., num_zones=...). For RAMSES-style octs this should be num_zones=2 (2×2×2 cells per oct); passing 1 sets num_zones=1 and can break reshaping/ghost-zone logic. Consider renaming the parameter to num_zones (default 2) and/or passing by keyword to avoid this mismatch.

Copilot · 2026-02-17T23:21:41Z

yt/frontends/mini_ramses/io.py

+        # Separate fields by type
+        ftypes = {f[0] for f in fields}
+
+        for chunk in chunks:


There are blank lines containing whitespace in this block (e.g. after ftypes = ...). Ruff/pycodestyle will flag this as W293/W291. Please strip trailing whitespace (running the configured formatter/linter should fix it automatically).

Copilot · 2026-02-17T23:21:42Z

yt/frontends/mini_ramses/data_structures.py

+            self.base_region,
+            self._base_domain,
+            self.ds,
+            self._num_ghost_zones,


retrieve_ghost_zones passes self._num_ghost_zones as the 4th positional argument to MiniRAMSESDomainSubset(...), which maps to num_zones in OctreeSubset.__init__. This will incorrectly set num_zones to the previous ghost-zone count. Pass the existing self._num_zones/num_zones=2 and set num_ghost_zones=ngz separately.

Suggested change

self._num_ghost_zones,

self._num_zones,

Copilot · 2026-02-17T23:21:43Z

yt/frontends/mini_ramses/data_structures.py

+        self.min_level = params.get("levelmin", 1)
+        self.max_level = params.get("levelmax", 1)
+        self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** (
+            self.min_level


MiniRAMSESDataset sets min_level/max_level directly from info.txt and then uses domain_dimensions = 2**min_level. In the RAMSES frontend, ds.min_level is stored as levelmin - 1 and ds.max_level is stored relative to ds.min_level to keep octree level math consistent. Aligning to that convention will make ds.index.max_level, level stats, and cell-size calculations consistent across octree frontends.

Suggested change

self.min_level = params.get("levelmin", 1)

self.max_level = params.get("levelmax", 1)

self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** (

self.min_level

levelmin = params.get("levelmin", 1)

levelmax = params.get("levelmax", 1)

# Align level bookkeeping with the RAMSES frontend convention:

# - store min_level as levelmin - 1

# - store max_level relative to min_level

self.min_level = levelmin - 1

self.max_level = levelmax - self.min_level

# The coarse grid domain has 2**levelmin cells per dimension.

# Since levelmin = self.min_level + 1, use (self.min_level + 1) here

# to keep domain_dimensions consistent with the RAMSES frontend.

self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** (

self.min_level + 1

Copilot · 2026-02-17T23:21:44Z

yt/frontends/mini_ramses/tests/test_mini_ramses.py

+    def test_load_dataset(self, tmp_path):
+        outdir = _create_mini_ramses_output(str(tmp_path))
+        ds = MiniRAMSESDataset(outdir)
+        assert ds is not None
+        assert ds.dimensionality == 3


The tests validate dataset detection and basic attributes, but they never actually access a fluid/particle field through yt data objects to exercise the IO handler + selection plumbing end-to-end. Adding at least one assertion that ds.all_data()[('mini-ramses','Density')] (and a particle field like ('io','particle_mass')) can be read and returns an array of the expected size would significantly improve coverage of the new frontend.

Copilot · 2026-02-17T23:21:44Z

yt/frontends/mini_ramses/data_structures.py

+        twotondim = 2**ndim
+
+        with open(domain.amr_fn, "rb") as f:
+            ndim_file = struct.unpack("i", f.read(4))[0]


ndim_file is unpacked from the AMR header but never used. This will trigger Ruff F841, and also misses a cheap sanity check (e.g. verify it matches ds.dimensionality). Either validate it or unpack into _ to explicitly ignore it.

Suggested change

ndim_file = struct.unpack("i", f.read(4))[0]

ndim_file = struct.unpack("i", f.read(4))[0]

if ndim_file != ndim:

mylog.warning(

"Dimensionality mismatch in AMR header for domain %s: "

"file reports %dD, dataset expects %dD.",

domain.domain_id,

ndim_file,

ndim,

)

Copilot · 2026-02-17T23:21:44Z

yt/frontends/mini_ramses/data_structures.py

+                    )
+
+                    # Read refinement map (twotondim int32 values)
+                    refined = np.array(


refined is read for every grid but never used, which will fail Ruff (F841). If refinement info is intentionally ignored, unpack into _ (or delete the variable) to satisfy lint. Otherwise, consider using it for validation (e.g. ensure refinement flags are consistent with the octs present at the next level).

Suggested change

refined = np.array(

_refined = np.array(

Copilot · 2026-02-17T23:21:44Z

yt/frontends/mini_ramses/data_structures.py

+                        for icell in range(twotondim):
+                            raw_data[fname][cell_offset + icell] = oct_data[fidx, icell]


The nested Python loops copying oct_data into raw_data (for fname... / for icell...) will be very slow for large outputs. This can be vectorized by slice-assigning oct_data[fidx, :] into raw_data[fname][cell_offset:cell_offset + twotondim], and ideally you should avoid reading/copying data for fields that weren't requested by the caller.

Suggested change

for icell in range(twotondim):

raw_data[fname][cell_offset + icell] = oct_data[fidx, icell]

# Vectorized copy: assign all cells for this field in one slice

raw_data[fname][cell_offset:cell_offset + twotondim] = oct_data[fidx, :]

cphyc

This is great! I'm on holidays but ping me if I haven't reviewed this come early March, I personally really want this included and shipped with the next release of yt.
Also, do you have a small-ish dataset (<1GiB) you could share so that we could host it on https://yt-project.org/data/?

James11222 · 2026-02-18T15:04:07Z

ya definitely, I can certainly get a nice small sample dataset to be hosted for you in the next week or so.

chrishavlin

awesome! a couple of very quick notes on the new test file you added:

super helpful to have the test itself mock up a dataset to use in the test. thank you!
you'll want to exclude this test from the old nose testing framework that yt still uses for some of its tests: update the nose_ignores file at the yt root directory as well as tests/tests.yaml (add the filename to the the end of tests.yaml where it has a list of files to ignore)

cphyc

This is a great start!

However, I think there are a few pieces that would benefit from refactorisation so that the reader is more future-proof and flexible.

In addition, you're doing many small reads from the file + struct.unpack calls, which could (and probably) should be replaced by np.fromfile reads for performance.

One final decision is related to https://github.com/yt-project/yt/pull/5380/changes#r2864451870: do you want to have one octree per domain, or one global octree (as you currently have)?

The former option is more efficient, the second is easier and avoids a few minor plotting artefacts.

cphyc · 2026-02-27T12:47:52Z

yt/frontends/mini_ramses/data_structures.py

+        self.amr_fn = os.path.join(basename, f"amr.{icpu_str}")
+        self.hydro_fn = os.path.join(basename, f"hydro.{icpu_str}")
+        self.grav_fn = os.path.join(basename, f"grav.{icpu_str}")


This specific pattern makes it slightly more complicated to later add support for rt later.

In the RAMSES dataset, I've used a "FieldFileHandler" (see https://github.com/yt-project/yt/blob/main/yt/frontends/ramses/field_handlers.py) that is specialised for hydro, grav and rt files.
This allows us to handle each file type separately and to factorize code.

cphyc · 2026-02-27T12:48:57Z

yt/frontends/mini_ramses/data_structures.py

+
+    def _read_amr_header(self):
+        """Read the AMR header from the stream-based binary file."""
+        if not os.path.exists(self.amr_fn):


Is this case really useful? As in, can a dataset be read without any AMR file?

cphyc · 2026-02-27T12:50:29Z

yt/frontends/mini_ramses/data_structures.py

+        self._level_count = np.zeros(nlevels, dtype="int64")
+        for ilevel in range(levelmin - 1, nlevelmax):
+            idx = ilevel - (self.ds.min_level - 1)
+            if 0 <= idx < nlevels:


Can this ever be false if levelmin and nlevelmax are properly set for the dataset? This condition seems like a tautology.

cphyc · 2026-02-27T12:51:53Z

yt/frontends/mini_ramses/data_structures.py

+        self._level_offsets = np.zeros(nlevels, dtype="int64")
+        cumsum = 0
+        for i in range(nlevels):
+            self._level_offsets[i] = cumsum
+            cumsum += self._level_count[i]


Suggested change

self._level_offsets = np.zeros(nlevels, dtype="int64")

cumsum = 0

for i in range(nlevels):

self._level_offsets[i] = cumsum

cumsum += self._level_count[i]

self._level_offsets = self._level_count.cumsum() - self._level_count[0]

cphyc · 2026-02-27T12:52:36Z

yt/frontends/mini_ramses/data_structures.py

+    def _read_hydro_header(self):
+        """Read the hydro header to get nvar."""
+        if not os.path.exists(self.hydro_fn):
+            return 0
+        with open(self.hydro_fn, "rb") as f:
+            ndim = struct.unpack("i", f.read(4))[0]
+            nvar = struct.unpack("i", f.read(4))[0]
+        return nvar
+
+    def _read_grav_header(self):
+        """Read the gravity header to get nvar."""
+        if not os.path.exists(self.grav_fn):
+            return 0
+        with open(self.grav_fn, "rb") as f:
+            ndim = struct.unpack("i", f.read(4))[0]
+            nvar = struct.unpack("i", f.read(4))[0]
+        return nvar


Note: If you move the handling of hydro/grav to a class of its own, you'll be able to prevent the repetition here

cphyc · 2026-02-27T13:34:19Z

yt/frontends/mini_ramses/io.py

+            "particle_position_x": "pos_x",
+            "particle_position_y": "pos_y",
+            "particle_position_z": "pos_z",
+            "particle_velocity_x": "vel_x",
+            "particle_velocity_y": "vel_y",
+            "particle_velocity_z": "vel_z",
+            "particle_mass": "mass",
+            "particle_refinement_level": "level",
+            "particle_identity": "birth_id",
+            "particle_metallicity": "metallicity",
+            "particle_birth_time": "birth_date",


This definition (and the other mappings) belong in definitions.py

cphyc · 2026-02-27T13:35:54Z

yt/frontends/mini_ramses/io.py

+        return [], {}
+
+
+def _detect_optional_particle_fields(header_fn, prefix):


This is very similar to what you already used in _detect_particle_fields in data_structure.py. Could you refactor?

cphyc · 2026-02-27T13:37:17Z

yt/frontends/mini_ramses/data_structures.py

+
+        # Select the requested cells
+        for ftype, fname in fields:
+            if fname in all_data:


I would expect that _read_all_cell_data has the promise to read everything, so it shouldn't be necessary to check here that fname is in all_data.

cphyc · 2026-02-27T13:44:16Z

yt/frontends/mini_ramses/data_structures.py

+                for igrid in range(ncache):
+                    # Each oct: nvar fields * twotondim cells * 4 bytes
+                    # Layout is (nvar, twotondim) - field-major
+                    oct_data = np.frombuffer(
+                        f.read(4 * twotondim * f_nvar),
+                        dtype="<f4"
+                    ).reshape(f_nvar, twotondim)
+
+                    # Store each field's data for this oct's cells
+                    for fname, fidx in field_indices.items():
+                        for icell in range(twotondim):
+                            raw_data[fname][cell_offset + icell] = oct_data[fidx, icell]
+
+                    cell_offset += twotondim


I'd suggest reading everything in one go.
Note that the code below is only valid is field_indices is sorted by values (such that the field names are in the same order as they appear on file).

# Since we're reading everything, we can do it in one pass: oct_data_t = np.dtype([ fname, (">f4", twotondim) for fname in field_indices.keys() ]) oct_data = np.fromfile(f, oct_data_t, ncache) # Store each field's data for this oct's cells for fname, fidx in field_indices.items(): raw_data[fname] = oct_data[fname].flatten()

cphyc · 2026-02-27T13:54:11Z

yt/frontends/mini_ramses/data_structures.py

+        root_nodes = sum(int(d.level_count[0]) for d in self.domains)
+        mylog.info("Allocating %s octs", domain_counts.sum())
+
+        self.oct_handler = RAMSESOctreeContainer(


RAMSESOctreeContainer is a specification for RAMSES so that it can handle missing octs at levelmin. However, here, since you're adding everything into a single oct, you can probably use the (slightly more efficient) OctreeContainer.

Alternatively, you can also use the same strategy as RAMSES and have each domain have its own octree, in which case you'd need to RAMSESOctreeContainer. The main advantage of such an approach is that, when you iterate over chunks (each of the subfiles), you only need to build the octree of that file and not the others. That will simplify parallelisation greatly (in a distant future!). Otherwise, every parallel task would need to read the entirety of the AMR files, which duplicates efforts and kills performance.

The main disadvantage is that operations that require interaction with the oct (such as particle deposition) will only have access to the oct of their domain, so you'll have small artefacts at the boundary between domains (I think this is what we're seeing e.g. in #1573).

Copilot AI and others added 5 commits February 17, 2026 14:59

Initial plan

93bec93

Clean up unused imports, fix code review feedback on redundant scale …

e5dfdf8

…parameter Co-authored-by: James11222 <49885659+James11222@users.noreply.github.com>

Merge pull request #1 from James11222/copilot/add-mini-ramses-fronten…

5eaea92

…d-reader Add mini-ramses frontend reader

fully working mini-ramses front end, works on dark matter particles, …

dbbe345

…hydro, sinks, and stars

Copilot AI review requested due to automatic review settings February 17, 2026 23:09

Copilot started reviewing on behalf of James11222 February 17, 2026 23:10 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

cphyc reviewed Feb 18, 2026

View reviewed changes

cphyc added code frontends Things related to specific frontends new feature Something fun and new! labels Feb 18, 2026

chrishavlin reviewed Feb 18, 2026

View reviewed changes

cphyc reviewed Feb 27, 2026

View reviewed changes

-        self.min_level = params.get("levelmin", 1)
-        self.max_level = params.get("levelmax", 1)
-        self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** (
-            self.min_level
+        levelmin = params.get("levelmin", 1)
+        levelmax = params.get("levelmax", 1)
+        # Align level bookkeeping with the RAMSES frontend convention:
+        # - store min_level as levelmin - 1
+        # - store max_level relative to min_level
+        self.min_level = levelmin - 1
+        self.max_level = levelmax - self.min_level
+        # The coarse grid domain has 2**levelmin cells per dimension.
+        # Since levelmin = self.min_level + 1, use (self.min_level + 1) here
+        # to keep domain_dimensions consistent with the RAMSES frontend.
+        self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** (
+            self.min_level + 1

-            ndim_file = struct.unpack("i", f.read(4))[0]
+            ndim_file = struct.unpack("i", f.read(4))[0]
+            if ndim_file != ndim:
+                mylog.warning(
+                    "Dimensionality mismatch in AMR header for domain %s: "
+                    "file reports %dD, dataset expects %dD.",
+                    domain.domain_id,
+                    ndim_file,
+                    ndim,
+                )

		for icell in range(twotondim):
		raw_data[fname][cell_offset + icell] = oct_data[fidx, icell]

		return [], {}


		def _detect_optional_particle_fields(header_fn, prefix):

Conversation

James11222 commented Feb 17, 2026

PR Summary

PR Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

cphyc left a comment

Choose a reason for hiding this comment

Uh oh!

James11222 commented Feb 18, 2026

Uh oh!

chrishavlin left a comment

Choose a reason for hiding this comment

Uh oh!

cphyc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants