temporal: t.rast.aggregate: add -e flag to extend existing STRDS#7223
temporal: t.rast.aggregate: add -e flag to extend existing STRDS#7223Sourish-spc wants to merge 25 commits intoOSGeo:mainfrom
Conversation
In stds_import.py, the finally block used the deprecated 'gisdbase' option when calling g.mapset to switch back to the original location. This caused a deprecation warning on every t.rast.import run with the project parameter. Replaced 'gisdbase' with 'dbase' to match the current g.mapset API. Fixes OSGeo#4231
7ee4adc to
9b2110e
Compare
temporal/t.rast.aggregate/testsuite/test_aggregation_absolute.py
Outdated
Show resolved
Hide resolved
ninsbl
left a comment
There was a problem hiding this comment.
Please also install pre-commit in your local working directory to avoid CI runs for formalities...
| gs.fatal( | ||
| _( | ||
| "Space time raster dataset <%s> does not exist. " | ||
| "Cannot extend a non-existing dataset." | ||
| ) | ||
| % output | ||
| ) |
There was a problem hiding this comment.
Personally, I do not think a fatal error should be thrown here. I would say, a warning at best. In any case, this part should be consistent across all temporal tools that receive the e-flag.
There was a problem hiding this comment.
Updated: changed fatal error to warning and fall back to creating new STRDS when dataset doesn't exist.
Adds support for extending an existing Space Time Raster Dataset by running t.rast.aggregate with the -e flag. This allows recurring workflows to append new aggregated results to an existing STRDS without deleting previous data. See OSGeo#3427
| # Verify STRDS was extended | ||
| lister = SimpleModule("t.rast.list", input="B", columns="name", flags="u") | ||
| self.runModule(lister) | ||
| self.assertIn("b_ext", lister.outputs.stdout) |
There was a problem hiding this comment.
This only checks that some map with "b_ext" in the name exists. It doesn't verify the original maps are still present, that the STRDS wasn't just overwritten, and/or the total map count.
There was a problem hiding this comment.
Thank you for the review. The test has been strengthened to also verify that original maps are still present and the total map count increased after extending.
| if output_exists: | ||
| output_strds.update_command_string(dbif=dbif) |
There was a problem hiding this comment.
Does this make sense when overwriting?
There was a problem hiding this comment.
Good point, iI did not really think about it. It may make sense in any case. I did not check the behavior. Updates of the STDS should be written to the history I`ld say, which is what update_command_string does. The command should not be duplicated in the history though. That should be checked...
There was a problem hiding this comment.
Updated the condition to if not output_exists or extend to avoid duplicating the command string in history when overwriting, while still updating it correctly when extending.
|
|
||
| if __name__ == "__main__": | ||
| options, flags = gs.parser() | ||
|
|
There was a problem hiding this comment.
Please, review your code, including the PR changes.
There was a problem hiding this comment.
Thank you for the review. The extra blank line has been removed in the latest commit.
…ctions for extend support
ninsbl
left a comment
There was a problem hiding this comment.
Please keep in mind that the code in open_stds pre-dates Python 3 and has not been re-factored after since.
We should not do re-factoring as part of this PR, but don`t repeat outdated pattern in stuff you add....
| dbif, connection_state_changed = init_dbif(dbif) | ||
| msgr = get_tgis_message_interface() | ||
|
|
||
| mapset = get_current_mapset() | ||
| id = name + "@" + mapset if name.find("@") < 0 else name | ||
|
|
||
| if type in {"strds", "rast", "raster"}: | ||
| sp = dataset_factory("strds", id) | ||
| elif type in {"str3ds", "raster3d", "rast3d", "raster_3d"}: | ||
| sp = dataset_factory("str3ds", id) | ||
| elif type in {"stvds", "vect", "vector"}: | ||
| sp = dataset_factory("stvds", id) | ||
| else: | ||
| msgr.fatal(_("Unknown type: %s") % (type)) | ||
| if extend and sp.is_in_db(dbif): | ||
| # extend-if-exists: load and return existing dataset | ||
| sp.select(dbif) | ||
| if connection_state_changed: | ||
| dbif.close() | ||
| return sp |
There was a problem hiding this comment.
You duplicate quite a bit of code... You may consider adding a helper function similar to e.g. (untested, so not suitable for copy and paste!):
def get_stds(stds_id: str, stds_type: str) -> SpaceTimeDataset:
"""Return an initialized SpaceTimeDataset (STDS)."""
msgr = get_tgis_message_interface()
supported_stds_types = {
"strds": "strds",
"rast": "strds",
"raster": "strds",
"str3ds": "str3ds",
"raster3d": "str3ds",
"rast3d": "str3ds",
"raster_3d": "str3ds",
"stvds": "stvds",
"vect": "stvds",
"vector": "stvds",
}
if stds_type not in supported_stds_types:
msgr.fatal(_("Unknown type: %s") % (type))
return dataset_factory(supported_stds_types[stds_type], stds_id)]
There was a problem hiding this comment.
Added _get_stds helper function to avoid repeating the type mapping block, and ensure_id helper to avoid repeating the id construction pattern. Kept them private ( prefix) since they're only used internally in open_stds.py happy to make them public if needed.
ninsbl
left a comment
There was a problem hiding this comment.
Now we are getting somewhere. Almost done. I just had some hopefully final, minor comments.
Once those are addressed, this can be approved as far as I am concerned. I would appreciate a second opinion, especially on having the new helper function public or private...
python/grass/temporal/open_stds.py
Outdated
| dbif, connection_state_changed = init_dbif(dbif) | ||
| output_exists = sp.is_in_db(dbif) | ||
|
|
||
| if output_exists: |
There was a problem hiding this comment.
See the comment above about collapsible-if...
There was a problem hiding this comment.
Fixed: collapsed nested if statements into single conditions.
python/grass/temporal/open_stds.py
Outdated
| """ | ||
| msgr = get_tgis_message_interface() | ||
|
|
||
| if "@" in name: |
There was a problem hiding this comment.
You can skip this check if you _ensure-id before.
If ensure_id is a public function and used in t.rast.aggregate, the new functions you add here could expect an id as input and those checks are not needed at all. Yet, they are insignificant and may be safer to double-check. No strong opinion from my side on this...
There was a problem hiding this comment.
Removed the redundant mapset check
| ) | ||
|
|
||
| # Check original maps are still present (STRDS was not overwritten) | ||
| for original_map in original_maps.strip().split(os.linesep): |
There was a problem hiding this comment.
Maybe use all() or a set() based check and only one assertion...
There was a problem hiding this comment.
Replaced the loop with a single set.issubset() assertion.
| self.assertIn(original_map, extended_maps) | ||
|
|
||
| # Check total map count increased (original 3 + extended 3 = 6) | ||
| info = SimpleModule("t.info", flags="g", input="B") |
There was a problem hiding this comment.
You get the number of maps from t.rast.list. No need to run t.info in addition for this check. If you want to check that the STRDS history is updated properly (command added to comments) you need to.info...
There was a problem hiding this comment.
Removed the t.info call, map count is now derived directly from t.rast.list output using len().
|
Note: test_aggregation_relative.py was already failing before this PR's changes, verified by running the tests on the original code via git stash. This appears to be a pre-existing issue unrelated to the -e flag changes." |
…check, simplify test assertions
ninsbl
left a comment
There was a problem hiding this comment.
Two minor and hopefully final comments from my side. Looks otherwise good. I would appreciate a second opinion though, esp. regarding the library parts...
| # Refresh STRDS object to reflect newly registered maps | ||
| output_strds.select(dbif) |
There was a problem hiding this comment.
What is the reason to add this here? Was it a necessary change? And what does it accomplish?
There was a problem hiding this comment.
Yes, it was necessary to fix a bug. Without it, after register_map_object_list adds the new maps into the existing STRDS, the in-memory object still holds the old state (3 maps). select(dbif) refreshes it from the database so metadata.update records the correct map count (6). Removing it causes the extend test to fail.
temporal/t.rast.aggregate/testsuite/test_aggregation_absolute.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Stefan Blumentrath <stefan.blumentrath@gmx.de>
Description
This PR adds support for extending an existing Space Time Raster Dataset (STRDS) in
t.rast.aggregateby introducing a new-eflag, following the same pattern implemented in #3798 fort.rast.neighbors.Changes
-eflag (Extend existing space time raster dataset) to module interface-eflag is set instead of always creating a new oneupdate_command_string()when extending existing STRDStest_extend_existing_strdsto existing test suiteUsage
First run — creates new STRDS:
t.rast.aggregate input=daily_rain output=monthly_rain granularity="1 month" method=average basename=result
Second run — extends existing STRDS:
t.rast.aggregate input=daily_rain output=monthly_rain granularity="1 month" method=average basename=result_new -e
Related
Closes part of #3427
See also #3798