Skip to content

Commit 076fb64

Browse files
authored
Merge pull request #51 from swiss-territorial-data-lab/gs/factorize_ex_dqry
Factorize detection merge
2 parents 794620b + 70fe2e1 commit 076fb64

File tree

15 files changed

+260
-305
lines changed

15 files changed

+260
-305
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ train_model.py:
236236
model_zoo_checkpoint_url: <zoo model to start training from, e.g. "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">
237237
init_model_weights: <True or False; if True, the model weights will be initialized to 0 (optional, defaults to False)>
238238
resume_training: <True or False; if True, the training is resumed from the final weights saved in the log folder. Defaults to False>
239-
data_augmentation: <True or False; if True, apply random adjustment of brightness, contrast, saturation, lightning, and size, plus flip the image horizontally. Defaults to False>
239+
data_augmentation: <True or False; if True, apply random adjustment of brightness, contrast, saturation, lighting, size, and flip the image horizontally. Defaults to False>
240240
```
241241

242242
Detectron2's configuration files are provided in the example folders mentioned here-below. We warn the end-user about the fact that, **for the time being, no hyperparameters tuning is automatically performed**.

examples/anthropogenic-activities/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@ It consists of the following elements:
77
* input data
88
* scripts for data preparation and the first step of post-processing
99

10-
The full project is available is its [own repository](https://github.com/swiss-territorial-data-lab/proj-sda).
10+
The full project is available in its [own repository](https://github.com/swiss-territorial-data-lab/proj-sda).
1111

1212

13-
The **installation** can be carried out by following the instructions [here](../../README.md). When using Docker, the container must be launched from this repository root folder before running the workflow:
13+
The **installation** can be carried out by following the instructions [here](../../README.md). If used, the Docker container must be launched from the root folder of this repository before running the workflow:
1414

1515
```bash
1616
$ sudo chown -R 65534:65534 examples

examples/anthropogenic-activities/config_trne.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Produce tile geometries based on the AoI extent and zoom level
22
prepare_data.py:
33
datasets:
4-
shapefile: data/sda_ground_truth_250410.gpkg # GT labels
4+
shapefile: data/sda_ground_truth.gpkg # GT labels
55
fp_shapefile: data/FP_labels.gpkg # FP labels
66
# empty_tiles_aoi: data/AoI/<AOI_SHPFILE> # AOI in which additional empty tiles can be selected. Only one 'empty_tiles' option can be selected
77
# empty_tiles_year: 2023 # If "empty_tiles_aoi" selected then provide a year. Choice: (1) numeric (i.e. 2020), (2) [year1, year2] (random selection of a year within a given year range)
8-
empty_tiles_shp: data/20250924_emtpy_tiles.gpkg # Provided shapefile of selected empty tiles. Only one 'empty_tiles' option can be selected
8+
empty_tiles_shp: data/empty_tiles.gpkg # Provided shapefile of selected empty tiles. Only one 'empty_tiles' option can be selected
99
category_field: Classe
1010
output_folder: output/trne/
1111
zoom_level: 16
@@ -18,10 +18,10 @@ generate_tilesets.py:
1818
working_directory: .
1919
output_folder: output/trne/
2020
datasets:
21-
aoi_tiles: output/trne/tiles.geojson
22-
ground_truth_labels: output/trne/labels.geojson
21+
aoi_tiles: output/trne/tiles.gpkg
22+
ground_truth_labels: output/trne/labels.gpkg
2323
add_fp_labels: # Uncomment if FP shapefile exists in prepare_data.py
24-
fp_labels: output/trne/FP.geojson
24+
fp_labels: output/trne/FP_labels.gpkg
2525
frac_trn: 0.7 # fraction of fp tiles to add to the trn dataset, then the remaining tiles will be split in 2 and added to tst and val datasets
2626
image_source:
2727
type: XYZ # supported values: 1. MIL = Map Image Layer 2. WMS 3. XYZ 4. FOLDER
@@ -82,7 +82,7 @@ assess_detections.py:
8282
working_directory: output/trne
8383
output_folder: assessment
8484
datasets:
85-
ground_truth_labels: labels.geojson
85+
ground_truth_labels: labels.gpkg
8686
split_aoi_tiles: split_aoi_tiles.geojson # aoi = Area of Interest
8787
categories: category_ids.json
8888
detections:

examples/anthropogenic-activities/merge_detections.py

Lines changed: 3 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
import pandas as pd
1010

1111
sys.path.insert(1, '../..')
12-
from helpers.functions_for_examples import get_categories
12+
from helpers.functions_for_examples import get_categories, merge_adjacent_detections, read_dets_and_aoi
1313
import helpers.misc as misc
1414

1515
from loguru import logger
@@ -62,87 +62,17 @@
6262
logger.success(f"Done! All files already exist in folder {OUTPUT_DIR}. Exiting.")
6363
sys.exit(0)
6464

65-
logger.info("Loading split AoI tiles as a GeoPandas DataFrame...")
66-
tiles_gdf = gpd.read_file('split_aoi_tiles.geojson')
67-
tiles_gdf = tiles_gdf.to_crs(2056)
68-
if 'year_tile' in tiles_gdf.keys():
69-
tiles_gdf['year_tile'] = tiles_gdf.year_tile.astype(int)
70-
logger.success(f"Done! {len(tiles_gdf)} features were found.")
71-
72-
logger.info("Loading detections as a GeoPandas DataFrame...")
73-
74-
detections_gdf = gpd.GeoDataFrame()
75-
76-
for dataset, dets_file in DETECTION_FILES.items():
77-
detections_ds_gdf = gpd.read_file(dets_file)
78-
detections_ds_gdf[f'dataset'] = dataset
79-
detections_gdf = pd.concat([detections_gdf, detections_ds_gdf], axis=0, ignore_index=True)
80-
detections_gdf = detections_gdf.to_crs(2056)
81-
detections_gdf['area'] = detections_gdf.area
82-
detections_gdf['det_id'] = detections_gdf.index
83-
if 'year_det' in detections_gdf.keys():
84-
detections_gdf['year_det'] = detections_gdf.year_det.astype(int)
85-
logger.success(f"Done! {len(detections_gdf)} features were found.")
65+
tiles_gdf, detections_gdf = read_dets_and_aoi(DETECTION_FILES)
8666

8767
# Merge features
8868
logger.info(f"Merge adjacent polygons overlapping tiles with a buffer of {DISTANCE} m...")
8969
detections_all_years_gdf = gpd.GeoDataFrame()
9070

9171
# Process detection by year
9272
for year in detections_gdf.year_det.unique():
93-
detections_by_year_gdf = detections_gdf[detections_gdf['year_det']==year]
94-
95-
detections_buffer_gdf = detections_by_year_gdf.copy()
96-
detections_buffer_gdf['geometry'] = detections_by_year_gdf.geometry.buffer(DISTANCE, resolution=2)
97-
98-
# Saves the id of polygons contained entirely within the tile (no merging with adjacent tiles), to avoid merging them if they are at a distance of less than thd
99-
detections_tiles_join_gdf = gpd.sjoin(tiles_gdf, detections_buffer_gdf, how='left', predicate='contains')
100-
remove_det_list = detections_tiles_join_gdf.det_id.unique().tolist()
101-
102-
detections_within_tiles_gdf = detections_by_year_gdf[detections_by_year_gdf.det_id.isin(remove_det_list)].drop_duplicates(subset=['det_id'], ignore_index=True)
103-
detections_overlap_tiles_gdf = detections_by_year_gdf[~detections_by_year_gdf.det_id.isin(remove_det_list)].drop_duplicates(subset=['det_id'], ignore_index=True)
104-
105-
# Merge polygons within the thd distance
106-
detections_overlap_tiles_gdf.loc[:, 'geometry'] = detections_overlap_tiles_gdf.buffer(DISTANCE, resolution=2)
107-
detections_dissolve_gdf = detections_overlap_tiles_gdf[['det_id', 'geometry']].dissolve(as_index=False)
108-
detections_merge_gdf = detections_dissolve_gdf.explode(ignore_index=True)
109-
del detections_dissolve_gdf, detections_overlap_tiles_gdf
110-
111-
if detections_merge_gdf.isnull().values.any():
112-
detections_merge_gdf = gpd.GeoDataFrame()
113-
else:
114-
detections_merge_gdf.geometry = detections_merge_gdf.buffer(-DISTANCE, resolution=2)
115-
116-
# Spatially join merged detection with raw ones to retrieve relevant information (score, area,...)
117-
detections_merge_gdf['index_merge'] = detections_merge_gdf.index
118-
detections_join_gdf = gpd.sjoin(detections_merge_gdf, detections_by_year_gdf, how='inner', predicate='intersects')
119-
120-
det_class_all = []
121-
det_score_all = []
122-
123-
for id in detections_merge_gdf.index_merge.unique():
124-
detections_by_year_gdf = detections_join_gdf.copy()
125-
detections_by_year_gdf = detections_by_year_gdf[(detections_by_year_gdf['index_merge']==id)]
126-
detections_by_year_gdf.rename(columns={'score_left': 'score'}, inplace=True)
127-
det_score_all.append(detections_by_year_gdf['score'].mean())
128-
detections_by_year_gdf = detections_by_year_gdf.dissolve(by='det_class', aggfunc='sum', as_index=False)
129-
# Keep class of largest det
130-
if len(detections_by_year_gdf) > 0:
131-
detections_by_year_gdf['det_class'] = detections_by_year_gdf.loc[detections_by_year_gdf['area'] == detections_by_year_gdf['area'].max(),
132-
'det_class'].iloc[0]
133-
det_class = detections_by_year_gdf['det_class'].drop_duplicates().tolist()
134-
else:
135-
det_class = [0]
136-
det_class_all.append(det_class[0])
137-
138-
detections_merge_gdf['det_class'] = det_class_all
139-
detections_merge_gdf['score'] = det_score_all
140-
141-
complete_merge_dets_gdf = pd.merge(detections_merge_gdf, detections_join_gdf[['index_merge', 'year_det'] + ([] if 'dataset' in detections_merge_gdf.columns else ['dataset'])], on='index_merge')
73+
complete_merge_dets_gdf, detections_within_tiles_gdf = merge_adjacent_detections(detections_gdf, tiles_gdf, year, DISTANCE)
14274
detections_all_years_gdf = pd.concat([detections_all_years_gdf, complete_merge_dets_gdf, detections_within_tiles_gdf], ignore_index=True)
14375

144-
del complete_merge_dets_gdf, detections_merge_gdf, detections_by_year_gdf, detections_within_tiles_gdf, detections_join_gdf
145-
14676
# get classe ids
14777
CATEGORIES = os.path.join('category_ids.json')
14878
categories_info_df, _ = get_categories(CATEGORIES)

examples/anthropogenic-activities/prepare_data.py

Lines changed: 5 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -59,32 +59,13 @@
5959
if not os.path.exists(OUTPUT_DIR):
6060
os.makedirs(OUTPUT_DIR)
6161

62-
written_files = []
63-
64-
gt_labels_4326_gdf = ffe.prepare_labels(SHPFILE, CATEGORY, supercategory=SUPERCATEGORY)
65-
66-
label_filepath = os.path.join(OUTPUT_DIR, 'labels.geojson')
67-
gt_labels_4326_gdf.to_file(label_filepath, driver='GeoJSON')
68-
written_files.append(label_filepath)
69-
logger.success(f"Done! A file was written: {label_filepath}")
62+
gt_labels_4326_gdf, written_files = ffe.prepare_labels(SHPFILE, CATEGORY, supercategory=SUPERCATEGORY, output_dir=OUTPUT_DIR)
7063

71-
tiles_4326_all_gdf, tmp_written_files = ffe.format_all_tiles(
72-
FP_SHPFILE, os.path.join(OUTPUT_DIR, 'FP.geojson'), EPT_SHPFILE, ept_data_type=EPT, ept_year=EPT_YEAR, labels_4326_gdf=gt_labels_4326_gdf,
73-
category=CATEGORY, supercategory=SUPERCATEGORY, zoom_level=ZOOM_LEVEL
64+
_, tmp_written_files = ffe.format_all_tiles(
65+
FP_SHPFILE, EPT_SHPFILE, ept_data_type=EPT, ept_year=EPT_YEAR, labels_4326_gdf=gt_labels_4326_gdf,
66+
category=CATEGORY, supercategory=SUPERCATEGORY, zoom_level=ZOOM_LEVEL, output_dir=OUTPUT_DIR
7467
)
75-
76-
# Save tile shapefile
77-
tile_filepath = os.path.join(OUTPUT_DIR, 'tiles.geojson')
78-
if tiles_4326_all_gdf.empty:
79-
logger.warning('No tile generated for the designated area.')
80-
tile_filepath = os.path.join(OUTPUT_DIR, 'area_without_tiles.gpkg')
81-
gt_labels_4326_gdf.to_file(tile_filepath)
82-
written_files.append(tile_filepath)
83-
else:
84-
logger.info("Export tiles to GeoJSON (EPSG:4326)...")
85-
tiles_4326_all_gdf.to_file(tile_filepath, driver='GeoJSON')
86-
written_files.append(tile_filepath)
87-
logger.success(f"Done! A file was written: {tile_filepath}")
68+
written_files.extend(tmp_written_files)
8869

8970
print()
9071
logger.info("The following files were written. Let's check them out!")

examples/mineral-extract-sites-detection/config_det.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ generate_tilesets.py:
1616
nb_tiles_max: 5000
1717
working_directory: ./output/
1818
datasets:
19-
aoi_tiles: det/tiles.geojson
19+
aoi_tiles: det/tiles.gpkg
2020
image_source:
2121
type: XYZ # supported values: 1. MIL = Map Image Layer 2. WMS 3. XYZ 4. FOLDER
2222
year: 2020 # supported values: 1. multi-year (tiles of different year), 2. <year> (i.e. 2020)
@@ -57,7 +57,6 @@ make_detections.py:
5757
# Assess the final results
5858
merge_detections.py:
5959
working_directory: ./output/det/
60-
labels: labels.geojson
6160
detections:
6261
oth: oth_detections_at_0dot3_threshold.gpkg
6362
distance: 10 # m, distance use as a buffer to merge close polygons (likely to belong to the same object) together

examples/mineral-extract-sites-detection/config_trne.yaml

Lines changed: 3 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,6 @@ prepare_data.py:
66
srs: EPSG:2056
77
datasets:
88
shapefile: ./data/labels/mes_swisstlm3d_swissimage2020.shp
9-
# fp_shapefile: ./data/FP/[FP_SHPFILE] # FP labels. Optional: can contain a 'year' column
10-
# empty_tiles:
11-
# type: shp # supported values: 1. aoi (area in which tiles will be selected randomly) 2. shp (provided empty tiles). Adapt the following keys accordingly
12-
# shapefile: ./data/AoI/[EPT_SHPFILE] # shapefile in which additional empty tiles can be selected.
13-
# year: 2020 # if type = aoi selected, then provide a year, otherwise comment line. Supported value: (1) numeric (i.e. 2020), (2) [year1, year2] (random selection of a year within a given year range)
14-
# category: [CLASS_COL_NAME] # if it exists, indicate the attribute column name of the label class
159
output_folder: ./output/trne/
1610
zoom_level: 16 # z, keep between 15 and 18
1711

@@ -22,19 +16,12 @@ generate_tilesets.py:
2216
nb_tiles_max: 2000
2317
working_directory: ./output/
2418
datasets:
25-
aoi_tiles: trne/tiles.geojson
26-
ground_truth_labels: trne/labels.geojson
27-
# fp_labels:
28-
# fp_shp: trne/FP.geojson
29-
# frac_trn: 0.7 # fraction of fp tiles to add to the trn dataset, then the remaining tiles will be split in 2 and added to tst and val datasets
19+
aoi_tiles: trne/tiles.gpkg
20+
ground_truth_labels: trne/labels.gpkg
3021
image_source:
3122
type: XYZ # supported values: 1. MIL = Map Image Layer 2. WMS 3. XYZ 4. FOLDER
3223
year: 2020 # supported values: 1. multi-year (tiles of different year), 2. <year> (i.e. 2020)
3324
location: https://wmts.geo.admin.ch/1.0.0/ch.swisstopo.swissimage-product/default/{year}/3857/{z}/{x}/{y}.jpeg
34-
# empty_tiles: # add empty tiles to datasets
35-
# tiles_frac: 0.5 # fraction (relative to the number of tiles intersecting labels) of empty tiles to add
36-
# frac_trn: 0.7 # fraction of empty tiles to add to the trn dataset, then the remaining tiles will be split in 2 and added to tst and val datasets
37-
# keep_oth_tiles: False # keep tiles in oth dataset not intersecting oth labels
3825
output_folder: trne/
3926
tile_size: 256 # per side, in pixels
4027
overwrite: True
@@ -86,7 +73,7 @@ make_detections.py:
8673
assess_detections.py:
8774
working_directory: ./output/trne/
8875
datasets:
89-
ground_truth_labels: labels.geojson
76+
ground_truth_labels: labels.gpkg
9077
image_metadata_json: img_metadata.json
9178
split_aoi_tiles: split_aoi_tiles.geojson # aoi = Area of Interest
9279
categories: category_ids.json
@@ -102,7 +89,6 @@ assess_detections.py:
10289
# Assess the final results
10390
merge_detections.py:
10491
working_directory: ./output/trne/
105-
labels: labels.geojson
10692
detections:
10793
trn: trn_detections_at_0dot05_threshold.gpkg
10894
val: val_detections_at_0dot05_threshold.gpkg

0 commit comments

Comments
 (0)