generated from roboflow/template-python
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Search before asking
- I have searched the Supervision issues and found no similar feature requests.
Question
Summary of RLE formats supported by COCO dataset by Gemini2.5-pro:
"""
Structure of the RLE Output
The output of the mask.encode()
function is a dictionary with the following structure:
'size'
: A list of two integers[height, width]
representing the dimensions of the mask.'counts'
: This field contains the run-length encoded data. It can be one of two types:- A list of integers: This is an uncompressed RLE. The list represents the lengths of alternating runs of 0s and 1s. For example,
[2, 3, 1, 1]
for a mask[0, 0, 1, 1, 1, 0, 1]
means two 0s, followed by three 1s, one 0, and one 1. If the first pixel is a 1, the RLE starts with a 0 count for the initial run of 0s. - A UTF-8 encoded string: This is a compressed RLE, which is more space-efficient. The
pycocotools
library handles the compression and decompression of this string.
- A list of integers: This is an uncompressed RLE. The list represents the lengths of alternating runs of 0s and 1s. For example,
"""
In particular my dataset use UTF-8 encoded string, because it is outputed by pycocotools
Does the sv.DetectionDataset.from_coco
function support loading a dataset with UTF-8 encoded RLE segmentation masks?
From what I can tell it calls the rle_to_mask
function, which is the function that handles the conversion does not support it
supervision/supervision/dataset/utils.py
Lines 142 to 195 in deb1c9c
def rle_to_mask( | |
rle: npt.NDArray[np.int_] | list[int], resolution_wh: tuple[int, int] | |
) -> npt.NDArray[np.bool_]: | |
""" | |
Converts run-length encoding (RLE) to a binary mask. | |
Args: | |
rle (Union[npt.NDArray[np.int_], List[int]]): The 1D RLE array, the format | |
used in the COCO dataset (column-wise encoding, values of an array with | |
even indices represent the number of pixels assigned as background, | |
values of an array with odd indices represent the number of pixels | |
assigned as foreground object). | |
resolution_wh (Tuple[int, int]): The width (w) and height (h) | |
of the desired binary mask. | |
Returns: | |
The generated 2D Boolean mask of shape `(h, w)`, where the foreground object is | |
marked with `True`'s and the rest is filled with `False`'s. | |
Raises: | |
AssertionError: If the sum of pixels encoded in RLE differs from the | |
number of pixels in the expected mask (computed based on resolution_wh). | |
Examples: | |
```python | |
import supervision as sv | |
sv.rle_to_mask([5, 2, 2, 2, 5], (4, 4)) | |
# array([ | |
# [False, False, False, False], | |
# [False, True, True, False], | |
# [False, True, True, False], | |
# [False, False, False, False], | |
# ]) | |
``` | |
""" | |
if isinstance(rle, list): | |
rle = np.array(rle, dtype=int) | |
width, height = resolution_wh | |
assert width * height == np.sum(rle), ( | |
"the sum of the number of pixels in the RLE must be the same " | |
"as the number of pixels in the expected mask" | |
) | |
zero_one_values = np.zeros(shape=(rle.size, 1), dtype=np.uint8) | |
zero_one_values[1::2] = 1 | |
decoded_rle = np.repeat(zero_one_values, rle, axis=0) | |
decoded_rle = np.append( | |
decoded_rle, np.zeros(width * height - len(decoded_rle), dtype=np.uint8) | |
) | |
return decoded_rle.reshape((height, width), order="F") |
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested