Skip to content

Performance drop on 1D Zarr file #231

@asparsa

Description

@asparsa

There is a performance degradation when writing and reading into a zarr3 file with 1 dimension.
for example, when reading a [1024,1024,1024] dataset completely it takes 4.66201 sec, but reshaping it to a [32768,32768] dataset and then read it reduce the read time to 2.59770.

Most parameters remained the same, as number of chunkes. each dataset is chunked into 64 pieces.

this is the piece of code used to capture the read time in both situation:

static void main_read_full(const char* path,json_object *main_obj, int loadnumber )
{       
        MACSIO_TIMING_GroupMask_t main_read_full_grp = MACSIO_TIMING_GroupMask("main_read_full");
        MACSIO_TIMING_TimerId_t main_read_full_tid;
        double timer_dt;
        
        ::nlohmann::json json_spec = readjson(num_th,path);
        main_read_full_tid = MT_StartTimer("open", main_read_full_grp, loadnumber);
        auto result = tensorstore::Open(json_spec, context, tensorstore::OpenMode::open);
        timer_dt = MT_StopTimer(main_read_full_tid);
        auto store = std::move(result).value();
        main_read_full_tid = MT_StartTimer("read_data", main_read_full_grp, loadnumber);
        auto read_result = tensorstore::Read(
        store).result();
        timer_dt = MT_StopTimer(main_read_full_tid);
        if (!read_result.ok()) {
        std::cerr << "Failed to read data: " << read_result.status() << std::endl;
    }   
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions