Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0

I hope the following is sufficient for reproducing the issue.

Writing with `df.to_parquet` goes fine, it's when reading the data back with `pd.read_parquet` that the code hangs. The parquet engine used is pyarrow. No error is raised, the docker container simply hangs forever.

python: 3.10.7
OS: Linux
pandas: 1.4.4
numpy: 1.23.3
pyarrow: 9.0.0

Disabling filprofiler (I use the api with a conditional environment variable as documented in https://pythonspeed.com/fil/docs/api.html#using-the-python-api) resolves the issue. Also reverting to filprofiler 2022.06.0 (with everything else exactly the same) resolves the issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions