First principles datasets by gAldeia · Pull Request #181 · EpistasisLab/pmlb

gAldeia · 2024-09-03T21:58:25Z

Data comes from two symbolic regression repos:

Miles Cranmer's PySR: https://github.com/MilesCranmer/PySR
Etienne Russeil et al.'s MvSR: https://github.com/erusseil/MvSR-analysis

They are all datasets that have a first-principle equation derived from data and used in their respective papers to show how symbolic regression has the potential of retrieving the original equation when only observational data is available.

While some of them have just a few samples and others are synthetically generated, they are challenging for symbolic regression methods and can be used to evaluate these algorithms.

The idea of pushing them into PMLB is to help other users to quickly set up experiments with the data.

I still need to write proper metadata for them. My understanding is that opening a PR will trigger a GA that will push some new files to my fork, which I should complete before the new datasets go to revision. Please let me know if there is there anything I got wrong and need to update!

Data comes from two symbolic regression repos: - Miles Cranmer's PySR: https://github.com/MilesCranmer/PySR - Etienne Russeil et al.'s MvSR: https://github.com/erusseil/MvSR-analysis They are all datasets that have a first-principle equation derived from data and used in their respective papers to show how symbolic regression has the potential of retrieving the original equation when only observational data is available. While some of them have just a few samples and others are synthetically generated, they are challenging for symbolic regression methods and can be used to evaluate these algorithms. The idea of pushing them into PMLB is to help other users to quickly set up experiments with the data. I still need to write proper metadata for them.

CI was failing to parse the contents of these specific ones.

Created by https://github.com/gAldeia/pmlb/actions/runs/11616806556\nfrom f23672c on 2024-10-31

trangdata · 2025-02-19T21:47:58Z

Thank you for this PR, @gAldeia! 💯 🌈

If you could update the metadata.yaml files (e.g. datasets/first_principles_rydberg/metadata.yaml), I will review!

…pmlb into symreg_first_principles

gAldeia · 2025-02-20T12:08:18Z

Done! Let me know if everything is ok or if it needs a major review.
Thanks!

Created by https://github.com/gAldeia/pmlb/actions/runs/13434894123\nfrom bdc87c8 on 2025-02-20

gAldeia

This is correct, I accidentally pushed changes from another branch into this one

trangdata · 2025-02-20T13:52:54Z

ah I think one of the datasets' names are not matching (directory name vs. metadata dataset field):

first_principles_supernovae_zr
first_principles_supernovae_zg
Could you fix plz @gAldeia

…pmlb into symreg_first_principles

Created by https://github.com/gAldeia/pmlb/actions/runs/13465733857\nfrom a226e6b on 2025-02-21

gAldeia and others added 3 commits September 3, 2024 18:48

Re-generated broken datasets

f23672c

CI was failing to parse the contents of these specific ones.

update dataset files

42b29f7

Created by https://github.com/gAldeia/pmlb/actions/runs/11616806556\nfrom f23672c on 2024-10-31

gAldeia added 3 commits February 20, 2025 09:04

New metadata

4742074

Updated summary

253f0dd

Merge branch 'symreg_first_principles' of https://github.com/gAldeia/…

bdc87c8

…pmlb into symreg_first_principles

github-actions bot and others added 2 commits February 20, 2025 12:08

update dataset files

641950a

Created by https://github.com/gAldeia/pmlb/actions/runs/13434894123\nfrom bdc87c8 on 2025-02-20

Merge branch 'EpistasisLab:master' into symreg_first_principles

13c7796

gAldeia commented Feb 20, 2025

View reviewed changes

gAldeia and others added 4 commits February 20, 2025 13:38

Fix typo

f02e9f9

Merge branch 'symreg_first_principles' of https://github.com/gAldeia/…

e673b36

…pmlb into symreg_first_principles

Merge branch 'master' into symreg_first_principles

a226e6b

update dataset files

f710510

Created by https://github.com/gAldeia/pmlb/actions/runs/13465733857\nfrom a226e6b on 2025-02-21

trangdata merged commit 7c1f4bd into EpistasisLab:master Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First principles datasets#181

First principles datasets#181
trangdata merged 12 commits intoEpistasisLab:masterfrom
gAldeia:symreg_first_principles

gAldeia commented Sep 3, 2024

Uh oh!

trangdata commented Feb 19, 2025

Uh oh!

gAldeia commented Feb 20, 2025

Uh oh!

gAldeia left a comment

Uh oh!

trangdata commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gAldeia commented Sep 3, 2024

Uh oh!

trangdata commented Feb 19, 2025

Uh oh!

gAldeia commented Feb 20, 2025

Uh oh!

gAldeia left a comment

Choose a reason for hiding this comment

Uh oh!

trangdata commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants