-
Notifications
You must be signed in to change notification settings - Fork 28
Accuracy benchmarks
The code in benchmarks/accuracy
and the module src/AccuracyBenchmark.jl
supports a variety of possible
benchmarks of Celeste's accuracy.
Each accuracy benchmark consists of running Celeste on a single field images (five images techincally, one for each band) and comparing inferred parameters for all sources present to known "ground truth" values. Here are the components of an accuracy benchmark:
- We start with a ground truth catalog.
- We get a set of band images corresponding to this catalog.
- We run Celeste on these images, with particular initialization values corresponding to the given images, generating a predition catalog.
- We compare one or more prediction catalogs to the ground truth, summarizing accuracy for each parameter.
All catalogs are stored in a common CSV format; see AccuracyBenchmark.{read,write}_catalog()
.
At each step we have some choices:
The command
$ julia write_ground_truth_catalog_csv.jl [coadd|prior]
writes a ground truth catalog to the output
subdirectory.
-
coadd
uses the SDSS Stripe82 "coadd" catalog, already pulled from the SDSS CasJobs server (in FITS format). By default, this uses the coadd file for the 4263/5/119 RCF stored undertest/data
, but you can specify a path manually as the next command-line argument. (Usemake RUN=... CAMCOL=... FIELD=...
in thetest/data
directory to pull these files from the SDSS server.) -
prior
draws 500 random sources from the Celeste prior (with a few added prior distributions for parameters which don't have a prior specified in Celeste).
For Stripe82, one can use real SDSS imagery which has already been downloaded (under
test/data
).
For any ground truth catalog, one can generate synthetic imagery in two ways:
- Using GalSim, with the
benchmark/galsim/galsim_field.py
script. See the README in that directory for more details (setting up GalSim is nontrivial). This process will generate a FITS file underbenchmark/galsim/output
. - By drawing from the Celeste likelihood model (implemented in
src/Synthetic.jl
), and inserting synthetic light sources into template imagery (metadata only, pixels are all new), using the command$ julia generate_synthetic_field.jl <ground truth CSV>
The command
$ julia sdss_rcf_to_csv.jl
will read the Stripe82 "primary" catalog (from pre-downloaded FITS files) and write it in CSV form
to the output
subdirectory. This is useful for two things:
- Initializing Celeste for a run on Stripe82 imagery.
- Comparing Celeste's accuracy to the "primary" catalog.
You can also specify a different RCF and omit the ground truth catalog, to run Celeste outside of Stripe82 and compare to primary.
The script run_celeste_on_field.jl
will run Celeste on given images, writing predictions to a
new catalog under the output
subdirectory.
- The default behavior is to read Stripe82 SDSS images. You can specify a JLD file containing
imagery to use instead with
--images-jld <filename>
. - By default, Celeste detects sources on the images. To skip this and initialize sources from an existing
CSV catalog, use
--initialization-catalog <filename>
. Sources will be initialized only with a noisy position, so you can pass a ground truth catalog for synthetic imagery without "cheating". Alternatively, if you pass--use-full-initialization
, Celeste will be initialized with all information from the given catalog. - The script supports single (default) or joint inference (
--joint
). - The script writes predictions in the common catalog CSV format.
The command
$ julia benchmark/accuracy/score_predictions.jl \
<ground truth CSV> <predictions CSV> [predictions CSV]
compares one or two prediction catalogs to a ground truth catalog, summarizing their performance (and comparing to each other, if two are given). You can also use
$ julia benchmark/accuracy/score_uncertainty.jl \
<ground truth CSV> <Celeste predictions CSV>
to examine the distribution of errors relative to posterior SDs (for those parameters with a posterior distribution).
Here are some examples of use (commands are relative to the benchmark/accuracy
directory):
-
To run Celeste on Stripe82 real imagery using "primary" predictions for initialization, as in real runs, and compare Celeste to Stripe82 primary accuracy:
$ julia write_ground_truth_catalog_csv.jl coadd $ julia sdss_rcf_to_csv.jl \ --objid-csv output/coadd_for_4263_5_119_<hash>.csv $ julia run_celeste_on_field.jl --use-full-initialization \ output/sdss_4263_5_119_primary_<hash>.csv $ julia score_predictions.jl \ output/coadd_for_4263_5_119_<hash>.csv \ output/sdss_4263_5_119_primary_<hash>.csv \ output/sdss_4263_5_119_predictions_<hash>.csv $ julia score_uncertainty.jl \ output/coadd_for_4263_5_119_<hash>.csv \ output/sdss_4263_5_119_predictions_<hash>.csv
-
To run Celeste on GalSim imagery from a "prior" ground truth catalog, using partial information from the ground truth catalog for initialization, and compare single to joint inference:
$ julia write_ground_truth_catalog_csv.jl prior # go to benchmark/galsim/ and generate synthetic imagery from the above-generated catalog $ julia run_celeste_on_field.jl \ output/prior_<hash>.csv --images-jld output/prior_<hash>_synthetic_<hash>.jld $ julia run_celeste_on_field.jl \ output/prior_<hash>.csv --images-jld output/prior_<hash>_synthetic_<hash>.jld --joint $ julia score_predictions.jl \ output/prior_<hash>.csv \ output/prior_<hash>_images_<hash>_predictions_<first hash>.csv output/prior_<hash>_images_<hash>_predictions_<second hash>.csv
-
To run Celeste on another SDSS RCF using "primary" predictions both for initialization and as ground truth:
$ julia sdss_rcf_to_csv.jl $ julia run_celeste_on_field.jl --use-full-initialization \ output/sdss_4263_5_119_primary_<hash>.csv $ julia score_predictions.jl \ output/sdss_4263_5_119_primary_<hash>.csv \ output/sdss_4263_5_119_predictions_<hash>.csv