Skip to content

Using PBench

Steve Burnett edited this page May 8, 2024 · 15 revisions

This page presents the online help for all pbench commands except for pbench run. For help with pbench run see Running PBench.

pbench

Run ./pbench --help to see the online help for pbench.

Tool for running Presto benchmarks

Usage:
  pbench [command]

Available Commands:
  cmp         Compare two query result directories
  completion  Generate the autocompletion script for the specified shell
  genconfig   Generate benchmark cluster configurations
  help        Help about any command
  loadjson    Load query JSON files into event listener database and run recorders
  round       Round the decimal values in the benchmark query output files for easier comparison.
  run         Run a benchmark
  save        Save table information for recreating the schema and data

Flags:
  -h, --help   help for pbench

Use "pbench [command] --help" for more information about a command.

pbench cmp

Run ./pbench cmp --help to see the online help for pbench cmp.

Compare two query result directories

Usage:
  pbench cmp [flags] [directory 1] [directory 2]

Flags:
  -r, --file-id-regex string   regex to extract file id from file names in two directories to find matching files to compare (default ".*(query_\\d{2}).*\\.output")
  -h, --help                   help for cmp
  -o, --output-path string     diff output path (default "./diff")

pbench completion

Run ./pbench completion --help to see the online help for pbench completion.

Generate the autocompletion script for pbench for the specified shell.
See each sub-command's help for details on how to use the generated script.

Usage:
  pbench completion [command]

Available Commands:
  bash        Generate the autocompletion script for bash
  fish        Generate the autocompletion script for fish
  powershell  Generate the autocompletion script for powershell
  zsh         Generate the autocompletion script for zsh

Flags:
  -h, --help   help for completion

Use "pbench completion [command] --help" for more information about a command.

pbench genconfig

Run ./pbench genconfig --help to see the online help for pbench genconfig.

Generate benchmark cluster configurations

Usage:
  pbench genconfig [flags] [directory to search recursively for config.json]
  pbench genconfig [command]

Available Commands:
  default     Print the built-in default generator parameter file.

Flags:
  -h, --help                    help for genconfig
  -p, --parameter-file string   Specifies the parameter file. Use built-in defaults if not specified.
  -t, --template-dir string     Specifies the template directory. Use built-in template if not specified.

Use "pbench genconfig [command] --help" for more information about a command.

PBench's default configuration is in the params.json file in the PBench repository. pbench genconfig default displays the contents of params.json.

pbench help

Run ./pbench help --help to see the online help for pbench help.

Help provides help for any command in the application.
Simply type pbench help [path to command] for full details.

Usage:
  pbench help [command] [flags]

Flags:
  -h, --help   help for help

For example, the two commands

  • pbench genconfig default --help
  • pbench help genconfig default

return the same output:

Print the built-in default generator parameter file.

Usage:
  pbench genconfig default

Flags:
  -h, --help   help for default

pbench loadjson

Run ./pbench loadjson --help to see the online help for pbench loadjson.

Load query JSON files into event listener database and run recorders

Usage:
  pbench loadjson [flags] [list of files or directories to process]

Flags:
  -c, --comment string       Add a comment to this run (optional)
  -h, --help                 help for loadjson
      --influx string        InfluxDB connection config for run recorder (optional)
      --mysql string         MySQL connection config for event listener and run recorder (optional)
  -n, --name string          Assign a name to this run. (default: "load_<current time>") (default "load_240502-144312")
  -o, --output-path string   Output directory path (default "/Users/steveburnett/Downloads/pbench")
  -P, --parallel int         Number of parallel threads to load json files (default 10)
  -r, --record-run           Record all the loaded JSON as a run

The default for -P varies, as its default is the number of CPU cores on the system.

pbench round

Run ./pbench round --help to see the online help for pbench round.

The program will try to match every column in the first row to see which column has matching decimal.
After processing the first row, it will only look at the matched columns. So if the overly long decimal only appears from the second row, this might not work properly.
A PR was opened to fix the native/Java decimal precision discrepancy but so far it does not work quite well:
https://github.com/facebookincubator/velox/pull/7944

Usage:
  pbench round [flags] [list of files or directories to process]

Flags:
  -e, --file-extension stringArray   Specifies the file extensions ton include for processing (including the dot). You can specify multiple file extensions. (default [.output])
  -f, --format string                Specifies the format of the files. Accepted values are: "csv"" or "json" which is the output file from the "run"" command (default "json")
  -h, --help                         help for round
  -p, --precision int                Decimal precision to preserve. (default 12)
  -r, --recursive                    Recursively walk a path if a directory is provided in the arguments.
  -i, --rewrite-in-place             When turned on, we will rewrite the file in-place. Otherwise, we save the rewritten file separately.

pbench run

See Running PBench.

pbench save

Run ./pbench save --help to see the online help for pbench save.

Save table information for recreating the schema and data

Usage:
  pbench save [flags] [list of table names]

Flags:
      --catalog string        Catalog name
  -f, --file string           CSV file to read catalog,schema,table
      --force-https           Force all API requests to use HTTPS
  -h, --help                  help for save
  -o, --output-path string    Output directory path (default "/Users/ezhang/Downloads/collect-stats")
  -P, --parallel int          Number of parallel threads to save table summaries. (default 10)
  -p, --password string       Presto user password (optional)
      --schema string         Schema name
  -s, --server string         Presto server address (default "http://127.0.0.1:8080")
      --session stringArray   Session property (property can be used multiple times; format is
                              key=value; use 'SHOW SESSION' in Presto CLI to see available properties)
      --trino                 Use Trino protocol
  -u, --user string           Presto user name (default "pbench")```

The default for -P varies, as its default is the number of CPU cores on the system.

Clone this wiki locally