Skip to content

Unify row_limit #3424

@keflavich

Description

@keflavich

https://github.com/search?q=repo%3Aastropy%2Fastroquery+row_limit&type=code

ROW_LIMIT is used in SIMBAD, Vizier docs, vizier, ESO (for the moment), gaia, Euclid, IRSA's docs,

but row_limit is used in IRSA, esasky, mpc, vizier

The case difference is not consistent between all of these. I am not sure there was good motivation for the all-caps variables. Maybe, but if not, let's change all of them to row_limit everywhere.

More serious a problem, several modules use a completely different keyword to specify row limit:
maxrec:

  • heasarc
  • cadc
  • mocserver
  • alma

maxrecords:

  • mast

I'm not sure if there are other hidden ones, but I'd like to unify these at the user-facing level. TAP uses maxrec, so for most use cases, row_limit can just be passed in to TAP modules as maxrec.

I assume the rest of the modules that have none of these simply don't have a mechanism to limit query length.

In this discussion, I learned that there are separate top and maxrec concepts. The former lets users downselect in a TAP query based on the top N rows, while maxrec is just a hard cap on what gets returned and doesn't guarantee any logical order, if I understand correctly. My take is that row_limit is much closer to maxrec, but top is more valuable and should be the default when supported - as long as top shares the effect of limiting the amount of data sent. Generally the use case I see for these limits is running small test queries before running big production queries, which means the actual content of the small query is secondary to it being quick; the corollary is that if top takes longer than maxrec (because it has to do an additional sort), it should not be the default option.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions