Skip to content

Return type of broadcast on GroupedDataFrame #1680

@nalimilan

Description

@nalimilan

Currently broadcasting on a GroupedDataFrame returns a Vector. This is inconsistent with map, which returns a GroupedDataFrame. Should we change this? I'd say yes:

  • a Vector doesn't carry any information about the groups, making the result almost useless
  • one can always use a comprehension to get a vector

If we agree to change this, we need to decide what to return exactly:

  • a GroupedDataFrame: consistent with map, which makes sense since broadcast and map return the same kind of objects in general in Base
  • a DataFrame: like combine, which may be more convenient since most operations are not supported on GroupedDataFrame

Maybe the solution is to return a GroupedDataFrame, but make that type behave more like a DataFrame (#1256). One issue is that a GroupedDataFrame doesn't make a lot of sense when each group contains a single row; so it depends on whether the most common use case for broadcast is to apply a function which returns multiple rows (like describe at #1539), or a single row.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions