-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Currently, the to_bigquery presented in the gist uses temporary storage, I think this is not ideal given that the user will have to create the storage to be able to do this.
I was wondering if it would be possible to take a similar approach what to was done for dask-mongo where the write_bgq would be using pandas.to_gbq() on the pandas df that comes from each partition. Where partitions will look something like
def to_bgq(ddf, some_args):
partitions = [
write_gbq(partition, connection_args)
for partition in ddf.to_delayed()
]
dask.compute(partitions)and write_bigquery will have something of the form:
@delayed
def write_gbq():
with bigquery.Client() as bq_client:
pd.to_gbq(df, some_args) Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels