to_bigquery ideas - no intermediate storage

Currently, the `to_bigquery` presented in the [gist](https://gist.github.com/bnaul/4819f045ccbee160b60a530b6cfc0c98#file-dask_bigquery-py-L188) uses temporary storage, I think this is not ideal given that the user will have to create the storage to be able to do this. 

I was wondering if it would be possible to take a similar approach what to was done for [`dask-mongo`](https://github.com/coiled/dask-mongo/blob/main/dask_mongo/core.py#L14-L82) where the `write_bgq` would be using `pandas.to_gbq()` on the pandas `df` that comes from each partition. Where partitions will look something like

```python
def to_bgq(ddf, some_args):

      partitions = [
            write_gbq(partition, connection_args)
            for partition in ddf.to_delayed()
        ]

       dask.compute(partitions)
``` 

and `write_bigquery` will have something of the form:

```python
@delayed
def write_gbq():
     with bigquery.Client() as bq_client:
            pd.to_gbq(df, some_args) 
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to_bigquery ideas - no intermediate storage #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

to_bigquery ideas - no intermediate storage #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions