-
Notifications
You must be signed in to change notification settings - Fork 668
Description
Is your feature request related to a problem?
I have a postgres db, with a table with a geometry column. If I attach to that DB using duckb, and then copy that data over to a local duckdb table, it errors, because ibis interprets the type as ibis.dt.Binary(), so then it CREATE TABLE
s with a column with duckdb type of BLOB, and then when actually inserting into that new table, it errors because duckdb can't cast from WKB_BLOB to BLOB. Even if it did, that wouldn't be right, since I'd want the table to be created with type geometry, not blob.
Relevant is #7818, where it looks like we explicitly chose to interpret "WKB_BLOB" as ibis.dt.Binary(). I'm not exactly sure what problem that PR was trying to avoid? If I understand that then I could help come up with a solution that doesn't cause a regression there. I HEREBY SUMMON @gforsyth TO TRY TO REMEMBER THE CONTEXT OF A DECISION FROM 2 YEARS AGO 🤣 (kidding, this is absolutely not your responsibility, but maybe there is some chance you remember)
import os
import duckdb
import ibis
ibis.options.interactive = True
conn = ibis.duckdb.connect()
conn.raw_sql(f"ATTACH '{os.environ['DATABASE_URL']}' AS pg (TYPE postgres)")
print(duckdb.__version__) # 1.3.2
conn.raw_sql("INSTALL spatial; LOAD spatial;")
print(
conn.sql(
"SELECT extension_name, extension_version FROM duckdb_extensions() where extension_name in ('spatial', 'postgres_scanner')"
)
)
# ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
# ┃ extension_name ┃ extension_version ┃
# ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
# │ string │ string │
# ├──────────────────┼───────────────────┤
# │ postgres_scanner │ c0411b9 │
# │ spatial │ 662718c │
# └──────────────────┴───────────────────┘
pg_table = conn.table("precincts", database=("pg", "districts"))
print(pg_table.precinct__geometry.type()) # binary
# This errors:
print(pg_table.to_sql()) # SELECT * FROM "pg"."districts"."precincts"
print(pg_table.limit(1).precinct__geometry.typeof().as_scalar().execute()) # WKB_BLOB
# conn.create_table("duckdb_table", pg_table)
# ConversionException Traceback (most recent call last)
# Cell In[14], line 10
# 8 print(pg_table.precinct__geometry.type()) # binary
# 9 print(pg_table.to_sql())
# ---> 10 conn.create_table("duckdb_table", pg_table)
# File ~/code/ibis/ibis/backends/duckdb/__init__.py:230, in Backend.create_table(self, name, obj, schema, database, temp, overwrite)
# 224 columns = [
# 225 sge.to_identifier(col, quoted=quoted) for col in table.columns
# 226 ]
# 227 insert_stmt = sge.insert(
# 228 query, into=initial_table, columns=columns
# 229 ).sql(dialect)
# --> 230 cur.execute(insert_stmt).fetchall()
# 232 if overwrite:
# 233 cur.execute(
# 234 sge.Drop(kind="TABLE", this=final_table, exists=True).sql(dialect)
# 235 )
# ConversionException: Conversion Error: Unimplemented type for cast (WKB_BLOB -> BLOB) when casting from source column precinct__geometry
# It works to explicitly cast to geometry, the created table is actually geometry
fixed = pg_table.mutate(precinct__geometry=pg_table.precinct__geometry.cast("geometry"))
print(fixed.to_sql())
# SELECT
# *
# REPLACE (ST_ASWKB("precinct__geometry") AS "precinct__geometry")
# FROM (
# SELECT
# "t0"."precinct__id",
# "t0"."precinct__hd",
# "t0"."precinct__sd",
# "t0"."precinct__borough",
# "t0"."precinct__region",
# "t0"."precinct__doe_name",
# "t0"."precinct__description",
# ST_GEOMFROMWKB("t0"."precinct__geometry") AS "precinct__geometry"
# FROM "pg"."districts"."precincts" AS "t0"
# )
print(fixed.limit(1).precinct__geometry.typeof().as_scalar().execute()) # GEOMETRY
conn.create_table("duckdb_table", fixed)
What is the motivation behind your request?
nightly syncing from prod db to analytical for analysis.
Describe the solution you'd like
- The
pg_table.precinct__geometry
column to be interpreted as geometry (eg displayed in the repr as such), not as binary. - The vanilla
conn.create_table("duckdb_table", pg_table)
to work as I describe
What version of ibis are you running?
main
What backend(s) are you using, if any?
duckdb and postgres
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
Type
Projects
Status