Define schemas with pydantic #9969
-
|
Is there a way to define table schemas as pydantic models and use that as the source of truth for what the data contains, irrespective of the backend used (duckdb in my case)? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
Hey @tgy! We don't have explicit support for >>> import ibis
>>> sch = ibis.schema({"a": "array<int>", "b": "float32", "c": "str"})
>>> ibis.table(sch, name="my_table")
UnboundTable: my_table
a array<int64>
b float32
c stringOr if you prefer to not use our string-parsing to get the datatypes, you can use explicit ibis dtypes: >>> import ibis.expr.datatypes as dt
>>> sch = ibis.schema({"a": dt.Array(dt.int), "b": dt.float32, "c": dt.str})
>>> ibis.table(sch, name="my_table")
UnboundTable: my_table
a array<int64>
b float32
c stringAlternatively, if you have a schema-like thing defined in any one of >>> from ibis.expr.schema import Schema
>>> sch = Schema.from_polars(schema_obj) # or `pyarrow`, etc...Now you can operate on this table like any other table >>> t = ibis.table(sch, name="my_table")
>>> expr = t.mutate(c=t.c.replace("foo", "bar"))
>>> expr
r0 := UnboundTable: my_table
a array<int64>
b float32
c string
Project[r0]
a: r0.a
b: r0.b
c: StringReplace(r0.c, pattern='foo', replacement='bar')Once you have the expression how you want it (or just to test it out) you can run it against actual data by doing something like con = ibis.duckdb.connect("my_db_with_that_actual_table_definition.ddb")
con.to_pandas(expr) |
Beta Was this translation helpful? Give feedback.
You could do something like this? Depending on how you want to type the
dataclass, you might have to add in some type mapping (like if you were typing things using python builtin types instead of Ibis types).