Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions website/docs/reference/global-configs/databricks-changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ The following are the current [behavior change flags](/docs/reference/global-con
| [`use_info_schema_for_columns`](#use-information-schema-for-columns) | 1.9.0 | TBD |
| [`use_user_folder_for_python`](#use-users-folder-for-python-model-notebooks) | 1.9.0 | TBD |
| [`use_materialization_v2`](#use-restructured-materializations) | 1.10.0 | TBD |
| [`use_managed_iceberg`](#use-managed-iceberg) | 1.11.0 | TBD |
| [`use_replace_on_for_insert_overwrite`](#use-replace-on-for-insert_overwrite-strategy) | 1.11.0 | TBD |

## Use information schema for columns

Expand Down Expand Up @@ -178,3 +180,11 @@ models:
```

</File>

## Use managed Iceberg

The `use_managed_iceberg` flag is `False` by default and results in a [UniForm](https://www.databricks.com/blog/delta-uniform-universal-format-lakehouse-interoperability) table when `table_format` is set to `iceberg`. When this flag is set to `True`, the table is created as a [managed Iceberg table](https://docs.databricks.com/aws/en/tables/managed).

## Use `replace on` for `insert_overwrite` strategy

The `use_replace_on_for_insert_overwrite` flag is only relevant when using incremental models with the `insert_overwrite` strategy on SQL warehouses. The flag is `True` by default and results in using the `replace on` syntax to perform partition overwrites. When the flag is set to `False`, partition overwrites will be performed via `insert overwrite` with dynamic partition overwrite. The latter is only officially supported for cluster computes, and will truncate the entire table when used with SQL warehouses.
Copy link
Contributor

@longvu-db longvu-db Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should we focus on the fact that there will be no more table truncation on SQL Warehouses and instead DPO?

For Cluster Computes, nothing changes, if we mention Cluster Computes, then we should at least emphasize that REPLACE ON and the old Classic dynamic INSERT OVERWRITE syntax both perform a dynamic partition overwrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Also, Should we still mention partition if we also have liquid clusters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longvu-db feel free to directly suggest edits to address these points. Since your team owns the feature, you can probably provide the most appropriate wording (sorry I should have shared this PR earlier)

Original file line number Diff line number Diff line change
Expand Up @@ -240,9 +240,9 @@ insert into table analytics.databricks_incremental

### The `insert_overwrite` strategy

This strategy is most effective when specified alongside a `partition_by` clause in your model config. dbt will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy.
This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition/cluster when using this incremental strategy. When using `liquid_clustered_by`, the `replace on` keys used will be equivalent to the `liquid_clustered_by` value (same as `partition_by` behavior). If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.

If no `partition_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.
If no `partition_by` or `liquid_clustered_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.

<Tabs
defaultValue="source"
Expand Down