Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions website/docs/reference/global-configs/databricks-changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ The following are the current [behavior change flags](/docs/reference/global-con
| [`use_info_schema_for_columns`](#use-information-schema-for-columns) | 1.9.0 | TBD |
| [`use_user_folder_for_python`](#use-users-folder-for-python-model-notebooks) | 1.9.0 | TBD |
| [`use_materialization_v2`](#use-restructured-materializations) | 1.10.0 | TBD |
| [`use_managed_iceberg`](#use-managed-iceberg) | 1.11.0 | TBD |
| [`use_replace_on_for_insert_overwrite`](#use-replace-on-for-insert_overwrite-strategy) | 1.11.0 | TBD |

## Use information schema for columns

Expand Down Expand Up @@ -178,3 +180,11 @@ models:
```

</File>

## Use managed Iceberg

The `use_managed_iceberg` flag is `False` by default and results in a [UniForm](https://www.databricks.com/blog/delta-uniform-universal-format-lakehouse-interoperability) table when `table_format` is set to `iceberg`. When this flag is set to `True`, the table is created as a [managed Iceberg table](https://docs.databricks.com/aws/en/tables/managed).

## Use `replace on` for `insert_overwrite` strategy

The `use_replace_on_for_insert_overwrite` flag is only relevant when using incremental models with the `insert_overwrite` strategy on SQL warehouses. The flag is `True` by default and results in using the `replace on` syntax to perform partition overwrites. When the flag is set to `False`, partition overwrites will be performed via `insert overwrite` with dynamic partition overwrite. The latter is only officially supported for cluster computes, and will truncate the entire table when used with SQL warehouses.
Copy link
Contributor

@longvu-db longvu-db Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should we focus on the fact that there will be no more table truncation on SQL Warehouses and instead DPO?

For Cluster Computes, nothing changes, if we mention Cluster Computes, then we should at least emphasize that REPLACE ON and the old Classic dynamic INSERT OVERWRITE syntax both perform a dynamic partition overwrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Also, Should we still mention partition if we also have liquid clusters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longvu-db feel free to directly suggest edits to address these points. Since your team owns the feature, you can probably provide the most appropriate wording (sorry I should have shared this PR earlier)

Original file line number Diff line number Diff line change
Expand Up @@ -240,9 +240,9 @@ insert into table analytics.databricks_incremental

### The `insert_overwrite` strategy

This strategy is most effective when specified alongside a `partition_by` clause in your model config. dbt will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy.
This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does liquid_clustered_by interact with insert_overwrite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same as partition_by handling. The liquid_clustered_by columns will be used in the insert into .. replace on clause

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we updated the dbt insert overwrite docs to capture liquid_clustered_by is also supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think we should document explicitly that for liquid_clustered_by, the REPLACE ON keys will be the same as the liquid_clustered_by, same for partition_by

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a sentence in 00b5b11. Feel free to suggest edits if the wording seems off


If no `partition_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.
If no `partition_by` or `liquid_clustered_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.

<Tabs
defaultValue="source"
Expand Down
Loading