Document new behavior flags for dbt-databricks #8049

ericj-db · 2025-10-16T17:32:46Z

What are you changing in this pull request and why?

Document two new behavior flags that will be released with dbt-databricks 1.11.0

use_managed_iceberg
use_replace_on_for_insert_overwrite

Screenshots of updated sections

Checklist

The changes in this PR meet the docs style guide/fundamentals required for all content.
Applied the proper versioning rules if the content is for specific dbt version(s): (version a whole page or version a block of content
The content in this PR requires a dbt release note, so I added one to the release notes page.).

vercel · 2025-10-16T17:32:50Z

@ericj-db is attempting to deploy a commit to the dbt-labs Team on Vercel.

A member of the Team first needs to authorize it.

benc-db · 2025-10-16T21:35:46Z

website/docs/reference/resource-configs/databricks-configs.md

 ### The `insert_overwrite` strategy

-This strategy is most effective when specified alongside a `partition_by` clause in your model config. dbt will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy.
+This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.


How does liquid_clustered_by interact with insert_overwrite?

It's the same as partition_by handling. The liquid_clustered_by columns will be used in the insert into .. replace on clause

Have we updated the dbt insert overwrite docs to capture liquid_clustered_by is also supported?

+1, I think we should document explicitly that for liquid_clustered_by, the REPLACE ON keys will be the same as the liquid_clustered_by, same for partition_by

Added a sentence in 00b5b11. Feel free to suggest edits if the wording seems off

amychen1776

longvu-db · 2025-10-27T16:37:35Z

website/docs/reference/global-configs/databricks-changes.md

+
+## Use `replace on` for `insert_overwrite` strategy
+
+The `use_replace_on_for_insert_overwrite` flag is only relevant when using incremental models with the `insert_overwrite` strategy on SQL warehouses. The flag is `True` by default and results in using the `replace on` syntax to perform partition overwrites. When the flag is set to `False`, partition overwrites will be performed via `insert overwrite` with dynamic partition overwrite. The latter is only officially supported for cluster computes, and will truncate the entire table when used with SQL warehouses.


Should we focus on the fact that there will be no more table truncation on SQL Warehouses and instead DPO?

For Cluster Computes, nothing changes, if we mention Cluster Computes, then we should at least emphasize that REPLACE ON and the old Classic dynamic INSERT OVERWRITE syntax both perform a dynamic partition overwrite.

Also, Should we still mention partition if we also have liquid clusters?

@longvu-db feel free to directly suggest edits to address these points. Since your team owns the feature, you can probably provide the most appropriate wording (sorry I should have shared this PR earlier)

website/docs/reference/resource-configs/databricks-configs.md

Co-authored-by: Thang Long Vu <[email protected]>

website/docs/reference/resource-configs/databricks-configs.md

Co-authored-by: Thang Long Vu <[email protected]>

ericj-db · 2025-10-27T16:50:04Z

website/docs/reference/resource-configs/databricks-configs.md

 ### The `insert_overwrite` strategy

-This strategy is most effective when specified alongside a `partition_by` clause in your model config. dbt will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy.
+This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.


This is referring the the databricks insert overwrite syntax, so does it need the underscore? insert_overwrite would be referring the the dbt incremental strategy

longvu-db · 2025-10-27T17:00:21Z

website/docs/reference/resource-configs/databricks-configs.md

 ### The `insert_overwrite` strategy

-This strategy is most effective when specified alongside a `partition_by` clause in your model config. dbt will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) that dynamically replaces all partitions included in your query. Be sure to re-select _all_ of the relevant data for a partition when using this incremental strategy.
+This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition/cluster when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.


Suggested change

This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into .. replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition/cluster when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.

This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `INSERT INTO .. REPLACE ON` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition/cluster when using this incremental strategy. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` or runtime is older than 17.1, this strategy will run an [atomic `insert overwrite` statement](https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-overwrite-table.html) instead.

SQL syntax is intentionally lowercase to align with the rest of dbt documentation

Let's leave it lowercase

website/docs/reference/resource-configs/databricks-configs.md

website/docs/reference/global-configs/databricks-changes.md

Co-authored-by: Thang Long Vu <[email protected]>

longvu-db

LGTM with 2 small comments

longvu-db · 2025-10-27T20:54:14Z

website/docs/reference/resource-configs/databricks-configs.md

+This strategy is most effective when specified alongside a `partition_by` or `liquid_clustered_by` clause in your model config. dbt will run an [atomic `insert into ... replace on` statement](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-dml-insert-into#replace-on) that dynamically replaces all partitions/clusters included in your query. Be sure to re-select _all_ of the relevant data for a partition/cluster when using this incremental strategy. 

-If no `partition_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.
+When using `liquid_clustered_by`, the `replace on` keys used will be equivalent to the `liquid_clustered_by` value (same as `partition_by` behavior). 


Suggested change

When using `liquid_clustered_by`, the `replace on` keys used will be equivalent to the `liquid_clustered_by` value (same as `partition_by` behavior).

When using `liquid_clustered_by`, the `replace on` keys used will be the same as the `liquid_clustered_by` keys (same as `partition_by` behavior).

Since liquid_clustered_by specifies column names, so slightly appropriate to also call it keys

longvu-db · 2025-10-27T20:55:11Z

website/docs/reference/resource-configs/databricks-configs.md

-If no `partition_by` is specified, then the `insert_overwrite` strategy will atomically replace all contents of the table, overriding all existing data with only the new records. The column schema of the table remains the same, however. This can be desirable in some limited circumstances, since it minimizes downtime while the table contents are overwritten. The operation is comparable to running `truncate` and `insert` on other databases. For atomic replacement of Delta-formatted tables, use the `table` materialization (which runs `create or replace`) instead.
+When using `liquid_clustered_by`, the `replace on` keys used will be equivalent to the `liquid_clustered_by` value (same as `partition_by` behavior). 
+
+If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `True` in SQL warehouses or if cluster computes are used, this strategy will run a [partitionOverwriteMode='dynamic' `insert overwrite` statement](https://docs.databricks.com/aws/en/delta/selective-overwrite#dynamic-partition-overwrites-with-partitionoverwritemode-legacyl) instead. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` in SQL warehouses, this strategy will truncate the entire table.


Suggested change

If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `True` in SQL warehouses or if cluster computes are used, this strategy will run a [partitionOverwriteMode='dynamic' `insert overwrite` statement](https://docs.databricks.com/aws/en/delta/selective-overwrite#dynamic-partition-overwrites-with-partitionoverwritemode-legacyl) instead. If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` in SQL warehouses, this strategy will truncate the entire table.

If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `True` in SQL warehouses or if cluster computes are used, this strategy will run a [partitionOverwriteMode='dynamic' `insert overwrite` statement](https://docs.databricks.com/aws/en/delta/selective-overwrite#dynamic-partition-overwrites-with-partitionoverwritemode-legacyl). If [`use_replace_on_for_insert_overwrite`](/reference/global-configs/databricks-changes#use-replace-on-for-insert_overwrite-strategy) is set to `False` in SQL warehouses, this strategy will truncate the entire table.

Nit

Document new behavior flags for dbt-databricks

216c575

ericj-db requested a review from a team as a code owner October 16, 2025 17:32

github-actions bot added the content Improvements or additions to content label Oct 16, 2025

update insert_overwrite section

e825acb

benc-db reviewed Oct 16, 2025

View reviewed changes

Merge branch 'current' into dbx-behavior-flags

e9c2e53

amychen1776 approved these changes Oct 20, 2025

View reviewed changes

amychen1776 enabled auto-merge October 20, 2025 20:37

longvu-db reviewed Oct 27, 2025

View reviewed changes

website/docs/reference/resource-configs/databricks-configs.md Outdated Show resolved Hide resolved

Update website/docs/reference/resource-configs/databricks-configs.md

999325e

Co-authored-by: Thang Long Vu <[email protected]>

auto-merge was automatically disabled October 27, 2025 16:42
Head branch was pushed to by a user without write access

longvu-db reviewed Oct 27, 2025

View reviewed changes

website/docs/reference/resource-configs/databricks-configs.md Outdated Show resolved Hide resolved

Update website/docs/reference/resource-configs/databricks-configs.md

d50c900

Co-authored-by: Thang Long Vu <[email protected]>

ericj-db commented Oct 27, 2025

View reviewed changes

longvu-db reviewed Oct 27, 2025

View reviewed changes

clarification for liquid_clustered_by

00b5b11

longvu-db reviewed Oct 27, 2025

View reviewed changes

website/docs/reference/resource-configs/databricks-configs.md Outdated Show resolved Hide resolved

longvu-db reviewed Oct 27, 2025

View reviewed changes

website/docs/reference/global-configs/databricks-changes.md Outdated Show resolved Hide resolved

Apply suggestions from code review

2bf8989

Co-authored-by: Thang Long Vu <[email protected]>

longvu-db approved these changes Oct 27, 2025

View reviewed changes


		## Use `replace on` for `insert_overwrite` strategy

		The `use_replace_on_for_insert_overwrite` flag is only relevant when using incremental models with the `insert_overwrite` strategy on SQL warehouses. The flag is `True` by default and results in using the `replace on` syntax to perform partition overwrites. When the flag is set to `False`, partition overwrites will be performed via `insert overwrite` with dynamic partition overwrite. The latter is only officially supported for cluster computes, and will truncate the entire table when used with SQL warehouses.

	When using `liquid_clustered_by`, the `replace on` keys used will be equivalent to the `liquid_clustered_by` value (same as `partition_by` behavior).
	When using `liquid_clustered_by`, the `replace on` keys used will be the same as the `liquid_clustered_by` keys (same as `partition_by` behavior).

Uh oh!

Document new behavior flags for dbt-databricks #8049

Are you sure you want to change the base?

Document new behavior flags for dbt-databricks #8049

Conversation

ericj-db commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are you changing in this pull request and why?

Screenshots of updated sections

Checklist

Uh oh!

vercel bot commented Oct 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amychen1776 left a comment

Choose a reason for hiding this comment

Uh oh!

longvu-db Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ericj-db Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

longvu-db left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ericj-db commented Oct 16, 2025 •

edited

Loading

longvu-db Oct 27, 2025 •

edited

Loading

ericj-db Oct 27, 2025 •

edited

Loading