Skip to content

Conversation

@stankiewicz
Copy link
Contributor

Goal - Bring more flexibility/strategies into incremental models. updatePartitionFilter is not enough.
Additional insert-overwrite strategy with DML (delete) before model execution. Delete statement should select partitions dynamically or based on static input.

Why it's needed:
Customers are calculating expensive aggregates. Sometimes there are no unique keys in input and output - incremental model is append only and leveraging pre-statement is error prone.

Solution suggested:
Adapter for incremental tables should support:

  • Append only (especially if no keys are provided)
  • Merge statement if keys are provided
  • Insert overwrite via delete from and insert

Partitioning should not be enforced, some SCD tables, like in data vault can be clustered only.

Insert overwrite strategy allows setting a overwrite_filter:

  • Default (empty) - when partition_by is used, then there will be DML invoked that is running DELETE from based on columns used with partition_by, otherwise it will fail
  • Custom like overwrite_filter = "current_date()" or overwrite_filter=${dataform.projectConfig.vars.date}

@lewish lewish changed the base branch from master to main July 20, 2022 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant