Scaling dbt Parsing for Large Number of Models #11406

shloktech · 2025-03-20T15:29:08Z

shloktech
Mar 20, 2025

Scalability of dbt for Large Projects

We’re starting this thread to discuss the scalability of dbt when handling a large number of models. In our organization, we have a dbt project with 2,000+ models. To execute these models, we’ve integrated dbt with Airflow, running dbt commands via the Airflow BashOperator.

Current Challenge:

While executing models through bash commands, we pass certain variables using the --var flag. However, due to a known dbt limitation, dbt is unable to do partial parse and it starts a full parse for the entire project for each airflow dbt task run. This significantly increases parsing time as the number of models is growing.

Optimization Efforts So Far:

We’ve explored the following optimizations as suggested in the dbt documentation:

PyYAML + LibYAML: Leveraging LibYAML for faster YAML parsing.
Partial Parsing: Avoiding re-parsing unchanged files between invocations.
Static Parser: Using a static parser for simpler models to speed up parsing.

Despite these efforts, the parsing time remains a bottleneck due to the sheer scale of our project.

Request for Suggestions:

We’re seeking additional strategies or best practices to optimize dbt parsing time for large projects. If you’ve faced similar challenges or have insights to share, we’d love to hear from you!

Tagging top contributors for attention: @drewbanin, @jtcohen6 , @emmyoop, @gshank, @MichelleArk
@MichelleArk, @cmcarthur, @dbeatty10

shloktech · 2025-03-22T05:18:41Z

shloktech
Mar 22, 2025
Author

Hello @drewbanin, @jtcohen6 , @emmyoop, @gshank, @MichelleArk, @MichelleArk, @cmcarthur, @dbeatty10,
Can you please take a look at: #11406?

Let me know in case of any queries.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Scaling dbt Parsing for Large Number of Models #11406

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Scaling dbt Parsing for Large Number of Models #11406

Uh oh!

shloktech Mar 20, 2025

Current Challenge:

Optimization Efforts So Far:

Request for Suggestions:

Replies: 1 comment

Uh oh!

shloktech Mar 22, 2025 Author

shloktech
Mar 20, 2025

shloktech
Mar 22, 2025
Author