-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
all models which feed into dex_solana.bot_trades are out of date in terms of propagating bug fixes / new DEXs added upstream to dex_solana.trades. when changes are made, or new DEX added, to dex_solana.trades, the table rebuilds fully in prod efficiently. once complete, all bot trades models kick off since they read from DEX. the bot trades models will not run in prod, regardless of compute infrastructure in place (number of worker nodes in cluster, amount of concurrency, etc).
due to this, typically we won't run these models after dex is refreshed. we also have not added a new bot trades model since.
in order to ensure data quality & add more models, we need to revamp the design for which they are built in order to run efficiently in prod.
the most impactful learning recently for spellbook performance on the trino engine has been to materialize early, and materialize often. we can use staging tables frequently to get data in the shape we need to use downstream in finals tables. trino performance is nuked when you inject a CTE or subquery into a join condition. the planner doesn't handle it well. if you materialize that CTE or subquery, then use as a table join in downstream model, it performs much better.
i think bonkbot is one of the worst performance models, it would be good to open a fresh PR with that one to rearchitect to follow these patterns.
for any of these upstream models built as staging tables, avoid displaying on data explorer with post hooks.