-
Notifications
You must be signed in to change notification settings - Fork 5
Description
I propose we split the crates with an idea of how it was previously before we merged most of them into single runtime.
Problems I would like to address primarily are compilation and thus developer velocity: currently it takes ~30s (at least) on my machine (M1 Max) to iterate on any change within runtime crate. Project codebase grows pretty quickly and compilation times gets worse very quickly.
Another problem to address is clearer dependency split. Example: nothing should depend on UI module and currently (due to Rust allowing cycle dependencies in a crate) there are suboptimal dependencies already. I would like to avoid that and keep dependency graph simple.
With that being said, here is the proposal (pseudo naming is used, feel free to suggest better naming):
- Have single "fat"
mainwhere all dependencies are constructed, set up and wired up together. This is the only crate/module that will have all other dependencies (with possible conditional compilation flags, i.e. "compile with ui disabled", etc) embucket-executorprovidesExecutionServicetrait andCoreExecutorimplementation, depends onMetastoreembucket-metastoreprovidesMetastoretrait and `SlatedbMetastore implementation, leaf dependency (doesn't depend on other domain services and entities)embucket-http-uiprovides axumRouter, depends on possibly all/many services, serves UIembucket-http-v1provides axumRouter, depends on possibly all/many services, serves snowflake v1 REST APIembucket-http-icebergprovides axumRouter, depends on possibly all/many services, serves Iceberg REST APIembucket-history, providesWorksheetsStoretrait andSlateDBWorksheetsStore+RecordingExecutionServiceimplementations, depends ondyn ExecutionService
There are other crates I'm less confident about (mostly cross application dependencies like iceberg spec, utils and such)
embucket-datafusion-functions: does it make sense to also separate from original crate?embucket-datafusion-catalog: same, not sureembucket-utils: historically it provided slatedb related utility functions that were reused across entities storing data in slatedb- anything else I forgot