Skip to content

[Discussion] Crates separation proposal #719

@rampage644

Description

@rampage644

I propose we split the crates with an idea of how it was previously before we merged most of them into single runtime.

Problems I would like to address primarily are compilation and thus developer velocity: currently it takes ~30s (at least) on my machine (M1 Max) to iterate on any change within runtime crate. Project codebase grows pretty quickly and compilation times gets worse very quickly.

Another problem to address is clearer dependency split. Example: nothing should depend on UI module and currently (due to Rust allowing cycle dependencies in a crate) there are suboptimal dependencies already. I would like to avoid that and keep dependency graph simple.

With that being said, here is the proposal (pseudo naming is used, feel free to suggest better naming):

  • Have single "fat" main where all dependencies are constructed, set up and wired up together. This is the only crate/module that will have all other dependencies (with possible conditional compilation flags, i.e. "compile with ui disabled", etc)
  • embucket-executor provides ExecutionService trait and CoreExecutor implementation, depends on Metastore
  • embucket-metastore provides Metastore trait and `SlatedbMetastore implementation, leaf dependency (doesn't depend on other domain services and entities)
  • embucket-http-ui provides axum Router, depends on possibly all/many services, serves UI
  • embucket-http-v1 provides axum Router, depends on possibly all/many services, serves snowflake v1 REST API
  • embucket-http-iceberg provides axum Router, depends on possibly all/many services, serves Iceberg REST API
  • embucket-history, provides WorksheetsStore trait and SlateDBWorksheetsStore + RecordingExecutionService implementations, depends on dyn ExecutionService

There are other crates I'm less confident about (mostly cross application dependencies like iceberg spec, utils and such)

  • embucket-datafusion-functions: does it make sense to also separate from original crate?
  • embucket-datafusion-catalog: same, not sure
  • embucket-utils: historically it provided slatedb related utility functions that were reused across entities storing data in slatedb
  • anything else I forgot

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions