EPIC: Rust Based Compaction

This is an EPIC issue that serves as a direction worth our community's attention. We can use this issue to track the features we want to offer and how close we are to achieving them.

---

The issue concerns compaction, specifically native compaction, to be precise, Rust-based compaction.

We all know that compaction is a resource-intensive task that involves heavy calculations, significant I/O, substantial memory consumption, and large-scale resources. I beleive compaction is the killer feature that iceberg-rust can provide for the whole communnity. I expect iceberg-rust can implement compaction more efficiently in terms of both performance and cost.

In this EPIC, I want iceberg-rust to deliver:

## Compaction API for a table.

- It should have a simple API that is easier to use for small tables, such as `table.compact()`.
- It should have a well-designed planner and scheduler that functions efficiently in a distributed system, processing large tables quickly.

## Bindings for Python and Java.

- This API should be available in Python so that PyIceberg can benefit from our implementation.
- This API should be available in Java, allowing users to enhance their Spark jobs.

## Tests (E2E tests, behavior tests, fuzz tests, ...)

Compaction is more complex than just reading. Mistakes we make could break the user's entire table.

We will need various tests, including end-to-end tests, behavior tests, and fuzz tests, to ensure we have done it correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

EPIC: Rust Based Compaction #624

Compaction API for a table.

Bindings for Python and Java.

Tests (E2E tests, behavior tests, fuzz tests, ...)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EPIC: Rust Based Compaction #624

Description

Compaction API for a table.

Bindings for Python and Java.

Tests (E2E tests, behavior tests, fuzz tests, ...)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions