-
Notifications
You must be signed in to change notification settings - Fork 393
Closed
Description
Feature Request / Improvement
Feature Request / Improvement
This is a placeholder ticket for implementing write support for PyIceberg.
Since we don't want PyIceberg to write the actual data, and only handle the metadata part of the Iceberg table format, we need to get an overview of the frameworks we most likely want to integrate with (PyArrow, Dask (fastparquet?), etc).
I would suggest the following first steps to keep it simple: Write using PyArrow (since that's the most commonly used FileIO) and start with unpartitioned tables.
What we need:
- Avro write support: Python: Avro write support iceberg#7255
- Write files and extract statistics: Python: Write Parquet file using PyArrow iceberg#7256
- Ability to alter the Manifest JSON: Ability to the write Metadata JSON #22
- Proper integration tests between Java and Python: Python: Integration tests iceberg#6398
juztnelubka, Lyle-on-Git, pdames, jrouly, PiotrBB and 8 morejrouly, samplec0de, juztnelubka and Miliasberglh, luiztauffer, nicor88, juztnelubka, cdelamocepsa and 6 more
Metadata
Metadata
Assignees
Labels
No labels