Skip to content

Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

@spock-abadai

Description

@spock-abadai

Apache Iceberg version

main (development)

Please describe the bug 🐞

On one of my iceberg tables, when I load a table and scan it, during the parsing of the name mapping in the table properties, pydantic issues the following ValidationError:

    def parse_mapping_from_json(mapping: str) -> NameMapping:
>       return NameMapping.model_validate_json(mapping)
E       pydantic_core._pydantic_core.ValidationError: 1 validation error for NameMapping
E       9.names
E         Value error, At least one mapped name must be provided for the field [type=value_error, input_value=[], input_type=list]
E           For further information visit https://errors.pydantic.dev/2.8/v/value_error

This seems to be a result of the code in table/name_mapping.py in the method check_at_least_one, which (if I understand correctly) checks that all fields in the name mapping have at least one name. However, if I'm reading the Iceberg spec correctly, it states that:

image

I'm not 100% sure what scenario lead to this but I can say that the name mapping we have indeed has a field with id 10 that has an empty list of names. This field existed at one point in the schema but it seems like it was removed. In any case, it doesn't seem like requiring that the list of names contain at least one value is in line with the spec (and it seems that situations where this isn't the case do happen).

Note that the said iceberg table was never created, written to or modified using pyiceberg (only using spark and trino). pyiceberg is only used to read.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions