Skip to content

Conversation

@tustvold
Copy link
Contributor

Which issue does this PR close?

Closes #1459
Closes #1557
Part of #1556

Rationale for this change

See tickets

What changes are included in this PR?

See tickets

Are there any user-facing changes?

Yes

Fix inference from null logical type (apache#1557)

Replace some `&Option<T>` with `Option<&T>` (apache#1556)
@github-actions github-actions bot added the parquet Changes to the parquet crate label Apr 13, 2022
/// Returns the key value metadata, returns `None` if [`ArrowReaderOptions::skip_arrow_metadata`]
fn get_kv_metadata(&self) -> Option<&Vec<KeyValue>> {
if self.options.skip_arrow_metadata {
return None;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be written without the change in #1556 to move away from &Option<T>

))),
},
// https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#unknown-always-null
(Some(LogicalType::UNKNOWN(_)), _) => Ok(DataType::Null),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix for #1557

assert_eq!(schema.field(0), &arrow_field);
}

#[test]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the closest I could get to a test of #1459 as we always write and decode the LogicalType, even when technically PARQUET_1_0 doesn't support it. The nature of thrift means this isn't actually a bug I don't think

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thanks @tustvold

@codecov-commenter
Copy link

Codecov Report

Merging #1558 (54755f5) into master (c9549bb) will increase coverage by 0.02%.
The diff coverage is 98.46%.

@@            Coverage Diff             @@
##           master    #1558      +/-   ##
==========================================
+ Coverage   82.83%   82.85%   +0.02%     
==========================================
  Files         190      190              
  Lines       54957    55042      +85     
==========================================
+ Hits        45521    45606      +85     
  Misses       9436     9436              
Impacted Files Coverage Δ
parquet/src/file/serialized_reader.rs 95.65% <ø> (ø)
parquet/src/arrow/arrow_reader.rs 94.48% <98.30%> (+0.69%) ⬆️
parquet/src/arrow/array_reader.rs 87.07% <100.00%> (ø)
parquet/src/arrow/schema.rs 85.71% <100.00%> (+0.03%) ⬆️
parquet/src/file/metadata.rs 94.42% <100.00%> (ø)
parquet/src/encodings/encoding.rs 93.37% <0.00%> (-0.19%) ⬇️
arrow/src/json/reader.rs 84.48% <0.00%> (+0.02%) ⬆️
arrow/src/datatypes/field.rs 54.10% <0.00%> (+0.30%) ⬆️
arrow/src/ipc/reader.rs 88.28% <0.00%> (+0.48%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9549bb...54755f5. Read the comment docs.

@tustvold tustvold added the api-change Changes to the arrow API label Apr 14, 2022
@tustvold tustvold merged commit d701939 into apache:master Apr 14, 2022
@alamb alamb changed the title Add option to skip decoding arrow metadata from parquet (#1459) Add ArrowReaderOptions to ParquetFileArrowReader, add option to skip decoding arrow metadata from parquet (#1459) Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error Infering Schema for LogicalType::UNKNOWN Handle Parquet Files With Inconsistent Timestamp Units

3 participants