Skip to content

Conversation

@andrew4699
Copy link
Contributor

Description

This PR adds the ability to provide wrappers around File IO by specifying a wrapper factory. It comes with a no-op wrapper factory which performs no wrapping. I did not implement any specific type of wrapper as there can be a separate discussion on which interceptors would be useful to put in the repo if any, but it should be uncontroversial that the ability to intercept this is generally useful for use cases such as metrics.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • I added a unit test testFileIOWrapper.
  • All existing tests work with the NoOp wrapper.
  • I ran through the quickstart and verified that I could perform all the same table operations.

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue
  • I have signed and submitted the ICLA and if needed, the CCLA. See Contributing for details.

@andrew4699 andrew4699 requested a review from a team as a code owner August 16, 2024 23:56
@andrew4699
Copy link
Contributor Author

@collado-mike I changed it to just be FileIOFactory and added MetricRegistryAware to be consistent with the other xAware interfaces.

collado-mike
collado-mike previously approved these changes Aug 21, 2024
Copy link
Contributor

@collado-mike collado-mike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Minor suggestion for an additional metric and add a javadoc and we'll be good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

measure files deleted? Seems like a useful metric (especially if it spikes real high real fast 😅)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some testing for measuring deleted files. Originally I left it out as MeasuredFileIOFactory is just a test class but I may as well add it as an example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! didn't realize this was a test class. It seems super useful to me. Maybe move it into src/main?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation isn't that great since it doesn't clean up IOs and its metrics may not be super useful, it was just intended to be super simple to confirm that File IO is wrapped properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this is redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factoryType could be a top level config, but I made it io => factoryType to push a grouping of io-related configs. For example if you choose to make a metrics-emitting implementation of FileIOFactory, you may want some further configs.

@andrew4699 andrew4699 force-pushed the aguterman-fileio-wrapper branch from 92c2539 to 55b34b8 Compare August 21, 2024 16:09
@andrew4699 andrew4699 force-pushed the aguterman-fileio-wrapper branch from 55b34b8 to f56f773 Compare August 26, 2024 16:39
@andrew4699 andrew4699 force-pushed the aguterman-fileio-wrapper branch 2 times, most recently from b863463 to 7b92ae1 Compare August 26, 2024 16:43
@andrew4699 andrew4699 requested a review from takidau as a code owner August 26, 2024 16:43
@andrew4699 andrew4699 force-pushed the aguterman-fileio-wrapper branch 2 times, most recently from baccf83 to 5ed91cc Compare August 26, 2024 17:05
@andrew4699 andrew4699 force-pushed the aguterman-fileio-wrapper branch from a6f6846 to 434e1e4 Compare August 26, 2024 22:57
@takidau takidau merged commit 07c8444 into apache:main Aug 29, 2024
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* fix typo in Ozone getting-started guide (apache#2975)

* NoSQL: Persistence API (apache#2965)

Provides the persistence API parts for the NoSQL persistence work. Most of this PR are purely interfaces and annotations.

It consists of a low-level `Persistence` interface to read and write individual `Obj`ects and the corresponding pluggable `ObjType`s.

The API module also contains upstream SPIs for database specific implementations and downstream APIs for atomic commits and indexes.
Also some CDI specific infrastructure helper annotations.

Unit tests cover the few pieces of actual executable code in this change.

This change also adds a README, which references functionality and modules that are not in this PR, but already provide an overview of the overall interactions.

* Handle poetry lock update (apache#2942)

* Update dependency com.azure:azure-sdk-bom to v1.3.2 (apache#2979)

* Added interface for reporting metrics (apache#2887)

Co-authored-by: Alexandre Dutra <[email protected]>

* Prefer RealmConfig fields (apache#2971)

minor cleanup of verbose code

* Update community meetings note links due to document split (apache#2981)

* NoSQL: database agnostic implementation + in-memory backend (apache#2977)

This change contains the database agnostic implementation plus the in-memory backend used for testing purposes, and a Junit extension.

These three modules are difficult to put into isolated PRs.

The "main implementation" contains the commit-logic, indexes-logic and the caching part.
`PersistenceImplementation` is (more or less) a wrapper providing higher-level functionality backed by a database's `Backend` implementation. The latter provides the bare minimum functionality.
Other implementations of the `Persistence` interface are just to transparently add caching and commit-attempt specific case.
No call site needs to bother about the actual implementation and/or its layers.

Tests in the `polaris-persistence-nosql-impl` module use the in-memory backend via the Junit extension.
Common tests for all backends, in-memory in this PR and MongoDB in a follow-up, are in the testFixtures of the `polaris-persistence-nosql-impl`.

* Fix incorrect column name in ModelEvent (apache#2987)

Fixes apache#2913

* Rename request ID header (apache#2988)

* Update dependency org.agrona:agrona to v2.3.1 (apache#2986)

* Update actions/stale digest to fad0de8 (apache#2984)

* Last merged commit 291bd7d

---------

Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: cccs-cat001 <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants