Skip to content

Conversation

@MonkeyCanCode
Copy link
Contributor

@MonkeyCanCode MonkeyCanCode commented Aug 13, 2024

Description

The latest cli introduced a breaking change on cli initialization if an end-user tried to initialize it for the first time (for user who already initialized the cli, this won't cause issue as the code snippet will get skipped due to the present of env dir). This is due to SCRIPT_DIR is already set to xxxx/regtests but the init script still refs to one dir above in couple places. To fix it, I basically fixed the path refs.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Before the chance for an end-user to do initial init:

...
cp: cannot stat '/home/xxxxx/polaris/regtests/regtests/client/python/pyproject.toml': No such file or directory

Poetry could not find a pyproject.toml file in /home/xxxxx/polaris/regtests or its parents
First time setup complete.
/home/xxxxx/polaris/regtests/polaris-venv/bin/python3: can't open file 'regtests/client/python/cli/polaris_cli.py': [Errno 2] No such file or directory
...

After the fix:

Package operations: 25 installs, 1 update, 0 removals

  • Installing jmespath (1.0.1)
  • Installing python-dateutil (2.9.0.post0)
  • Installing botocore (1.34.160)
  • Updating platformdirs (3.11.0 -> 4.2.2)
  • Installing typing-extensions (4.12.2)
  • Installing annotated-types (0.7.0)
  • Installing cachetools (5.4.0)
  • Installing chardet (5.2.0)
  • Installing colorama (0.4.6)
  • Installing exceptiongroup (1.2.2)
  • Installing iniconfig (2.0.0)
  • Installing mccabe (0.7.0)
  • Installing mypy-extensions (1.0.0)
  • Installing pluggy (1.5.0)
  • Installing pycodestyle (2.9.1)
  • Installing pydantic-core (2.20.1)
  • Installing pyflakes (2.5.0)
  • Installing s3transfer (0.10.2)
  • Installing pyproject-api (1.7.1)
  • Installing boto3 (1.34.120)
  • Installing flake8 (5.0.4)
  • Installing mypy (1.4.1)
  • Installing pydantic (2.8.2)
  • Installing pytest (8.3.2)
  • Installing tox (4.18.0)
  • Installing types-python-dateutil (2.9.0.20240316)

Writing lock file
First time setup complete.

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue
  • I have signed and submitted the ICLA and if needed, the CCLA. See Contributing for details.

@MonkeyCanCode MonkeyCanCode requested a review from a team as a code owner August 13, 2024 22:57
aihuaxu
aihuaxu previously approved these changes Aug 15, 2024
Copy link
Contributor

@aihuaxu aihuaxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MonkeyCanCode
Copy link
Contributor Author

will check why fail on regression tests later tonight.

@MonkeyCanCode
Copy link
Contributor Author

MonkeyCanCode commented Aug 17, 2024

@eric-maynard / @aihuaxu so the original change was actually invalid. This is due to the content of polaris cli between the project root dir and regtests dir is similar and I was trying to used the one within regtests dir earlier for bootstrapping. As the build is happening within regtests dir and it is copying the polaris cli from that dir, thus the dir ref is correct (as it is copied to the one dir above):

COPY ./polaris /home/spark

To avoid the confusion, I had add a check for polari cli within regtests dir to state this is for regression test only and people should not be able to bootstrap from there.

Here is how it will looks like if a user tried to bootstrap within regtests dir:

xxxx@DESKTOP:~/polaris/regtests(fix_polaris_cli)$ ./polaris
This script is intended solely for regression testing and cannot be executed from within the 'regtests' directory. Please use the script from the parent directory: /home/xxxx/polaris/polaris

@flyrain
Copy link
Contributor

flyrain commented Aug 19, 2024

The name is still a bit confusing even though we ban it from running within the dir regtests. How about renaming the regtests/polaris to something like regtests/polaris-reg-test? Not a blocker though.

@eric-maynard
Copy link
Contributor

I like @flyrain 's idea -- IMO ideally we would not even need the regtests/polaris script and we could somehow generate it at test-time based on the actual polaris script. But that may be too much complexity for now.

@MonkeyCanCode
Copy link
Contributor Author

@flyrain / @eric-maynard thanks for reviewed. Pushed the change for file rename.

# Save the current directory
CURRENT_DIR=$(pwd)
cd $SCRIPT_DIR > /dev/null
pushd $SCRIPT_DIR > /dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushd is not supported in all shells, which is why this script originally didn't use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is what I was guessing as well. But I did check last night that spark python image does support it. Maybe I am missing something here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on the shell docker uses. Maybe it's better to avoid pushd etc. if we can

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the variable in both places if pushd isn't supported in all shell. Thoughts?

Copy link
Contributor

@eric-maynard eric-maynard Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, although the polaris script is meant to be run in a different environment and we can reasonably expect that the shell header will be respected. Users can install bash if it's not there and the script will fail if the header can't be respected. That's not necessarily true or desirable for a docker container used for regression tests.

In any event changing this is probably outside the scope of this PR. I'd like to get this fix merged ASAP as main is broken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge the previous commit then. @MonkeyCanCode , can you revert the latest one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flyrain / @eric-maynard changes are reverted (diff file content and diff name). I am not aware of the main is broken issue and this change won't fix that I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noted in the description, the CLI's first-time setup current doesn't work and so the CLI is unusable without the user doing a workaround.

Copy link
Contributor

@collado-mike collado-mike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like to move the python client and cli out of the regtests directory altogether. Originally, the python client was generated there purely for running the pyspark tests, but now that the client is being used for non-test purposes, I don't think it makes sense to keep this code in that directory at all.

@eric-maynard
Copy link
Contributor

+1 to @collado-mike's comment above. We should move this and the script will need to change then too. But let's get main un-broken ASAP for now.

@flyrain
Copy link
Contributor

flyrain commented Aug 20, 2024

I'd really like to move the python client and cli out of the regtests directory altogether.

yeah, this is not done by this PR, only because that dockerfile CANNOT visit its parent directory. We will figure it out in a follow-up.

@flyrain flyrain merged commit 8510e9f into apache:main Aug 20, 2024
@flyrain
Copy link
Contributor

flyrain commented Aug 20, 2024

Thanks a lot, @MonkeyCanCode for the change. Thanks @eric-maynard @aihuaxu @collado-mike for the reivew.

@MonkeyCanCode MonkeyCanCode deleted the fix_polaris_cli branch August 25, 2024 16:58
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Fix Jandex Maven coordinates (apache#2888)

The entry `jandex = { module = "io.smallrye.jandex:jandex", version ="3.5.0" }` is wrong (coordinates are `io.smallrye:jandex`), and Jandex is defined elsewhere as `smallrye-jandex`.
Interestingly, these (broken) coordinates seem to cause the consistent re-creation of the Quarkus 3.29.0 PR (the cause is a mystery).

* Update plugin com.gradle.develocity to v4.2.2 (apache#2597)

* Site: Hugo docs relative links (apache#2892)

* Update dependency software.amazon.awssdk:bom to v2.36.2 (apache#2901)

* Update GitHub Artifact Actions (apache#2895)

* Formatting: apply Spotless to :polaris-distribution (apache#2900)

* Build: Capture jcstress output in a log file (apache#2890)

The jcstress output is pretty verbose and prints a lot to the console.
This change captures the output in a log file. In case of a test failure, the output is logged to the console, but only in case of a failure.

* Prep: Site for 1.2 release  (apache#2877)

* Adding 1.2.0 as one of active releases (apache#2916)

Co-authored-by: Yufei Gu <yufei.apache.org>

* Use official spark image (apache#2899)

* Update dependency ipykernel to v7.1.0 (apache#2918)

* Added missing features doc (apache#2898)

* Added missing features doc

* Added missing features doc

* Site: Add a blog for StarRocks and Apache Polaris Integration (apache#2851)

* NoSQL: Node IDs - API, SPI + general implementation (apache#2728)

* NoSQL: Node IDs - API, SPI + general implementation

This PR provides a mechanism to assign a Polaris-cluster-wide unique node-ID to each Polaris instance, which is then used when generating Polaris-cluster-wide unique Snowflake-IDs.

The change is fundamental for the NoSQL work, but also demanded for the existing relational JDBC persistence.

Does not include any persistence specific implementation.

* NoSQL: Fail node-management-impl init after timeout

Also move the expensive part to a `@PostConstruct` to not block CDI entirely from initializing.

* Update dependency io.prometheus:prometheus-metrics-exporter-servlet-jakarta to v1.4.2 (apache#2929)

* Build-logic: `GitInfo` refactor (apache#2908)

Allows use of `GitInfo` for other use cases than just Jar manifest attributes.
SBOM generation will be another use case for Git information.

* Memoize ASF project information (apache#2909)

Information included in Polaris publications pulls some information about the project from ASF project metadata sources (Whimsey).
This information is currently only used when generating Maven poms, but will also be needed in SBOMs.

This change adds a new, memoized `AsfProject` information object, which holds the project infromation from Whimsey.

* Build: Simplify signing + fix execution in polaris-distribution (apache#2906)

This change simplifies generation of non-publication artifacts by adding a function taking the task which outputs shall be signed. That function takes care of setting up the correct task dependencies and task execution.

Also fixes an issue that signing does not always happen when running `./gradlew :polaris-distribution:assemble`, because the task dependency graph for the archive tasks and the corresponding signing tasks isn't properly set up.

* Proposed Test Fix (apache#2936)

Co-authored-by: Travis Michael Bowen <[email protected]>

* Update docker.io/prom/prometheus Docker tag to v3.7.3 (apache#2944)

* Update Quarkus Platform and Group to v3.29.0 (apache#2934)

* Update Gradle to v9.2.0 (apache#2938)

Co-authored-by: Robert Stupp <[email protected]>

* Update dependency openapi-generator-cli to v7.17.0 (apache#2940)

* Implement OpaPolarisAuthorizer (apache#2680)

* Update dependency com.github.ben-manes.caffeine:caffeine to v3.2.3 (apache#2923)

* Prefer PolarisPrincipal.getRoles in Resolver (apache#2925)

it should be sufficient to rely on `SecurityContext.getUserPrincipal`
alone, we dont need to call `isUserInRole` explicitly.

note due to the `ResolverTest` testing with non-existent roles we have
to add null-filtering to the `Resolver`.

* Move `nodeids` to `nosql` package parent (apache#2931)

Following up on apache#2728 this change moves "nodeids" code to the
`org.apache.polaris.persistence.nosql.nodeids` package.

* Update actions/stale digest to 39bea7d (apache#2950)

* Update dependency org.junit:junit-bom to v5.14.1 (apache#2951)

* docs(2843): Add documentation around Polaris-Tools (apache#2946)

* Add documentation around Polaris-Tools
* Related to apache#2843

* Add getting started with Apache Ozone (apache#2853)

* Add getting started with Apache Ozone

Use Apache Ozone as an example S3 impl. that does not have STS.

* fix typo in MinIO readme

* Update dependency com.azure:azure-sdk-bom to v1.3.0 (apache#2754)

* docs: add feature configuration section to Hive federation guide (apache#2952)

Add documentation for required feature flags when enabling
Hive Metastore federation. Users must configure three properties
in `application.properties` before Hive federation will work:

- `SUPPORTED_CATALOG_CONNECTION_TYPES`
- `SUPPORTED_EXTERNAL_CATALOG_AUTHENTICATION_TYPES`
- `ENABLE_CATALOG_FEDERATION`

Inspired from [this](https://apache-polaris.slack.com/archives/C084XDM50CB/p1761851426511259) Slack thread.

Co-authored-by: Prathyush Shankar <[email protected]>

* Change getting-start docker file to use official spark image from outdated jupyter image (apache#2943)

* Use official spark image

* Use official spark image

* Use official spark image

* Use official spark image

* Use official spark image

* Use Iterable for realms in BootstrapCommand (apache#2956)

* Simplify digest generation (apache#2907)

Similarly to the change to simplify artifact signing, this change simplifies digest generation by introducing a function to digest the output files of any task. That function takes care of setting up the correct task dependencies and task execution.

Also removes an unnecessary double buffering during digest generation.

* Build: `GitInfo` function to build a raw github content URL (apache#2910)

* NoSQL: nodeids renames

* NoSQL: Update test for Caffeine 3.2.3

The read of `Eviction` properties is "just" a volatile read since Caffeine 3.2.3 and trigger cleanups asynchronously. Before 3.2.3, cleanups happened synchronously.  This change breaks the initially present assertions of this test, but not the functionality of the production code.

See ben-manes/caffeine#1897

* Last merged commit cec41c4

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: olsoloviov <[email protected]>
Co-authored-by: Prashant Singh <[email protected]>
Co-authored-by: Yufei Gu <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Youngwb <[email protected]>
Co-authored-by: Travis Bowen <[email protected]>
Co-authored-by: Travis Michael Bowen <[email protected]>
Co-authored-by: Sung Yun <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Adam Christian <105929021+adam-christian-software@users.noreply.github.com>
Co-authored-by: carc-prathyush-shankar <[email protected]>
Co-authored-by: Prathyush Shankar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants