Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Dec 14, 2023

What changes were proposed in this pull request?

The pr aims to

Why are the changes needed?

We use the local maven repo as the first-level cache in ivy. The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user. Based on the original intention, we should skip the cache directly in similar situations.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually test.

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun panbingkun changed the title [SPARK-46400][CORE][SQL] When there are corrupted files in the local maven repo, retry to skip this cache [SPARK-46400][CORE][SQL] When there are damaged files in the local maven repo, skip this cache and try again Dec 14, 2023
@panbingkun panbingkun changed the title [SPARK-46400][CORE][SQL] When there are damaged files in the local maven repo, skip this cache and try again [SPARK-46400][CORE][SQL] When there are corrupted files in the local maven repo, skip this cache and try again Dec 14, 2023
@panbingkun
Copy link
Contributor Author

friendly ping @vsevolodstep-db @hvanhovell
If you have time, could you take some time to help review this PR?

// scalastyle:on println

val ivy = Ivy.newInstance(ivySettings)
var ivy = Ivy.newInstance(ivySettings)
Copy link
Contributor

@LuciferYang LuciferYang Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can give it a new name below, no need to change it to var here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
// resolve dependencies
val rr: ResolveReport = ivy.resolve(md, resolveOptions)
var rr: ResolveReport = ivy.resolve(md, resolveOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

if (failedReports.nonEmpty && noCacheIvySettings.isDefined) {
val failedArtifacts = failedReports.map(
fr => fr.getArtifact).mkString("[", ", ", "]")
logInfo(s"Download failed: $failedArtifacts, attempt to skip local maven cache.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip local-m2-cache?

throw new RuntimeException(rr.getAllProblemMessages.toString)
// SPARK-46302: When there are some corrupted jars in the maven repo,
// we try to continue without the cache
val failedReports = rr.getArtifactsReports(DownloadStatus.FAILED, true)
Copy link
Contributor

@LuciferYang LuciferYang Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that if the submission machine has a .m2 dir, it will most likely enter the retry process of without local-m2-cache because the local-m2-cache produced from building spark distribution also has this issues now.

Perhaps it's difficult to obtain a perfect local-m2-cache now.

* @param ivySettings
* An IvySettings containing resolvers to use
* @param noCacheIvySettings
* An no-cache IvySettings containing resolvers to use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

def resolveMavenCoordinates(
coordinates: String,
ivySettings: IvySettings,
noCacheIvySettings: Option[IvySettings] = None,
Copy link
Contributor

@LuciferYang LuciferYang Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where was a non-None noCacheIvySettings passed in? I build a new client with this pr , but also found the error message:

:: problems summary ::
:::: WARNINGS
		[NOT FOUND  ] org.apache.hive#hive-exec;2.3.9!hive-exec.jar (0ms)

	==== local-m2-cache: tried

	  file:/Users/yangjie01/.m2/repository/org/apache/hive/hive-exec/2.3.9/hive-exec-2.3.9.jar

		[NOT FOUND  ] asm#asm;3.2!asm.jar (1ms)

	==== local-m2-cache: tried

	  file:/Users/yangjie01/.m2/repository/asm/asm/3.2/asm-3.2.jar

		::::::::::::::::::::::::::::::::::::::::::::::

		::              FAILED DOWNLOADS            ::

		:: ^ see resolution messages for details  ^ ::

		::::::::::::::::::::::::::::::::::::::::::::::

		:: asm#asm;3.2!asm.jar

		:: org.apache.hive#hive-exec;2.3.9!hive-exec.jar

		::::::::::::::::::::::::::::::::::::::::::::::

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the input for noCacheIvySettings in IsolatedClientLoader and built a new client. Although the retry was successful, there will be a large number of error messages like the following:

:: problems summary ::
:::: ERRORS
	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where was a non-None noCacheIvySettings passed in? I build a new client with this pr , but also found the error message:

:: problems summary ::
:::: WARNINGS
		[NOT FOUND  ] org.apache.hive#hive-exec;2.3.9!hive-exec.jar (0ms)

	==== local-m2-cache: tried

	  file:/Users/yangjie01/.m2/repository/org/apache/hive/hive-exec/2.3.9/hive-exec-2.3.9.jar

		[NOT FOUND  ] asm#asm;3.2!asm.jar (1ms)

	==== local-m2-cache: tried

	  file:/Users/yangjie01/.m2/repository/asm/asm/3.2/asm-3.2.jar

		::::::::::::::::::::::::::::::::::::::::::::::

		::              FAILED DOWNLOADS            ::

		:: ^ see resolution messages for details  ^ ::

		::::::::::::::::::::::::::::::::::::::::::::::

		:: asm#asm;3.2!asm.jar

		:: org.apache.hive#hive-exec;2.3.9!hive-exec.jar

		::::::::::::::::::::::::::::::::::::::::::::::

Yes, this is my bad, it should be passed in when IsolatedClientLoader calls it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the input for noCacheIvySettings in IsolatedClientLoader and built a new client. Although the retry was successful, there will be a large number of error messages like the following:

:: problems summary ::
:::: ERRORS
	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

	unknown resolver local-m2-cache

The above issues have been resolved.


/**
* Extracts maven coordinates from a comma-delimited string
* Create a ChainResolver used by Ivy to search for and resolve dependencies.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code comments were incorrect.

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, it is acceptable to directly remove or disable the use of local-m2-cache by default for me, as this only affects the speed of the first startup with spark.sql.hive.metastore.jars=maven

Besides, we should listen to other people's suggestions

@LuciferYang
Copy link
Contributor

LuciferYang commented Jan 2, 2024

friendly ping @HyukjinKwon, do you know who would be a good fit to review this PR together? Thanks ~

@LuciferYang
Copy link
Contributor

@srowen Do you have any suggestions on this pr? As far as I remember, the current maven daily test still can't test HiveExternalCatalogVersionsSuite due to this issue. Thanks ~

@srowen
Copy link
Member

srowen commented Jan 30, 2024

If this fixes the issue, great! any downside?

@mridulm
Copy link
Contributor

mridulm commented Jan 30, 2024

+CC @shardulm94

@panbingkun
Copy link
Contributor Author

If this fixes the issue, great! any downside?

The advantage: when there is a corrupted jar in Local Maven, it will retry in the skip Local Maven cache and will no longer fail to exit due to this reason.
The disadvantage: during the second retry, local Maven will no longer be used as the cache, which will increase some download time.

@srowen srowen closed this in f2a471e Jan 31, 2024
@srowen
Copy link
Member

srowen commented Jan 31, 2024

Merged to master

@LuciferYang
Copy link
Contributor

@panbingkun Could you submit a fix for branch-3.5 and branch-3.4 respectively? When testing HiveExternalCatalogVersionsSuite using maven, it requires downloading Spark-3.5 and Spark-3.4 with this fix in order to complete the creation of the data table.

@panbingkun
Copy link
Contributor Author

@panbingkun Could you submit a fix for branch-3.5 and branch-3.4 respectively? When testing HiveExternalCatalogVersionsSuite using maven, it requires downloading Spark-3.5 and Spark-3.4 with this fix in order to complete the creation of the data table.

Okay, let me do it.

LuciferYang pushed a commit that referenced this pull request Feb 8, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: #44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.5.
Master branch pr: #44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #45017 from panbingkun/branch-3.5_SPARK-46400.

Authored-by: panbingkun <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Feb 15, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: #44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.4.
Master branch pr: #44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #45018 from panbingkun/branch-3.4_SPARK-46400.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Feb 15, 2024

Thank you, @panbingkun and all!

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Mar 26, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: apache#44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.4.
Master branch pr: apache#44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#45018 from panbingkun/branch-3.4_SPARK-46400.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…ocal maven repo, skip this cache and try again (apache#367)

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: apache#44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.5.
Master branch pr: apache#44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#45017 from panbingkun/branch-3.5_SPARK-46400.

Authored-by: panbingkun <[email protected]>

Signed-off-by: yangjie01 <[email protected]>
Co-authored-by: panbingkun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants