Skip to content

Conversation

@snmvaughan
Copy link
Contributor

Description of PR

The following CVEs can be addressed by upgrading dependencies within the build. This includes a replacement of HTrace with a noop implementation.

This addresses all of the CVEs from branch-3.3.4 except for the kotlin library associated with okhttp and the ones that would require upgrading Netty to 4.x.

This is a backport specifically targeted at 3.3.4

How was this patch tested?

Tested using a local build of branch-3.3.4 along with this patch.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@brahmareddybattula
Copy link
Contributor

changes lgtm.. Looks to be some unchanged files also poping up here. .. Any chances to backport to branch-3.2 also..?

@steveloughran
Copy link
Contributor

steveloughran commented Jun 23, 2022

(you are going to hate me here. sorry)

First, please let's not have "update a few dependency" patches. Is it not a useful title and by updating multiple dependencies simultaneously makes it a lot harder to identify problems through git bisect and makes the changes harder roll back and cherry pick.

Second, we must not have anything in this release which isn't already in branch 3.3 and so has been stabilising there in the uses other developers have been making of that branch.

Finally, I am scared of any- and all- last minute updates of dependences as the blast radius of a change of a few digits in a number in a POM file can have dramatic impact on a project two hops away.

That's why I believe the default decision on any last minute dependency update should be "no". This is worth bearing in mind as I intend to share release manager responsibilities with Mukund on the branch-3.3 feature release this summer, and refusing last-minute changes is going to be my default action, especially when it comes to jar updates. Get those changes in and stabilising now!

jetty

-1 to jetty update because I'm scared of what will break. the hadoop.next release will upgrade to jetty 2 and shade it.

Htrace

-1 to htrace as it was fixed in this branch by #3520

9e2936f8d1f HADOOP-17424. Replace HTrace with No-Op tracer (#3520)

If this is not the case then we have a serious issue which needs to be fixed across all the recent branches. file a critical hadoop JIRA and we can go from there.

Zookeeper

-1 until/unless in branch-3.3

Interesting one there. trunk is on 3.6.3 after HADOOP-17612. Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0 #3241

For any change there, an increment on 3.5.x is lower risk and may not need a matching curator increment, but that'd still need qualification

for the branch-3.3 release, why don't we cherrypick #3241 and followons?

AWS SDK

-1 to updating the AWS SDK except as a standalone cherrypick of our branch-3.3 patch #3864 with full requalification

d8ab84275e0 - HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864)

The SDK is covered in HADOOP-18068; any back porting should just be a cherrypick. But as with most is AWS SDK updates it caused a regression (HADOOP-18085). Anyone proposing it as a backport has to

  1. Run the full hadoop-aws integration test suite with -Dscale and declare which endpoint they ran against.
  2. look at the section "Qualifying an AWS SDK Update" and treat the instructions there as a MUST not a MAY https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/testing.html#Qualifying_an_AWS_SDK_Update
  3. note that instruction 1 there is "Don’t make this a last minute action."

I have encountered other cases where people have been updating this SDK dependency in private forks. Yes, tools do highlight Jackson serialisation issues which exist in the shaded Jackson dependency. However, the AWS SDK does not use those bits of Jackson. And, because nothing else uses those bits of Jackson in this library precisely because they are shaded, the risk is not actually manifest in the S3A connector.

If you really want this in, create a single PR cherry picking HADOOP-18068, and all follow-on fixes which are applicable to this branch, say which AWS endpoint you ran the hadoop-aws test suites against. And do the entire SDK update qualification covered in the testing doc. I will then merge the chain of commits in one by one

This should be safe because we have actually been using this in branch 3.3+ and other than the regression in tests there have been no adverse consequences. It MUST be the exact version we have been using (1.12.132) as no later release has been validated.

@steveloughran
Copy link
Contributor

sorry, i confused jetty with jersey. don't know how jetty is on branch 3.3. it is not quite as bad as that jersey thing.

steveloughran and others added 3 commits June 23, 2022 20:22
With this update, the versions of key shaded dependencies are

  jackson    2.12.3
  httpclient 4.5.13

Contributed by Steve Loughran

Change-Id: Id9ed677352d54e8ea71b9729b6a4bfedc6142825
…lation (apache#3902)

Part of HADOOP-17198. Support S3 Access Points.

HADOOP-18068. "upgrade AWS SDK to 1.12.132" broke the access point endpoint
translation.

Correct endpoints should start with "s3-accesspoint.", after SDK upgrade they start with
"s3.accesspoint-" which messes up tests + region detection by the SDK.

Contributed by Bogdan Stolojan

Change-Id: I0c0181628ab803afc39036003777eaec79aa378c
@steveloughran
Copy link
Contributor

I have done the aws sdk update with followup patch, run the ITests with only an expected failure (marker tool and the landsat bucket). not going to do the others.

however, zookeeper may merit a change into branch 3.3 and then back to here. can you do that as its own JIRA. thanks

@steveloughran
Copy link
Contributor

oh, and -1. like I said, sorry.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 20s #4491 does not apply to branch-3.3.4. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #4491
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4491/2/console
versions git=2.17.1
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@snmvaughan
Copy link
Contributor Author

I understand the desire to evaluate and manage the individual changes, so I'll resubmit the individual dependency updates. I was already planning on submitting these for branch-3.3 when 3.3.4 switch came to my attention.

The HTrace library is showing up in the distribution as part of a transitive dependency from HBase. Given the goal to remove the dependency, the CVE, and ongoing work to move to OpenTelemetry, I would suggest we re-consider the hbase-noop-htrace swap.

I was already planning on looking into the Curator 5 related ZooKeeper pull request, but felt the elimination of CVEs in the short-term was important.

@jojochuang
Copy link
Contributor

Maintenance release update of Jetty is usually fine as long as the unit tests pass and that no additional dependencies got introduced.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and didn't #4495 / HADOOP-18044 updated jquery?

@steveloughran
Copy link
Contributor

  1. I've cherrypicked the aws sdk update we'd had in branch-3.3.
  2. the jquery stuff should all be good now.
  3. if htrace is still visible then that's a problem across the branches. is it getting in to the distribution, or is it a transitive dependency of something (test jar?) we don't distribute but do publish on maven.
  4. jetty will need to be reviewed carefully.

I am away for a week and want to make a release next week which is up to date security wise but not going to cause regressions. please can people sort htrace/jetty stuff out this week so everything is ready then. thanks

@snmvaughan
Copy link
Contributor Author

snmvaughan commented Jun 27, 2022 via email

@snmvaughan
Copy link
Contributor Author

I'm closing this pull request, and replacing it with smaller targeted ones.

@snmvaughan snmvaughan closed this Jun 28, 2022
@apache apache deleted a comment from hadoop-yetus Jul 18, 2022
@steveloughran
Copy link
Contributor

this patch includes another aws sdk update in a9c174b7d3e69b6eee1e271.

steve, this merits a whole new jira with full qualification. as now it is very, very late to get it into 3.3.4, though I know we will get complaints if it isn't

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants