Skip to content

Commit 9d618bc

Browse files
virajjasanijojochuang
authored andcommitted
HADOOP-18125. Utility to identify git commit / Jira fixVersion discrepancies for RC preparation (#3991)
Signed-off-by: Wei-Chiu Chuang <[email protected]> (cherry picked from commit 697e5d4) (cherry picked from commit d763c99)
1 parent a8512d6 commit 9d618bc

File tree

3 files changed

+270
-0
lines changed

3 files changed

+270
-0
lines changed
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
<!--
2+
Licensed to the Apache Software Foundation (ASF) under one or more
3+
contributor license agreements. See the NOTICE file distributed with
4+
this work for additional information regarding copyright ownership.
5+
The ASF licenses this file to You under the Apache License, Version 2.0
6+
(the "License"); you may not use this file except in compliance with
7+
the License. You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
18+
Apache Hadoop Git/Jira FixVersion validation
19+
============================================================
20+
21+
Git commits in Apache Hadoop contains Jira number of the format
22+
HADOOP-XXXX or HDFS-XXXX or YARN-XXXX or MAPREDUCE-XXXX.
23+
While creating a release candidate, we also include changelist
24+
and this changelist can be identified based on Fixed/Closed Jiras
25+
with the correct fix versions. However, sometimes we face few
26+
inconsistencies between fixed Jira and Git commit message.
27+
28+
git_jira_fix_version_check.py script takes care of
29+
identifying all git commits with commit
30+
messages with any of these issues:
31+
32+
1. commit is reverted as per commit message
33+
2. commit does not contain Jira number format in message
34+
3. Jira does not have expected fixVersion
35+
4. Jira has expected fixVersion, but it is not yet resolved
36+
37+
Moreover, this script also finds any resolved Jira with expected
38+
fixVersion but without any corresponding commit present.
39+
40+
This should be useful as part of RC preparation.
41+
42+
git_jira_fix_version_check supports python3 and it required
43+
installation of jira:
44+
45+
```
46+
$ python3 --version
47+
Python 3.9.7
48+
49+
$ python3 -m venv ./venv
50+
51+
$ ./venv/bin/pip install -r dev-support/git-jira-validation/requirements.txt
52+
53+
$ ./venv/bin/python dev-support/git-jira-validation/git_jira_fix_version_check.py
54+
55+
```
56+
57+
The script also requires below inputs:
58+
```
59+
1. First commit hash to start excluding commits from history:
60+
Usually we can provide latest commit hash from last tagged release
61+
so that the script will only loop through all commits in git commit
62+
history before this commit hash. e.g for 3.3.2 release, we can provide
63+
git hash: fa4915fdbbbec434ab41786cb17b82938a613f16
64+
because this commit bumps up hadoop pom versions to 3.3.2:
65+
https://github.com/apache/hadoop/commit/fa4915fdbbbec434ab41786cb17b82938a613f16
66+
67+
2. Fix Version:
68+
Exact fixVersion that we would like to compare all Jira's fixVersions
69+
with. e.g for 3.3.2 release, it should be 3.3.2.
70+
71+
3. JIRA Project Name:
72+
The exact name of Project as case-sensitive e.g HADOOP / OZONE
73+
74+
4. Path of project's working dir with release branch checked-in:
75+
Path of project from where we want to compare git hashes from. Local fork
76+
of the project should be up-to date with upstream and expected release
77+
branch should be checked-in.
78+
79+
5. Jira server url (default url: https://issues.apache.org/jira):
80+
Default value of server points to ASF Jiras but this script can be
81+
used outside of ASF Jira too.
82+
```
83+
84+
85+
Example of script execution:
86+
```
87+
JIRA Project Name (e.g HADOOP / OZONE etc): HADOOP
88+
First commit hash to start excluding commits from history: fa4915fdbbbec434ab41786cb17b82938a613f16
89+
Fix Version: 3.3.2
90+
Jira server url (default: https://issues.apache.org/jira):
91+
Path of project's working dir with release branch checked-in: /Users/vjasani/Documents/src/hadoop-3.3/hadoop
92+
93+
Check git status output and verify expected branch
94+
95+
On branch branch-3.3.2
96+
Your branch is up to date with 'origin/branch-3.3.2'.
97+
98+
nothing to commit, working tree clean
99+
100+
101+
Jira/Git commit message diff starting: ##############################################
102+
Jira not present with version: 3.3.2. Commit: 8cd8e435fb43a251467ca74fadcb14f21a3e8163 HADOOP-17198. Support S3 Access Points (#3260) (branch-3.3.2) (#3955)
103+
WARN: Jira not found. Commit: 8af28b7cca5c6020de94e739e5373afc69f399e5 Updated the index as per 3.3.2 release
104+
WARN: Jira not found. Commit: e42e483d0085aa46543ebcb1196dd155ddb447d0 Make upstream aware of 3.3.1 release
105+
Commit seems reverted. Commit: 6db1165380cd308fb74c9d17a35c1e57174d1e09 Revert "HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)"
106+
Commit seems reverted. Commit: 1e3f94fa3c3d4a951d4f7438bc13e6f008f228f4 Revert "HDFS-16333. fix balancer bug when transfer an EC block (#3679)"
107+
Jira not present with version: 3.3.2. Commit: ce0bc7b473a62a580c1227a4de6b10b64b045d3a HDFS-16344. Improve DirectoryScanner.Stats#toString (#3695)
108+
Jira not present with version: 3.3.2. Commit: 30f0629d6e6f735c9f4808022f1a1827c5531f75 HDFS-16339. Show the threshold when mover threads quota is exceeded (#3689)
109+
Jira not present with version: 3.3.2. Commit: e449daccf486219e3050254d667b74f92e8fc476 YARN-11007. Correct words in YARN documents (#3680)
110+
Commit seems reverted. Commit: 5c189797828e60a3329fd920ecfb99bcbccfd82d Revert "HDFS-16336. Addendum: De-flake TestRollingUpgrade#testRollback (#3686)"
111+
Jira not present with version: 3.3.2. Commit: 544dffd179ed756bc163e4899e899a05b93d9234 HDFS-16171. De-flake testDecommissionStatus (#3280)
112+
Jira not present with version: 3.3.2. Commit: c6914b1cb6e4cab8263cd3ae5cc00bc7a8de25de HDFS-16350. Datanode start time should be set after RPC server starts successfully (#3711)
113+
Jira not present with version: 3.3.2. Commit: 328d3b84dfda9399021ccd1e3b7afd707e98912d HDFS-16336. Addendum: De-flake TestRollingUpgrade#testRollback (#3686)
114+
Jira not present with version: 3.3.2. Commit: 3ae8d4ccb911c9ababd871824a2fafbb0272c016 HDFS-16336. De-flake TestRollingUpgrade#testRollback (#3686)
115+
Jira not present with version: 3.3.2. Commit: 15d3448e25c797b7d0d401afdec54683055d4bb5 HADOOP-17975. Fallback to simple auth does not work for a secondary DistributedFileSystem instance. (#3579)
116+
Jira not present with version: 3.3.2. Commit: dd50261219de71eaa0a1ad28529953e12dfb92e0 YARN-10991. Fix to ignore the grouping "[]" for resourcesStr in parseResourcesString method (#3592)
117+
Jira not present with version: 3.3.2. Commit: ef462b21bf03b10361d2f9ea7b47d0f7360e517f HDFS-16332. Handle invalid token exception in sasl handshake (#3677)
118+
WARN: Jira not found. Commit: b55edde7071419410ea5bea4ce6462b980e48f5b Also update hadoop.version to 3.3.2
119+
...
120+
...
121+
...
122+
Found first commit hash after which git history is redundant. commit: fa4915fdbbbec434ab41786cb17b82938a613f16
123+
Exiting successfully
124+
Jira/Git commit message diff completed: ##############################################
125+
126+
Any resolved Jira with fixVersion 3.3.2 but corresponding commit not present
127+
Starting diff: ##############################################
128+
HADOOP-18066 is marked resolved with fixVersion 3.3.2 but no corresponding commit found
129+
HADOOP-17936 is marked resolved with fixVersion 3.3.2 but no corresponding commit found
130+
Completed diff: ##############################################
131+
132+
133+
```
134+
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
#!/usr/bin/env python3
2+
############################################################################
3+
#
4+
# Licensed to the Apache Software Foundation (ASF) under one
5+
# or more contributor license agreements. See the NOTICE file
6+
# distributed with this work for additional information
7+
# regarding copyright ownership. The ASF licenses this file
8+
# to you under the Apache License, Version 2.0 (the
9+
# "License"); you may not use this file except in compliance
10+
# with the License. You may obtain a copy of the License at
11+
#
12+
# http://www.apache.org/licenses/LICENSE-2.0
13+
#
14+
# Unless required by applicable law or agreed to in writing, software
15+
# distributed under the License is distributed on an "AS IS" BASIS,
16+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17+
# See the License for the specific language governing permissions and
18+
# limitations under the License.
19+
#
20+
############################################################################
21+
"""An application to assist Release Managers with ensuring that histories in
22+
Git and fixVersions in JIRA are in agreement. See README.md for a detailed
23+
explanation.
24+
"""
25+
26+
27+
import os
28+
import re
29+
import subprocess
30+
31+
from jira import JIRA
32+
33+
jira_project_name = input("JIRA Project Name (e.g HADOOP / OZONE etc): ") \
34+
or "HADOOP"
35+
# Define project_jira_keys with - appended. e.g for HADOOP Jiras,
36+
# project_jira_keys should include HADOOP-, HDFS-, YARN-, MAPREDUCE-
37+
project_jira_keys = [jira_project_name + '-']
38+
if jira_project_name == 'HADOOP':
39+
project_jira_keys.append('HDFS-')
40+
project_jira_keys.append('YARN-')
41+
project_jira_keys.append('MAPREDUCE-')
42+
43+
first_exclude_commit_hash = input("First commit hash to start excluding commits from history: ")
44+
fix_version = input("Fix Version: ")
45+
46+
jira_server_url = input(
47+
"Jira server url (default: https://issues.apache.org/jira): ") \
48+
or "https://issues.apache.org/jira"
49+
50+
jira = JIRA(server=jira_server_url)
51+
52+
local_project_dir = input("Path of project's working dir with release branch checked-in: ")
53+
os.chdir(local_project_dir)
54+
55+
GIT_STATUS_MSG = subprocess.check_output(['git', 'status']).decode("utf-8")
56+
print('\nCheck git status output and verify expected branch\n')
57+
print(GIT_STATUS_MSG)
58+
59+
print('\nJira/Git commit message diff starting: ##############################################')
60+
61+
issue_set_from_commit_msg = set()
62+
63+
for commit in subprocess.check_output(['git', 'log', '--pretty=oneline']).decode(
64+
"utf-8").splitlines():
65+
if commit.startswith(first_exclude_commit_hash):
66+
print("Found first commit hash after which git history is redundant. commit: "
67+
+ first_exclude_commit_hash)
68+
print("Exiting successfully")
69+
break
70+
if re.search('revert', commit, re.IGNORECASE):
71+
print("Commit seems reverted. \t\t\t Commit: " + commit)
72+
continue
73+
ACTUAL_PROJECT_JIRA = None
74+
for project_jira in project_jira_keys:
75+
if project_jira in commit:
76+
ACTUAL_PROJECT_JIRA = project_jira
77+
break
78+
if not ACTUAL_PROJECT_JIRA:
79+
print("WARN: Jira not found. \t\t\t Commit: " + commit)
80+
continue
81+
JIRA_NUM = ''
82+
for c in commit.split(ACTUAL_PROJECT_JIRA)[1]:
83+
if c.isdigit():
84+
JIRA_NUM = JIRA_NUM + c
85+
else:
86+
break
87+
issue = jira.issue(ACTUAL_PROJECT_JIRA + JIRA_NUM)
88+
EXPECTED_FIX_VERSION = False
89+
for version in issue.fields.fixVersions:
90+
if version.name == fix_version:
91+
EXPECTED_FIX_VERSION = True
92+
break
93+
if not EXPECTED_FIX_VERSION:
94+
print("Jira not present with version: " + fix_version + ". \t Commit: " + commit)
95+
continue
96+
if issue.fields.status is None or issue.fields.status.name not in ('Resolved', 'Closed'):
97+
print("Jira is not resolved yet? \t\t Commit: " + commit)
98+
else:
99+
# This means Jira corresponding to current commit message is resolved with expected
100+
# fixVersion.
101+
# This is no-op by default, if needed, convert to print statement.
102+
issue_set_from_commit_msg.add(ACTUAL_PROJECT_JIRA + JIRA_NUM)
103+
104+
print('Jira/Git commit message diff completed: ##############################################')
105+
106+
print('\nAny resolved Jira with fixVersion ' + fix_version
107+
+ ' but corresponding commit not present')
108+
print('Starting diff: ##############################################')
109+
all_issues_with_fix_version = jira.search_issues(
110+
'project=' + jira_project_name + ' and status in (Resolved,Closed) and fixVersion='
111+
+ fix_version)
112+
113+
for issue in all_issues_with_fix_version:
114+
if issue.key not in issue_set_from_commit_msg:
115+
print(issue.key + ' is marked resolved with fixVersion ' + fix_version
116+
+ ' but no corresponding commit found')
117+
118+
print('Completed diff: ##############################################')
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one
3+
# or more contributor license agreements. See the NOTICE file
4+
# distributed with this work for additional information
5+
# regarding copyright ownership. The ASF licenses this file
6+
# to you under the Apache License, Version 2.0 (the
7+
# "License"); you may not use this file except in compliance
8+
# with the License. You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
jira==3.1.1

0 commit comments

Comments
 (0)