Skip to content

Conversation

@NihalJain
Copy link
Contributor

@NihalJain NihalJain commented Sep 15, 2024

…dule hbase-diagnostics

The purpose of this task is to refactor and move certain tools currently located under the test packaging to a new module, named 'hbase-diagnostics'.

The following tools have been initially identified for relocation(will add more as and when identified):

These tools are valuable beyond the scope of testing and should be accessible in the binary distribution of HBase. However, their current location within the test jars adds unnecessary bloat to the assembly and classpath, and potentially introduces CVE-prone JARs into the binary assemblies. We plan to remove all test jars from assembly with HBASE-28433.

This task involves creating the new 'hbase-diagnostics' module, and moving the identified tools into this module. It also includes ensuring that these tools function correctly in their new location and that their relocation does not negatively impact any existing functionality or dependencies.

Also see draft patch without this change for follow up work HBASE-28433 at #6184

@NihalJain NihalJain marked this pull request as draft September 15, 2024 12:57
@NihalJain
Copy link
Contributor Author

NihalJain commented Sep 15, 2024

very rough change. lot of things to handle:

  • code duplication
  • cleanup
  • commented code
  • organising methods into proper files
  • testing: did basic ltt and pe run w/o mapreduce with local build, worked fine
  • also seems i didnot pull code before working here, so need to resolve those as well 🤦

Will post a summary of changes and things to be take care of later tomorrow

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@NihalJain
Copy link
Contributor Author

NihalJain commented Sep 16, 2024

Change Summary

  • Added a new module called hbase-diagnostics (as suggested by @stoty)
    • Question: Any better names?
  • Moved all the targeted tools we intend to move (i.e. PerformanceEvaluation, LoadTestTool, HFilePerformanceEvaluation, ScanPerformanceEvaluation and LoadBalancerPerformanceEvaluation) as part of this task to main of hbase-diagnostics along with all related classes:
    • Question: Anything else which we need to handle?
  • Created a copy just to bring out the all the usages of the class to help decide (based on references) where we move this in final patch:
    • RandomDistribution
    • KeyProviderForTesting
    • LoadTestKVGenerator
  • Added a new util DiagnosticToolsCommonUtils and moved method required to loginAndReturnUGI() from HBaseKerberosUtils
  • Added a new util LoadTestUtil and moved all load test related code from HFileTestUtil and HBaseTestingUtil to here
    • Question: Should we merge this to DiagnosticToolsCommonUtils?
  • Copied generateData() from PerformanceEvaluation in TestHFileOutputFormat2 to break cyclic dependency
  • Added @InterfaceAudience.Private to all classes moved from test code to main
  • Replaced UTIL.getConfiguration() with HBaseConfiguration.create() in LoadBalancerPerformanceEvaluation

TODO

  • Need to handle testHBASE14489() in hbase-server tests which depends on FilterAllFilter()
    • Should we move this class to main of hbase-server or hbase-common?
  • Deal with copied classes which have cyclic dependencies and remove all copies and its references
    • Should we copy these to main of hbase-server or hbase-common?
  • Need to do some manual testing, once code is well polished

@ndimiduk, @stoty @Apache9 Would you be able to provide some early feedback for this draft PR.?

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for branch
+1 💚 mvninstall 3m 1s master passed
+1 💚 compile 2m 5s master passed
+1 💚 javadoc 3m 54s master passed
+1 💚 shadedjars 5m 36s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 2m 55s the patch passed
+1 💚 compile 2m 8s the patch passed
+1 💚 javac 2m 8s the patch passed
-0 ⚠️ javadoc 0m 12s /results-javadoc-javadoc-hbase-diagnostics.txt hbase-diagnostics generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-0 ⚠️ javadoc 1m 57s /results-javadoc-javadoc-root.txt root generated 2 new + 91 unchanged - 0 fixed = 93 total (was 91)
+1 💚 shadedjars 5m 33s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 245m 15s /patch-unit-root.txt root in the patch failed.
282m 15s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6249
Optional Tests javac javadoc unit compile shadedjars
uname Linux 633e00b5ce0a 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 469be3d
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/2/testReport/
Max. process+thread count 6078 (vs. ulimit of 30000)
modules C: hbase-common hbase-client hbase-balancer hbase-asyncfs hbase-server hbase-mapreduce hbase-diagnostics hbase-compression/hbase-compression-zstd . hbase-assembly hbase-it U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/2/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 2m 25s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for branch
+1 💚 mvninstall 3m 2s master passed
+1 💚 compile 7m 56s master passed
+1 💚 checkstyle 1m 13s master passed
+1 💚 spotbugs 12m 0s master passed
+1 💚 spotless 0m 44s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 8m 1s the patch passed
-0 ⚠️ javac 8m 1s /results-compile-javac-root.txt root generated 61 new + 1207 unchanged - 4 fixed = 1268 total (was 1211)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 13s /results-checkstyle-root.txt root: The patch generated 43 new + 58 unchanged - 36 fixed = 101 total (was 94)
+1 💚 xmllint 0m 0s No new issues.
-1 ❌ spotbugs 7m 42s /new-spotbugs-root.html root generated 11 new + 0 unchanged - 0 fixed = 11 total (was 0)
+1 💚 hadoopcheck 11m 33s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 43s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 1m 34s The patch does not generate ASF License warnings.
77m 17s
Reason Tests
SpotBugs module:root
Integral division result cast to double or float in org.apache.hadoop.hbase.PerformanceEvaluation.calculateRowsAndSize(PerformanceEvaluation$TestOptions) At PerformanceEvaluation.java:double or float in org.apache.hadoop.hbase.PerformanceEvaluation.calculateRowsAndSize(PerformanceEvaluation$TestOptions) At PerformanceEvaluation.java:[line 3154]
org.apache.hadoop.hbase.PerformanceEvaluation$RunResult defines compareTo(PerformanceEvaluation$RunResult) and uses Object.equals() At PerformanceEvaluation.java:Object.equals() At PerformanceEvaluation.java:[line 250]
Random object created and used only once in org.apache.hadoop.hbase.util.LoadTestKVGenerator.getValueForRowColumn(int, byte[][]) At LoadTestKVGenerator.java:only once in org.apache.hadoop.hbase.util.LoadTestKVGenerator.getValueForRowColumn(int, byte[][]) At LoadTestKVGenerator.java:[line 111]
org.apache.hadoop.hbase.util.LoadTestTool.DEFAULT_NUM_REGIONS_PER_SERVER isn't final but should be At LoadTestTool.java:be At LoadTestTool.java:[line 163]
org.apache.hadoop.hbase.util.LoadTestUtil.DEFAULT_COLUMN_FAMILY should be both final and package protected At LoadTestUtil.java:and package protected At LoadTestUtil.java:[line 50]
org.apache.hadoop.hbase.util.MultiThreadedAction.verifyResultAgainstDataGenerator(Result, boolean, boolean) concatenates strings using + in a loop At MultiThreadedAction.java:using + in a loop At MultiThreadedAction.java:[line 415]
Integral division result cast to double or float in org.apache.hadoop.hbase.util.MultiThreadedAction$ProgressReporter.run() At MultiThreadedAction.java:double or float in org.apache.hadoop.hbase.util.MultiThreadedAction$ProgressReporter.run() At MultiThreadedAction.java:[line 206]
org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.createGet(long) concatenates strings using + in a loop At MultiThreadedReader.java:in a loop At MultiThreadedReader.java:[line 318]
Dead store to rowKey in org.apache.hadoop.hbase.util.MultiThreadedReaderWithACL$HBaseReaderThreadWithACL.queryKey(Get, boolean, long) At MultiThreadedReaderWithACL.java:org.apache.hadoop.hbase.util.MultiThreadedReaderWithACL$HBaseReaderThreadWithACL.queryKey(Get, boolean, long) At MultiThreadedReaderWithACL.java:[line 91]
Inconsistent synchronization of org.apache.hadoop.hbase.util.MultiThreadedUpdater.writer; locked 75% of time Unsynchronized access at MultiThreadedUpdater.java:75% of time Unsynchronized access at MultiThreadedUpdater.java:[line 80]
Unwritten field:MultiThreadedUpdaterWithACL.java:[line 94]
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6249
Optional Tests dupname asflicense javac codespell detsecrets xmllint spotless spotbugs checkstyle compile hadoopcheck hbaseanti
uname Linux 162b34e96fd4 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 167c153
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 192 (vs. ulimit of 30000)
modules C: hbase-common hbase-client hbase-balancer hbase-asyncfs hbase-server hbase-mapreduce hbase-diagnostics hbase-it hbase-compression/hbase-compression-zstd hbase-assembly . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/3/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 xmllint=20913
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 2m 28s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 36s Maven dependency ordering for branch
+1 💚 mvninstall 3m 1s master passed
+1 💚 compile 2m 4s master passed
+1 💚 javadoc 3m 56s master passed
+1 💚 shadedjars 5m 31s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 2m 52s the patch passed
+1 💚 compile 2m 5s the patch passed
+1 💚 javac 2m 5s the patch passed
-0 ⚠️ javadoc 0m 12s /results-javadoc-javadoc-hbase-diagnostics.txt hbase-diagnostics generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-0 ⚠️ javadoc 2m 4s /results-javadoc-javadoc-root.txt root generated 2 new + 91 unchanged - 0 fixed = 93 total (was 91)
+1 💚 shadedjars 6m 7s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 244m 9s /patch-unit-root.txt root in the patch failed.
283m 48s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6249
Optional Tests javac javadoc unit compile shadedjars
uname Linux 75c2ec15443b 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 167c153
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/3/testReport/
Max. process+thread count 5110 (vs. ulimit of 30000)
modules C: hbase-common hbase-client hbase-balancer hbase-asyncfs hbase-server hbase-mapreduce hbase-diagnostics hbase-it hbase-compression/hbase-compression-zstd hbase-assembly . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6249/3/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@stoty stoty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having to deal with Copy classes would also make the patch much smaller, half the files only differ due to those renames.

* Remove after tfile is committed and use the tfile version of this class instead.
* </p>
*/
public class RandomDistributionCopy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad name.

Do we need two copies of this class ? Can't we just move this from hbase-compression to hbase-common with the same name ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies to the other "Copy" classes.

Copy link
Contributor Author

@NihalJain NihalJain Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @stoty Thanks for having a look at the PR.

Do we need two copies of this class ? Can't we just move this from hbase-compression to hbase-common with the same name ?

These files are not supposed to be bundled and are here only for draft change. please see previous comment where I have captured a summary of changes: #6249 (comment)

See

Created a copy just to bring out the all the usages of the class to help decide (based on references) where we move this in final patch:
RandomDistribution
KeyProviderForTesting
LoadTestKVGenerator

Also see

Deal with copied classes which have cyclic dependencies and remove all copies and its references
Should we copy these to main of hbase-server or hbase-common?

I have created copy to show case how these files are being used across the project. Just want to discuss

  • if it is fine to move these files to proper module, outside hbase-diagnostics to resolve cyclic dependencies of usage
  • or is it fine to create a new copy them retaining the test copies where ever they are, which IMO is not clean code.

Copy link
Contributor Author

@NihalJain NihalJain Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just move this from hbase-compression to hbase-common with the same name ?

I am +1 on moving to hbase-common. Will wait some time before updating draft with this change, lets also see what other think about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a separete patch on top of this one which moves the classes to hbase-common ?
You can open a separate PR for that one, so this one still shows what you wanted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see how big the patch is without the copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a separete patch on top of this one which moves the classes to hbase-common ?

Sure @stoty let me post a new PR with suggested change.

table.close();

}
// TODO: Below test seems to be using FilterAllFilter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to resolve this before commit

@stoty
Copy link
Contributor

stoty commented Sep 17, 2024

Sorry I did not read your comments @NihalJain .

@NihalJain
Copy link
Contributor Author

Sorry I did not read your comments @NihalJain .

Hey @stoty no problem at all. I have raised a new PR with suggested changes where I will continue rest of the work, will close this one soon

@NihalJain NihalJain closed this Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants