Skip to content

Conversation

@frostruan
Copy link
Contributor

This PR introduces a ScanRangeOptimizer to try to reduce unnecessary reading of data based on filters user set.

For example, if user want to scan data where rowkey > 'hhh' and rowkey < 'mmm', the optimizer can optimize start row to 'hhh' and stop row to 'mmm'. Compare to the default start row and stop row, EMPTY_START_ROW and EMPTY_STOP_ROW, this will help speed up scan request.

@frostruan frostruan requested a review from Apache9 November 12, 2023 13:17
@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 2m 23s Docker mode activated.
-0 ⚠️ yetus 0m 5s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for branch
+1 💚 mvninstall 2m 30s master passed
+1 💚 compile 0m 31s master passed
+1 💚 shadedjars 5m 14s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 25s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 2m 18s the patch passed
+1 💚 compile 0m 29s the patch passed
+1 💚 javac 0m 29s the patch passed
+1 💚 shadedjars 5m 10s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 24s the patch passed
_ Other Tests _
+1 💚 unit 1m 49s hbase-common in the patch passed.
-1 ❌ unit 1m 3s hbase-client in the patch failed.
24m 3s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5514
Optional Tests javac javadoc unit shadedjars compile
uname Linux dd7ad45a760d 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / e806350
Default Java Temurin-1.8.0_352-b08
unit https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/testReport/
Max. process+thread count 360 (vs. ulimit of 30000)
modules C: hbase-common hbase-client U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 34s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for branch
+1 💚 mvninstall 3m 54s master passed
+1 💚 compile 0m 47s master passed
+1 💚 shadedjars 6m 15s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 38s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 3m 19s the patch passed
+1 💚 compile 0m 40s the patch passed
+1 💚 javac 0m 40s the patch passed
+1 💚 shadedjars 6m 0s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 40s the patch passed
_ Other Tests _
+1 💚 unit 2m 50s hbase-common in the patch passed.
-1 ❌ unit 1m 27s hbase-client in the patch failed.
28m 53s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5514
Optional Tests javac javadoc unit shadedjars compile
uname Linux cb796ef6263c 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / e806350
Default Java Eclipse Adoptium-11.0.17+8
unit https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/testReport/
Max. process+thread count 370 (vs. ulimit of 30000)
modules C: hbase-common hbase-client U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 18s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 2m 51s master passed
+1 💚 compile 1m 13s master passed
+1 💚 checkstyle 0m 32s master passed
+1 💚 spotless 0m 43s branch has no errors when running spotless:check.
+1 💚 spotbugs 1m 16s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 2m 39s the patch passed
+1 💚 compile 1m 12s the patch passed
+1 💚 javac 1m 12s the patch passed
-0 ⚠️ checkstyle 0m 16s hbase-client: The patch generated 3 new + 4 unchanged - 0 fixed = 7 total (was 4)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 9m 16s Patch does not cause any errors with Hadoop 3.2.4 3.3.6.
-1 ❌ spotless 0m 18s patch has 66 errors when running spotless:check, run spotless:apply to fix.
+1 💚 spotbugs 1m 31s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 19s The patch does not generate ASF License warnings.
29m 46s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5514
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux c61adbdc9a94 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / e806350
Default Java Eclipse Adoptium-11.0.17+8
checkstyle https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt
spotless https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/artifact/yetus-general-check/output/patch-spotless.txt
Max. process+thread count 78 (vs. ulimit of 30000)
modules C: hbase-common hbase-client U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5514/1/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented Nov 13, 2023

There is a setStartStopRowForPrefixScan method for Scan already? I think it is exactly for the same purpose...

@frostruan
Copy link
Contributor Author

frostruan commented Nov 13, 2023

Thanks for reviewing Duo. Yes, the setStartStopRowForPrefixScan method works for prefix filtering, but it can not work for range filtering. Maybe the title misunderstood you. What I want to introduce here, is like the query optimizer sub-system in RDBMS. It will optimize the scan range based on the filters that user sets. For example, if user want to scan data where rowkey > 'hhh' and rowkey < 'mmm', the optimizer can optimize start row to 'hhh' and stop row to 'mmm'. Compare to the default start row and stop row, EMPTY_START_ROW and EMPTY_STOP_ROW, this will help speed up scan request.

@Apache9
Copy link
Contributor

Apache9 commented Nov 25, 2023

Then let's change the title and post a simple design doc to discuss first? I think introducing a new mechanism is fine, but we need to discuss it first. At least, changing the Scan object passed in may break our users code...

@frostruan
Copy link
Contributor Author

OK. Thanks for your advise Duo. Let me prepare the design doc first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants