You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Minor improvements to the README of hbck2
* Fix typo and reflow so each sentence is on its own line
* Fix typo and maybe fix whitespace issues as well
* Fix a couple more typos
Copy file name to clipboardExpand all lines: hbase-hbck2/README.md
+38-45Lines changed: 38 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,17 +22,15 @@ _HBCK2_ is the repair tool for Apache HBase clusters.
22
22
23
23
Problems in operation are bugs. The need for an _HBCK2_ fix
24
24
is meant as workaround until the bug is fixed and deployed
25
-
in a new hbase version.
25
+
in a new HBase version.
26
26
27
27
## _HBCK2_ vs _hbck1_
28
-
HBCK2 is the successor to [hbck](https://hbase.apache.org/book.html#hbck.in.depth),
29
-
the repair tool that shipped with _hbase-1.x_ (A.K.A _hbck1_). Use _HBCK2_ in place of
30
-
_hbck1_ making repairs against hbase-2.x clusters. _hbck1_ should not be run against an
31
-
hbase-2.x install. It may do damage. While _hbck1_ is still bundled inside hbase-2.x
32
-
-- to minimize surprise -- it is deprecated, to be removed in _hbase-3.x_. Its
33
-
write-facility (`-fix`) has been removed. It can report on the state of an hbase-2.x
34
-
cluster but its assessments will be inaccurate since it does not understand the internal
35
-
workings of an hbase-2.x.
28
+
HBCK2 is the successor to [hbck](https://hbase.apache.org/book.html#hbck.in.depth), the repair tool that shipped with _HBase 1.x_ (A.K.A _hbck1_).
29
+
Use _HBCK2_ in place of _hbck1_ making repairs against hbase-2.x clusters.
30
+
_hbck1_ should not be run against an HBase 2.x installation as it may do damage.
31
+
While _hbck1_ is still included in HBase 2.x to avoid surprises, it is now deprecated and will be removed in _HBase 3.x_.
32
+
The write-facility (`-fix`) of _hbck1_ has been removed.
33
+
It can report on the state of a HBase 2.x cluster but its assessments will be inaccurate since it does not understand all internal workings of HBase 2.x.
36
34
37
35
_HBCK2_ does not work the way _hbck1_ used to, even for the case where commands are
38
36
similarly named across the two versions. See the next section for how the tools
@@ -60,13 +58,13 @@ Run:
60
58
```
61
59
$ mvn install
62
60
```
63
-
The built _HBCK2_ jar will be in the `target`sub-directory.
61
+
The built _HBCK2_ jar will be in the `target`subdirectory.
64
62
65
63
## Running _HBCK2_
66
64
The _HBCK2_ jar does not include dependencies; it is not built as a 'fat' jar.
67
65
Dependencies must be `provided`. Building, adjusting the target hbase version in the
68
-
top-level pom to match your deploy will make for the smoothest operation when run
69
-
against your deploy (See the parent pom.xml `hbase-operator-tools` for the
66
+
top-level pom to match your deployment will make for the smoothest operation when run
67
+
against your deployment (See the parent pom.xml `hbase-operator-tools` for the
70
68
[hbase.version to set](https://github.com/apache/hbase-operator-tools/blob/master/pom.xml#L126)).
71
69
72
70
Where runtime interaction between _HBCK2_ and running cluster can get interesting is
@@ -77,15 +75,14 @@ it should fail gracefully. Use an older release or upgrade your cluster (if you
77
75
The easiest means of 'providing' _HBCK2_ its dependencies is by launching
78
76
_HBCK2_ via the `$HBASE_HOME/bin/hbase` script. The `bin/hbase` script natively
79
77
makes mention of `hbck` -- there is a `hbck` option listed in the help output.
80
-
By default, running `bin/hbase hbck`, the built-in _hbck1_tooling will be run.
78
+
By default, running `bin/hbase hbck`, will run the built-in _hbck1_tool.
81
79
To run _HBCK2_, you need to point at a built _HBCK2_ jar using the `-j` option
`/etc/hbase-conf` is where the deployment's configuration lives.
85
+
The _HBCK2_ jar is at `~/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar`.
89
86
The above command with no options or arguments passed will dump out the _HBCK2_ help:
90
87
```
91
88
usage: HBCK2 [OPTIONS] COMMAND <ARGS>
@@ -445,13 +442,13 @@ Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs
445
442
```
446
443
... it is because the HDFS jars are not on the CLASSPATH. The default is NOT
447
444
to bundle HDFS jars on the CLASSPATH when running `hbck` via `bin/hbase`. Define
448
-
`HADOOP_HOME` in the environment so `bin/hbase` can find your local hadoop
449
-
install and then it will load its HDFS jars.
445
+
`HADOOP_HOME` in the environment so `bin/hbase` can find your local Hadoop
446
+
installation, and then it will load its HDFS jars.
450
447
451
448
## _HBCK2_ Overview
452
449
_HBCK2_ is currently a simple tool that does one thing at a time only.
453
450
454
-
In hbase-2.x, the Master is the final arbiter of all state, so a general principal for most
451
+
In hbase-2.x, the Master is the final arbiter of all state, so a general principle for most
455
452
_HBCK2_ commands is that it asks the Master to effect all repair. This means a Master must be
456
453
up before you can run _HBCK2_ commands.
457
454
@@ -498,7 +495,7 @@ its _pid_ but also its _ppid_; its parent's _pid_.
498
495
499
496
Generally all run problem free but if some unforeseen circumstance
500
497
arises, the assignment framework may sustain damage requiring
501
-
operator intervention. Below we will discuss some such scenarios
498
+
operator intervention. Below we will discuss some such scenarios,
502
499
but they can manifest in the Master log as a Region being _STUCK_ or
503
500
a Procedure transitioning an entity -- a Region or a Table --
504
501
may be blocked because another Procedure holds the exclusive lock
@@ -531,10 +528,8 @@ Procedures and Locks as well as the current set of Master Procedure WALs
531
528
directory in your hbase install). On startup, on a large
532
529
cluster when furious assigning is afoot, this page is
533
530
filled with lists of Procedures and Locks. The count of
534
-
MasterProcWALs will bloat too. If after the cluster settles,
535
-
there is a stuck Lock or Procedure or the count of WALs
536
-
doesn't ever come down but only grows, then operator intervention
537
-
is needed to alieve the blockage.
531
+
MasterProcWALs will bloat too.
532
+
If after the cluster settles, there is a stuck Lock or Procedure or the count of WALs doesn't ever come down but only grows, then operator intervention is needed to remove the blockage.
538
533
539
534
Lists of locks and procedures can also be obtained via the hbase shell:
An `HBCK Report` page was added to the Master in versions hbase 2.3.0/2.1.6/2.2.1
548
-
at `/hbck.jsp`
549
-
which shows output from two inspections run by the master on an interval; one
550
-
is output by the CatalogJanitor whenever it runs. If overlaps or holes in
551
-
`hbase:meta`, the CatalogJanitor half of the page will list what it has found
552
-
(otherwise it is quiet). Another background 'chore' process was added to compare
553
-
`hbase:meta` and filesystem content making compare; if anomaly, it will make
554
-
note in its `HBCK Report` section.
542
+
Starting with HBase 2.3.0/2.1.6/2.2.1, the Master UI now includes a `HBCK Report` page located at `/hbck.jsp`.
543
+
This pages displays the output from two inspections run by the Master at regular intervals.
555
544
556
-
See the 'HBCK Report' page itself for how to force runs of the inspectors.
545
+
1. The first is performed by the `CatalogJanitor` and reports any overlaps in regions or holes in `hbase:meta`.
546
+
2. The second inspection is a background 'chore' process that compares `hbase:meta` and filesystem content, and makes a note of any anomalies in the HBCK Report section.
547
+
548
+
If you want to force a run of these inspectors, refer to the HBCK Report page for instructions.
549
+
550
+
Look at the `fixMeta` command to fix overlaps and holes found by these inspections.
557
551
558
552
559
553
#### The [HBase Canary Tool](http://hbase.apache.org/book.html#_canary)
As it operates independently from Master, once it finishes successfully, additional steps are
727
+
As it operates independently of Master, once it finishes successfully, additional steps are
735
728
required to actually have the re-added regions assigned. These are listed below:
736
729
737
730
1._addFsRegionsMissingInMeta_ outputs an _assigns_ command with all regions that got re-added. This
@@ -764,14 +757,14 @@ Start the cluster up. It won’t come up fully. It will be stuck because the _na
764
757
2019-07-10 18:30:51,090 WARN [master/localhost:16000:becomeActiveMaster] master.HMaster: hbase:namespace,,1562808216225.725a0fe6c2c869d3d0a9ed82bfa80fa3. is NOT online; state={725a0fe6c2c869d3d0a9ed82bfa80fa3 state=CLOSED, ts=1562808619952, server=null}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
765
758
```
766
759
767
-
To assign the namespace table region, you cannot use the shell. If you use the shell, it will fail with a `PleaseHoldException` because the master is not yet up (it is waiting for the namepace table to come online before it declares itself ‘up’). You have to use the `HBCK2`_assigns_ command. To assign, you will need the namespace encoded name. It shows in the log quoted above: i.e. _725a0fe6c2c869d3d0a9ed82bfa80fa3_ in this case. You will also have to pass the -skip command to ‘skip’ the master version check (without it, your `HBCK2` invocation will also elicit the above `PleaseHoldException` because the master is not yet up). Here is an example adding an assign of the namespace table:
760
+
To assign the namespace table region, you cannot use the shell. If you use the shell, it will fail with a `PleaseHoldException` because the master is not yet up (it is waiting for the namespace table to come online before it declares itself ‘up’). You have to use the `HBCK2`_assigns_ command. To assign, you will need the namespace encoded name. It shows in the log quoted above: i.e. _725a0fe6c2c869d3d0a9ed82bfa80fa3_ in this case. You will also have to pass the -skip command to ‘skip’ the master version check (without it, your `HBCK2` invocation will also elicit the above `PleaseHoldException` because the master is not yet up). Here is an example adding an assign of the namespace table:
If the invocation comes back with ‘Connection refused’, is the Master up? The Master will shut down after a while if it can’t initialize itself. Just restart the cluster/master and rerun the above assigns command.
773
766
774
-
When the assigns runs successfully, you’ll see it emit the likes of the following. The ‘48’ on the end is the pid of the assign procedure schedule. If the pid returned is ‘-1’, then the master startup has not progressed sufficently… retry. Or, the encoded regionname is incorrect. Check.
767
+
When the assigns runs successfully, you’ll see it emit the likes of the following. The ‘48’ on the end is the pid of the assign procedure schedule. If the pid returned is ‘-1’, then the master startup has not progressed sufficiently… retry. Or, the encoded regionname is incorrect. Check.
18:40:43.817 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
@@ -785,14 +778,14 @@ master.HMaster: Master has completed initialization 132.515sec
785
778
```
786
779
It might take a while to appear.
787
780
788
-
The rebuild of _hbase:meta_ adds the user tables in _DISABLED_ state and the regions in _CLOSED_ mode. Reenable tables via the shell to bring all table regions back online.
781
+
The rebuild of _hbase:meta_ adds the user tables in _DISABLED_ state and the regions in _CLOSED_ mode. Re-enable tables via the shell to bring all table regions back online.
789
782
Do it one-at-a-time or see the `enable_all ".*"` command to enable all tables in one shot.
790
783
791
784
The rebuild meta will likely be missing edits and may need subsequent repair and cleaning using facility outlined higher up in this README.
792
785
793
786
### Dropped reference files, missing hbase.version file, and corrupted hfiles
794
787
795
-
_HBCK2_ can check for hanging references and corrupt hfiles. You can ask it to sideline bad files which may be needed to get over humps where regions won't online or reads are failing. See the _filesystem_ command in the _HBCK2_ listing. Pass one or more tablename (or 'none' to check all tables). It will report bad files. Pass the _--fix_ option to effect repairs.
788
+
_HBCK2_ can check for hanging references and corrupt HFiles. You can ask it to sideline bad files, which may be needed to get over humps where regions won't online or reads are failing. See the _filesystem_ command in the _HBCK2_ listing. Pass one or more tablename (or 'none' to check all tables). It will report bad files. Pass the _--fix_ option to effect repairs.
0 commit comments