Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
import org.apache.hadoop.hbase.regionserver.HStore;
import org.apache.hadoop.hbase.regionserver.HStoreFile;
import org.apache.hadoop.hbase.regionserver.RegionSplitPolicy;
import org.apache.hadoop.hbase.regionserver.RegionSplitRestriction;
import org.apache.hadoop.hbase.regionserver.StoreFileInfo;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.CommonFSUtils;
Expand Down Expand Up @@ -110,6 +111,21 @@ public SplitTableRegionProcedure(final MasterProcedureEnv env,
// we fail-fast on construction. There it skips the split with just a warning.
checkOnline(env, regionToSplit);
this.bestSplitRow = splitRow;
TableDescriptor tableDescriptor = env.getMasterServices().getTableDescriptors()
.get(getTableName());
Configuration conf = env.getMasterConfiguration();
if (hasBestSplitRow()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have this logic in the past? What is changed so now we need to apply this restriction in SplitTableRegionProcedure? IIRC the logic is done at region server side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we didn't have this logic in the past. I think we can apply the restriction to a user-specified split point because without this logic, we can easily break the restriction by splitting with specifying a split point. And the user-specified split point is passed to the Master side, we need to do it on the master side.

What do you think? @Apache9

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally we should get the actual split point back from region server? No? Then this should be a bug for the current code base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally we should get the actual split point back from region server? No?

No, I don't think so.

Let's see we have a table that has a key prefix restriction where the prefix length is 2 bytes.
When a user runs split command with specifying a split point abc in the hbase shell, this will break the key prefix restriction if we split the region by abc. So I think we can apply the restriction to the user-specified split point, and the restriction-applied split point will be ab, which won't break the restriction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Apache9 What about this? Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. So user may be 'surprised' if we do not split where they want? Will there be a message saying so anywhere that their choice has been over-ruled by the restriction? Or will it be obvious that the 'restriction' over-ruled?

I'm good w/ the restriction over-ruling the user as long as there a log to this effect (add the 'behavior change' to the existing nice release note @brfrn169 )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add a WARN message in the Master log when the user-specified split point is over-ruled by the restriction. I will do that. And I will add the 'behavior change' to the release note. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here we will only use SplitRestriction to fix the split row? Then what if users uses the deprecated KeyPrefixSplitPolicy? We will not fix the split row if it breaks the rule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the behavior of KeyPrefixSplitPolicy, and we will not fix the split row even if it breaks the rule. And maybe it's not easy to fix it because RegionSplitPolicy doesn't have any method to restrict/convert a user-specified split point. It has only byte[] getSplitPoint() that gets an appropriate split point calculated based on its policy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so in fact we are not changing the behavior? If you use the old KeyPrefixSplitPolicy, there is nothing changed. If you use the new SplitRestriction, then you will find out that you are not allowed to break the restriction when proposing a split point. Could mention this in the release note.

// Apply the split restriction for the table to the user-specified split point
RegionSplitRestriction splitRestriction =
RegionSplitRestriction.create(tableDescriptor, conf);
byte[] restrictedSplitRow = splitRestriction.getRestrictedSplitPoint(bestSplitRow);
if (!Bytes.equals(bestSplitRow, restrictedSplitRow)) {
LOG.warn("The specified split point {} violates the split restriction of the table. "
+ "Using {} as a split point.", Bytes.toStringBinary(bestSplitRow),
Bytes.toStringBinary(restrictedSplitRow));
bestSplitRow = restrictedSplitRow;
}
}
checkSplittable(env, regionToSplit);
final TableName table = regionToSplit.getTable();
final long rid = getDaughterRegionIdTimestamp(regionToSplit);
Expand All @@ -125,15 +141,14 @@ public SplitTableRegionProcedure(final MasterProcedureEnv env,
.setSplit(false)
.setRegionId(rid)
.build();
TableDescriptor htd = env.getMasterServices().getTableDescriptors().get(getTableName());
if(htd.getRegionSplitPolicyClassName() != null) {
if(tableDescriptor.getRegionSplitPolicyClassName() != null) {
// Since we don't have region reference here, creating the split policy instance without it.
// This can be used to invoke methods which don't require Region reference. This instantiation
// of a class on Master-side though it only makes sense on the RegionServer-side is
// for Phoenix Local Indexing. Refer HBASE-12583 for more information.
Class<? extends RegionSplitPolicy> clazz =
RegionSplitPolicy.getSplitPolicyClass(htd, env.getMasterConfiguration());
this.splitPolicy = ReflectionUtils.newInstance(clazz, env.getMasterConfiguration());
RegionSplitPolicy.getSplitPolicyClass(tableDescriptor, conf);
this.splitPolicy = ReflectionUtils.newInstance(clazz, conf);
}
}

Expand Down Expand Up @@ -219,7 +234,7 @@ private void checkSplittable(final MasterProcedureEnv env,
throw e;
}

if (bestSplitRow == null || bestSplitRow.length == 0) {
if (!hasBestSplitRow()) {
throw new DoNotRetryIOException("Region not splittable because bestSplitPoint = null, " +
"maybe table is too small for auto split. For force split, try specifying split row");
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,11 @@
* <code>userid_eventtype_eventid</code>, and use prefix delimiter _, this split policy
* ensures that all rows starting with the same userid, belongs to the same region.
* @see KeyPrefixRegionSplitPolicy
*
* @deprecated since 3.0.0 and will be removed in 4.0.0. Use {@link RegionSplitRestriction},
* instead.
*/
@Deprecated
@InterfaceAudience.Private
public class DelimitedKeyPrefixRegionSplitPolicy extends IncreasingToUpperBoundRegionSplitPolicy {

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase.regionserver;

import java.io.IOException;
import java.util.Arrays;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.TableDescriptor;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.yetus.audience.InterfaceAudience;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* A {@link RegionSplitRestriction} implementation that groups rows by a prefix of the row-key with
* a delimiter. Only the first delimiter for the row key will define the prefix of the row key that
* is used for grouping.
* <p>
* This ensures that a region is not split "inside" a prefix of a row key.
* I.e. rows can be co-located in a region by their prefix.
*
* As an example, if you have row keys delimited with <code>_</code>, like
* <code>userid_eventtype_eventid</code>, and use prefix delimiter _, this split policy ensures
* that all rows starting with the same userid, belongs to the same region.
*/
@InterfaceAudience.Private
public class DelimitedKeyPrefixRegionSplitRestriction extends RegionSplitRestriction {
private static final Logger LOG =
LoggerFactory.getLogger(DelimitedKeyPrefixRegionSplitRestriction.class);

public static final String DELIMITER_KEY =
"hbase.regionserver.region.split_restriction.delimiter";

private byte[] delimiter = null;

@Override
public void initialize(TableDescriptor tableDescriptor, Configuration conf) throws IOException {
String delimiterString = tableDescriptor.getValue(DELIMITER_KEY);
if (delimiterString == null || delimiterString.length() == 0) {
delimiterString = conf.get(DELIMITER_KEY);
if (delimiterString == null || delimiterString.length() == 0) {
LOG.error("{} not specified for table {}. "
+ "Using the default RegionSplitRestriction", DELIMITER_KEY,
tableDescriptor.getTableName());
return;
}
}
delimiter = Bytes.toBytes(delimiterString);
}

@Override
public byte[] getRestrictedSplitPoint(byte[] splitPoint) {
if (delimiter != null) {
// find the first occurrence of delimiter in split point
int index = org.apache.hbase.thirdparty.com.google.common.primitives.Bytes.indexOf(
splitPoint, delimiter);
if (index < 0) {
LOG.warn("Delimiter {} not found for split key {}", Bytes.toString(delimiter),
Bytes.toStringBinary(splitPoint));
return splitPoint;
}

// group split keys by a prefix
return Arrays.copyOf(splitPoint, Math.min(index, splitPoint.length));
} else {
return splitPoint;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -695,6 +695,7 @@ void sawNoSuchFamily() {

private TableDescriptor htableDescriptor = null;
private RegionSplitPolicy splitPolicy;
private RegionSplitRestriction splitRestriction;
private FlushPolicy flushPolicy;

private final MetricsRegion metricsRegion;
Expand Down Expand Up @@ -1037,6 +1038,9 @@ private long initializeRegionInternals(final CancelableProgressable reporter,
// Initialize split policy
this.splitPolicy = RegionSplitPolicy.create(this, conf);

// Initialize split restriction
splitRestriction = RegionSplitRestriction.create(getTableDescriptor(), conf);

// Initialize flush policy
this.flushPolicy = FlushPolicyFactory.create(this, conf);

Expand Down Expand Up @@ -7870,6 +7874,9 @@ public Optional<byte[]> checkSplit(boolean force) {
}

byte[] ret = splitPolicy.getSplitPoint();
if (ret != null && ret.length > 0) {
ret = splitRestriction.getRestrictedSplitPoint(ret);
}

if (ret != null) {
try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,11 @@
*
* This ensures that a region is not split "inside" a prefix of a row key.
* I.e. rows can be co-located in a region by their prefix.
*
* @deprecated since 3.0.0 and will be removed in 4.0.0. Use {@link RegionSplitRestriction},
* instead.
*/
@Deprecated
@InterfaceAudience.Private
public class KeyPrefixRegionSplitPolicy extends IncreasingToUpperBoundRegionSplitPolicy {
private static final Logger LOG = LoggerFactory
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase.regionserver;

import java.io.IOException;
import java.util.Arrays;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.TableDescriptor;
import org.apache.yetus.audience.InterfaceAudience;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* A {@link RegionSplitRestriction} implementation that groups rows by a prefix of the row-key.
* <p>
* This ensures that a region is not split "inside" a prefix of a row key.
* I.e. rows can be co-located in a region by their prefix.
*/
@InterfaceAudience.Private
public class KeyPrefixRegionSplitRestriction extends RegionSplitRestriction {
private static final Logger LOG =
LoggerFactory.getLogger(KeyPrefixRegionSplitRestriction.class);

public static final String PREFIX_LENGTH_KEY =
"hbase.regionserver.region.split_restriction.prefix_length";

private int prefixLength;

@Override
public void initialize(TableDescriptor tableDescriptor, Configuration conf) throws IOException {
String prefixLengthString = tableDescriptor.getValue(PREFIX_LENGTH_KEY);
if (prefixLengthString == null) {
prefixLengthString = conf.get(PREFIX_LENGTH_KEY);
if (prefixLengthString == null) {
LOG.error("{} not specified for table {}. "
+ "Using the default RegionSplitRestriction", PREFIX_LENGTH_KEY,
tableDescriptor.getTableName());
return;
}
}
try {
prefixLength = Integer.parseInt(prefixLengthString);
} catch (NumberFormatException ignored) {
}
if (prefixLength <= 0) {
LOG.error("Invalid value for {} for table {}:{}. "
+ "Using the default RegionSplitRestriction", PREFIX_LENGTH_KEY,
tableDescriptor.getTableName(), prefixLengthString);
}
}

@Override
public byte[] getRestrictedSplitPoint(byte[] splitPoint) {
if (prefixLength > 0) {
// group split keys by a prefix
return Arrays.copyOf(splitPoint, Math.min(prefixLength, splitPoint.length));
} else {
return splitPoint;
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase.regionserver;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.TableDescriptor;
import org.apache.yetus.audience.InterfaceAudience;

/**
* A {@link RegionSplitRestriction} implementation that does nothing.
*/
@InterfaceAudience.Private
public class NoRegionSplitRestriction extends RegionSplitRestriction {

@Override
public void initialize(TableDescriptor tableDescriptor, Configuration conf) throws IOException {
}

@Override
public byte[] getRestrictedSplitPoint(byte[] splitPoint) {
// Do nothing
return splitPoint;
}
}
Loading