Build indicesAccessControl for selective indices #78327

ywangd · 2021-09-27T14:41:02Z

During index authorization, the IndicesAccessControl is built for all
original indices. The list of original indices can be large for a large
size cluster and requests with liberal wildcards. Therefore computing
IndicesAccessControl for every one of them can be expensive.

The observation is that many shard/index level request only really just
target a single concrete index and hence just need IndicesAccessControl
for that single index. One twist is that the single index may be
requested with its alias or authorized as part of a data stream. In this
case, the relevant aliases and data stream also need to be considered.
Still this list of concrete index and its aliases and data stream can be
much smaller than the original indices.

This PR adds an effective allowlist for requests that we know how to
extract the single concrete target index and only compute
IndicesAccessControl for the index and relevant aliases, data stream.

During index authorization, the IndicesAccessControl is built for all original indices. The list of original indices can be large for a large size cluster and requests with liberal wildcards. Therefore computing IndicesAccessControl for every one of them can be expensive. The observation is that many shard/index level request only really just target a single concrete index and hence just need IndicesAccessControl for that single index. One twist is that the single index may be requested with its alias or authorized as part of a data stream. In this case, the relevant aliases and data stream also need to be considered. Still this list of concrete index and its aliases and data stream can be much smaller than the original indices. This PR adds an effective allowlist for requests that we know how to extract the single concrete target index and only compute IndicesAccessControl for the index and relevant aliases, data stream.

elasticmachine · 2021-09-27T14:41:06Z

Pinging @elastic/es-security (Team:Security)

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

ywangd · 2021-09-28T01:05:19Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+    // An allowlist of requests that have only a single concrete target index regardless of the original indices
+    private String getSingleTargetIndex(TransportRequest request) {
+        if (request instanceof ShardSearchRequest) {
+            return ((ShardSearchRequest) request).shardId().getIndexName();
+        } else if (request instanceof FieldCapabilitiesIndexRequest) {
+            return ((FieldCapabilitiesIndexRequest) request).index();
+        } else if (request instanceof ShardFetchSearchRequest) {
+            return ((ShardFetchSearchRequest) request).getShardSearchRequest().shardId().getIndexName();
+        }
+        return null;
+    }


There are more Transport Requests that target a single index/shard. These three are chosen to match what are currently being used in benchmarking. They also serve the purpose of demostrating the general idea of this change. I'll work on a more complete list once we agree on the approach.

IndexAbstraction.Index Relates to elastic#78327

Resolve aliases from IndexAbstraction.DataStream and IndexAbstraction.Index Helps with this specifically: https://github.com/elastic/elasticsearch/pull/78327/files#r717141768 Relates to #78327

Backport elastic#78372 to 7.x Resolve aliases from IndexAbstraction.DataStream and IndexAbstraction.Index Helps with this specifically: https://github.com/elastic/elasticsearch/pull/78327/files#r717141768 Relates to elastic#78327

Backport #78372 to 7.x Resolve aliases from IndexAbstraction.DataStream and IndexAbstraction.Index Helps with this specifically: https://github.com/elastic/elasticsearch/pull/78327/files#r717141768 Relates to #78327

albertzaharovits

I took a very close look at this.
My hope is that the same improvement is going to be covered in core, by #78508 . I think that is almost there, and it also avoids the serialization/deserialization costs you were mentioning in the last weekly meeting.

ywangd · 2021-10-06T01:15:52Z

I took a very close look at this. My hope is that the same improvement is going to be covered in core, by #78508 . I think that is almost there, and it also avoids the serialization/deserialization costs you were mentioning in the last weekly meeting.

Thanks Albert. Filtering the list of names in the core (#78508) is a superior solution in terms of performance. It covers this PR and more:

It does what this PR is trying to do in a more natural way
It reduces the looping cost (due to small set of requested indices) in IndicesAndAliasesResolver#resolveIndicesAndAliases
More importantly, it cut downs the de/serialisation cost for sending requests cross nodes.

I am happy to close this PR once #78508 is merged. We should also discuss whether the same change should be actively promoted to other parts in core where applicable. I think it should be. But discussion would be great to understand the impact better and also help decide on whether core should always be aware of the source names, i.e. whether the concrete index name is from an alias or even multiple aliases.

PS: This PR and other hacks locally almost screamed for the change in #78508. But I just didn't notice it. A lesson learned on how to tackle an issue in a more broader context (than just security).

tvernum · 2021-10-07T07:08:31Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+        } else if (request instanceof ShardFetchSearchRequest) {
+            return ((ShardFetchSearchRequest) request).getShardSearchRequest().shardId().getIndexName();
+        }
+        return null;


Can we implement something in core that unifies these? It seems like

public interface ShardIndicesRequest extends IndicesRequest { public ShardId shardId(); }

would be helpful.

tvernum · 2021-10-07T07:12:15Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+                // The code uses a simple heuristic to choose between the two: if the total number of aliases is less
+                // than the number of requested names, it uses Method 1. Otherwise, it uses Method 2.
+                int totalAliases = 0;
+                totalAliases += targetIndexAbstraction.getIndices().get(0).getAliases().size();


This bothers me.
Can we add getIndexMetadata to IndexAbstraction.Index so we don't have to rely on the implementation detail of pulling a index out of the singleton list?

Or add getNumberOfAliases to IndexAbstraction? (I assume we don't call getAliases().size() because it's not very efficient).

Or we could make IndexAbstraction.getAliases() return Collection<String> and make IndexAbstraction.Index.getAliases cheaper to call.

Actually, since we call getAliases below (in method 1) we should just call it here and keep a local copy of the result.

tvernum · 2021-10-07T07:20:56Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+                }
+                if (totalAliases < indices.size()) {
+                    // Method 1
+                    effectiveIndices = new HashSet<>();


Suggested change

effectiveIndices = new HashSet<>();

effectiveIndices = new HashSet<>(totalAliases + 2);

// aliases + targetIndexName + parentDataStream

tvernum · 2021-10-07T07:27:35Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+                    effectiveIndices = indices.stream().filter(name -> {
+                        if (name.equals(targetIndexName)) {
+                            return true;
+                        }
+                        final IndexAbstraction indexAbstraction = aliasAndIndexLookup.get(name);
+                        if (indexAbstraction.getType() != IndexAbstraction.Type.CONCRETE_INDEX) {
+                            return indexAbstraction.getIndices().stream().anyMatch(im -> im.getIndex().getName().equals(targetIndexName));
+                        }
+                        return false;
+                    }).collect(Collectors.toUnmodifiableSet());


Collectors.toUnmodifiableSet() is very inefficient. If you're trying to squeeze the milliseconds here, you're better off constructing the set by hand like in method 1.

Suggested change

effectiveIndices = indices.stream().filter(name -> {

if (name.equals(targetIndexName)) {

return true;

}

final IndexAbstraction indexAbstraction = aliasAndIndexLookup.get(name);

if (indexAbstraction.getType() != IndexAbstraction.Type.CONCRETE_INDEX) {

return indexAbstraction.getIndices().stream().anyMatch(im -> im.getIndex().getName().equals(targetIndexName));

}

return false;

}).collect(Collectors.toUnmodifiableSet());

List<String> indexNames = new ArrayList<>(indices.size());

indices.stream().filter(name -> {

if (name.equals(targetIndexName)) {

return true;

}

final IndexAbstraction indexAbstraction = aliasAndIndexLookup.get(name);

if (indexAbstraction.getType() != IndexAbstraction.Type.CONCRETE_INDEX) {

return indexAbstraction.getIndices().stream().anyMatch(im -> im.getIndex().getName().equals(targetIndexName));

}

return false;

}).forEach(effectiveIndices::add);

effectiveIndices = Set.copyOf(indexNames);

I've suggested using a list to collect the names here, because I think it's going to be more efficient (but I haven't benchmarked it).
Set.copyOf constructs a new HashSet even if the argument is already a HashSet, so you're better off just using an ArrayList for raw efficiency while collecting and then paying the uniqueness cost once during copyOf.

tvernum · 2021-10-07T07:39:31Z

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java

+        }
+
+        final IndicesAccessControl accessControl =
+            role.authorize(action, Set.copyOf(effectiveIndices), aliasAndIndexLookup, fieldPermissionsCache);


If we're going to introduce a copyOf here, then we should change the caller to use Set.of rather than Sets.newHashSet and reduce the amount of copying that happens.

ywangd · 2021-10-20T23:12:11Z

Closing since changes in the core (#78508) is a better pattern.

ywangd added >enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.0.0 v7.16.0 labels Sep 27, 2021

ywangd requested review from albertzaharovits and tvernum September 27, 2021 14:41

elasticmachine added the Team:Security Meta label for security team label Sep 27, 2021

ywangd commented Sep 28, 2021

View reviewed changes

x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/RBACEngine.java Outdated Show resolved Hide resolved

ywangd commented Sep 28, 2021

View reviewed changes

martijnvg mentioned this pull request Sep 28, 2021

Resolve aliases from IndexAbstraction #78372

Merged

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Sep 28, 2021

Resolve aliases from IndexAbstraction.DataStream and

47f6c5e

IndexAbstraction.Index Relates to elastic#78327

Merge remote-tracking branch 'origin/master' into faster-role-authorize

291253b

ywangd added 3 commits September 29, 2021 00:51

Update to leverage the new getAliases API

abf5f69

checkstyle

9050910

check index name itself

4605f35

Simple heuristic and comments.

9b88dac

This was referenced Sep 29, 2021

ListenableFuture for authorize index action #78358

Closed

Superuser fastpath for indexAccessControl #78498

Merged

albertzaharovits mentioned this pull request Oct 1, 2021

Filter original indices in shard level request #78508

Merged

albertzaharovits reviewed Oct 1, 2021

View reviewed changes

tvernum reviewed Oct 7, 2021

View reviewed changes

ywangd closed this Oct 20, 2021

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Build indicesAccessControl for selective indices #78327

Build indicesAccessControl for selective indices #78327

Uh oh!

ywangd commented Sep 27, 2021

Uh oh!

elasticmachine commented Sep 27, 2021

Uh oh!

Uh oh!

ywangd Sep 28, 2021

Uh oh!

albertzaharovits left a comment •

edited

Loading

Uh oh!

ywangd commented Oct 6, 2021

Uh oh!

tvernum Oct 7, 2021

Uh oh!

tvernum Oct 7, 2021

Uh oh!

tvernum Oct 7, 2021

Uh oh!

tvernum Oct 7, 2021

Uh oh!

tvernum Oct 7, 2021

Uh oh!

ywangd commented Oct 20, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	effectiveIndices = new HashSet<>();
	effectiveIndices = new HashSet<>(totalAliases + 2);
	// aliases + targetIndexName + parentDataStream

Build indicesAccessControl for selective indices #78327

Build indicesAccessControl for selective indices #78327

Uh oh!

Conversation

ywangd commented Sep 27, 2021

Uh oh!

elasticmachine commented Sep 27, 2021

Uh oh!

Uh oh!

ywangd Sep 28, 2021

Choose a reason for hiding this comment

Uh oh!

albertzaharovits left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywangd commented Oct 6, 2021

Uh oh!

tvernum Oct 7, 2021

Choose a reason for hiding this comment

Uh oh!

tvernum Oct 7, 2021

Choose a reason for hiding this comment

Uh oh!

tvernum Oct 7, 2021

Choose a reason for hiding this comment

Uh oh!

tvernum Oct 7, 2021

Choose a reason for hiding this comment

Uh oh!

tvernum Oct 7, 2021

Choose a reason for hiding this comment

Uh oh!

ywangd commented Oct 20, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

albertzaharovits left a comment •

edited

Loading