-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML][Inference] Fix model pagination with models as resources #51573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,7 +28,6 @@ | |
| import org.elasticsearch.action.support.WriteRequest; | ||
| import org.elasticsearch.client.Client; | ||
| import org.elasticsearch.common.CheckedBiFunction; | ||
| import org.elasticsearch.common.Nullable; | ||
| import org.elasticsearch.common.Strings; | ||
| import org.elasticsearch.common.bytes.BytesReference; | ||
| import org.elasticsearch.common.collect.Tuple; | ||
|
|
@@ -73,10 +72,10 @@ | |
| import java.util.Collections; | ||
| import java.util.Comparator; | ||
| import java.util.HashSet; | ||
| import java.util.LinkedHashSet; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Set; | ||
| import java.util.TreeSet; | ||
|
|
||
| import static org.elasticsearch.xpack.core.ClientHelper.ML_ORIGIN; | ||
| import static org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin; | ||
|
|
@@ -381,19 +380,34 @@ public void deleteTrainedModel(String modelId, ActionListener<Boolean> listener) | |
|
|
||
| public void expandIds(String idExpression, | ||
| boolean allowNoResources, | ||
| @Nullable PageParams pageParams, | ||
| PageParams pageParams, | ||
| Set<String> tags, | ||
| ActionListener<Tuple<Long, Set<String>>> idsListener) { | ||
| String[] tokens = Strings.tokenizeToStringArray(idExpression, ","); | ||
| Set<String> matchedResourceIds = matchedResourceIds(tokens); | ||
| Set<String> foundResourceIds; | ||
| if (tags.isEmpty()) { | ||
| foundResourceIds = matchedResourceIds; | ||
| } else { | ||
| foundResourceIds = new HashSet<>(); | ||
| for(String resourceId : matchedResourceIds) { | ||
| // Does the model as a resource have all the tags? | ||
| if (Sets.newHashSet(loadModelFromResource(resourceId, true).getTags()).containsAll(tags)) { | ||
| foundResourceIds.add(resourceId); | ||
| } | ||
| } | ||
| } | ||
| SearchSourceBuilder sourceBuilder = new SearchSourceBuilder() | ||
| .sort(SortBuilders.fieldSort(TrainedModelConfig.MODEL_ID.getPreferredName()) | ||
| // If there are no resources, there might be no mapping for the id field. | ||
| // This makes sure we don't get an error if that happens. | ||
| .unmappedType("long")) | ||
| .query(buildExpandIdsQuery(tokens, tags)); | ||
| if (pageParams != null) { | ||
| sourceBuilder.from(pageParams.getFrom()).size(pageParams.getSize()); | ||
| } | ||
| .query(buildExpandIdsQuery(tokens, tags)) | ||
| // We "buffer" the from and size to take into account models stored as resources. | ||
| // This is so we handle the edge cases when the model that is stored as a resource is at the start/end of | ||
| // a page. | ||
| .from(Math.max(0, pageParams.getFrom() - foundResourceIds.size())) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the strategy is to get extra then figure out where the resource model ids would fit in that sorted list 👍
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidkyle exactly. If we don't have the buffer, AND the resource model ID is at the start of the list, we don't know if it is ACTUALLY at the start of the page, or the END of the previous page. Hence the need for a buffer. |
||
| .size(Math.min(10_000, pageParams.getSize() + foundResourceIds.size())); | ||
| sourceBuilder.trackTotalHits(true) | ||
| // we only care about the item id's | ||
| .fetchSource(TrainedModelConfig.MODEL_ID.getPreferredName(), null); | ||
|
|
@@ -406,47 +420,65 @@ public void expandIds(String idExpression, | |
| indicesOptions.expandWildcardsClosed(), | ||
| indicesOptions)) | ||
| .source(sourceBuilder); | ||
| Set<String> foundResourceIds = new LinkedHashSet<>(); | ||
| if (tags.isEmpty()) { | ||
| foundResourceIds.addAll(matchedResourceIds(tokens)); | ||
| } else { | ||
| for(String resourceId : matchedResourceIds(tokens)) { | ||
| // Does the model as a resource have all the tags? | ||
| if (Sets.newHashSet(loadModelFromResource(resourceId, true).getTags()).containsAll(tags)) { | ||
| foundResourceIds.add(resourceId); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| executeAsyncWithOrigin(client.threadPool().getThreadContext(), | ||
| ML_ORIGIN, | ||
| searchRequest, | ||
| ActionListener.<SearchResponse>wrap( | ||
| response -> { | ||
| long totalHitCount = response.getHits().getTotalHits().value + foundResourceIds.size(); | ||
| Set<String> foundFromDocs = new HashSet<>(); | ||
| for (SearchHit hit : response.getHits().getHits()) { | ||
| Map<String, Object> docSource = hit.getSourceAsMap(); | ||
| if (docSource == null) { | ||
| continue; | ||
| } | ||
| Object idValue = docSource.get(TrainedModelConfig.MODEL_ID.getPreferredName()); | ||
| if (idValue instanceof String) { | ||
| foundResourceIds.add(idValue.toString()); | ||
| foundFromDocs.add(idValue.toString()); | ||
| } | ||
| } | ||
| Set<String> allFoundIds = collectIds(pageParams, foundResourceIds, foundFromDocs); | ||
| ExpandedIdsMatcher requiredMatches = new ExpandedIdsMatcher(tokens, allowNoResources); | ||
| requiredMatches.filterMatchedIds(foundResourceIds); | ||
| requiredMatches.filterMatchedIds(allFoundIds); | ||
| if (requiredMatches.hasUnmatchedIds()) { | ||
| idsListener.onFailure(ExceptionsHelper.missingTrainedModel(requiredMatches.unmatchedIdsString())); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this error in a valid situation? from = 0, size = 10, idExpression = 'bar*,foo' I have the models The first 10
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidkyle i think this was a problem well before I added this logic. Data frame analytics suffers from this as well. I think it is OK to error in this case. I may be wrong though (about it being OK) :D.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Certainly it's not part of this change but let's follow up on it. I don't think we should be reporting en error when we don't know if it is an error or not, maybe we can relax the check |
||
| } else { | ||
| idsListener.onResponse(Tuple.tuple(totalHitCount, foundResourceIds)); | ||
|
|
||
| idsListener.onResponse(Tuple.tuple(totalHitCount, allFoundIds)); | ||
| } | ||
| }, | ||
| idsListener::onFailure | ||
| ), | ||
| client::search); | ||
| } | ||
|
|
||
| static Set<String> collectIds(PageParams pageParams, Set<String> foundFromResources, Set<String> foundFromDocs) { | ||
| // If there are no matching resource models, there was no buffering and the models from the docs | ||
| // are paginated correctly. | ||
| if (foundFromResources.isEmpty()) { | ||
| return foundFromDocs; | ||
| } | ||
|
|
||
| TreeSet<String> allFoundIds = new TreeSet<>(foundFromDocs); | ||
| allFoundIds.addAll(foundFromResources); | ||
|
|
||
davidkyle marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if (pageParams.getFrom() > 0) { | ||
| // not the first page so there will be extra results at the front to remove | ||
| int numToTrimFromFront = Math.min(foundFromResources.size(), pageParams.getFrom()); | ||
| for (int i = 0; i < numToTrimFromFront; i++) { | ||
| allFoundIds.remove(allFoundIds.first()); | ||
| } | ||
| } | ||
|
|
||
| // trim down to size removing from the rear | ||
| while (allFoundIds.size() > pageParams.getSize()) { | ||
| allFoundIds.remove(allFoundIds.last()); | ||
| } | ||
|
|
||
| return allFoundIds; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My way of looking at it is this: we have a request for 10 ids and 2 models stored as resource. We query for 14 ids so we can figure out where the resource model ids would fit. We have a result like AAOOOOOOOOOOBB where the As and Bs are the padding. If the model resource Ids come before the As then they are outside the ordering and all the Os are returned. Same if the model resource ids come after the Bs. If they are inserted middle (ids M & N) we have an ordering like AAOOOOOMOOONOOBB we we take the first 10 after the As. Same for I think we have to track the insertion position of the model resource Ids. There is a further complication when the search does not return size hits and you have
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are no If that makes sense.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me take your example and run with it. Assuming size is 10. AAOOOOOOOOOO If the resource models are inserted middle (ids M & N) we have an ordering like: AAOOOOOMOOONOO since our Returning to the user OOOOOMOOON Here are the results for the rest of your example insertion points
I think 1 and 4 are obvious OK situations. 2 and 3 may seem weird off hand, but they are OK as well, since
I don't think this is possible since we don't know all the page boundries (without the buffers) and knowing the insertion point does not add anything.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the explanation. From those examples it seems once you have taken the 2 sorted sets and merged them together there are 2 conditions to consider.
In this case there are no As to remove so we trim the set to size from the back
With those observations the code could be simplified to
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidkyle wrote some more test cases, and this solution works great. I modified my code :D |
||
| } | ||
|
|
||
| static QueryBuilder buildExpandIdsQuery(String[] tokens, Collection<String> tags) { | ||
| BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery() | ||
| .filter(buildQueryIdExpressionQuery(tokens, TrainedModelConfig.MODEL_ID.getPreferredName())); | ||
|
|
@@ -517,7 +549,7 @@ private static QueryBuilder buildQueryIdExpressionQuery(String[] tokens, String | |
|
|
||
| private Set<String> matchedResourceIds(String[] tokens) { | ||
| if (Strings.isAllOrWildcard(tokens)) { | ||
| return new HashSet<>(MODELS_STORED_AS_RESOURCE); | ||
| return MODELS_STORED_AS_RESOURCE; | ||
| } | ||
|
|
||
| Set<String> matchedModels = new HashSet<>(); | ||
|
|
@@ -535,7 +567,7 @@ private Set<String> matchedResourceIds(String[] tokens) { | |
| } | ||
| } | ||
| } | ||
| return matchedModels; | ||
| return Collections.unmodifiableSet(matchedModels); | ||
| } | ||
|
|
||
| private static <T> T handleSearchItem(MultiSearchResponse.Item item, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, are
pageParamsguaranteed to be non-null?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the
pageParamsparameter really necessary? I'd argue that I would expectexpandIdsto return all the ids rather than a page as the id small compared to a full document.The default value of
PageParams.getSize()is 100. Without setting a page param object you may not realise that limitation is in place and that you are not getting all of your ids. Is there even a mechanism to say you need to get the second page now?Compare to
JobConfigProvider.expandJobIdswhich gives youAnomalyDetectorsIndex.CONFIG_INDEX_MAX_RESULTS_WINDOW = 10_000ids.It would also simplify this code to remove to the pageParams which is a very good reason to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it is. The APIs making calls against this function require paging params (default is from: 0, size: 100). I think this is a sane way of handling larger numbers of items.
I think that this is a bug and shows that we don't support good pagination at all with anomaly detection jobs.
Yes, the total count of all the ids found with id pattern is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not possible to set them as
nullvia the API since you can only setfromandsizeindividually.If it is null from a client, they just don't set those parameters (and thus get the default value).
If an internal caller is making these calls, they should not explicitly set pageParams to null and if they do, they will receive an error pretty quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern was that with default paging we open ourselves up to bugs such as elastic/kibana#38559.
I assumed
getTrainedModelwould take paging parameters in which case paging inexpandIdswould be redundant but I see from the way the code is used that is not the case andGetTrainedModelsStatsActionexplicitly depends on paging ids.AD jobs are not paged because when they lived in clusterstate all where returned by the GET API. It would have been a breaking change to require paging on that API when we move to the .ml-config index and we could not make a breaking change in that release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the apis
GET _ml/trained_modelsandGET _ml/trained_models/_statsaccept paging parameters and use those when expanding the IDs.Since they are stored in indices, paging parameters are a must (even for expanding IDs) and might as well have it handled now instead of worrying about adding them later when we reach scaling issues.