-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Script: Ingest Metadata and CtxMap #88458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script: Ingest Metadata and CtxMap #88458
Conversation
…return vt validator, doc matcher only checks metadata map
Adds FieldProperties record that configures how to validate fields. Fields have a type, are writeable or read-only, and nullable or not and may have an additional validation useful for Set/Enum validation. Splits IngestMetadata from Metadata in preparation for new Metdata subclasses.
… into ingest_ctx_map_field_prop
… into ingest_sm_to_ctx_map
… into ingest_sm_to_ctx_map
…search into ingest_ctx_map_field_prop
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
Pinging @elastic/es-data-management (Team:Data Management) |
|
@elasticsearchmachine run elasticsearch-ci/part-2 |
…gest_ctx_map_field_prop
…gest_ctx_map_field_prop
rjernst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine. I don't quite see yet how this will get to what we discussed before (subclasses of Metadata passing the operations they support for keys, ie read-only vs read-write), but I know there are more followups. This PR on it's own looks like an improvement.
| return null; | ||
| } | ||
|
|
||
| static class IngestMetadata extends Metadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is static, let's move it out to a tope level class. It can still be package private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an IngestMetadata that "Holds the ingest pipelines that are available in the cluster".
I'm going to rename this to IngestDocMetadata when I make it top-level.
| public static Tuple<Map<String, Object>, Map<String, Object>> splitSourceAndMetadata(Map<String, Object> sourceAndMetadata) { | ||
| if (sourceAndMetadata instanceof IngestSourceAndMetadata ingestSourceAndMetadata) { | ||
| return new Tuple<>(new HashMap<>(ingestSourceAndMetadata.source), new HashMap<>(ingestSourceAndMetadata.metadata.getMap())); | ||
| public static Tuple<Map<String, Object>, Map<String, Object>> splitSourceAndMetadata( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this method necessary when the source and metadata are already split internally? Couldn't this be two separate calls to member methods, to get the source map and the Metadata object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
| } | ||
|
|
||
| /** | ||
| * Check that all metadata map contains only valid metadata and no extraneous keys and source map contains no metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this comment is off? there is no source map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
| put(VERSION, version); | ||
| } | ||
|
|
||
| public ZonedDateTime getTimestamp() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can move completely to IngestMetadata right? We shouldn't need it to exist at all on other Metadata subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is available in 2 of 4 write contexts, ingest and update. It is missing in reindex and update by query
It's here because that allows scripts to do Metadata m = metadata(); m.timestamp. Otherwise, the script would have to name the subtype.
It is possible to add timestamp to reindex and update by query, both have access to the thread pool and pull a long supplier from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could duck-type the Metadata subclasses just for timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it’s relevant to all the other contexts, then it makes sense to stay completely in this class. But I don’t think we should have it defined here but implemented in subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change the implementation here in the Update PR.
…x validate javadoc
jdconrad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few minor comments.
| ZonedDateTime timestamp, | ||
| Map<String, Object> source | ||
| ) { | ||
| super(new HashMap<>(source), new IngestDocMetadata(index, id, version, routing, versionType, timestamp)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we do a deepcopy for source when calling this constructor, do we then again need to wrap source in a new HashMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of tests call into IngestDocument which calls this constructor, those test all do Map.of or SingletonMap and fail if we don't perform this copy.
| copy.put(entry.getKey(), deepCopy(entry.getValue())); | ||
| } | ||
| // TODO(stu): should this check for IngestSourceAndMetadata in addition to Map? | ||
| // TODO(stu): should this check for IngestCtxMap in addition to Map? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to leave this as part of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shows up because of the rename, it's still relevant question for now.
| put(VERSION, version); | ||
| } | ||
|
|
||
| public ZonedDateTime getTimestamp() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could duck-type the Metadata subclasses just for timestamp.
This reverts commit a08afb8.
Create a
Metadatasuperclass for ingest and update contexts.Create a
CtxMapsuperclass forctxbackwards compatibility in ingest and update contexts.script.CtxMapwas moved fromingest.IngestSourceAndMetadataCtxMaptakes aMetadatasubclass and validates update via theFieldPropertys passed in.Metadataprovides typed getters and setters and implements aMap-like interface, making it easy for a class containingCtxMapto implement the fullMapinterface.The
FieldPropertyrecord that configures how to validate fields. Fields have atype, arewriteableor read-only, andnullableor not and may have an additional validation useful for Set/Enum validation.