Skip to content

Conversation

@areek
Copy link
Contributor

@areek areek commented Aug 22, 2016

Currently, bulk item requests can be any ActionRequest, this PR
restricts bulk item requests to DocumentRequest. This simplifies
handling failures during bulk requests. Additionally, a new enum
is added to DocumentRequest to represent the intended operation
to be performed by a document request (create, index, update
and delete), which was previously represented with a mix of strings
and index request operation type.
Now, index request operation type reuses the new enum to specify
whether the request should create or index a document.
Restricting bulk requests to DocumentRequest further simplifies
execution of shard-level bulk operations to use the same failure
handling for index, delete and update operations.
This PR also fixes a bug which executed delete operations twice for
replica copies while executing bulk requests.

relates #19105

Currently, bulk item requests can be any ActionRequest, this commit
restricts bulk item requests to DocumentRequest. This simplifies
handling failures during bulk requests. Additionally, a new enum
is added to DocumentRequest to represent the intended operation
to be performed by a document request. Now, index operation type
also uses the new enum to specify whether the request should
create or index a document.
@areek areek force-pushed the cleanup/transport_bulk branch from ff103b3 to f207ecc Compare August 23, 2016 14:33
This commit refactors execution of shard-level
bulk operations to use the same failure handling
for index, delete and update operations.
@areek areek force-pushed the cleanup/transport_bulk branch from f207ecc to 14908f8 Compare September 1, 2016 18:16
@areek areek force-pushed the cleanup/transport_bulk branch from 1fa265b to 5677cb6 Compare October 3, 2016 18:35
@areek areek force-pushed the cleanup/transport_bulk branch from 5677cb6 to 248ac24 Compare October 3, 2016 20:12
@areek areek force-pushed the cleanup/transport_bulk branch from e515328 to 40b4f39 Compare October 4, 2016 19:05
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through this very carefully and it looks great. It's an amazing restructuring and I'm happy you took it on yourself. I left some minor comments. I also think we can potentially simplify further and make the DocumentRequest inherit from WriteReplicationRequest (and call DocWriteRequest) . I'm not sure but it would be great if you can give it a go.

UpdateRequest updateRequest = new UpdateRequest();
updateRequest.readFrom(in);
request = updateRequest;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an else and throw an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion


OpType(int op) {
this.op = (byte) op;
this.lowercase = this.toString().toLowerCase(Locale.ENGLISH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we typically use ROOT for these things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to ROOT, but there are a few places (e.g. DocWriteResponse) where we use ENGLISH too. maybe we should change it to ROOT as well?

out.writeVInt(requests.size());
for (ActionRequest<?> request : requests) {
for (DocumentRequest<?> request : requests) {
if (request instanceof IndexRequest) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we fold these reads and writes into static methods of DocumentRequest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (request == null) {
continue;
}
String concreteIndex = concreteIndices.getConcreteIndex(request.index()).getName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

setResponse(item, item.getPrimaryResponse());
BulkItemRequest item = request.items()[requestIndex];
DocumentRequest<?> documentRequest = item.request();
if (ExceptionsHelper.status(e) == RestStatus.CONFLICT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we stay consistent and use isConflictException? (I know it was like this before)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

location = locationToSync(location, result.getLocation());
WriteResult<? extends DocWriteResponse> writeResult = innerExecuteBulkItemRequest(metaData, indexShard,
request, requestIndex);
if (writeResult.getResponse().getResult() != DocWriteResponse.Result.NOOP) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not check if writeResult.getLocation() != null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be more strict as only a noop result can be valid with a null location. changed it to be an assert instead.

public void writeTo(StreamOutput out) throws IOException {
out.writeVInt(id);
out.writeString(opType);
out.writeByte(opType.getId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need bwc here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bwc added

areek added 3 commits October 5, 2016 17:51
Currently, update action delegates to index and delete actions
for replication using a dedicated transport action. This change
makes update a replication operation, removing the dedicated
transport action. This simplifies bulk execution and removes
duplicate logic for update retries and translation. This
consolidates the interface for single document write requests.

Now on the primary, the update request is translated to
an index or delete request before execution and the translated
request is sent to copies for replication.
@areek areek force-pushed the cleanup/transport_bulk branch from e68723a to eee0d18 Compare October 6, 2016 09:03
@areek
Copy link
Contributor Author

areek commented Oct 6, 2016

Thanks @bleskes for the review :). I addressed all the minor comments.

I also think we can potentially simplify further and make the DocumentRequest inherit from WriteReplicationRequest (and call DocWriteRequest)

I really like the idea of making DocumentRequest inherit from WriteReplicationRequest. While giving this a go, I refactored update operation to be a replication operation (updates are a DocumentWriteRequest but does't use the replication operation instead delegates to index and delete operations for replication). This cleans up the dedicated transport update action (TransportInstanceSingleOperationAction) in favour of reusing replication action and duplicate code for update retry logic and translation. The one big downside is it would be difficult to implement wire bwc with 5.0. WDYT?

@areek areek force-pushed the cleanup/transport_bulk branch 2 times, most recently from 8f01c60 to 02ea4b8 Compare October 6, 2016 18:42
@areek areek force-pushed the cleanup/transport_bulk branch from 02ea4b8 to 42bc2d1 Compare October 6, 2016 18:47
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @areek . Let's leave the merger/inheritance of DocumentRequest and ReplicationRequest alone. None of the solutions seem ideal (update request is not a replication request and the fact it's running on the primary is not a good thing). This PR is great enough and it would be a shame to delay it. We can revisit / evaluate again when have a better idea.

I do have two small asks (next to a question I left in the comments)

return new UpdateResult(translate, indexRequest, retry, cause, null);
} else {
assert translate.getResponseResult() == DocWriteResponse.Result.UPDATED;
update.setGetResult(updateHelper.extractGetResult(updateRequest, updateRequest.concreteIndex(), indexResponse.getVersion(), translate.updatedSourceAsMap(), translate.updateSourceContentType(), indexSourceAsBytes));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we now return always the get result on updates? it seems to be different before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to match the logic in TransportUpdateAction.shardOperation but that caused CI failures, I reverted the change to be the same as update operation in bulk

@bleskes
Copy link
Contributor

bleskes commented Oct 9, 2016

It also seems CI isn't happy.

@areek
Copy link
Contributor Author

areek commented Oct 12, 2016

Thanks @bleskes for the feedback. I updated the PR, addressing your comments and the CI is happy. Could you take a look

@bleskes
Copy link
Contributor

bleskes commented Oct 12, 2016

LGTM. Awesome @areek

@areek areek merged commit 133be66 into elastic:master Oct 12, 2016
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Jan 23, 2017
This is a bespoke backport of elastic#20109 for 5.x:

Currently, bulk item requests can be any ActionRequest, this PR restricts bulk
item requests to DocumentRequest. This simplifies handling failures during bulk
requests. Additionally, a new enum is added to DocumentRequest to represent the
intended operation to be performed by a document request (create, index, update
and delete), which was previously represented with a mix of strings and index
request operation type.

Now, index request operation type reuses the new enum to specify whether the
request should create or index a document. Restricting bulk requests to
DocumentRequest further simplifies execution of shard-level bulk operations to
use the same failure handling for index, delete and update operations. This PR
also fixes a bug which executed delete operations twice for replica copies while
executing bulk requests.

Relates to elastic#19105 and elastic#20109
@lcawl lcawl added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. and removed :Bulk labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >enhancement v6.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants