-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Correctly processing the response to a Bulk API call is incredibly important to ensuring data integrity, however the Bulk API reference docs 2.x only state:
The response to a bulk action is a large JSON structure with the individual results of each action that was performed.
The only description of that "large JSON structure" is in an example curl command at the top of the page which only shows what a single action's response might look like:
{"took":7,"items":[{"create":{"_index":"test","_type":"type1","_id":"1","_version":1}}]}There is a ton to infer from that snippet:
itemsis an array of the same length and order as the sent actions+documents?- The response to
indexactions will indicate how they were applied? So for a new document the response type will becreatebut for an existing doc it would be...? - What indicates an error?!
Other oddity: non-HTTP API?
Same page and small, so I don't think it's worth a separate issue but:
If using the HTTP API...
This is the only hint on the entire page that there's a non-HTTP bulk API. Pretty sure this phrase should just be removed?
Update: Aha! https://www.elastic.co/guide/en/elasticsearch/reference/2.1/breaking_20_removed_features.html#_bulk_udp
Final nit pick: Write consistency means...?
The write consistency docs only say enough shards must be active to satisfy write consistency concerns.
In other distributed systems I associate write consistency with how many nodes must commit a write before the write is considered successfully applied.
Does write consistency depend solely on cluster state (the mere availability of active shards), or does it actually block on the writes to remote hosts?
(Also with replicas=1 quorum should really mean the primary+replica both must be available, not just one of them... one and quroum should not be the same, but thanks for at least documenting that they are!)