-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (bin/elasticsearch --version):
5.5.2
Plugins installed: []
discovery-ec2
repository-s3
JVM version (java -version):
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
OS version (uname -a if on a Unix-like system):
Amazon Linux on EC2 instances
Linux 4.9.43-17.38.amzn1.x86_64 #1 SMP Thu Aug 17 00:20:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
Steps to reproduce:
I recently upgraded a few clusters in different environments from 5.2.2 -> 5.5.2. Since doing so one of the clusters is running into timeout failures creating snapshots to S3. I've had a few successful snapshots and the other clusters have no failures so I know it does work. However most runs produce at least one failed shared or more with the same timeout error. Incidentally this has been limited to our production cluster which has the most/largest indices.
Provide logs (if relevant):
Some data redacted with ... below.
"failures": [
{
"index": "...-2017.09.11",
"index_uuid": "...-2017.09.11",
"shard_id": 4,
"reason": "IndexShardSnapshotFailedException[Failed to perform snapshot (index files)]; nested: IOException[Unable to upload object elasticsearch-snapshots/indices/.../4/__e]; nested: AmazonS3Exception[Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: BB3062E801AD4513)]; ",
"node_id": "...",
"status": "INTERNAL_SERVER_ERROR"
}
],
"failures": [
{
"index": "...-2017.09.10",
"index_uuid": "...-2017.09.10",
"shard_id": 0,
"reason": "IndexShardSnapshotFailedException[Failed to perform snapshot (index files)]; nested: IOException[Unable to upload object elasticsearch-snapshots/indices/.../0/__1]; nested: AmazonS3Exception[Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: 4C030EBC4EB49F51)]; ",
"node_id": "...",
"status": "INTERNAL_SERVER_ERROR"
},
{
"index": "...-2017.08.30",
"index_uuid": "...-2017.08.30",
"shard_id": 0,
"reason": "IndexShardSnapshotFailedException[Failed to write file list]; nested: IOException[Unable to upload object elasticsearch-snapshots/indices/.../0/pending-index-11]; nested: AmazonS3Exception[Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: 1562D61EBA5696BD)]; ",
"node_id": "...",
"status": "INTERNAL_SERVER_ERROR"
}
],