-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
These failures started occurring on 15th October. After that many 6.8 BWC tests against 5.6.x versions have failed. For example, look at the failure pattern in:
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.0,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.1,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.2,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.3,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.4,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.12,nodes=centos-7&&immutable
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.15,nodes=centos-7&&immutable
Not all versions are affected. For example 5.6.13 has been fine:
The errors vary. For example:
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.10,nodes=centos-7&&immutable/210/console is
java.lang.AssertionError: IndexVersionValue{version=1, seqNo=-2, term=1, location=null} - https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.6,nodes=centos-7&&immutable/210/console is
java.lang.AssertionError: IndexVersionValue{version=1, seqNo=-2, term=1, location=null} - https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.2,nodes=centos-7&&immutable/210/console is a suite timeout
- https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.4,nodes=centos-7&&immutable/210/console is a suite timeout
Infra has also noticed that some of the workers are running out of disk space due to huge console logs. For example https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.8+default-distro+bwc/BWC_VERSION=5.6.6,nodes=centos-7&&immutable/210/consoleText is 352MB.
Was something changed recently that increased the parallelism of 6.8 BWC tests? If so then that probably explains it. But it seems that the level of parallelism is beyond what the currently configured workers can cope with.
I'm tagging the distributed team in case the java.lang.AssertionError: IndexVersionValue{version=1, seqNo=-2, term=1, location=null} errors are a worry, and also the core build team in case this is all down to trying to run too much in parallel.