Skip to content

Entries repeatedly logged with sub-second frequency filling up disk space #19164

@krystalcode

Description

@krystalcode

Elasticsearch version:

5.0.0-alpha3

JVM version:

The docker image inherits from the official java:8-jre image.

OS version:

Official docker image (https://hub.docker.com/_/elasticsearch/, Debian Jessie) running on a Fedora 23 host.

Description of the problem including expected versus actual behavior:

Certain log entries are repeatedly added to the log file with sub-second frequency, resulting in filling up the disk, which can trigger shard migration and possibly node failure. In a matter of hours the log file can be several gigabytes. Ironically, one of the log entries exhibiting such behaviour is warning about low disk space. The log records I have seen added with such frequency are given below.

Expected behaviour is to log entries with less frequency. I certainly don't need to be notified 2 times per second that my disk usage is over 90%. Once every 15 or 30 minutes would suffice.

Would it be an option to allow users to configure how often certain entries would be logged? That would require the program to be intelligent enough and know when an entry was already logged. Not sure how this could be accomplished - one idea could be to store in memory the timestamp of the last entry of a type and take it into account when a new entry of the same type is about to be logged.

Steps to reproduce:

  1. Run elasticsearch on a node with more than 90% usage.
  2. Observe the log file, the related entry is added 2 times per second.

I do not know how to reproduce the second log entry referenced below, I will file a separate issue if it seems to be a bug.

Provide logs (if relevant):

The following log entry is written up to 7 times per second.

{"log":"[2016-06-27 16:49:55,313][WARN ][cluster.routing.allocation.decider] [Sabretooth] high disk watermark [90%] exceeded on [ExDIO2orQJm6a5XHEvxxgg][Sabretooth][/usr/share/elasticsearch/data/elasticsearch/nodes/0] free: 4.6gb[7.7%], shards will be relocated away from this node\n","stream":"stdout","time":"2016-06-27T16:49:55.314948647Z"}
{"log":"[2016-06-27 16:49:55,314][INFO ][cluster.routing.allocation.decider] [Sabretooth] rerouting shards: [high disk watermark exceeded on one or more nodes]\n","stream":"stdout","time":"2016-06-27T16:49:55.315047615Z"}

The following log entry is written 1 or 2 times per second.

{"log":"[2016-06-29 14:55:30,597][WARN ][cluster.action.shard     ] [Thor] [.monitoring-data-2][0] received shard failed for target shard [[.monitoring-data-2][0], node[zUY_PHA0SPet5YLXwPZKSA], [P], s[INITIALIZING], a[id=0xG_VO4iQYC-9EArmgdU1w], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-06-29T14:55:30.398Z], failed_attempts[14], details[failed recovery, failure RecoveryFailedException[[.monitoring-data-2][0]: Recovery failed from null into {Thor}{zUY_PHA0SPet5YLXwPZKSA}{172.17.0.3}{172.17.0.3:9300}]; nested: IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: EOFException; ]]], source shard [[.monitoring-data-2][0], node[zUY_PHA0SPet5YLXwPZKSA], [P], s[INITIALIZING], a[id=0xG_VO4iQYC-9EArmgdU1w], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-06-29T14:55:30.398Z], failed_attempts[14], details[failed recovery, failure RecoveryFailedException[[.monitoring-data-2][0]: Recovery failed from null into {Thor}{zUY_PHA0SPet5YLXwPZKSA}{172.17.0.3}{172.17.0.3:9300}]; nested: IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: EOFException; ]]], message [failed recovery], failure [RecoveryFailedException[[.monitoring-data-2][0]: Recovery failed from null into {Thor}{zUY_PHA0SPet5YLXwPZKSA}{172.17.0.3}{172.17.0.3:9300}]; nested: IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: EOFException; ]\n","stream":"stdout","time":"2016-06-29T14:55:30.597899825Z"}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions