Skip to content

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Sep 29, 2018

Fixes #206.

Based on https://github.com/apache/kafka/blob/trunk/bin/kafka-server-stop.sh and https://github.com/apache/kafka/blob/trunk/bin/zookeeper-server-stop.sh but these scripts don't wait for shutdown to complete.

Got the wait loop from https://stackoverflow.com/questions/17894720/kill-a-process-and-wait-for-the-process-to-exit

Currently I've only tested the PR locally with little load. @stigok Can you confirm that you no longer get corrupted indices?

@solsson
Copy link
Contributor Author

solsson commented Sep 29, 2018

The last log entry I see is INFO [KafkaServer id=0] shut down completed (kafka.server.KafkaServer)

@solsson
Copy link
Contributor Author

solsson commented Sep 29, 2018

Maybe Zookeeper doesn't need controlled shutdown. I see no effect in logs of invoking the script.

@stigok
Copy link

stigok commented Sep 30, 2018

I'm unable to reproduce the bad indices. I don't know how I ended up with them in the first place. We've been having a lot of pod restarts and failed probes running in AKS, so it could've been caused by a lot of different factors.

@stigok
Copy link

stigok commented Sep 30, 2018

But this is PR is certainly a step in the right direction 👍

@solsson solsson merged commit 198666d into master Nov 18, 2018
@stigok
Copy link

stigok commented Nov 18, 2018

I had bad indexes again after my disks went full. Maybe that is a "good way" to simulate broken indexes.

  • Configure log.retention.bytes to a value greater than available disk-space
  • Produce enough messages to fill the disk
  • Watch Kafka die
  • Expand disk and expect to see bad indexes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants