Persistent volumes for Zookeeper #33

solsson · 2017-06-25T19:21:49Z

The README.md used to say:

If you lose your zookeeper cluster, kafka will be unaware that persisted topics exist. The data is still there, but you need to re-create topics.

But that's a risky assumption if you have partitioning, which we aim to improve at with #30.

We can do as suggested in #26

… for zoo

solsson · 2017-06-25T19:24:38Z

I've tested this with GKE.

How well does multi-zone cluster work with volumes? I get a feeling we lose the ability to transition, in case one zone becomes entirely unavailable, because each volume is created in a zone and pods get affinity to there.

solsson · 2017-06-25T19:45:54Z

An issue with #32 seems to be that pod deletion is slow. Maybe signals don't reach the java process.

The selected config is from the jmx_exporter examples.

… and two that can move automatically at node failures

Suggest a mix of persistent and ephemeral data to improve reliability across zones

and with the mix of PV and emptyDir there's no reason to make PVs faster than host disks. Use 10GB as it is the minimum for standard disks on GKE.

solsson · 2017-06-26T11:29:28Z

How well does multi-zone cluster work with volumes? I get a feeling we lose the ability to transition, in case one zone becomes entirely unavailable, because each volume is created in a zone and pods get affinity to there.

#34 merged with a potential solution for that. Interesting tests remain. Go ahead and kill nodes etc.

Known remaining issues, non-blocers, with this branch:

Pod deletion is slow (maybe that's good for controlled scale down, but it might indicate an image och command issue)
The prometheus exporter shows no meaningful data.

solsson · 2017-06-26T11:35:49Z

Tested kubectl logs -f -c zookeeper zoo-1 together with kubectl scale --replicas=1 statefulset zoo and logs show no sign that the pod is aware that it will be terminated.

I also tested locally with docker kill zookeeper-test after

docker run --rm -d --name zookeeper-test --entrypoint ./bin/zookeeper-server-start.sh solsson/kafka:0.11.0.0-rc2@sha256:c1316e0131f4ec83bc645ca2141e4fda94e0d28f4fb5f836e15e37a5e054bdf1 config/zookeeper.properties
docker logs -f zookeeper-test

and no trace of termination handling there either, though shutdown is really fast. Will test on Kafka instead, after merge, as it has a documented graceful shutdown behavior. I will also postpone metrics troubleshooting, as that too benefits from comparison with kafka. Might simply be about jmx_exporter config.

solsson · 2017-06-27T08:43:06Z

Switched Kafka to dynamic provisioning in 10543bf

solsson · 2017-06-27T09:08:49Z

Good resource on termination: https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html

solsson · 2017-06-27T10:57:27Z

After 411192d I get properly logged shutdown behavior in kafka, taking around 15s with negligible load. So the alpine shell not forwardning signals might have been the issue.

Uses a named storage class so you can select volume type specifically…

4351e7c

… for zoo

solsson added 2 commits June 25, 2017 21:31

Verified the volume setup with Minikube

9479e81

Updates the readme

a8c8a39

solsson added 4 commits June 25, 2017 21:49

Enables metrics export to Prometheus, but they look very uninteresting.

26173af

The selected config is from the jmx_exporter examples.

Makes persistence a fundamental attribute of the statefulset

4fd1e5e

Creates identical definitions for a non-persistent zoo statefulset

225569f

A cluster in three availability zones now get one persistent zk each,…

cb83353

… and two that can move automatically at node failures

solsson mentioned this pull request Jun 26, 2017

Suggest a mix of persistent and ephemeral data to improve reliability across zones #34

Merged

solsson and others added 2 commits June 26, 2017 13:22

Merge pull request #34 from Yolean/zookeeper-availability-zones

4a16d4f

Suggest a mix of persistent and ephemeral data to improve reliability across zones

Forks can tweak storage classes, but here we want setup to be simple...

efb1019

and with the mix of PV and emptyDir there's no reason to make PVs faster than host disks. Use 10GB as it is the minimum for standard disks on GKE.

solsson merged commit 13e9818 into kafka-011 Jun 27, 2017

solsson mentioned this pull request Jun 27, 2017

Error and Completed state on fresh pods with negligible load #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Persistent volumes for Zookeeper #33

Persistent volumes for Zookeeper #33

Uh oh!

solsson commented Jun 25, 2017

Uh oh!

solsson commented Jun 25, 2017 •

edited

Loading

Uh oh!

solsson commented Jun 25, 2017

Uh oh!

solsson commented Jun 26, 2017

Uh oh!

solsson commented Jun 26, 2017 •

edited

Loading

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Persistent volumes for Zookeeper #33

Persistent volumes for Zookeeper #33

Uh oh!

Conversation

solsson commented Jun 25, 2017

Uh oh!

solsson commented Jun 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

solsson commented Jun 25, 2017

Uh oh!

solsson commented Jun 26, 2017

Uh oh!

solsson commented Jun 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

solsson commented Jun 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

solsson commented Jun 25, 2017 •

edited

Loading

solsson commented Jun 26, 2017 •

edited

Loading