-
Notifications
You must be signed in to change notification settings - Fork 62
Closed
Description
Steps to reproduce
All omdb ops performed on sled g0
- Launch a4x2
- Set clickhouse-policy to
bothvia omdb - regenerate a blueprint and make target
- hyperstop g2 (node with one keeper)
- expunge g2 via omdb
- regenerate a couple blueprints and set as targets
- Ensure the zones get expunged in the blueprints
Evidence
Sled g2 can definitely no longer be reached. I see a log related to failing to contact it in the nexus node on g3. However, the keeper still shows it in inventory both in keeper-config.xml and via the clickhouse keeper-client command.
root@oxz_clickhouse_keeper_1d4c8dac:~# clickhouse keeper-client --host [fd00:1122:3344:104::21]
Connected to ZooKeeper at [fd00:1122:3344:104::21]:9181 with session_id 8
Keeper feature flag FILTERED_LIST: enabled
Keeper feature flag MULTI_READ: enabled
Keeper feature flag CHECK_NOT_EXISTS: disabled
/ :) get /keeper/config
server.1=fd00:1122:3344:101::21:9234;participant;1
server.2=fd00:1122:3344:104::21:9234;participant;1
server.3=fd00:1122:3344:103::21:9234;participant;1
server.4=fd00:1122:3344:103::22:9234;participant;1
server.5=fd00:1122:3344:102::21:9234;participant;1
The keeper on sled g2 is server.5
I then checked to see that there has been keeper log entries committed by the leader and they are increasing.
/ :) lgif
first_log_idx 1
first_log_term 1
last_log_idx 1515
last_log_term 1
last_committed_log_idx 1515
leader_committed_log_idx 1515
target_committed_log_idx 1515
last_snapshot_idx 0
I then checked crdb to see what the configuration was:
root@[fd00:1122:3344:101::3]:32221/omicron> select * from bp_clickhouse_cluster_config ;
blueprint_id | generation | max_used_server_id | max_used_keeper_id | cluster_name | cluster_secret | highest_seen_keeper_leader_committed_log_index
---------------------------------------+------------+--------------------+--------------------+------------------+--------------------------------------+-------------------------------------------------
16dfac44-0091-453a-b5e0-2e1b8cad2329 | 2 | 3 | 5 | oximeter_cluster | 5b815633-062c-438d-8acc-1858bb059e9e | 0
69cdc490-9a9d-46e9-b0c0-c8661b0b4794 | 2 | 3 | 5 | oximeter_cluster | 5b815633-062c-438d-8acc-1858bb059e9e | 0
79d919a7-13cd-4b47-9e9c-d15515c8532f | 2 | 3 | 5 | oximeter_cluster | 5b815633-062c-438d-8acc-1858bb059e9e | 0
bc23843c-1b2a-49d0-9b7b-224f1ed2e892 | 2 | 3 | 5 | oximeter_cluster | 5b815633-062c-438d-8acc-1858bb059e9e | 0Interestingly the highest_seen_keeper_leader_committed_log_index is 0 for all blueprints.
There are also no related rows in inventory:
root@[fd00:1122:3344:101::3]:32221/omicron> select * from inv_clickhouse_keeper_membership;
inv_collection_id | queried_keeper_id | leader_committed_log_index | raft_config
--------------------+-------------------+----------------------------+--------------
(0 rows)
Time: 4ms total (execution 4ms / network 1ms)
root@[fd00:1122:3344:101::3]:32221/omicron>It appears that retrieving this inventory data from clickhouse-admin-keeper is not working resulting in failure to modify the keepers.
Metadata
Metadata
Assignees
Labels
No labels