Skip to content

Commit d3bfc60

Browse files
committed
Stop replica with SIGKILL if SIGTERM failed
Found that if the previous test leaves process of the created cluster replica then the next same test fails on its recreation [1]: [047] replication/election_qsync_stress.test.lua vinyl [047] [047] [Instance "election_replica2" returns with non-zero exit code: 1] [047] [047] Last 15 lines of Tarantool Log file [Instance ... [047] 2020-11-05 13:19:25.941 [29831] main/114/applier/unix/:/private/tmp/tnt/047_replication/election_replica3.sock box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode. [047] 2020-11-05 13:19:25.941 [29831] main/103/election_replica2 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode. [047] 2020-11-05 13:19:25.941 [29831] main/103/election_replica2 F> can't initialize storage: Can't modify data because this instance is in read-only mode. [047] 2020-11-05 13:19:25.941 [29831] main/103/election_replica2 F> can't initialize storage: Can't modify data because this instance is in read-only mode. [047] [ fail ] [047] Test "replication/election_qsync_stress.test.lua", conf: "vinyl" [047] from "fragile" list failed with results file checksum: "133676d72249c570f7124440150a8790", rerunning with server restart ... [047] replication/election_qsync_stress.test.lua vinyl [ fail ] [047] Test "replication/election_qsync_stress.test.lua", conf: "vinyl" [047] from "fragile" list failed with results file checksum: "133676d72249c570f7124440150a8790", rerunning with server restart ... [047] replication/election_qsync_stress.test.lua vinyl [ fail ] ... To fix the issue replica should be killed with signal SIGKILL if SIGTERM signal didn't kill it's process. [1] - https://gitlab.com/tarantool/tarantool/-/jobs/831786472#L5060
1 parent 5224312 commit d3bfc60

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

lib/tarantool_server.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -816,6 +816,8 @@ def start(self, silent=True, wait=True, wait_load=True, rais=True, args=[],
816816
if self.rpl_master:
817817
os.putenv("MASTER", self.rpl_master.iproto.uri)
818818
self.logfile_pos = self.logfile
819+
self.signal_default = signal.SIGTERM
820+
self.signal_kill = signal.SIGKILL
819821

820822
# redirect stdout from tarantoolctl and tarantool
821823
os.putenv("TEST_WORKDIR", self.vardir)
@@ -980,6 +982,17 @@ def stop(self, silent=True, signal=signal.SIGTERM):
980982
if self.crash_detector is not None:
981983
save_join(self.crash_detector)
982984
self.wait_stop()
985+
# check if the process died, otherwise if SIGTERM failed use SIGKILL
986+
try:
987+
self.process.send_signal(0)
988+
except OSError:
989+
pass
990+
else:
991+
if signal == self.signal_default:
992+
signal = self.signal_kill
993+
color_log('Sending signal {0} ({1}) to process {2}\n'.format(
994+
signal, signame(signal), self.process.pid))
995+
self.process.send_signal(signal)
983996

984997
self.status = None
985998
if re.search(r'^/', str(self._admin.port)):

0 commit comments

Comments
 (0)