-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Description
Unfortunately, the fix for #732 doesn't correctly handle master node failover for multi-threaded code.
Traceback (most recent call last):
# snip
File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 1870, in blpop
return self.execute_command('BLPOP', *keys)
File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 884, in execute_command
conn.send_command(*args)
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 721, in send_command
check_health=kwargs.get('check_health', True))
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 692, in send_packed_command
self.connect()
File "/usr/local/lib/python3.5/site-packages/redis/sentinel.py", line 44, in connect
self.connect_to(self.connection_pool.get_master_address())
File "/usr/local/lib/python3.5/site-packages/redis/sentinel.py", line 107, in get_master_address
self.disconnect()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 1241, in disconnect
connection.disconnect()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 669, in disconnect
self._sock.close()
AttributeError: 'NoneType' object has no attribute 'close'Scenario is this:
- multi-threaded process running, using a
SentinelConnectionPoolto get their separate connections to the Redis master - Redis master goes down, Sentinel does its job and elects a new master
- Subsequent redis-py call ends up calling
SentinelConnectionPool.get_master_address() get_master_address()notices that there is a new master and callsself.disconnect()to flush the poolConnectionPool.disconnect()callsdisconnect()on each of the connections in the pool.Connection.disconnect()sees a matching PID and callsshutdown()on the socket, then sets it back toNone- "Fun" ensues...
There is thread-locking to protect the pool's management of its members, but ConnectionPool.disconnect() is ripping sockets out from other threads in the middle of other operations.
The actual stack trace and error you'll get will vary depending on timing.
To fix it in my product code, I'm re-integrating the deferred disconnect using a generation id from PR #784.
Version: redis-py v3.4.1, redis/sentinel v3.2.13
Platform: Python 3.5.2 on Linux
Metadata
Metadata
Assignees
Labels
No labels