-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Version: redis-py 4.5.4, Redis version irrelevant
Platform: macOS 13.2.1
Description: There's an inherent memory leak in redis.asyncio.connection.HiredisParser. This is actually hiredis-py's fault, but I suggest that we implement stuff a bit different to workaround this issue until redis/hiredis-py#163 is merged and a new version of hiredis-py is released.
This is a result of #2557.
Inside HiredisParser.on_connect, an hiredis.Reader object is constructed with replyError set to self.parse_error:
redis-py/redis/asyncio/connection.py
Lines 368 to 376 in e1017fd
| kwargs: _HiredisReaderArgs = { | |
| "protocolError": InvalidResponse, | |
| "replyError": self.parse_error, | |
| } | |
| if connection.encoder.decode_responses: | |
| kwargs["encoding"] = connection.encoder.encoding | |
| kwargs["errors"] = connection.encoder.encoding_errors | |
| self._reader = hiredis.Reader(**kwargs) |
At this point, we have the HiredisParser object pointing to the hiredis.Reader object (through _reader).
hiredis.Reader.__init__ reads replyError and sets it within self.replyErrorClass:
Now we have a cyclic reference:
HiredisParser._reader references to hiredis.Reader;
hiredis.Reader.replyErrorClass references to HiredisParser.parse_error (which is a bound method, so its references to self).
However, hiredis.Reader doesn't support garbage collection!
It doesn't declare Py_TPFLAGS_HAVE_GC.
You can also see that it doesn't implement tp_traverse:
https://github.com/redis/hiredis-py/blob/8adb1b3cb38b82cdc73fa2d72879712da1f74e70/src/reader.c#L49
So these objects are never garbage collected, leading to a memory leak.
The non-asyncio variant doesn't suffer from this problem, as its HiredisParser.on_disconnect deletes the reference to hiredis.Reader, breaking the cycle. #2557 removed this from the async variant, which has introduced the problem.
The correct solution is to support GC in hiredis-py, as it can hold references to non-primitive objects. Until a new version is released, I suggest to implement a temporary workaround.
The following is a simple reproduction script:
import asyncio
import gc
import hiredis # Make sure it's installed
import redis.asyncio
import redis.asyncio.connection
async def a():
r = redis.asyncio.Redis.from_url("redis://localhost:6379/0")
await r.get("a")
await r.close()
asyncio.run(a())
gc.collect()
for obj in gc.get_objects():
if isinstance(obj, redis.asyncio.connection.BaseParser):
print(obj)