-
Notifications
You must be signed in to change notification settings - Fork 18
Wait replica respond before load it in safe_load() #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/preprocessor.py
Outdated
try: | ||
result = yaml.safe_load(result) | ||
except AttributeError: | ||
result = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TarantoolAsyncConnection returns None
at socket.error
. There are several comments here:
- How can I reproduce the situation?
- Please, don't use try-except, where
'foo' in my_dict
orx is None
may be used. The exception handling is not cheap thing. - As in PR Fix server_stop() routine on broken connections #228 and PR Fix lua_eval() routine on broken connections #230, we should not hide the error and report an empty result to the caller.
- There is nothing called
kill_all_servers()
, please, change or clarify (it is about the commit message).
fade4c4
to
bc80d8a
Compare
It is unclear how to reproduce the situation. There are also unresolved comments regarding the implementation. I'll close the PR so. If the problem persists, feel free to reopen the PR with description how to reproduce the issue (a mangled tarantool implementation / a mangled test are okay) and with an implementation updated according to the comments above. Or file an issue with a reproducer. |
Please check the new way to fix the issue. The issue is flaky and it can't be reproduce too easy, so only checking the logs in past currently available. I hope to find the way to reproduce it and working on it. |
bc80d8a
to
3a8774f
Compare
Pushed-pushed the old state (due to isaacs/github#361), reopened, force-pushed the new state back. |
I looked at the changes. You again try to propose a 'fix' without any reproducer, even synthetic one. So I just have no any information, based on which I can say that the change is okay or not okay. |
3a8774f
to
df32290
Compare
51182fe
to
f5686da
Compare
lua_eval() routine is used for evaluating lua command on a given instance. Before run the command it reconnects to the needed instance and runs it there. After the command run safe_load() is called to parse the result. Found that some times safe_load() routine may fail with AttributeError internal exception. It happens, because result value from running command on the instance was not checked that it was not empty. To fix it was added waiting loop for replica respond before load it in safe_load() routine. Check the following output from gitlab-ci job [1]: DEBUG: sending command: test_run:wait_fullmesh(SERVERS) DEBUG: test-run received command: config engine DEBUG: test-run's response for [config engine] | !!python/unicode 'engine': !!python/unicode 'memtx' | ... DEBUG: test-run received command: eval autobootstrap1 "return box.info.server" DEBUG: test-run's response for [eval autobootstrap1 "return box.info.server"] | - id: null | lsn: -1 | ro: true | uuid: e84d86bd-d2fc-43c1-b832-8146d1f02cd1 | ... DEBUG: test-run received command: eval autobootstrap1 "return box.info.server" DEBUG: test-run's response for [eval autobootstrap1 "return box.info.server"] | - id: null | lsn: -1 | ro: true | uuid: e84d86bd-d2fc-43c1-b832-8146d1f02cd1 | ... DEBUG: test-run received command: eval autobootstrap1 "return box.info.server" TarantoolInpector.handle() received the following error: Traceback (most recent call last): File "test-run/lib/inspector.py", line 100, in handle result = self.parser.parse_preprocessor(line) File "test-run/lib/preprocessor.py", line 87, in parse_preprocessor return self.lua_eval(name, expr[1:-1]) File "test-run/lib/preprocessor.py", line 404, in lua_eval result = yaml.safe_load(result) File "/usr/local/lib/python2.7/site-packages/yaml/__init__.py", line 162, in safe_load return load(stream, SafeLoader) File "/usr/local/lib/python2.7/site-packages/yaml/__init__.py", line 112, in load loader = Loader(stream) File "/usr/local/lib/python2.7/site-packages/yaml/loader.py", line 34, in __init__ Reader.__init__(self, stream) File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 87, in __init__ self.determine_encoding() File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 126, in determine_encoding self.update_raw() File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 183, in update_raw data = self.stream.read(size) AttributeError: 'NoneType' object has no attribute 'read' DEBUG: test-run's response for [eval autobootstrap1 "return box.info.server"] | error: AttributeError("'NoneType' object has no attribute 'read'",) | ... Kill all servers ... [Instance "autobootstrap1" returns with non-zero exit code: 1] It is seen here that wait_fullmesh() routine waits for the server in loop and after successfull 2 checks it fails on the 3rd one. [1] - https://gitlab.com/tarantool/tarantool/-/jobs/878962286 Closes tarantool/tarantool#5572
f5686da
to
b0a2ead
Compare
No responses from the author of the PR for more than three months. And we still need a reproducer, even if it'll be probabilistic. The point regarding error hiding is actual: we'll not ever know that something goes wrong. Aside of this, the new patch makes it unclear, whether test-run may cycle for the infinite time (what if an instance is crashed?). I would not object against some kind of good error reporting that will help us to spot a rare problem (collect and report all helpful information and dump it to the terminal). But attempts to blindly fix such problems is not a way to go. We'll just got some other problems like worker hangs. I'll close the PR as stale. Feel free to reopen, if you'll get some kind of reproducer. |
after fix it works like this [2] (file: log/096_replication.log, line: 29111):
[1] - https://gitlab.com/tarantool/tarantool/-/jobs/878962286
[2] - https://gitlab.com/tarantool/tarantool/-/jobs/879929201/artifacts/download
Closes tarantool/tarantool#5572