-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDDS-1785. OOM error in Freon due to the concurrency handling #1085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
@elek @iamcaoxudong please review |
|
💔 -1 overall
This message was automatically generated. |
|
(force-pushed to pick up recent fixes from |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 Thank you @adoroszlai the fix for this problem.
I really like this approach as it can guarantee predictable name for the created keys/volumes/buckets (using pre defined postfix instead of a random string).
With refactoring out the validation from the random key generator we can run chaos testing easily:
-
generate 10M keys with freon
-
run validator (check the existence / content of the generated keys)
-
kill one datanode and restart it without the persitent data
-
wait for the replication
-
validate the keys
and we can repeat 3-5...
|
I tested it in kubernetes with generating 25M keys, compared to the previous version I was able to generate the keys... I will merge it soon... |
* Upgrading calcite to 1.19 * Fixing the flatten testcase and adding another test * minor fixup * Removing a flaky test
What changes were proposed in this pull request?
Change concurrency in Freon
RandomKeyGenerator:Workers coordinate the items they create using "global" counters.
https://issues.apache.org/jira/browse/HDDS-1785
How was this patch tested?
Tested with various number of volumes/buckets/threads.