-
-
Notifications
You must be signed in to change notification settings - Fork 244
Add pointers tree to TempSpace class #8421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pointers tree to TempSpace class #8421
Conversation
Necessary for faster search of a free segment of the required size. When using temporary blobs, there are situations when a large number of free segments of a small size accumulate during one transaction.
AFAIU, the second tree contains chains of pointers to the segments of the same size. The code looks over complicated for me, Did you consider to implement exactly what was described and much more simpler - second tree with plain segment pointers ordered by segment size ? |
Even better could be to add map |
Tried to run the test with not patched master:
Note: I've fixed RUN_TEST to avoid overflow and my results is very different:
Then I set page cache to 50000 pages and TempCacheLimit to 1G to avoid on-disk operations until commit,
It shows almost ideal increase of execution time |
Sorry, my mistake, I initially tested on FLD_48 when preparing the resulting data, and then changed it to FLD_50 and remove 64k and 128k records in the test, because even on these values the result is visible.
On the release build, the effect is not so visible on such a sample, so the results are collected on a debug build, this is not entirely correct, but effectively.
Previously I add debug information to the code:
And on a real DB the results were as follows:
It follows that almost all free segments are not reused (the size of free segments was less than 16 bytes), and with each call |
Could you provide results on RELEASE build ? |
My work computer is weaker, plus the OS may matter (I'm testing on Windows), but the results on the release build are as follows:
In the config it was written:
I was trying to figure out why you had such a good increase time without a patch.
Which is approximately (within the margin of error) the same, both without the patch and with the patch. There hasn't been enough time to figure out why the I also tried running the test on another, more powerful computer (with config parameters and a
I am sending a test case in the attached file, but I am not sure that it will run as is, since it uses a modified pytest to simultaneously run two DBMS. |
Finally, I reproduced the issue with original test case: initially I created trigger So, now I can confirm the issue and the fix. Also, I've tried alternative fix (with map <size, position>, as I offered to consider) and it shows almost the same results. I still think that code could be made a bit more understandable and looking how to improve it. |
Let me first explain how I see what this PR is about. Free segments organized into set of doubly-linked lists where each list contains segments of equal size. I consider this as nice way to add an extra index without much overhead. I offer to:
I believe such (or similar) changes makes code much more clear and easy to read and understand. |
Looks like you (@XaBbl4) agreed with proposed changes. So, how we going to proceed ?
Another opinions ? |
I will do this as part of the PR, but after fixing the bug found with this. For certain operations, fb_assert is triggered, there is a reproducible case. |
Much better, thanks! Two notes:
|
Also add consistency check in validate function
Thanks for the notes. I rename the variable, it is indeed correct. Also added a check for the number of segments of the source tree freeSegments to validate function. I think this check is enough. |
@hvlad , any objection to backport into v5? |
No objections, of course |
* Add pointers tree to TempSpace class Necessary for faster search of a free segment of the required size. When using temporary blobs, there are situations when a large number of free segments of a small size accumulate during one transaction. * Replace NULL to nullptr * Refactor class and fix server crash * Rename head to tail for better understanding Also add consistency check in validate function --------- Co-authored-by: Andrey Kravchenko <[email protected]>
The problem is in the linear enumeration of the tree of free segments to find the best segment of the required size.
It appears in cases when a lot of free segments of a very small size appear in the tree, and with each request for
allocateSpace
- this cycle constantly enumerated.This patch proposes a solution by adding a second tree, where the key is the size of the free segment and the payload is a reference to that segment in the first tree.
Test case in which this problem occurs:
After running the test before and after the patch we have:
K_B
, e.g.BEFORE(128000) / BEFORE(64000) = 4.752
), and the more records are changed in one transaction - the time goes to the sky :-)K_A
, e.g.AFTER(128000) / AFTER(64000) = 2.285
) and overall acceleration of execution relative to the variant without the patch (K
, e.g.BEFORE(128000) / AFTER(128000) = 39.343
)