-
-
Notifications
You must be signed in to change notification settings - Fork 500
Refactor event handler and improve performance [So far: ~40-50%] #2173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…. Works as expected on server side.
…char* as fn input args
|
As of now only server will compile. Client is TODO. |
|
tl;dr; Off-opic /LTCG /GL stuffI also tired with /LTCG /GL and had a consistent 8-10% speedup, but for whatever reason with 28 handlers it became 32%. I really suggest us enabling /LTCG and /GL when shipping MTA.The attached file is the not-so-tldr version. |
|
Ready for review. A few things to consider: Macro of doomI added a macro of doom in BuiltInEventListApply.h. It's ugly, but we don't have to maintain the same list in 3 different places. If anyone has a better idea, let me know. Memory usage?Memory usage? As I said it does increase by a not so marginal ~17% / element. I'll see what the performance difference if I use the hash map for built-in events as well. Or should we sacrifice memory for CPU?Realistically people have max 50k elements on the client. Edit: Okay, so I ran a quick test. With /LTCG /GL the difference with hashmap(lower memory usage) vs array(higher) is overall ~10% (up to 33%). |
|
This isn't all, but I don't want to cramp up the PR too much. I'll remove the need for |
|
so when i use |
|
Why not use hash instead of std::string in CustomEvent? |
Absolutely. CPU usage is the bottleneck for 99% of players. |
I thought this was a well known performance tip, but I don't see it mentioned on the addEventHandler wiki page. There is however multiple mentions of it in the page I made based on the things I learned about performance (some of which was from MTA devs) like ccw suggested changing all event handlers for client events to server from root to resourceRoot (you have to change the triggerServerEvent source to resourceRoot too) and doing that had a large CPU usage reduction for the server. |
Hashes by itself have collisions, thus I would have to do this. Doing this would be a pain in the ass honestly. Also I dont reuse the key, because I only need to do the name => event translation once. |
ASAP this PR gets merged I will make another to add arguments to limit triggering on children, since that takes a lot of time. So, imagine you trigger something on root: the program has to traverse the whole tree, and call all related handlers on any of the elements, thats why its so slow. |
|
And what happens when i use "onClientRender" with anything else than root?? |
|
Render events a special, they (internally) are triggered only on root, and aren't propagated down to children. |
|
Merge this before 1.5.9 please. |
|
If this just needs reviewing perhaps you should request someone to review so that such a beneficial change can be available to players as soon as possible. |
|
Reviewing it is pretty trivial, testing it isn't. |
|
I just tried starting a server with it, using CIT scripts and after starting some of the scripts (during that massive spam of add event debug) it crashes and has crashed both times with the same last message in log being [2021-04-20 17:36:35] startResource: Resource 'CITaccountsClient' started Here is the public dump file that was created: server_1.5.8-custom_KERNELBASE_00122802_7363_20210420_1735.rsa.zip |
|
@ArranTuna Could you please send me the private dump in private please? Over at discord (Pirulax#6835). |
|
Sup', how is this going? any ETA? I had tested the changes with my gamemode and it worked just fine, tried to use the unit test but I could only get working the event-sys-test part, because setedata-times is missing. In terms of memory usage it went up by 159MB~ on server-side (32 bits) and 220MB~ on client-side. My gamemode has to allocate a lot of memory for models so this could be a medium to worst case scenario, and tbh it shouldn't affect anything. I can't see a server/client crashing up because of this change, and even if that was the case the benefits are just too much. The only way I can test that this really helps on a large scale is when this comes live tho, so I'll be waiting impatiently. Good work Pirulax! Offtopic:
This means that this: -- serverside
addEventHandler("my_event", resourceRoot, my_function)
--clientside
triggerServerEvent("my_event", resourceRoot)Is faster than this?: -- serverside
addEventHandler("my_event", root, my_function)
-- clientside
triggerServerEvent("my_event", localPlayer)Or its kinda the same and just faster in general than root? If it is slower then I guess I would have to do some rewriting here and there. |
|
Yes that scripting tips thing is faster. Did you test the speed of event handling on this build? Try with tens of thousands of elements created (Just for loop createObject). When I tested with Pirulax it turned out that this PR made triggerEvent take multiple times longer on a server with tens of thousands of elements. |
Just did some tests regarding this. I used r20740 to test, all serversided. Three tests in total, which where: Test 1 Create and destroy 30k objects with an "onElementDestroy" event attached at root. You need to keep in mind that my server has over 8000 objects already created on serverside, so the total amount of objects is arround 38k, but only 30k are being created and destroyed. addEventHandler("onElementDestroy", root, function()
--
end)
local tick = getTickCount()
local objs = {}
for i=1,30000 do
objs[i] = createObject(1337, 0, 0, 3)
end
print("created in:", getTickCount()-tick, "ms")
tick = getTickCount()
for i=1,30000 do
destroyElement(objs[i])
end
print("destroyed in:", getTickCount()-tick, "ms")= r20740 = = Custom = While nobody is on the server it takes a tiny more to create and destroy the objects, but when somebody is connected you can expect a performance boost. Test 2 Again created 30k objects, but this time i just trigger a event in other resource calling triggerEvent x100 times. local tick = getTickCount()
local objs = {}
for i=1,30000 do
objs[i] = createObject(1337, 0, 0, 3)
end
print("created in:", getTickCount()-tick, "ms")
local tick = getTickCount()
for i=1,100 do
triggerEvent("onPlayerACInfo", resourceRoot, {})
end
print("100 triggerEvent:", getTickCount()-tick, "ms")== r20740 == == Custom == As you said, it takes longer for the events to trigger, which is odd. Last Test I just triggered all 100 events in a loop, remember that i have arround 8000 objects in serverside local tick = getTickCount()
for i=1,100 do
triggerEvent("onPlayerACInfo", resourceRoot, {})
end
print("100 triggerEvent:", getTickCount()-tick, "ms")== r20740 == == Custom == No change at all. My conclusion was that this issue only ocurred when the objects are created in the same resource where the triggers are being called. Making two separate resources, one that creates 30k objects and one that triggers the 100 events makes the result look like this: == Custom == Using a timer to call the events (1000ms after creation) didn't help with the results, so it's safe to say that this is resource related. I hope this information is useful in some way. btw i just tested this:
with 10k calls and sadly it doesn't seem to be the case, the localPlayer method and the resourceRoot method get mostly the same time, as expected, but the value can vary from just 300ms to over 2300ms, kinda strange. It's till 10x times better than root, obviously. |
|
I honestly have no clue what causes the performance degradation Arran mentioned. Anyways. I'll look into it. |
I could try to compile it on linux and test it. It's a very edge case tho, shouldn't we be more flexible? Who will have a single resource with 30k objects anyway? Also why does it slow down anyway? In r20740 it goes from 1ms~ to 100ms~ just because the objects were created on the same resource. It's odd.
There is also this problem, every time I executed that test the results varied, but normally they went down by the same ammount. First the triggerings took 2300ms, then 2000, then 1500, and so on until 300-350ms~. I can't think of a reason of why that was happening, some caching maybe? I was using r20740 and not the custom version tho. Eitherway if I can help in something (i'll try to test on linux when I can) let me know. Probably I'll do some more testings later, no promises tho. |
|
I mean the compiler might not be inlining something as expected. I'll try dropping a __forceinline. |
|
@Pirulax Is there any further updates on the progress of this? |
|
Not really. I'm yet to test it in a realistic way. |
|
This draft pull request is stale because it has been open for at least 90 days with no activity. Please continue on your draft pull request or it will be closed in 30 days automatically. |
|
This draft pull request was closed because it has been marked stale for 30 days with no activity. |
As we all know (at least I do), the event handler system in MTA is a big bottleneck of quite a lot of things:
setElementData- event calling takes 40-50% of execution (70-80% if called on root)onClientWorldSound- Since this event is called multiple times a frame it can cause quite bad lag (GTA sounds cause huge FPS drops but don't in SP #1067)onClientRender: 200 handlers take can 1 ms to execute on my machine)Pros:
BuiltInEvent) with an unique id assigned to eachCons:
EventIDArray, which would manage event ID's (obviously), so that way we could use anvector<unique_ptr<EventHandlerCollection>>regardless of type (BuiltIn / Custom).Notes:
Benchmarks: See comment below.