-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
I had a look again at pytest performance, and was struck again by how many stat calls pytest performs. This seems to boil down to using the path classes from py for checking file existence, walking directory structures and such. Unfortunately instances of these are passed to plugins so aren't internal implementation details.
I suggest we think of using plain strings as paths that are passed to plugins. This is obviously a breaking change so can't be done until pytest 6. It's not clear to me how we'd make deprecation warnings for this though :(
To take a concrete example of this being a problem, the test suite for the product I work on calls stat 79k times just for the collect phase. If I monkey patch stat to log the paths there are ~3k unique paths in the output. I can get a little performance boost by monkey patching stat to be cached:
orig_stat = os.stat
cache = {}
def monkey_stat(*args, **kwargs):
a = args[0]
if a in cache:
return cache[a]
r = orig_stat(*args, **kwargs)
cache[a] = r
return rThat this can improve the performance is pretty silly :P
It seems like pytest could just use os.walk once and then use that data for the rest of the run of the program...