-
-
Notifications
You must be signed in to change notification settings - Fork 297
Description
This issue details my findings from investigating pylint-dev/pylint#8589 (see that issue for more information and steps to reproduce) after my PR couldn't pass checks.
I found the (quite arcane) cause and some different potential fixes.
Bug
Under rare circumstances, pylint crashes with RuntimeError: dictionary changed size during iteration
.
Cause
For this bug, during pylint checking, one of the modules in sys.modules
is an importlib.util.LazyLoader
object. These 'lazy modules' only execute their contents the first time an attribute is accessed on the module.
The lazy loader module is put in there by any sort of package that needs to be loaded during pylint execution: the package must be installed and importable. One reason why pylint would need to load the package is being included in pylint's extension-pkg-whitelist
(for pygame-ce
's case).
During checking (of any arbitrary code that needs the package to be loaded), pylint analyzes the sys
module live (live because there is no python source to analyze statically), and builds child nodes for its members, including for sys.modules
dict at astroid.raw_building.InspectBuilder.object_build()
.
Since sys.modules
is a dict, pylint also tries to create nodes for the keys (strings) and values (module objects) at astroid.nodes.node_classes._create_dict_items()
, by iterating over the dict's .items()
and calling const_factory()
on each key and value. The actual live sys.modules
dict is iterated over, not a copy.
Eventually the iteration reaches the LazyLoader
module object. Inside astroid.nodes.node_classes.const_factory()
, the .__class__
attribute of the lazy module is accessed, either implicitly by the isinstance()
assertion, or explicitly on the next line.
Since an attribute was accessed on the lazy module object for the first time, it's contents get executed. If it happens to import modules and packages during it's loading, new modules are put into sys.modules
dict by the python import system, changing its size.
After the lazy module is fully loaded, code execution eventually gets back to where sys.modules
's .items()
are being iterated. Upon the next iteration, python realizes the dict size has changed and raises RuntimeError
at that point in the code.
Comments
- The packages where the bug were experienced:
metaflow
,pyjanitor
,pygame-ce
(lazy module branch),django
. From what I see, these all utilize lazy loading andimportlib
. - The error was raised at the loop instead of when a lazy module was being loaded, making the traceback less useful.
- The condition where a package needed to be loaded during pylint execution made the crash difficult to reproduce. It explains why it often happens in CI checks of the package's repo.
Possible Fixes
- Shallow copy live mutable collections before iterating. Inefficient but simple and sound fix.
- Special-case
sys.modules
dict for live analysis. Technically a crash is still possible though, only addresses the common case. - Avoid all attribute accesses of
value
insideconst_factory()
. I checked and it is easy: replace twovalue.__class__
withtype(value)
(bypassing attribute access), and useissubclass()
instead ofisinstance()
(becauseisinstance()
uses__class__
) in one assertion, all inconst_factory()
code. Keeps original efficiency, but has side effects: harder to maintain,__class__
is no longer be able to be proxied (possibly a good thing?), and__getattribute__
is no longer invoked by pylint (probably good but is another change in behavior).
astroid 3.3.8