Skip to content
This repository was archived by the owner on Feb 26, 2025. It is now read-only.

Conversation

GianlucaFicarelli
Copy link
Contributor

@GianlucaFicarelli GianlucaFicarelli commented Nov 4, 2024

This PR adds async code to allow using entity_management in async code without using threads or subprocesses.

Structure:

  • The new async code is added under entity_management_async.
  • The old sync code is kept under entity_management, but it's automatically generated from the async code.
  • Data have been moved to entity_management_common, because used from both sync and async code.
  • The sync tests are automatically generated from the async tests as well.

Usage:
The async classes are similar to sync classes, but to automatically resolve the lazy attributes (linked Identifiables that need to be retrieved from Nexus) we need to explicitly await them.
Awaiting the same attribute multiple times doesn't have any effect (i.e. the attribute is not retrieved from Nexus multiple times)

Usage example:

>>> import logging
>>> logging.basicConfig(level="DEBUG")

>>> from entity_management_async.simulation import DetailedCircuit
>>> circuit_id = "https://..."
>>> circuit = await DetailedCircuit.from_id(circuit_id)
>>> circuit.as_json_ld()
>>> circuit.instantiated
True
>>> circuit.atlasRelease
AtlasRelease()
>>> circuit.atlasRelease.instantiated
False
>>> atlas_release = await circuit.atlasRelease
>>> circuit.atlasRelease.instantiated
True
>>> atlas_release is circuit.atlasRelease
True
>>> await circuit.atlasRelease.parcellationOntology
>>> circuit.atlasRelease.parcellationOntology.distribution[0].encodingFormat
'application/json'

Pros:

  • The async classes mostly behave like the sync classes
  • Loading from Nexus is more explicit, because it needs to be awaited.

Cons:

  • The user needs to be aware which attributes are Identifiables that need to be loaded from Nexus

Comparison with SQLAlchemy approaches:

  • The approach in this PR resembles awaitable_attrs with the difference that the awaitable attributes are directly accessible without the awaitable_attrs attribute.
  • A different approach, not implemented, is the eager loading strategy using selectinload() eager loader. In our case, if the user knows which sub-entities need to be loaded when calling from_id or from_url, they could be specified and all eagerly retrieved immediately.

Documentation: it should be updated after discussion.

@GianlucaFicarelli GianlucaFicarelli self-assigned this Nov 4, 2024
@codecov-commenter
Copy link

codecov-commenter commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 78.77719% with 361 lines in your changes missing coverage. Please review.

Project coverage is 80.13%. Comparing base (28ec74a) to head (80bc27a).

Files with missing lines Patch % Lines
entity_management_async/nexus.py 54.33% 100 Missing ⚠️
...ty_management_async/circuit/building/functional.py 0.00% 50 Missing ⚠️
entity_management_async/base.py 90.76% 36 Missing ⚠️
entity_management_async/cli/base.py 0.00% 34 Missing ⚠️
entity_management_async/analysis.py 0.00% 25 Missing ⚠️
entity_management_async/util.py 87.19% 21 Missing ⚠️
entity_management/nexus.py 45.45% 18 Missing ⚠️
entity_management_async/core.py 86.32% 16 Missing ⚠️
entity_management/util.py 50.00% 7 Missing ⚠️
entity_management_async/atlas.py 84.09% 7 Missing ⚠️
... and 14 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
- Coverage   80.98%   80.13%   -0.85%     
==========================================
  Files          24       52      +28     
  Lines        1551     3206    +1655     
==========================================
+ Hits         1256     2569    +1313     
- Misses        295      637     +342     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Base automatically changed from httpx to main November 4, 2024 13:48
@GianlucaFicarelli
Copy link
Contributor Author

More details about possible errors: in the example in the description, when a nested entity is accessed before it's instantiated, a RuntimeError is raised:

>>> circuit.atlasRelease.parcellationOntology
...
RuntimeError: Cannot access not instantiated AtlasRelease.parcellationOntology

but the message can be made clearer, for example:

  • Cannot access the attribute 'parcellationOntology' because AtlasRelease needs to be instantiated
  • Need to instantiate AtlasRelease to access the attribute 'parcellationOntology'
  • ...

@genric
Copy link
Collaborator

genric commented Nov 5, 2024

Minor: there is a bunch of warnings on tox -e check-packaging.
Having to await separately on nested attrs is a major drawback IMHO, not sure if we can do any better...
Maybe it is better to split into two separate python packages rather than having a mix of generated and normal code in one. Then the other package would be completely auto-generated.

@GianlucaFicarelli
Copy link
Contributor Author

Since we are not happy with the async API, I'm thinking to close this PR without merging, while keeping the branch as a prototype for future reference.
I was considering to use this async code in the sonata-cell-position service, but we can postpone the decision because probably the service wouldn't benefit a lot, considering that the endpoint calling nexus is already executed in a separate thread, and most of the time is spent in calls to libsonata, that would need to continue to be executed in subprocesses.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants