Import and expose emodel traces #200

GianlucaFicarelli · 2025-05-27T09:41:18Z

This PR allows to import and expose the traces used to generate an EModel, using the derivation pattern.
Fixes https://github.com/openbraininstitute/prod-platform-architecture/issues/70

The derivations are stored in a generic derivation table with the intention that it can be reused for other entities.
For the moment this association table contains only used_id and generated_id referencing other entities because we don't need other information, but it should be possible to add other fields if needed (for example the type of derivation, or activity, generation, usage, attributes as indicated in https://www.w3.org/TR/prov-dm/#concept-derivation)

The traces are exposed in 3 ways because of the discussion for ion-channel-models in #183 (comment) , but we may decide to keep only one, or some:

expose GET /emodel/<uuid>/electrical-cell-recording)
allow to search by emodel in GET /electrical-cell-recording?generated_emodel__id=<uuid>
always return the list of traces in EModelReadExpanded.electrical_cell-recording, but include only id and type to prevent loading too many nested entities.

We could have another approach, that is to expose a dedicated /derivation endpoint, but it would be more beneficial if we store other information in the derivation itself.

The creation of the derivations can be implemented here or in a separate PR, but this needs more clarifications (cc @jdcourcol ):

should the derivations be created at the same time when the emodel is registered? Are they mandatory? in this case, they could be included in the create schema of EModel, and it would make sense if they are returned also by the read schema (approach 3 above)
should it be possible to create the derivations after the EModel has been registered? In this case, they could be registered through a dedicated endpoint, for example PUT /emodel/<uuid>/electrical-cell-recording that would make sense with the approach 1 above (alternatively, PATCH /emodel/<uuid> but this would make more sense for the attributes of the emodel, and it would need more clarifications on what other fields of EModel can be updated)
should it be possible to delete/recreate the derivations? in this case, in bulk or individually? This can also be decided later.

jdcourcol · 2025-05-27T10:19:26Z

should the derivations be created at the same time when the emodel is registered? Are they mandatory? in this case, they could be included in the create schema of EModel, and it would make sense if they are returned also by the read schema (approach 3 above)

no it is not mandatory. In certain case we may not have it. I would not return it in the read schema.

should it be possible to create the derivations after the EModel has been registered? In this case, they could be registered through a dedicated endpoint, for example PUT /emodel//electrical-cell-recording that would make sense with the approach 1 above (alternatively, PATCH /emodel/ but this would make more sense for the attributes of the emodel, and it would need more clarifications on what other fields of EModel can be updated)

yes we can create the derivation after. it is not a field of the emodel in my opinion. it is a relationship between 2 entities.

should it be possible to delete/recreate the derivations? in this case, in bulk or individually? This can also be decided later.

yes we can delete / recreate them in order to fix errors.

jdcourcol · 2025-05-27T10:23:31Z

app/routers/emodel.py

@@ -10,3 +19,24 @@
 read_many = router.get("")(app.service.emodel.read_many)
 read_one = router.get("/{id_}")(app.service.emodel.read_one)
 create_one = router.post("")(app.service.emodel.create_one)
+
+
+@router.get("/{id_}/electrical-cell-recording")


I would have done /{id_}/derived_from as it sounds more generic to me (since we will have these generic relationships in multiple place). Of course, in that case it returns all things the entity derive from (since we can derive also from other thing than electrical cell recording)

Do we need this endpoint? If the metadata returns a list of id and type then we can fetch the entities using the get endpoint for electrical cell recording.

Do we need this endpoint? If the metadata returns a list of id and type then we can fetch the entities using the get endpoint for electrical cell recording.

This is open for discussion as mentioned in the description and in the other JD's comment (to not return the derivations in the read schema). It mainly depend on the use cases:
Are these derivations usually needed by the clients when fetching the emodels? If not, the join with the derivation table would be useless in these cases.
Is it possible that the client needs only the derivations for a given id, without fetching the entity itself? In this case it could be beneficial to provide a separate endpoint.

That endpoint can also be used to create/update the derivations for a specific entity.

I would have done /{id_}/derived_from as it sounds more generic to me (since we will have these generic relationships in multiple place). Of course, in that case it returns all things the entity derive from (since we can derive also from other thing than electrical cell recording)

We can make the endpoint name generic (we could use the terms "used" and "generated" as indicated in the PROV document, so it could be /emodel/{id_}/used).
However, in this way the schema that specifies the returned resources should be a list of generic Entity, or a list of union of Entities.
Specifically, I think that we need to explicitly define the union of Entities if we want to instantiate models of different types with Pydantic, using a discriminator, but this adds some complexity to the model.
Alternatively, the generic endpoint can return a list of entities with only id and type, and the user should make an additional call to fetch the full entities based on them.

We can make the endpoint name generic (we could use the terms "used" and "generated" as indicated in the PROV document, so it could be /emodel/{id_}/used).

I am not sure it's a good idea to use activity-specific properties for an entity as we may want to introduce activities in the near future. derived-from/derived_from seems the right candidate for this job as it is an entity-related property:
https://www.w3.org/TR/prov-o/#wasDerivedFrom

Do we have multiple types of derivations for emodel?

derived-from/derived_from seems the right candidate for this job as it is an entity-related property: https://www.w3.org/TR/prov-o/#wasDerivedFrom

Agreed, it seems more appropriate.

Do we have multiple types of derivations for emodel?

Not sure for emodels, I think JD was referring to entities in general, to find an approach that can be adopted also in other cases, but it can be useful to consider more specific examples.

There is a charm, I have to admit, in not having to load the metadata to check its derivations or its generations if it's an activity for instance.

As a showcase (number 4) I added a new endpoint GET /{entity_route}/{entity_id}/derived-from that can work with any entity, and allows to filter by type.
This returns a paginated list of BasicEntity, that is a new pydantic schema with only id and type, because returning different types of entities in the same list would require special handling of polymorphism on both the server and the client side as mentioned in a previous comment.
This can be changed if needed, although it would require different queries and different joins to resolve the attributes of different types of entities.

To recap, there are 4 possible ways to retrieve the derivation, do you think that this last one can replace the previous ones for our use cases?
Do we need an additional attribute to differentiate different types of derivations for the same entity? (this could be also added later if needed)

GET /emodel/<uuid>/electrical-cell-recording) that returns a list of ElectricalCellRecording (paginated)

GET /electrical-cell-recording?generated_emodel__id=<uuid> that returns a list of ElectricalCellRecording (paginated)

GET /emodel/<uuid> that returns a list of BasicEntity in EModelReadExpanded.electrical_cell-recording (not paginated)

GET /{entity_route}/{entity_id}/derived-from that returns a list of BasicEntity (paginated)

I would use only number 4.

If there aren't objections I'm going to remove 3 as well, and leave only 4.
The endpoint 3 returns the same result as 4, it would just allow to avoid an additional call.

app/db/model.py

app/routers/derivation.py

eleftherioszisis · 2025-05-30T06:59:21Z

app/filters/electrical_cell_recording.py

 class ElectricalCellRecordingFilter(
    CustomFilter, BrainRegionFilterMixin, SubjectFilterMixin, EntityFilterMixin, NameFilterMixin
 ):
+    generated_emodel: Annotated[
+        _NestedEModelFilter | None,
+        FilterDepends(with_prefix("generated_emodel", _NestedEModelFilter)),
+    ] = None
    order_by: list[str] = ["-creation_date"]  # noqa: RUF012


Given that experimental data is usually used to produce models, isn't it strange allow the filtering of emodels via electrical cell recordings? Then we would probably need this generated_X for all different types of models that use traces. Maybe this can be internal just for the needs of the filter in the service but not exposed to the /electrical-cell-recording endpoint filter?

This is technically similar to the filter approach GET /ion-channel-model?emodel__id__in=['emodel1_id', 'emodel2_id1 ] suggested in #183 (comment)

However, if there aren't objections I'm going to remove:

this filter (approach 2)

and GET /emodel/<uuid>/electrical-cell-recording (approach 1 superseded by 4, using /derived-from)

while still keeping:

EModelReadExpanded.electrical_cell_recordings returns BasicEntity with only id and type (approach 3)

and 4 GET /{entity_route}/{entity_id}/derived-from

On the webapp we need to show a thumbnail for each emodel which we're missing

Not sure if that is an electrical cell recording, or another data source?
If it is indeed an electrical cell recording we probably need the assets of electrical cell recording in the list response of /emodel

Probably @bilalesi knows what data is needed.

By calling:
3. GET /emodel/{entity_id} you would get EModelReadExpanded.electrical_cell_recordings that is a list of BasicEntityRead with id and type
4. GET /{entity_route}/{entity_id}/derived-from (that is generic, but in this case it's GET /emodel/{entity_id}/derived-from) you would get only the list of BasicEntityRead with id and type

With the ids, it's possible to get the full details by calling GET /electrical-cell-recording?id__in={id_1},{id_2}....

Alternatively, EModelReadExpanded.electrical_cell_recordings could return the full details already, but this doesn't scale well because the same EModelReadExpanded is used for both the read_one and read_many endpoints, changed in #177.

3, 4, we run into N+1 problem.

GET /electrical-cell-recording?id__in={id_1} might work but it gets a bit awkward, say we have 30 emodels in the GET /emodel response and we want the electrical cell recordings for them. If there are more than one electrical_cell_recording per emodel Calling GET /electrical-cell-recording?id__in={id_1} ... {id_30} needs more than one call to get all the data, and some processing on the frontend side to group them by emodel.

However, apparently we only need one trace per emodel in the list view, not sure exactly how we choose which one is used for the thumbnail or if we can show any of them, if it's indeed arbitrary then we could nest 1 record only to keep the response small.

The thumbnail is not an experimental trace linked to the e-model. It is a plot of a run of the e-model with a supra-threshold current.
When we have an e-model configuration, we plot for the supra-threshold current with the lowest amplitude from the protocols from the configuration. For the e-models where we didn't have the configuration, we did the thumbnail plotting with a current equal to 130% of threshold if I recall correctly

Ah, thanks for the clarification. That's a separate issue irrelevant to this to the traces conversation , so I created: #205

Looking at the nexus data, there are:

images linked to the emodel: optimisation, traces, scores, thumbnail, parameters_distribution, evo_parameter_density...

images linked to the ElectricalCellRecording (SingleCellExperimentalTrace): ResponseTrace, StimulationTrace...

I understand that the images in 1 are out of scope for this PR, that should only link the emodel with the electrical-cell-recordings.
Is there a page in the UI that shows the traces used by a specific emodel (or more emodels)?
Is there a page that shows the images linked to ElectricalCellRecording/SingleCellExperimentalTrace?

Moreover, in nexus they were images and not distributions, so they aren't present in the current nexus dump. It was discussed before if they should be re-generated, or included in the next dump, but I'm not sure of the decision (cc @jdcourcol )

Is there a page in the UI that shows the traces used by a specific emodel (or more emodels)?

The EModel detail page, but that use case is covered by 3.

Is there a page that shows the images linked to ElectricalCellRecording/SingleCellExperimentalTrace?

The trace detail page

images linked to the emodel: optimisation, traces, scores, thumbnail, parameters_distribution, evo_parameter_density

These should be "ValidationResult" as these are created by the validation of a emodel or a memodel.
Thumbnail is the one that should probably be moved somehow to the thumbnail service.

1. expose GET /emodel/<uuid>/electrical-cell-recording) 2. allow to search by emodel in GET /electrical-cell-recording?generated_emodel__id=<uuid>

* origin/main: Wire in assets to the BrainAtlas and BrainAtlasRead (#207) me model calibration result proposal (#188) Remove legacy distribution metadata from assets during import (#206) sort the returned hierarchy alphabetically (#201)

AurelienJaquier

looks good to me

GianlucaFicarelli changed the title ~~Import emodel traces~~ Import and expose emodel traces May 27, 2025

GianlucaFicarelli and others added 4 commits May 27, 2025 11:44

Import and expose derivation between ElectricalCellRecording and EModel

2ab1de8

Add EModel.electrical_cell_recordings

2cd4ae8

Add EModel.electrical_cell_recordings

792fe6b

Rebase and update migration

c4c8519

GianlucaFicarelli force-pushed the import_emodel_traces branch from 84a5dbe to c4c8519 Compare May 27, 2025 09:46

GianlucaFicarelli requested review from bilalesi, mgeplf, eleftherioszisis and g-bar May 27, 2025 09:51

GianlucaFicarelli self-assigned this May 27, 2025

jdcourcol reviewed May 27, 2025

View reviewed changes

eleftherioszisis reviewed May 27, 2025

View reviewed changes

app/db/model.py Show resolved Hide resolved

Add endpoint /{entity_route}/{entity_id}/generated-by

6604e0d

jdcourcol requested changes May 27, 2025

View reviewed changes

app/routers/derivation.py Outdated Show resolved Hide resolved

Replace generated-by with derived-from

4a94eaf

GianlucaFicarelli requested review from jdcourcol and eleftherioszisis May 27, 2025 14:54

eleftherioszisis reviewed May 30, 2025

View reviewed changes

GianlucaFicarelli added 4 commits May 30, 2025 12:22

Remove 1 and 2

6207667

1. expose GET /emodel/<uuid>/electrical-cell-recording) 2. allow to search by emodel in GET /electrical-cell-recording?generated_emodel__id=<uuid>

Update migration after merge

5be625f

Remove 3

98bd2f5

GianlucaFicarelli requested review from eleftherioszisis and AurelienJaquier June 2, 2025 13:43

AurelienJaquier approved these changes Jun 2, 2025

View reviewed changes

jdcourcol approved these changes Jun 3, 2025

View reviewed changes

jdcourcol merged commit a1d1371 into main Jun 3, 2025
1 check passed

jdcourcol deleted the import_emodel_traces branch June 3, 2025 08:20

Import and expose emodel traces #200

Import and expose emodel traces #200

Uh oh!

Conversation

GianlucaFicarelli commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdcourcol commented May 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GianlucaFicarelli May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eleftherioszisis May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GianlucaFicarelli May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GianlucaFicarelli May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

eleftherioszisis May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GianlucaFicarelli May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

g-bar May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GianlucaFicarelli May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AurelienJaquier left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

GianlucaFicarelli commented May 27, 2025 •

edited

Loading

GianlucaFicarelli May 27, 2025 •

edited

Loading

eleftherioszisis May 27, 2025 •

edited

Loading

GianlucaFicarelli May 27, 2025 •

edited

Loading

GianlucaFicarelli May 27, 2025 •

edited

Loading

eleftherioszisis May 30, 2025 •

edited

Loading

GianlucaFicarelli May 30, 2025 •

edited

Loading

g-bar May 30, 2025 •

edited

Loading

GianlucaFicarelli May 30, 2025 •

edited

Loading