-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Ensure stateless layer instances #4448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
R/layer.r
Outdated
@@ -363,7 +365,9 @@ Layer <- ggproto("Layer", NULL, | |||
}, | |||
|
|||
finish_statistics = function(self, data) { | |||
self$stat$finish_layer(data, self$stat_params) | |||
params <- self$stat_params_computed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better but not technically 100% safe; what happens if there's an uncaught error between compute_statistic()
and finish_statistic()
? I assume you tried turning these into function arguments, but there's too many functions to thread through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean that the self$stat_params_computed
field would linger?
I don't see this as such a problem (pragmatically), since it is calculated fresh every time the plot is rendered and the old value will thus never influence any future render... I reset it to NULL
mostly out of principle, not that it has any practical effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is possible to avoid storing the computed params somewhere in the object since it is used in disparate places (and in the case of geom_params
both in ggplot_build()
and ggplot_gtable()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's just the principle that lead to this mistake in the first place — we should never store state in a reference class object because of the risk of it persisting across multiple calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree in general but we do already make an exception for the layer objects which are stateful but we avoid issues by carefully cloning only the stateless parts. So layers are already special
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I confused myself - it is Layout
which is allowed a state, not Layer
...
Hmm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thinking I can't come up with a better approach... And while I agree that it is not 100% great, I'd argue that it is 100% safe (within the bounds of what we are discussing).
The two fields carrying state, geom_params_computed
, and stat_params_computed
are never used before they are calculated anew. As such, it doesn't matter if they are reset to NULL
at the end of the rendering because they will always be recalculated and overwritten when a new rendering starts.
The offender in all this is the introduction of state (however inconsequential) to Layer
objects. We cannot avoid some state to be carried around through the different rendering sub-operations. For Coord
and Facet
we have created the Layout
object to manage that state, but we do not have anything similar for managing the state of Layer
objects throughout rendering. Introducing a new stateful class for this seems overkill, so it is either a question of conceding a bit of state to layers or saving them directly in the plot object which doesn't seem better to me...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that reasoning sounds fine to me. I think it just remains to make this code as clear as possible. I'd suggest:
- Giving these new variables a common prefix, so they sort together in the argument list (maybe
state_
)? - Removing the code that resets them to
NULL
, since it doesn't actually do anything, and seems like a red herring that might be confusing in the future
And then add a test to make sure we don't accidentally reintroduce this behaviour in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure — that sounds reasonable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a reminder: coord_sf()
has a similar issue with carrying state through the drawing process, as it needs to record the bounding boxes of all the layers:
Lines 71 to 79 in 5319784
# Allow sf layer to record the bounding boxes of elements | |
record_bbox = function(self, xmin, xmax, ymin, ymax) { | |
bbox <- self$params$bbox | |
bbox$xmin <- min(bbox$xmin, xmin) | |
bbox$xmax <- max(bbox$xmax, xmax) | |
bbox$ymin <- min(bbox$ymin, ymin) | |
bbox$ymax <- max(bbox$ymax, ymax) | |
self$params$bbox <- bbox | |
}, |
Fix #4204
This PR fixed an unnoticed issue (before #4204) where layers kept a new state after rendering which could lead to a change in appearance after the first render.
The issue was introduced when I added the
setup_params()
hook to geoms to make them symmetrical with all other ggproto objects. Because geom params are used in different calls I overwrote the original geom_params field which introduced a state.While fixing this I noticed that
finish_layer()
uses the un-computed stat_params, so I've added the same fix for that as well...The fix is basically to have a second field for the temporary computed params that can be used throughout the rendering. Once it is used the last time it will be reset to
NULL
making layer become borderline stateless again. I say borderline, becausegeom_params
is used in bothggplot_build()
andggplot_gtable()
so calling only the first will leave it with a state. This is not an issue as such since the state is not infectious as it will be recreated every time the plot gets rendered