Skip to content

Ensure stateless layer instances #4448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 3, 2021
Merged

Conversation

thomasp85
Copy link
Member

Fix #4204

This PR fixed an unnoticed issue (before #4204) where layers kept a new state after rendering which could lead to a change in appearance after the first render.

The issue was introduced when I added the setup_params() hook to geoms to make them symmetrical with all other ggproto objects. Because geom params are used in different calls I overwrote the original geom_params field which introduced a state.

While fixing this I noticed that finish_layer() uses the un-computed stat_params, so I've added the same fix for that as well...

The fix is basically to have a second field for the temporary computed params that can be used throughout the rendering. Once it is used the last time it will be reset to NULL making layer become borderline stateless again. I say borderline, because geom_params is used in both ggplot_build() and ggplot_gtable() so calling only the first will leave it with a state. This is not an issue as such since the state is not infectious as it will be recreated every time the plot gets rendered

@thomasp85 thomasp85 added this to the ggplot2 3.3.4 milestone Apr 26, 2021
@thomasp85 thomasp85 requested a review from hadley April 26, 2021 08:55
R/layer.r Outdated
@@ -363,7 +365,9 @@ Layer <- ggproto("Layer", NULL,
},

finish_statistics = function(self, data) {
self$stat$finish_layer(data, self$stat_params)
params <- self$stat_params_computed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better but not technically 100% safe; what happens if there's an uncaught error between compute_statistic() and finish_statistic()? I assume you tried turning these into function arguments, but there's too many functions to thread through?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean that the self$stat_params_computed field would linger?

I don't see this as such a problem (pragmatically), since it is calculated fresh every time the plot is rendered and the old value will thus never influence any future render... I reset it to NULL mostly out of principle, not that it has any practical effect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is possible to avoid storing the computed params somewhere in the object since it is used in disparate places (and in the case of geom_params both in ggplot_build() and ggplot_gtable())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's just the principle that lead to this mistake in the first place — we should never store state in a reference class object because of the risk of it persisting across multiple calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general but we do already make an exception for the layer objects which are stateful but we avoid issues by carefully cloning only the stateless parts. So layers are already special

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I confused myself - it is Layout which is allowed a state, not Layer...

Hmm...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thinking I can't come up with a better approach... And while I agree that it is not 100% great, I'd argue that it is 100% safe (within the bounds of what we are discussing).

The two fields carrying state, geom_params_computed, and stat_params_computed are never used before they are calculated anew. As such, it doesn't matter if they are reset to NULL at the end of the rendering because they will always be recalculated and overwritten when a new rendering starts.

The offender in all this is the introduction of state (however inconsequential) to Layer objects. We cannot avoid some state to be carried around through the different rendering sub-operations. For Coord and Facet we have created the Layout object to manage that state, but we do not have anything similar for managing the state of Layer objects throughout rendering. Introducing a new stateful class for this seems overkill, so it is either a question of conceding a bit of state to layers or saving them directly in the plot object which doesn't seem better to me...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that reasoning sounds fine to me. I think it just remains to make this code as clear as possible. I'd suggest:

  • Giving these new variables a common prefix, so they sort together in the argument list (maybe state_)?
  • Removing the code that resets them to NULL, since it doesn't actually do anything, and seems like a red herring that might be confusing in the future

And then add a test to make sure we don't accidentally reintroduce this behaviour in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure — that sounds reasonable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a reminder: coord_sf() has a similar issue with carrying state through the drawing process, as it needs to record the bounding boxes of all the layers:

ggplot2/R/coord-sf.R

Lines 71 to 79 in 5319784

# Allow sf layer to record the bounding boxes of elements
record_bbox = function(self, xmin, xmax, ymin, ymax) {
bbox <- self$params$bbox
bbox$xmin <- min(bbox$xmin, xmin)
bbox$xmax <- max(bbox$xmax, xmax)
bbox$ymin <- min(bbox$ymin, ymin)
bbox$ymax <- max(bbox$ymax, ymax)
self$params$bbox <- bbox
},

@thomasp85 thomasp85 requested a review from hadley April 29, 2021 08:12
@thomasp85 thomasp85 merged commit 07ddd0c into master May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ggsave inproperly rendering geom_line with factored y-variable
3 participants