Skip to content

Conversation

@kezabelle
Copy link
Contributor

See #910 for details.
Essentially, try and reduce the number of calls to pformat by storing a string representation of a given context layer and only doing a pformat when we haven't seen it before.

In practice for me on Python 2.7 this makes a massive difference.
In theory the difference is somewhere between the best case I'm presenting, and the current pathological situation, because a context which itself contains nested un-ordered data structures (sets, dictionaries) probably don't guarantee their output when converted to a string representation, but there's not a huge amount that can be done by that, I don't think.

Rendering a user change form in the admin with 50 groups, in the current pathological case will do 1121 pformat calls at the point where it attempts to put the temp_layer in the context_list and another 2226 to pformat an individual value within a context layer.

With the proposed changes, the total number of pformat calls becomes 175, which is slightly under the number of templates rendered.

These numbers are only my findings using a single test case, one version of Python, and one operating system (OSX) ... so if anyone wants to try this patch out to independently verify whether or not it has a real world affect elsewhere, that'd be nice.

@kezabelle
Copy link
Contributor Author

kezabelle commented Apr 22, 2017

I should add, there's possibly a middleground where the pformat calls are deferred until they're needed in generate_stats instead of happening during collection (_store_template_info), but that would bound the maximum number of calls based on the size of the widget choices (in the problematic widgets like Select etc) so a 20,000 item choice set would still end badly. If this current patch is acceptable, it should mitigate that somewhat, hopefully.

@matthiask
Copy link
Member

(The only failing build on Travis CI was the flake8 check.)

Hey @kezabelle,

Thanks you, this looks great!

I'm still worrying a little bit about nested dictionaries; the sorted(...items()) thing will always work for the topmost level, but what about dicts containing dicts, is their stringification somewhat stable or all over the place? Anyway, this wouldn't be a blocker for me, it seems like a micro-optimization.

Is the force_text bit really necessary? Resp. might it be sufficient to simply use id(value) instead? (That is, are the context dicts copied, or can we maybe use their identity?)

Anyway, that's just nitpicking and if nobody else sees something bad I missed I'll merge this in a few days. Thanks again!

@kezabelle
Copy link
Contributor Author

I'm still worrying a little bit about nested dictionaries; the sorted(...items()) thing will always work for the topmost level, but what about dicts containing dicts, is their stringification somewhat stable or all over the place.

Yeah, I nodded to that issue in the 'theory' part of the original comment. I wouldn't think it's stable, but it seemed "good enough" from my testing, and certainly better in at least some workloads than the current situation.

might it be sufficient to simply use id(value) instead? That is, are the context dicts copied, or can we maybe use their identity?

This is ... a super good point that I'd entirely overlooked, because I nearly never use id()
A cursory test suggests that the dicts themselves, and indeed the Context instances, may well have the same identity, rather than copies. Only the temp_layer is, by necessity, a copy.
Let me see if I can make that work instead, as it probably offers even better speedups & stability.

@codecov
Copy link

codecov bot commented Apr 23, 2017

Codecov Report

Merging #933 into master will increase coverage by 0.17%.
The diff coverage is 71.79%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #933      +/-   ##
==========================================
+ Coverage   83.25%   83.42%   +0.17%     
==========================================
  Files          31       31              
  Lines        1660     1671      +11     
  Branches      246      246              
==========================================
+ Hits         1382     1394      +12     
  Misses        199      199              
+ Partials       79       78       -1
Impacted Files Coverage Δ
debug_toolbar/panels/templates/panel.py 88% <71.79%> (+2.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e02f132...e0b6f88. Read the comment docs.

@kezabelle
Copy link
Contributor Author

kezabelle commented Apr 23, 2017

Pushed a new commit on top, which instead of stringifying & hoping for the best, uses the id() of the Context and only re-computes a new context_list (of pformat strings) when it sees a new one (which might be per-widget, or per-{% include with x=y only %} etc)

This is actually faster again than the previous one (for me), precisely because the Context appears to get re-used a lot. YMMV because of different versions, etc.

edit:

Having thought further on it, using the id() of the Context is probably wrong, as it will remain the same even while pushes and pops happen on it. Using id() on each individual layer (a dict) makes more sense, as pushes will be detected, and pops won't matter. Speed remains improved for me.

edit 2:

Thinking on it further, using id() won't work at all correctly because it only tells us the memory location of the instance, but a dictionary is mutable, and Django widgets seem to be making use of that, such that the same id(dict) is seen multiple times, but on subsequent runs it should have further key/value changes. sys.getsizeof would tell us whether the size had changed during subsequent examinations of the same dict, but wouldn't tell us if the values themselves had necessarily changed (True takes up the same space as False, etc).

I have one remaining idea to try, using a pair of lists to track the data structures and equality checks, but it may be that the stringifying offers the best trade off of saving some/many/most pformat calls, but if there's any chance the data is different (even if just because of an unordered repr) doing a new one anyway.

…et rendering, the number of context values which need prettifying has increased drastically, which can (given enough template contexts to render) cause pages using the template panel to become essentially unresponsive due to the sheer amount of data.

Instead, when first seeing a new context layer, keep a copy of it + the formatted version, and when subsequently seeing the exact same layer (by
relying on the implicit __eq__ calls it is expected the lists's __contains__ will do) re-use the existing formatted version.

This cuts the number of pformat calls at a minimum by half, and ultimately (far) more than that due to re-use of the same one where possible.
@kezabelle
Copy link
Contributor Author

Right. I've ditched the id() idea (see previous comment edits) because unfortunately it leads to incorrect contexts being rendered.
I've replaced the string version originally proposed in the ticket; instead of stringifying & hoping for the best, I'm now storing the layers in a list and relying on the list __contains__ doing an __eq__ for the dictionary to determine whether anything ought to be considered as changed. I think that's a more sensible solution? Ostensibly this oughtn't to be any worse than the stringifying way -- both have the possibility for doing more re-renders than absolutely necessary, but that's better than doing fewer!

For posterity's sake, the previous attempts remain in a separate branch:

  • the stringify version is still available in 3a18bff
  • the id(...) attempts are available as 6154046 and 82a7277

@aaugustin
Copy link
Contributor

Thanks for working on this!

@matthiask matthiask merged commit e0b6f88 into django-commons:master May 2, 2017
@matthiask
Copy link
Member

Thank you! That's awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants