Skip to content

Conversation

@bukka
Copy link
Member

@bukka bukka commented Nov 29, 2022

This PR changes the JSON serialized object protection logic to use in most cases protection on object which should be more performent and reduce the memory usage. It's a done in the way so it is still possible to support nested calls with debug functions and var_export. It means

  1. use json_encode inside __debugInfo on the same object
  2. use var_dump, debug_zval_dump and print_r inside jsonSerialize on the same object.
  3. use var_export inside __debugInfo or jsonSerialize on the same object.

See various test cases that implements those combinations (should be clearer from it).

To support 1st case, the two level protection for JSON serializable object is introduced. Primarily it tries to set protection on an object and only if it's already protected, it falls back to protecting the properties hash table. In most cases it will just use protection on the object and only if called inside __debugInfo, it will use the properties hash table protection. It results in a slight output change if there is a recursion as it catches the recursion on the next level (see diff below for ext/json/tests/bug61978.phpt). I think this should be acceptable and after thinking about it, such output makes maybe even more sense because it actually shows that there's some kind of recursion rather than just a null value but this might be subjective.

To support 2nd case, a special flag on object has been introduced. The reason for that is that it might not be reliable to do protection on the result of debug info when the result is temporary. If I understand the code correctly, it might happen on the reference case. It would also leave more edge cases when all 3 approaches are combined (see test case json_encode_recursion_04.phpt) which couldn't fully work as fallback cover just 2 levels.

To support 3rd case, the protection needs to be again done on the returned properties table. For var_export it is mostly the properties table so it should be mostly reliable. The exception is the SplFixedArray that got changed in #9757 . I think we could still leave this optimization for debug functions but for var_export it might be better to fallback to the old behavior. The thing is that the change for protection in var_export introduced in #9448 broke using var_export in __debugInfo which previously worked and it is in some way a regression. I haven't implemented the changes for SplFixedArray yet as I would like to first hear other opinions about it.

@dstogov @TysonAndre @nikic what do you think about this? Does it make sense to you or do you think it should be done differently?

@bukka
Copy link
Member Author

bukka commented Nov 30, 2022

Just a note that there might be similar issue with FFI and var_export that I'd of course look at too. Basically all XFAIL tests marked here will get resolved.

#define GC_IMMUTABLE (1<<6) /* can't be changed in place */
#define GC_PERSISTENT (1<<7) /* allocated using malloc */
#define GC_PERSISTENT_LOCAL (1<<8) /* persistent, but thread-local */
#define GC_DEBUG_PROTECTED (1<<9) /* whether the protection is for debug functions */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes conflict with IS_OBJ_FREE_CALLED.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I completely missed that it's reused by obj flags. I just had a quick try to do

diff --git a/Zend/zend_types.h b/Zend/zend_types.h
index 432c79f35a..9c4cf21ec1 100644
--- a/Zend/zend_types.h
+++ b/Zend/zend_types.h
@@ -636,9 +636,9 @@ static zend_always_inline zend_uchar zval_get_type(const zval* pz) {
 
 #define GC_TYPE_MASK                           0x0000000f
 #define GC_FLAGS_MASK                          0x000003f0
-#define GC_INFO_MASK                           0xfffffc00
+#define GC_INFO_MASK                           0xfffff800
 #define GC_FLAGS_SHIFT                         0
-#define GC_INFO_SHIFT                          10
+#define GC_INFO_SHIFT                          11
 
 static zend_always_inline zend_uchar zval_gc_type(uint32_t gc_type_info) {
        return (gc_type_info & GC_TYPE_MASK);
@@ -677,11 +677,11 @@ static zend_always_inline uint32_t zval_gc_info(uint32_t gc_type_info) {
 
 /* zval_gc_flags(zval.value->gc.u.type_info) (common flags) */
 #define GC_NOT_COLLECTABLE                     (1<<4)
-#define GC_PROTECTED                (1<<5) /* used for recursion detection */
-#define GC_IMMUTABLE                (1<<6) /* can't be changed in place */
-#define GC_PERSISTENT               (1<<7) /* allocated using malloc */
-#define GC_PERSISTENT_LOCAL         (1<<8) /* persistent, but thread-local */
-#define GC_DEBUG_PROTECTED          (1<<9) /* whether the protection is for debug functions */
+#define GC_PROTECTED                (1<<5)  /* used for recursion detection */
+#define GC_IMMUTABLE                (1<<6)  /* can't be changed in place */
+#define GC_PERSISTENT               (1<<7)  /* allocated using malloc */
+#define GC_PERSISTENT_LOCAL         (1<<8)  /* persistent, but thread-local */
+#define GC_DEBUG_PROTECTED          (1<<10) /* whether the protection is for debug functions */
 
 #define GC_NULL                                                (IS_NULL         | (GC_NOT_COLLECTABLE << GC_FLAGS_SHIFT))
 #define GC_STRING                                      (IS_STRING       | (GC_NOT_COLLECTABLE << GC_FLAGS_SHIFT))

It's likely missing some things as it's breaking something in GC because it shows leaks so I would need to investigate / debug it more but before I do that and spend more time on this, I wanted to check if you would be fine in general with extending common flags to 11 or you think there's a better solution for this? Basically I just don't want to waste time on something that you think is not the right solution.

--EXTENSIONS--
ffi
--XFAIL--
var_export now expect consistent properties
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduction of segfaults in existing tests is unacceptable of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this is just a draft so I would of course fix all the issues before it's ready. I mainly wanted to check if the approach looks fine to you before I finalize it.

@bukka bukka force-pushed the json_serializable_obj_protection branch from c9ff86d to 72ff655 Compare February 10, 2023 20:29
@dstogov
Copy link
Member

dstogov commented Feb 13, 2023

I wouldn't change GC related mask without a very big reason.

As I understood, the main reason of this PR is allowing nested calls of var_dump from jsonSerialize and different variations.

Can this be fixed by remembering and reseting GC_PROTECTED flags at start of all serializers and restoring original value at the end?
Probably, this could fix only a part of the problem.

@bukka
Copy link
Member Author

bukka commented Feb 17, 2023

@dstogov It's actually more about optimizing jsonSerialize that currently uses Z_OBJPROP_P(val) and does protection on hash table. As you know, the problem is that the hashtable needs to be often created and it just requires more resources. We can't currently switch to do protection on object because it would not be possible to call var_dump inside jsonSerialize. This PR will fix that. What I also noticed when I was looking into this is that there are related issues with protection used in __debugInfo and recent changes from Tyson caused some regressions in that area (breaking the current behaviour) so this PR will also fix those cases. It means that we might need otherwise revert his changes linked above which would eliminate some other optimizations.

I'm not sure how remembering GC_PROTECTED could work. If we took just a simple case of calling var_dump inside jsonSerialize, then jsonSerialize would place the GC_PROTECTED on the object and var_dump would see the object as protected but it cannot know that it came from jsonSerialize or from __debugInfo function. What I'm doing in some cases is falling back to hash table but that's not enough for __debugInfo combinations. We need somehow to keep extra info where it came from. Alternatively we could do some heavy lifting in HashTable fallback and extend HashTable in some way that would allow for additional flags but that's just a quick idea and not sure if this would work and didn't cause slowdown. I can look into it if you think it's better way to go? Or if you have different idea, happy to look into it too.

Just for my interest, is the new flag really that problematic? From what I saw it just reduces the GC address counter to 19 bits which should still be sufficient. Or is there anything else that I missed?

@dstogov
Copy link
Member

dstogov commented Feb 20, 2023

@dstogov It's actually more about optimizing jsonSerialize that currently uses Z_OBJPROP_P(val) and does protection on hash table. As you know, the problem is that the hashtable needs to be often created and it just requires more resources.

Can you please show, what particular case is optimized. If we have recursion json_encode is failed anyway, otherwise we anyway need the hashtable. I tried a simple benchmark, but it didn't show any improvement even with callgrind.

<?php
class SerializingTest implements JsonSerializable
{
    public function jsonSerialize(): mixed
    {
        return (object)['a' => (object)['a' => (object)['a' => $this->a]]];
    }
}

$o = new SerializingTest();
for ($i = 0; $i < 1000000; $i++) {
        $s = json_encode($o);
}
var_dump($s);

Just for my interest, is the new flag really that problematic? From what I saw it just reduces the GC address counter to 19 bits which should still be sufficient. Or is there anything else that I missed?

Reducing limits of GC is definitely bad and should be done with extreme care. You actually, didn't change all the GC related limits (like GC_MAX_UNCOMPRESSED). Reducing this single bit from GC address would limit number of "uncomressed" possible GC roots to 256K. So the GC would get significant slowdown, once a php app would create graph with more than 256K object. For some apps this may become a problem.

@bukka
Copy link
Member Author

bukka commented Feb 20, 2023

@dstogov This https://github.com/php/php-src/pull/10020/files#diff-7d42a64925fbe4e298b5fcf928cb3994e4e021774503190b850be7d01d181be9R556-R557 . It's basically to prevent building of properties table when serializing JsonSerializable object. I assume that your example doesn't show any differences because there are no properties defined in it. If you add bunch, you might see some difference potentially.

Essentially this was first reported here: https://bugs.php.net/bug.php?id=81524 . Then Nikita tried to fix it in #7589 but that resulted in significant slow down so it was closed.

@dstogov
Copy link
Member

dstogov commented Feb 20, 2023

OK. Now I finally understood the root of the first problem.
In my test the property table was rebuilt once. On the following tests I see ~5% improvement,

<?php
class SerializingTest implements JsonSerializable
{
    public $a = 1;
    public function jsonSerialize(): mixed
    {
        return (object)['a' => (object)['a' => (object)['a' => $this->a]]];
    }
}

for ($i = 0; $i < 1000000; $i++) {
        $s = json_encode(new SerializingTest());
}
var_dump($s);

Unfortunately, I don't have a good idea about a general fix. I'll try to think a bit more.

@bukka
Copy link
Member Author

bukka commented Mar 24, 2023

@dstogov I have been thinking about an alternative solution for this and got an idea how to possibly resolve it. I think we could keep a single flag for the GC protection but the semantic of that would slightly change if it is applied on object. It would just mean that it was protected by any of the sources (var_dump, json_encode...) but if it is set as protected it would require getting the HashTable (using Z_OBJPROP_P) and then it would do another GC protection check for the specific source. We could use currently unused flags in

uint8_t _unused,
to mark the source assuming this is really free? If it was not protected, it would set the protection on the hash table and carry on. If it was protected that it would mark it as recursion and end. It means that the recursion would be found on the second run but that should not be really a problem IMHO. I will try to depict the logic on the following diagram so it hopefully makes more sense as it is a bit hard to describe it in text.

graph TD;
    CheckProtection --> IsObjGCProtected{Is object GC protected?}
    IsObjGCProtected -->|no| ProtectObj{Protect object using GC flag}
    IsObjGCProtected -->|yes| GetHT{Get prop HashTable from object}
    ProtectObj --> ExecutePC{Execute protected code}
    GetHT --> IsHTProtected{Is HashTable protected with source flag?}
    IsHTProtected -->|no| ProtectHT{Protect object using GC flag}
    IsHTProtected -->|yes| Recursion{End because Recursion}
    ProtectHT --> ExecutePC
    ExecutePC --> IsHTPresent{Is HashTable present?}
    IsHTPresent -->|yes| UnprotectHT{Unprotect HashTable}
    IsHTPresent -->|no| UnprotectObj{Unprotect object}
    UnprotectObj --> End
    UnprotectHT --> End
Loading

For majority of use case that HashTable branch would not need to be used as that would be used only for Recursion or nested interactions so those cases are much less likely.

Does it make sense to you and if so, do you see any issues with that?

@dstogov
Copy link
Member

dstogov commented Mar 27, 2023

I didn't understand your proposal in all details yet, but looking into the block-scheme I think that the behavior of var_dump() and others is going to be clanged. Now the "recursive objects" are going to be processed twice, before the recursion is detected. Right?

@dstogov
Copy link
Member

dstogov commented Mar 27, 2023

As I see, one of the main problems is the space required for the additional "protection" bit(s).

I think it may be possible to reuse hight bits in Z_PROPERTY_GUARD().
Classes that implements __get/__set/__unset set ZEND_ACC_USE_GUARDS flag in ce_flags. This leads to allocation of an additional property that is managed by zend_get_property_guard() function. Only 4 low bits of Z_PROPERTY_GUARD() are used now.

When JsonSerializable is implemented, ZEND_ACC_USE_GUARDS should be set.
Z_PROPERTY_GUARD(zobj->properties_table[zobj->ce->default_properties_count]) & SOME_BIT_ABOVE_4 might be used as recursion protection code for JSON.
zend_get_property_guard() needs to be updated to keep Z_PROPERTY_GUARD() bits during UNDEF->STRING->ARRAY conversion.

@bukka
Copy link
Member Author

bukka commented Mar 28, 2023

Yeah the initial idea was to do it twice. We used to have that behavior in JSON before and it is not such a big issue - it just slightly changes output but it is not such a big problem. But of course it is better not to do that so your proposal is much better! I will give it a try hopefully next or following month as this is kind lower priority thing for me but would like to get it sorted before the first beta. I will ping you then when I have got a working PR. Thanks!

@bukka
Copy link
Member Author

bukka commented Mar 28, 2023

Closing as this solution won't be used in any case.

@bukka bukka closed this Mar 28, 2023
bukka added a commit that referenced this pull request Aug 24, 2023
This PR introduces a new way of recursion protection in JSON, var_dump
and friends. It fixes issue in master for __debugInfo and also improves
perf for jsonSerializable in some cases. More info can be found in
GH-10020.

Closes GH-11812
jorgsowa pushed a commit to jorgsowa/php-src that referenced this pull request Aug 24, 2023
This PR introduces a new way of recursion protection in JSON, var_dump
and friends. It fixes issue in master for __debugInfo and also improves
perf for jsonSerializable in some cases. More info can be found in
phpGH-10020.

Closes phpGH-11812
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants