Skip to content

Conversation

@lmmx
Copy link

@lmmx lmmx commented Oct 12, 2025

Summary

When the preserve_order feature is enabled, JSON objects now use IndexMap instead of HashMap to maintain insertion order. This is visible for objects with more than 32 keys, where halfbrown switches from Vec to HashMap.

Key changes

  • Conditionally define Object type alias based on preserve_order feature
  • Update ObjectHasher to use RandomState for IndexMap compatibility
  • Add ObjectMap type alias to abstract over HashMap/IndexMap
  • Fix FromIterator impls to explicitly use hasher for IndexMap
  • Update all deserializer structs to support both map types
  • Add conditional imports for indexmap throughout codebase

Fixes ordering issue for objects with 33+ keys when preserve_order
feature is enabled.

Success 🎉

Tests pass both with and without the new feature flag

cargo nextest run -F preserve_order -E 'test(preserve_order)'
    Starting 8 tests across 7 binaries (448 tests skipped)
        PASS [   0.003s] simd-json value::borrowed::test::preserve_order_roundtrip_33_keys
        PASS [   0.003s] simd-json value::borrowed::test::preserve_order_32_keys_baseline
        PASS [   0.003s] simd-json value::borrowed::test::preserve_order_33_keys
        PASS [   0.003s] simd-json value::borrowed::test::preserve_order_50_keys
        PASS [   0.002s] simd-json value::owned::test::preserve_order_33_keys
        PASS [   0.003s] simd-json value::owned::test::preserve_order_roundtrip_33_keys
        PASS [   0.003s] simd-json value::owned::test::preserve_order_32_keys_baseline
        PASS [   0.004s] simd-json value::owned::test::preserve_order_50_keys
────────────
     Summary [   0.037s] 8 tests run: 8 passed, 448 skipped

Full suite:

  • No features
     Summary [   0.926s] 448 tests run: 448 passed, 2 skipped
  • -F preserve_order
     Summary [   0.960s] 454 tests run: 454 passed, 2 skipped

Rewrite

Components implemented

  • cmp
  • from
  • de
  • se
  • lazy
  • tape

✔️ Compatible with all features

Status

  • A lot of heavy lifting has been done, some of the initial "feature switching" approach is still in there, most notably in the ObjectMap which is used in the generic deserialize functions. These functions need to work with both ordered and non-ordered types, depending on what the caller wants. So de/serialisation will either need to get additional separate deserialise functions for ordered types, or the existing ones should work generically with both.

Copy link
Member

@Licenser Licenser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This completely ignores the discussion in the issue

@lmmx
Copy link
Author

lmmx commented Oct 12, 2025

Sorry Heinz, what do you mean? I outlined my approach here as 2 steps, and you accepted the PR in value-trait with the first one, which this uses as its follow-on 2nd step. What part did I miss?

Do you mean this comment?

Ja, I prefer a new type for the index map backed value over it switching to allow the use of multiple map backends in the same project without feature flag conflicts or surprising changes in behavior. An alternative to OrderedBorrowedValue would be parameterizing BorrowedValue with a map backend not sure if that would improve things or make then tougher 😅 ultimately I think both are okay.

I took that last sentence to mean that the way I had outlined was okay, is this not what you had in mind?

edit oh, sorry I misread that as saying the approach I had or an alternative parameterising the type would both be okay, I see now you were actually suggesting 2 alternatives. Gotcha

I submitted this PR 6 hours ago and you replied 4 hours ago so I wasn't ignoring your comments, just looks like I implemented the wrong way 😞

@lmmx lmmx marked this pull request as draft October 13, 2025 17:16
@lmmx lmmx force-pushed the indexmap-support branch 9 times, most recently from 450cfed to 75b7c62 Compare October 13, 2025 20:02
@lmmx

This comment was marked as resolved.

@lmmx lmmx force-pushed the indexmap-support branch 3 times, most recently from 18a1c88 to f8c939a Compare October 13, 2025 21:12
@lmmx lmmx marked this pull request as ready for review October 13, 2025 21:16
@lmmx lmmx force-pushed the indexmap-support branch from f8c939a to d99ead1 Compare October 13, 2025 21:16
@lmmx lmmx force-pushed the indexmap-support branch from d99ead1 to 696b542 Compare October 13, 2025 21:31
@lmmx lmmx marked this pull request as draft October 13, 2025 22:20
Copy link
Member

@Licenser Licenser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly we're living in a time where I've to write reviews like this but I'm getting the strong feeling you're throwing an AI at this problem without thinking or reviewing your own PR.

If that's the case please just close the PR, I'm neither interested nor willing in reviewing 7000 LOC AI slop nor taking on the maintenance burden of it

let v_simd_borrowed = to_borrowed_value(d3).expect("to_borrowed_value failed");
assert_eq!(v_simd_borrowed, v_simd_owned);
let v_deserialize: OwnedValue = deserialize(d4).expect("deserialize failed");
let v_deserialize: OwnedValue = deserialize::<OwnedValue, String>(d4).expect("deserialize failed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What change does require adding the types here? I'm somewhat concerned that this might become required for callers and affect the API negatively.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm I agree, I think this came up while iterating on either the new ordered Value serde de.rs impls or ordered/from.rs ones (and making sure they didn't clash and give "duplicate impl" error)

I just tried removing it and it gave this, that was the cargo directive I followed:

error[E0283]: type annotations needed                
    --> src/tests/serde.rs:1011:45                                                                                                                                                                                    
     |                                                                                                                                                                                                                
1011 |             let v_deserialize: OwnedValue = deserialize(d4).expect("deserialize failed");                                                                                                                      
     |                                             ^^^^^^^^^^^ cannot infer type of the type parameter `Key` declared on the function `deserialize`                                                                   
     |                                                                                                     
     = note: cannot satisfy `_: Hash`                
note: required by a bound in `value::deserialize`                                                          
    --> src/value.rs:116:10                                                                                
     |                                                                                                     
113  | pub fn deserialize<'de, Value, Key>(s: &'de mut [u8]) -> Result<Value>                              
     |        ----------- required by a bound in this function                                             
...                                                                                                        
116  |     Key: Hash + Eq + From<&'de str>,                                                                
     |          ^^^^ required by this bound in `deserialize`                                               
help: consider specifying the generic arguments                                                                                                                                                                       
     |                                                                                                                                                                                                                
1011 |             let v_deserialize: OwnedValue = deserialize::<value::owned::Value, Key>(d4).expect("deserialize failed");                                                                                          
     |                                                        ++++++++++++++++++++++++++++                                                                                                                            
                                                     
error[E0283]: type annotations needed                
    --> src/tests/serde.rs:1011:45                                                                                                                                                                                    
     |                                                                                                                                                                                                                
1011 |             let v_deserialize: OwnedValue = deserialize(d4).expect("deserialize failed");                                                                                                                      
     |                                             ^^^^^^^^^^^ cannot infer type of the type parameter `Key` declared on the function `deserialize`                                                                   
     |                                                                                                     
     = note: multiple `impl`s satisfying `_: From<&str>` found in the `proptest` crate:                    
             - impl From<&'static str> for Reason;                                                         
             - impl From<&'static str> for StringParam;                                                    
note: required by a bound in `value::deserialize`                                                          
    --> src/value.rs:116:22                                                                                
     |                                                                                                     
113  | pub fn deserialize<'de, Value, Key>(s: &'de mut [u8]) -> Result<Value>                              
     |        ----------- required by a bound in this function                                             
...                                                                                                        
116  |     Key: Hash + Eq + From<&'de str>,                                                                
     |                      ^^^^^^^^^^^^^^ required by this bound in `deserialize`                         
help: consider specifying the generic arguments                                                                                                                                                                       
     |                                                                                                                                                                                                                
1011 |             let v_deserialize: OwnedValue = deserialize::<value::owned::Value, Key>(d4).expect("deserialize failed");                                                                                          
     |                                                        ++++++++++++++++++++++++++++                                                                                                                                                                                                                                   

Comment on lines +97 to +102
/// Hashmap used for objects
#[cfg(not(feature = "preserve_order"))]
type ObjectMap<K, V> = HashMap<K, V, ObjectHasher>;
/// Hashmap used for objects
#[cfg(feature = "preserve_order")]
type ObjectMap<K, V> = IndexMap<K, V, ObjectHasher>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.

Copy link
Author

@lmmx lmmx Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware! This is left over from the first rewrite as already discussed.

PR is WIP, hence I put it back in draft mode didn't mean for it to be reviewed yet. See comment

@lmmx
Copy link
Author

lmmx commented Oct 14, 2025

Sadly we're living in a time where I've to write reviews like this but I'm getting the strong feeling you're throwing an AI at this problem without thinking or reviewing your own PR.

If that's the case please just close the PR, I'm neither interested nor willing in reviewing 7000 LOC AI slop nor taking on the maintenance burden of it

I am not "just throwing an AI at this problem without thinking or reviewing your own PR", no. Ouch 🥲

I understand what I'm doing, and the draft status of this PR I put back on it after your initial review was intended to communicate 'Work In Progress', sorry I could've made that more clear.

some of the initial "feature switching" approach is still in there

Specifically this is where the PR description noted that this is WIP

I spent yesterday rewriting from the easy initial proof of concept (simple feature gated approach which swapped the hashmap, that was simple to implement and worked but the wrong design) to the new one (adding the indexmap so it won't interfere with the existing one, that requires more extensive changes). The main difference being it needs all new impls (for the value: cmp, from; for the serde value: de, se).

Code production and editing has been done by me, incrementally moving from the feature gated swapping to the feature gated addition style not automated with an AI agent.

I rolled back the original commits and re-pushed because at midnight I was done for the night having made good progress, not because I was finished/ready to review.

The size of the code additions has mainly come from plain old cp tool copying files (+ rg, sed, vim etc.), as introducing new types requires duplicating impls etc. That makes sense in this context to do, because the super::Value when applied to a submodule under ordered:: can simply refer to the ordered Value type and leave it all the same (e.g. this goes for the cmp and from modules). We already discussed doing this, please let me know if you think the code growth here is an undesirable maintenance burden but simply diffing them against their existing corresponding files should show they are nothing genuinely new. Refactoring said duplicated modules is of course possible to DRY this out, but I felt the first port of call should just be duplicating in standalone modules, and figuring out better ways on further iterations.

I have written a checklist of the components that were successfully ported (all except lazy/tape) because this is a big undertaking. The last thing I did here was notice that I had indeed left the ObjectHasher (as you also noted in review).

  • I was also meaning to ask whether or not making lazy/tape support indexmap should be considered necessary for this PR to land

The last thing I did last night was leave a "note to self" to address that very thing:

Screenshot from 2025-10-14 11-39-33

In short I put this PR back to a draft PR because it is not ready after I initially took it in the wrong direction (as already discussed), and was not expecting you to review. I woke up this morning and the first thought I had upon waking was about the next step to implement, I am absolutely not "not thinking" or not invested in doing this right.

@Licenser
Copy link
Member

Sorry for the rudeness @lmmx

@lmmx
Copy link
Author

lmmx commented Oct 14, 2025

Aha it's all good 😅 🙏

@lmmx lmmx changed the title indexmap support [DRAFT] indexmap support Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants