perf: p-token optimize pubkey cmp #64

ananas-block · 2025-06-29T22:31:34Z

Issue

p-token uses partial_eq to compare pubkeys which is not optimal.
spl-token uses sol_memcmp which is better than partial_eq (program/src/processor.rs).

A custom pubkey_eq implementation can improve performance of many p-token instructions by 10s of CU see table.

Instruction	p-token main	sol_memcmp	pubkey_eq	Savings (pubkey_eq - main)
InitializeMint	148 CU	144 CU	143 CU	-5 CU
InitializeAccount	228 CU	227 CU	225 CU	-3 CU
Mint	187 CU	174 CU	148 CU	-39 CU
Transfer	189 CU	175 CU	144 CU	-45 CU
TransferChecked	256 CU	-	186 CU	-70 CU
Batch	1,691 CU	-	1,598 CU	-93 CU

To reproduce CU measurements, see tx logs by transfer , transfer_checked, batch tests in branches:

Changes:

introduce pubkey_eq, cast 32 byte arrays to 4 u64 chunks, compare in a loop and exit early on unequal
replace pubkey comparisons with pubkey_eq

Notes:

u64 chunks perform significantly better than u16, u32, u128 chunks see this repo.
I have only run transfer, transfer_checked, and batch tests (building all tests doesn't work with 32gb ram). It is unlikely that performance of other instructions decreased with this change but best double check.
Maybe it makes sense to upstream this to pinocchio to use it in AccountInfo::is_owned_by.
Versions:
solana-cargo-build-sbf 2.2.15
platform-tools v1.48
rustc 1.84.1

joncinque · 2025-06-30T14:56:33Z

@febo can you look at this? I thought the compiler did the right thing here, but it might not be the case

febo · 2025-06-30T16:45:41Z

@febo can you look at this? I thought the compiler did the right thing here, but it might not be the case

Interesting, this is what the compiler should be doing – I will check with @LucasSte. It might depend on what platform tools version is being used to benchmark.

febo · 2025-07-01T09:17:31Z

On a small test program using platform-tools 1.48 (solana cli 2.2.15), pubkey comparisons are done using memcmp syscall. Not sure whether it is generating bloated code in the case of p-token or not.

I am not sure we can use the u64 chunks approach, since it requires the memory to be 8-bytes aligned, which we cannot guarantee in every case:

let a_chunks = unsafe { from_raw_parts(a.as_ptr() as *const u64, 4) };
let b_chunks = unsafe { from_raw_parts(b.as_ptr() as *const u64, 4) };

Another aspect that is interesting is that if you use platform-tools 1.43 (solana cli 2.1.22), CUs significantly improve for the same instruction – e.g., transfer goes from 189 to 153 without any code change.

ananas-block · 2025-07-01T17:26:09Z

True wrt alignment, put it in draft.

AccountInfos should be aligned to u64 right?

If so, alignment should be guaranteed for all uses of pubkey_eq except one:

 else if !pubkeys_eq(destination_account_info.key(), &INCINERATOR_ID)

Fyi, I had similar CU results in p-token tests with the following implementation,
but it performed significantly worse than this pr impl in my benchmark repo over 1k iterations.

#[inline(always)]
pub fn pubkeys_eq(a: &Pubkey, b: &Pubkey) -> bool {
    let a_bytes = a.as_ref();
    let b_bytes = b.as_ref();

    // Compare 8 bytes at a time using safe array slicing
    let a_chunks = [
        &a_bytes[0..8],
        &a_bytes[8..16],
        &a_bytes[16..24],
        &a_bytes[24..32],
    ];
    let b_chunks = [
        &b_bytes[0..8],
        &b_bytes[8..16],
        &b_bytes[16..24],
        &b_bytes[24..32],
    ];

    for i in 0..4 {
        if a_chunks[i] != b_chunks[i] {
            return false;
        }
    }
    true
}

LucasSte · 2025-07-01T18:09:12Z

It is not clear the version of platform tools used for these benchmarks. The numbers for the latest released version v1.50 are in the table below. They are lower than the base for the benchmark, but still do not match the optimized version for this PR.

The compiler has a threshold for when to transform a chain of loads and compares into a syscall. Currently, we consider 3 sequences for that, since a comparison involves two loads and one compare. Thus, 3*3 = 9 CUs, while the syscall overhead is 10 CUs.

That does not account for the arguments set up we must do before invoking the syscall. We must adjust three registers which will hold the arguments for memcmp. If we account that and raise the threshold to 4 sequences (8 loads and 4 compares = 12 CUs), we achieve very good results (see the third column in the table).

Instruction	v1.50	v1.50 + adjustments
InitializeMint	135 CU	121 CU
InitializeAccount	227 CU	176 CU
Mint	170 CU	142 CU
Transfer	176 CU	146 CU
TransferChecked	232 CU	184 CU
Batch	1,598 CU	1,387 CU

I'll update the compiler, and put this improvement in a new release.

When not invoking memcmp, the code the compiler emits for comparison is a chain of u64 comparisons, similar to what this PR proposes.

febo · 2025-07-01T18:24:42Z

AccountInfos should be aligned to u64 right?

The AccountInfos are aligned, but the account data is not necessarily aligned. For example, the close account instruction compares the close authority pubkey stored on an Account in the validate_owner call and that close authority pubkey is not aligned to 8 bytes. Same situation happens with the freeze authority of a Mint account.

All signers of a Multisig are also not 8 bytes aligned.

LucasSte · 2025-07-01T18:27:18Z

The fix is here: anza-xyz/llvm-project#160

ananas-block · 2025-07-01T22:51:33Z

It is not clear the version of platform tools used for these benchmarks.

solana-cargo-build-sbf 2.2.15
platform-tools v1.48
rustc 1.84.1

ananas-block · 2025-07-01T23:59:27Z

With platform tools 1.50 a transfer with custom eq is 148 CU, instead of 144 CU with 1.48.
Anyway I am looking forward to the platform tools release!

perf: optimize-pubkey-eq

2dd56ee

joncinque requested a review from febo June 30, 2025 14:56

ananas-block marked this pull request as draft July 1, 2025 17:26

LucasSte mentioned this pull request Jul 1, 2025

[SOL] Fine tune memcmp threshold anza-xyz/llvm-project#160

Merged

ananas-block closed this Jul 2, 2025

febo mentioned this pull request Jul 9, 2025

Optimize account validation with SIMD discriminator matching and syscall reduction exotic-markets-labs/typhoon#199

Closed

LucasSte mentioned this pull request Jul 22, 2025

[SOL] Enable sink and fold pass for SBF anza-xyz/llvm-project#165

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: p-token optimize pubkey cmp #64

perf: p-token optimize pubkey cmp #64

Uh oh!

ananas-block commented Jun 29, 2025 •

edited

Loading

Uh oh!

joncinque commented Jun 30, 2025

Uh oh!

febo commented Jun 30, 2025

Uh oh!

febo commented Jul 1, 2025

Uh oh!

ananas-block commented Jul 1, 2025 •

edited

Loading

Uh oh!

LucasSte commented Jul 1, 2025 •

edited

Loading

Uh oh!

febo commented Jul 1, 2025 •

edited

Loading

Uh oh!

LucasSte commented Jul 1, 2025

Uh oh!

ananas-block commented Jul 1, 2025 •

edited

Loading

Uh oh!

ananas-block commented Jul 1, 2025

Uh oh!

Uh oh!

perf: p-token optimize pubkey cmp #64

perf: p-token optimize pubkey cmp #64

Uh oh!

Conversation

ananas-block commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Changes:

Notes:

Uh oh!

joncinque commented Jun 30, 2025

Uh oh!

febo commented Jun 30, 2025

Uh oh!

febo commented Jul 1, 2025

Uh oh!

ananas-block commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucasSte commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

febo commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucasSte commented Jul 1, 2025

Uh oh!

ananas-block commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ananas-block commented Jul 1, 2025

Uh oh!

Uh oh!

ananas-block commented Jun 29, 2025 •

edited

Loading

ananas-block commented Jul 1, 2025 •

edited

Loading

LucasSte commented Jul 1, 2025 •

edited

Loading

febo commented Jul 1, 2025 •

edited

Loading

ananas-block commented Jul 1, 2025 •

edited

Loading