-
Notifications
You must be signed in to change notification settings - Fork 300
[DRAFT] RISC-V: Use .insn on (Zkne or Zknd) intrinsics
#1917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f31c152 to
bc825ff
Compare
.insn on Zknd or Zkne intrinsics.insn on Zkne or Zknd intrinsics
bc825ff to
3ad3868
Compare
7ee6e1e to
643b06a
Compare
.insn on Zkne or Zknd intrinsics.insn on (Zkne or Zknd) intrinsics
|
My only problem with this approach is that this will interact badly with the optimizer. LLVM must assume all Also, you can do both the rustc PR and the stdarch PR together now, due to stdarch being a subtree of the rustc repo, rather than a submodule |
I see. Worth experimenting and I'll try that later whether actual AES decryption code is heavily affected.
This is intentional because CI of I actually found that some of the tests ( |
643b06a to
1209865
Compare
**DRAFT: THIS COMMIT SHALL BE SQUASHED INTO THE NEXT COMMIT.** Because changes to rustc are not performed yet, we cannot switch `#[target_feature]` attribute to enable desired features. Instead, this commit just comments out `#[target_feature]`.
|
It seems... even if we use The code below implements 1 and 2 of AES-256 decryption routines.
#![feature(riscv_ext_intrinsics)]
#![allow(clippy::identity_op)]
use core::arch::riscv64::*;
/*
Key Sizes:
AES-256 key:
256-bit key data
== 4 u64 words
AES-256 scheduled key:
128-bit block * (14 + 1) ; where 14 is the number of AES-256 rounds
== 30 u64 words
*/
// Adjust #[target_feature] attributes below depending on the configuration and build.
// #[target_feature(enable = "zknd")]
// #[target_feature(enable = "zknd", enable = "zkne")]
#[unsafe(no_mangle)]
pub unsafe extern "C" fn aes256_key_schedule(key: *const [u64; 4], scheduled_key: *mut [u64; 30]) {
unsafe {
let key: &[u64; 4] = &*key;
let mut rk0 = key[0];
let mut rk1 = key[1];
let mut rk2 = key[2];
let mut rk3 = key[3];
let scheduled_key: &mut [u64; 30] = &mut *scheduled_key;
scheduled_key[0] = rk0;
scheduled_key[1] = rk1;
scheduled_key[2] = rk2;
scheduled_key[3] = rk3;
macro_rules! double_round {
($i: expr) => {
let tmp = aes64ks1i::<$i>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
let tmp = aes64ks1i::<10>(rk1);
rk2 = aes64ks2(tmp, rk2);
rk3 = aes64ks2(rk2, rk3);
scheduled_key[4 * ($i + 1) + 0] = rk0;
scheduled_key[4 * ($i + 1) + 1] = rk1;
scheduled_key[4 * ($i + 1) + 2] = rk2;
scheduled_key[4 * ($i + 1) + 3] = rk3;
};
}
double_round!(0);
double_round!(1);
double_round!(2);
double_round!(3);
double_round!(4);
double_round!(5);
// Process tail
let tmp = aes64ks1i::<6>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
scheduled_key[4 * 7 + 0] = rk0;
scheduled_key[4 * 7 + 1] = rk1;
}
}
// #[target_feature(enable = "zknd")]
// #[target_feature(enable = "zknd", enable = "zkne")]
#[unsafe(no_mangle)]
pub unsafe extern "C" fn aes256_key_schedule_for_decryption(
scheduled_key: *const [u64; 30],
scheduled_key_dec: *mut [u64; 30],
) {
unsafe {
let scheduled_key: &[u64; 30] = &*scheduled_key;
let scheduled_key_dec: &mut [u64; 30] = &mut *scheduled_key_dec;
// Copy the first scheduled key as-is.
scheduled_key_dec[0] = scheduled_key[0];
scheduled_key_dec[1] = scheduled_key[1];
// Inverse MixColumns on scheduled keys index 1..=13
macro_rules! invmc {
($i: expr) => {
scheduled_key_dec[$i * 2 + 0] = aes64im(scheduled_key[$i * 2 + 0]);
scheduled_key_dec[$i * 2 + 1] = aes64im(scheduled_key[$i * 2 + 1]);
};
}
invmc!(1);
invmc!(2);
invmc!(3);
invmc!(4);
invmc!(5);
invmc!(6);
invmc!(7);
invmc!(8);
invmc!(9);
invmc!(10);
invmc!(11);
invmc!(12);
invmc!(13);
// Copy the last scheduled key as-is.
scheduled_key_dec[28] = scheduled_key[28];
scheduled_key_dec[29] = scheduled_key[29];
}
} |
Using the inline assembly and `.insn` could avoid current issues regarding intrinsics available when either Zknd or Zkne is available. Note that, coincidental success of the testing process depends on the possibly flaky disassembler behavior (which attempts to disassemble instructions *not* enabled on the target of the output object file) and should be removed now or if anything breaks.
1209865 to
1a47fad
Compare
|
Little Update: Added While RISC-V is normally described as "RISC-V does not have condition flags", by default the Rust compiler assumes that RISC-V inline assembly may overwrite some of vector-related registers. Since we don't touch them, we'd better add it to ensure mixed code don't need saving/restoring such registers. It affects other intrinsics with inline assembly and I'll submit another (non-draft) PR for better efficiency on mixed RVV + non-vector intrinsics. |
|
Also note that, the option |
|
Ok, if it does optimize satisfyingly enough, we can then probably use the asm implementation. Did you send the rustc PR yet? |
|
@sayantn |
|
I have no problems, but typically changes due to review comments don't need attribution, so you really don't need to put any attributions. |
|
We didn't share the code directly but I think we helped each other so... I understand that attribution is completely optional but since you have first examined the possibilities to fix the issue #1765, I want to show your contribution in some way. |
|
Closing this in favor of rust-lang/rust#146798. In that PR, inline assembly got symbolic. |
Using the inline assembly and
.insncould avoid current issues regarding intrinsics available when either Zknd or Zkne is available.Note: this is intentionally incomplete as rustc changes are not performed yet.