-
Couldn't load subscription status.
- Fork 8.1k
Add ARM userspace infrastructure #4974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too deeply familar with ARM internals but I do have some comments.
It's not clear to me how the kernel stack is set up, if this is being carved out of the existing thread stack, need to change this so that the kernel stack is reserved either completely separately (like on x86 where it is between the guard and the thread stack) or carved to a fixed size out of the thread stack, it shouldn't be a function of the thread stack size.
We don't know for sure how big the kernel stack needs to be, I'd suggest 2K as a reasonably safe choice and we can tune it later.
arch/arm/core/thread.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the mask on the function pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it out. It's forcing the mode to ARM mode. If the LSB is 1, it means it's thumb mode.
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
On x86 we didn't need to touch the guard region at all.
Also please note, CONFIG_USERSPACE no longer depends on CONFIG_HW_STACK_PROTECTION and there may be no guard region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I just saw this. I'll have to fix this piece.
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to just override the system call id with K_SYSCALL_BAD and executing that entry in the dispatch table, with the bad ID as the first arg.
See x86 userspace.S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is cleaner. I'll change it to that.
arch/arm/core/fatal.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wanted to double-check, is ssf_ptr really a NANO_ESF?
on x86 it wasn't and I had to copy stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still have to fix this. I'll push another change tomorrow to address the stack frame. I think I'll have to copy the svc stack frame or reconstruct it from the pieces I have on hand (like the x86 method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some style comments
arch/arm/core/thread.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we swap this so its:
if (options & K_USER) {
pInitCtx->pc = ((u32_t)_arch_user_mode_enter) & 0xfffffffe;
} else {
pInitCtx->pc = ((u32_t)_thread_entry) & 0xfffffffe;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe setting pInitCtx->pc to either _arch_user_mode_enter or _thread_entry, and masking the bits after the #ifdef block will generate slightly smaller code.
arch/arm/core/swap.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we ifdef protect this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pop {pc} is nefarious because it literally pops off the stack into the pc, and then voila, you are there. But yeah I can block this off.
include/arch/arm/arch.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - remove space between 0x1 & ;
|
CI failures due to checkpatch being unhappy about whitespace |
include/arch/arm/arch.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor style nit in this function: missing vertical space between variable declaration and after the assembly block, and there's a stray space before the semicolon in the return statement.
include/arch/arm/arch.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work? call_id is the last argument passed by all the _arch_syscall_invoke*() functions, and the assembly implementation of this routine seems to take r0 as call_id (I'm not familiar with the ABI).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! yes this isn't correct and I think will fail since the call_id is in the wrong position
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this prototype seems to be no longer necessary, remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still think we can ditch this prototype unless I'm not noticing something.
include/arch/arm/arch.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other thing worth considering: system calls are a very hot code path. Here we are spending cycles to marshal 0 values for unused parameters. Are there approaches where unused parameters can just leave undefined values in the registers instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If call_id is the first argument (that will always be set), the others don't need to be passed to the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll see if I can trim this down.
arch/arm/core/swap.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be:
tst r2, #1
beq _oops
tst r2, #0 the Z flag will always be 1.
Another question is, should we need oops for this situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on x86 it is currently not a fatal error to invoke a system call in supervisor mode. it will work its just not a great way to go about using an API.
I don't think we should add any logic or checking to catch this. this is a very hot code path. it will not happen in practice unless the user is doing something really weird.
arch/arm/core/thread.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to add k_thread_abort() before CODE_UNREACHABLE.
Just like what _thread_entry() does to handle the case that thread returns from user_entry()
unexpectedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NACK
that isn't how this works, user_entry should never return here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I got it after reviewing the x86 implementation.
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the stack guard has been configured in __pendsv before _arch_user_mode_enter() being called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just make any adjustments needed to the guard area in C code and not in here?
i also don't have a clear picture on how the guard area, kernel stack, and thread stack are arranged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets complicated, because we have to use the guard region differently based on if we are in userspace or not (regardless of if STACK_GUARD is defined). Basically from lowest to highest address the memory is laid out like:
|-----guard------|---------stack-----------------|
with userspace it'll be:
|------guard-----|-------------privileged stack------|-------unprivileged stack--------|
Added complexity is that the mpu regions have to be aligned. So this means when we do userspace we absolutely have to have the stacks aligned to power of 2 boundaries. This means the guard HAS to come out of the stack, not added to the stack. All stacks have to fit in a power of 2 size. The privileged stack also has to be power of 2.
About to push some fixes for this that changes the mpu regions to work properly for user space
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think this is implemented correctly. r1 appears to contain user_entry when it is supposed to have _thread_entry()'s address in it.
it looks like this instruction will directly enter the user_entry function. What we have to do is setup the stack/registers so that we enter _thread_entry() with user_entry and its three arguments as parameters to that function.
_thread_entry never returns, and calls k_thread_abort() and does other housekeeping if user_entry returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, looked at wrong spot. this pops r0,r1,r2, and lr. Those are the 3 arguments. And it jumps to the passed in function ptr. Comment is not right. I fixed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what Andrew mentioned is that the better implementation should be like the following:
_arm_userspace_enter(k_thread_entry_t user_entry,
void *p1, void *p2, void *p3,
u32_t stack_end,
u32_t stack_start)
{
.....
.....
_thread_entry(user_entry, p1, p2, p3);
}
but the current implementation is:
_arm_userspace_enter(k_thread_entry_t user_entry,
void *p1, void *p2, void *p3,
u32_t stack_end,
u32_t stack_start)
{
.....
.....
user_entry(p1, p2, p3);
}
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this magic value of 40 represent?
can we make this a macro instead?
also this looks very very wrong in that you are validating the system call ID before you elevate to privileged mode, which means the checks aren't useful from a security perspective.
you have to do all checking after the software interrupt is invoked, nothing done when the CPU is in user mode can be trusted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move the code to that. That simplifies some things.
arch/arm/core/swap.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about how this works..
It seems that upon "svc #3" we land here, and if we are unprivileged, we set mode to privileged, switch to the kernel stack, and return wherever execution was when the "svc #3" call was made.
what's to prevent malicious user code from entering privileged mode anytime they please and then running whatever code they want as a supervisor?
you can't trust the code this returns to...i think what you need to do is re-enable interrupts and handle the system call here in the SVC handler or something like that, which is what we do on x86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do that unfortunately because we cannot handle nested SVC calls. This would mean that we can't context switch without it faulting.
|
Since we do not have emulator support, sanitycheck can't be used to verify this. I suggest running the following test cases to validate the port. CONFIG_USERSPACE=y should be added to the defconfig for whatever platform you are testing on (ideally, one device with an ARM MPU and one with an NXP MPU) Also test with CONFIG_HW_STACK_PROTECTION enabled and disabled. I reviewed the code some more, I think I found a showstopper with how the privilege elevation is handled. |
arch/arm/core/userspace.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r0 will be stack_obj + stack_info.size here, so memset() will set the memory out of the range of stack_obj.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has this been fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah now we switch over to the priv stack and then memset out the complete user mode stack area. No need to dance around any stored info here.
a199b50 to
3066f83
Compare
|
In case you don't already know it, if you try to build tests/kernel/msgq/msgq_api with CONFIG_USERSPACE=y and run it on frdm_k64f, it falls over with: |
arch/arm/core/swap.S
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to check up front if the caller is unpriv and only allow it to use svc 3 in that case? Otherwise, we're allowing userspace to invoke any of the other svc calls, and some of those might be trusting the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the mode is in the CONTROL reg, and I did this check and bombed out in a previous version of this if the code was privileged. I am on the fence on this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is different - I don't care about preventing a privileged caller from making an arbitrary svc call; I just want to prevent an unprivileged caller from calling anything other than svc 3 (and any other individual svcs that are explicitly whitelisted as being safe for use by userspace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok I see what you mean. This would mean some sort of macro or interface where we define system call numbers and their access mode and then do a lookup to validate it in the actual svc call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's what Stephen is saying.
What we want to prevent is user mode from invoking (for example) SVC #1. That would be a backdoor into the system. On x86 they are different interrupt vectors with different privilege levels, but since this is a common handler on ARM we need to filter this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think we need to refactor the design.
Right now, we are using SVC for stuff which should never be allowable from user mode: irq_offload() and inducing a kernel panic. irq_offload runs an arbitrary function pointer in interrupt context. kernel panic hoses the entire system (unlike an oops which just kills a thread).
Is there no way to check what the interrupting context's privilege level was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you know the callers privilege level. My point is that the SVC has always been meant to be called from unprivileged contexts. If we want to add this extra consideration, so be it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that the SVC has always been meant to be called from unprivileged contexts
For the irq_offload() and panic cases, we are using SVC because we want to generate a CPU exception from software, even though these should not be allowable from user mode.
Is there perhaps a better way to do it?
For sake of efficiency, we could check that the CPU is not in user mode in the logic for panic/offload cases and jump to the oops case if the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what do we return if we invoke a syscall from privileged context? We shouldn't allow that either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what do we return if we invoke a syscall from privileged context? We shouldn't allow that either.
Do we care?
AFAIK on x86 it does work, although it's a terribly inefficient thing to do since you can just call the implementation directly.
I'm fine with leaving the scenario as "undefined behavior" and not doing checks, or only checking in an assertion. I can't conceive on how supervisor mode would be invoking system calls like this unless someone is doing something really weird, and we don't try to prevent malfeasance in supervisor mode anyway.
I don't think it's worth spending cycles checking for this since it is such a hot code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some questions and suggested edits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passed-in (with a hyphen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our convention is to use asterisks for H2 headings (equal signs are for H3 headings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill change it to *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asterisks here (****)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asterisks here (****)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the implications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a sentence pointing to the memory placement section later in the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
particular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I think you're saying,
The stack guard is defined at the bottom (the lowest address) of the stack.
You're use of "or" makes it sound like an alternative rather than clarifying what the "bottom of the stack" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. i'll change it to your suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rewrite this sentence a bit, and move it to the end of this paragraph:
This requires a power of two ceiling be computed for the alignment and size
of the stack, and can result is the actual memory allocation of a given stack
being larger than the requested size. For example, a request for a 1500 byte stack
would be aligned and resized to a 2kB boundary and size if a power of two alignment
is required by the MPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i redid the section. Take a look at the new version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/which/that/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a size implication here too, such as multiples of 32 bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. those implementations use a start/end address register. The only requirement is that start/end are aligned on 32 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence probably needs to be a little more precise.
|
This appears to be "passing" the userspace tests but for the wrong reason. |
|
So the userspace tests need APPLICATION MEMORY set to make sure they can access the shared user data section. Otherwise all they get access to is the thread stack and defined memory domains. |
|
Yes, userspace has CONFIG_APPLICATION_MEMORY=y in its prj.conf. That didn't change. |
Yes. |
Ouch. That sounds like a bad bug in ztest. |
arch/arm/core/cortex_m/mpu/nxp_mpu.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Will fix.
|
This needs a rebase. |
7c1d369 to
ba80f05
Compare
This patch adds a configure_mpu_user_context API and implements the required function placeholders in the NXP and ARM MPU files. Signed-off-by: Andy Gross <[email protected]>
This patch adds support for userspace on ARM architectures. Arch specific calls for transitioning threads to user mode, system calls, and associated handlers. Signed-off-by: Andy Gross <[email protected]>
This patch set implements the APIs and changed required to support the user mode thread support. Signed-off-by: Andy Gross <[email protected]>
This patch adjusts the calculation of the overflow size for the kernel stack tests which read/write to areas below the current user stack. Signed-off-by: Andy Gross <[email protected]>
This patch fixes calculations for the top of the interrupt and main stacks. Due to power of two alignment requirements for certain MPUs, the guard size must be taken into account due to the guard being counted against the initial stack size. Signed-off-by: Andy Gross <[email protected]>
ba80f05 to
c8cf7b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few issues.
This is gonna need a bunch of testing but I'm inclined to merge this soon.
tests/ztest/src/ztest.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is really wrong here.
thread_stack is defined as:
K_THREAD_STACK_DEFINE(thread_stack, CONFIG_ZTEST_STACKSIZE +
CONFIG_TEST_EXTRA_STACKSIZE);
K_THREAD_STACK_SIZEOF(thread_stack), by definition, returns the same value passed to K_THREAD_STACK_DEFINE. It should already be CONFIG_ZTEST_STACKSIZE + CONFIG_TEST_EXTRA_STACKSIZE.
Why are you getting a different result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to change this to not be ARM-specific. ARC port is using this too, for example.
bf0c7cd to
01166c0
Compare
|
addressed Andrew's issues with the ztest change and the docs being ARM specific. The ztest change ended up being unnecessary due to a later addition in the arm specific _new_thread changes. |
This patch adds documentation on the design and implementation of stack objects for architectures which utilize MPU backed stack and memory protection. Signed-off-by: Andy Gross <[email protected]>
01166c0 to
4c59c85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna push this once we have a positive CI result.
This patch set adds the ARM based userspace infrastructure. The patches include all the parts to get system calls, demoting threads to user mode, etc.