Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require store local when storing return sentries #64

Merged
merged 3 commits into from
Aug 20, 2024
Merged

Require store local when storing return sentries #64

merged 3 commits into from
Aug 20, 2024

Conversation

nwf-msr
Copy link
Contributor

@nwf-msr nwf-msr commented Jul 25, 2024

Backwards control-flow arcs should ideally be confined to the stack and register save areas. Conveniently, we have mechanism in the RTOS to identify exactly those areas of memory, with capabilities bearing SL (store local) permission. And, after #54, the ISA has mechanism for identifying backwards control-flow arcs, with the two return sentry types. We should have capability store impose the requirement for SL in the authorizing cap if the cap being stored is a return sentry.

Credit where it's due: this is Robert's idea, originally suggested in my obviously-wrong-in-retrospect
#63 ("Have CJALR create !G sentries?").

@nwf-msr nwf-msr added the maybe-v2 Tracking issues for possible changes for an ISAv2 label Jul 25, 2024
@nwf-msr nwf-msr requested review from davidchisnall and rmn30 July 25, 2024 12:14
Copy link
Collaborator

@rmn30 rmn30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to update the archdoc and change log.

src/cheri_insts.sail Outdated Show resolved Hide resolved
@rmn30
Copy link
Collaborator

rmn30 commented Jul 25, 2024

I wonder if we could also make use of a local forward sentries e.g. for passing a non-capturable callback pointer? If so we could use one of the spare executable otypes and do the same trick.

We no longer throw exceptions for !SL authorities storing !G capabilities, just
clear the tag.
@davidchisnall
Copy link
Collaborator

In the current software model, we don't ever really need to do that because cross-compartment callbacks are not sentries and the fact that clearing G on a sealed thing doesn't clear load-global means that we can already provide no-capture callbacks.

@rmn30
Copy link
Collaborator

rmn30 commented Jul 25, 2024

Would be nice to also update the text that defines store local in sec:perms and mention in sec:sealing.

@vmurali
Copy link
Collaborator

vmurali commented Jul 25, 2024

That was quick :)

Backwards control-flow arcs should ideally be confined to the stack and register
save areas.  Conveniently, we have mechanism in the RTOS to identify exactly
those areas of memory, with capabilities bearing `SL` (store local) permission.
And, after #54, the ISA has
mechanism for identifying backwards control-flow arcs, with the two return
sentry types.  We should have capability store impose the requirement for `SL`
in the authorizing cap if the cap being stored is a return sentry.

Credit where it's due: this is Robert's idea, originally suggested in the
obviously-wrong-in-retrospect
#63 ("Have CJALR create !G
sentries?").

Co-authored-by: Robert Norton <[email protected]>
@nwf-msr nwf-msr requested a review from rmn30 July 29, 2024 15:33
src/cheri_insts.sail Outdated Show resolved Hide resolved
Co-authored-by: Robert Norton <[email protected]>
@vmurali
Copy link
Collaborator

vmurali commented Jul 29, 2024

Regarding the maybe-v2 label - if you are going to commit these changes, then Ibex should also be modified in v1 itself, right?

@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jul 29, 2024

I believe the plan is to leave this PR open and let FV make further progress on CHERIoT and CHERIoT-Ibex in particular as it stands now, since this is merely a nice to have and does not, we think, change the system's security properties.

ETA: to make it more apparent that that's the plan, I've marked this PR as a draft.

@nwf-msr nwf-msr marked this pull request as draft July 29, 2024 19:17
@vmurali
Copy link
Collaborator

vmurali commented Jul 29, 2024

I believe the plan is to leave this PR open and let FV make further progress on CHERIoT and CHERIoT-Ibex in particular as it stands now, since this is merely a nice to have and does not, we think, change the system's security properties.

It actually does change the system's security properties as I discussed offline:

A callee can now (without this PR) stash a return cap on a first call into its global memory, and on a future call, use the stashed return cap to return, unexpectedly, to the first caller. That can be exploited to at least make the compartment of the first caller behave erratically.

This PR prevents this behavior by preventing stashing of return caps.

@vmurali
Copy link
Collaborator

vmurali commented Jul 29, 2024

To make a bolder claim about this PR - this PR obviates the need for shadow stacks to ensure CFI during returns. (One still needs landing pads for CFI during intra-compartment calls)

@davidchisnall
Copy link
Collaborator

Return caps are never visible between compartments, except the return cap to the switcher (which just returns from the current trusted stack frame and so can be invoked anywhere without affecting the security model).

@davidchisnall
Copy link
Collaborator

We still need the shadow stack to handle the error-recovery case.

@vmurali
Copy link
Collaborator

vmurali commented Jul 29, 2024

Return caps are never visible between compartments, except the return cap to the switcher (which just returns from the current trusted stack frame and so can be invoked anywhere without affecting the security model).

It's visible to libraries.

@vmurali
Copy link
Collaborator

vmurali commented Jul 29, 2024

We still need the shadow stack to handle the error-recovery case.

Why ? Maybe you are talking about a different shadow stack (I am not talking about the trusted stack, but rather the standard CFI enforcement technique using shadow stacks to store the return address out-of-band during a call)

@davidchisnall
Copy link
Collaborator

Across compartments, we go via the switcher and so clear the stack. This ensures that the switcher return cap isn’t captured.

Within compartments, we don’t zero the stack and so a stack UAF can rearrange return addresses and make a function return to an earlier return address.

@vmurali
Copy link
Collaborator

vmurali commented Jul 30, 2024

Documenting what we discussed offline: this PR prevents shared libraries from storing return caps and later using old return caps. This reduces the TCB of shared libraries. But the assumption right now is that shared libraries are trusted, which makes this fix less critical.

@PhilDay-CT
Copy link

PhilDay-CT commented Jul 30, 2024

I'm probably missing something obvious, but in the docs it says "Libraries contain code and read-only data but do not have mutable globals" - so how could a library store a return cap ?

BTW since globals in general aren't accessible outside compartments I read this as meaning something special in the context of a library.

@rmn30
Copy link
Collaborator

rmn30 commented Jul 30, 2024

@PhilDay-CT one of the read-only objects could be a pointer (capability) to a writable region. I'm not 100% certain that the tooling actually supports this. At the very least I think you could define an MMIO region and import it in the library.

The most important thing is we should emphasize that library calls are not a proper security boundaries and calling them is effectively the same as importing the code into your compartment. This is confused by the fact that you can and we do use library calls as security boundaries with asymmetric trust as in the case of things like the fast token unseal but this requires special care in the form of manually written assembly so we don't encourage it outside the core RTOS functionality.

@davidchisnall
Copy link
Collaborator

The most important thing is that we should emphasize is that library calls are not a proper security boundaries and calling them is effectively the same as importing the code into your compartment

Specifically, there is no attempt to clear registers or isolate the stack on library calls. This means that malicious code in a library can reach any pointers that you have on your stack.

For example, if you are able to include a malicious compartment and persuade someone to use a malicious library from a victim compartment, your malicious library can walk up the stack and can find anything that you passed as an argument into this compartment. If you pass a callback function, the library can then invoke it and leak data (including just passing your stack and global pointers to the malicious compartment).

Even with this change, if victim compartment B has a stack use-after-free bug and a bug that allows a copy to that location, attacker compartment A could pass a read-only capability to part of its own stack that contains a return capability, and trigger this bug to overwrite a return address in the callee. When the callee returns, it will jump to attacker-controlled code, which will then have access to the victim's stack and globals. Without this change, that bug could also be a copy from a heap object that is set up in a previous call.

This is why I don't believe that this change makes a qualitative difference to the security model. It is useful defence in depth, because it makes it somewhat reduces the set of bugs that become vulnerabilities, but it does not eliminate bug classes. A stack use after free and a copy from an attacker-controlled object to an attacker-controlled offset in the dangling stack pointer remain the primitives that you need to assemble the vulnerability chain to break compartment confidentiality and integrity.

@vmurali
Copy link
Collaborator

vmurali commented Jul 30, 2024

Even with this change, if victim compartment B has a stack use-after-free bug and a bug that allows a copy to that location, attacker compartment A could pass a read-only capability to part of its own stack that contains a return capability, and trigger this bug to overwrite a return address in the callee. When the callee returns, it will jump to attacker-controlled code, which will then have access to the victim's stack and globals.

I do not understand how this attack can work with this PR. How would the library shared between compartments A and B access compartment A's return cap in order to install it into compartment B's stack?

@davidchisnall
Copy link
Collaborator

I do not understand how this attack can work with this PR. How would the library shared between compartments A and B access compartment A's return cap in order to install it into compartment B's stack?

Compartment A passes an argument to compartment B. This can be the return cap itself, or simply a read-only capability to a region on A's stack. The library can then load find where that capability is spilled on B's stack to load the return address and can store it anywhere on B's stack.

@vmurali
Copy link
Collaborator

vmurali commented Jul 30, 2024

Compartment A passes an argument to compartment B. This can be the return cap itself, or simply a read-only capability to a region on A's stack. The library can then load find where that capability is spilled on B's stack to load the return address and can store it anywhere on B's stack.

Thanks for the explanation. Shadow stacks wouldn't have prevented this either as the library could jump to the read-only cap of A's stack when called from B (this can be camouflaged as a call rather than a return to avoid the shadow stack from flagging it). The problem here is that A's caps are installed inside B through the argument itself.

@vmurali
Copy link
Collaborator

vmurali commented Jul 31, 2024

Note that the library can reach beyond the current stack or the current compartment. For instance, let's say an isolated thread 1 operating on an isolated compartment A calls a library function f. When another isolated thread 2 operating on another isolated compartment B calls f later, f can return to an old stashed return pointer in compartment A (compartment A cannot expect such an arbitrary entry into its code).

This is specifically the case I am trying to eliminate with this PR.

@davidchisnall
Copy link
Collaborator

Shadow stacks wouldn't have prevented this either

Exactly. This is why I get twitchy when people talk about CFI as an ISA-level property. An ISA can give you tools for checking properties, but no CFI scheme is robust in the presence of an attacker who can violate memory safety. Within a compartment, the compiler is in the TCB for memory safety of the stack and globals and so anything that doesn’t use the compiler (e.g. inline assembly) can violate CFI. We can constrain how it violates CFI, and we can also make it easier for non-malicious code to defend itself against bugs.

This is specifically the case I am trying to eliminate with this PR.

And I agree that, as defence in depth, this is a benefit with no downsides.

@davidchisnall davidchisnall marked this pull request as ready for review August 20, 2024 16:18
@nwf-msr nwf-msr requested a review from rmn30 August 20, 2024 16:50
@nwf nwf merged commit caf3bd5 into CHERIoT-Platform:main Aug 20, 2024
3 checks passed
@nwf-msr nwf-msr deleted the 202407-require-sl-ret-sentries branch August 20, 2024 17:18
@vmurali
Copy link
Collaborator

vmurali commented Aug 20, 2024

Does this mean the HW has changed too?

@nwf
Copy link
Member

nwf commented Aug 20, 2024

I believe Kunyan has a change to Ibex to land imminently, pending verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maybe-v2 Tracking issues for possible changes for an ISAv2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants