Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selectively delegating some supervisor domain external interrupts #34

Closed
AndybnACT opened this issue Feb 23, 2024 · 24 comments
Closed

Selectively delegating some supervisor domain external interrupts #34

AndybnACT opened this issue Feb 23, 2024 · 24 comments
Assignees

Comments

@AndybnACT
Copy link

Some TEE OSes may have a requirement to toggle the interrupt enable bit from the REE. For instance, OPTEE calls spin_lock_xsave() to disable both FIQ and IRQ during some critical sections. It seems like the riscv porting effort has to upgrade this operation to a sbi-call even with smsdia in SMMTT. This is because msdeie is an m-mode register, and SDEIE is only available in mie, which means delegation of this type of interrupt to s-mode is impossible. This is a good security choice as s-mode domain should not interfere with interrupts belonging to others. Every thing related to inter-domain managements should be done by the RDSM. However, this may increase the latency or complexity to do interrupt management when the relationship between some domains are not symmetric (Linux + OPTEE).

To achieve the improved latency without sacrificing security. Should we consider adding following 2 changes?

  1. Make SDEIE visible in sie so delegation works.
  2. Add an m-mode MAXLEN CSR msdeideleg E.g. to selectively determine if any of a supervisor domain external interrupts are able to be delegated.

Interrupts for an inactive s-mode domain only get delegated into the hart if mie.SDEIE is set, and the bit representing the s-mode domain in msdeie and msdeideleg is set. Interrupts for an inactive domain where msdeie is true but msdeideleg is not set still trigger a m-mode interrupt.

This type of delegation does not mean the running s-mode domain should process interrupts from an inactive s-file domain. Instead, this delegation works as a mask-able notification for the running s-mode domain.

@ved-rivos
Copy link
Collaborator

ved-rivos commented Feb 23, 2024

It seems like the riscv porting effort has to upgrade this operation to a sbi-call even with smsdia in SMMTT.

That is not necessary. The supervisor domain should can mask its own external interrupt globally using sstatus.SIE or locally using sie.SEIE. While clearing SIE disables all interrupts - timer, external, software, LCOFI, etc. - clearing SEIE disables external interrupts.

Make SDEIE visible in sie so delegation works.

MSDEI is a M-mode local interrupt that indicates to the RDSM that there are pending external interrupts for one or more of the supervisor domains. This interrupt is used by the scheduler in RDSM to determine if it should schedule a supervisor domain for execution. It is not useful to delegate this interrupt to any supervisor domain.

When a supervisor domain that has a supervisor domain interrupt controller directly assigned to it, the RDSM updates the msdcfg.SDICN to select that interrupt controller and may clear the bit corresponding to that interrupt controller in msdeie prior to resuming execution of the supervisor domain. Hence when a supervisor domain is executing on the hart, the bit corresponding to its interrupt controller is 0 in msdeie.

@gagachang
Copy link

gagachang commented Feb 29, 2024

This is because msdeie is an m-mode register, and SDEIE is only available in mie, which means delegation of this type of interrupt to s-mode is impossible. This is a good security choice as s-mode domain should not interfere with interrupts belonging to others. Every thing related to inter-domain managements should be done by the RDSM. However, this may increase the latency or complexity to do interrupt management when the relationship between some domains are not symmetric (Linux + OPTEE).

I've raised similar question in previous issue: #5 (comment)
For ARM, OP-TEE has higher permission than Linux. (Let me call it unilaterally-mistrusting domains here.)
When CPU is running OP-TEE context, ARM can directly inject Linux interrupt (non-secure interrupt, FIQ) into OP-TEE, without EL3 secure monitor intervention.
That means when CPU is running in S-mode supervisor domain, Other supervisor domain's interrupt "notification" can be directly injected into current supervisor domain, without M-mode RDSM intervention.
It can improve the performance of interrupt handling when the system adopts unilaterally-mistrusting domains model.

In short, I think what @AndybnACT wants is "FIQ in RISC-V".

Make SDEIE visible in sie so delegation works.

I think it is unnecessary to make the whole MSDEIE visible for S-mode supervisor domain, unless you want to control interrupt delegation per domain.
We can just let mip.MSDEIP and mie.MSDEIE (refer to Smmtt section 6.4 - Machine Interrupt registers) be delegated to S-mode supervisor domain by corresponding bits in mideleg.
Active supervisor domain can determine if there is an "FIQ" in the trap handler by

if (sip.SSDEIP == 1 /* any SD interrupt pending */ && sip.SEIP == 0 /* current SD interrupt pending */) {
    /* Yield control to other supervisor domains */
}

FIQ can be enabled/disabled by sie.SSDEIE, so that active supervisor domain only accepts sip.SEIP.

Hi @ved-rivos, What do you think for this ?
I remember you mentioned RISC-V should have "mutually-mistrusting SDs".
But since RISC-V is highly bespoke architecture, some vendors may want to adopt "unilaterally-mistrusting domains model".
Then we can define such flexibility into the spec to not restrict the system behavior.

@AndybnACT
Copy link
Author

Hi @gagachang,

I think it is unnecessary to make the whole MSDEIE visible for S-mode supervisor domain, unless you want to control interrupt delegation per domain.

I didn't propose this. S-mode should not see and control MSDEIE. It is a violation of domain security. Interrupt delegation should be controlled by RDSM. The idea I proposed is to give RDSM a way to relate closely coupled domains (Linux + OPTEE). e.g. to decide interrupts for selected domains can be sent as a notification to an active domain. And this kind of notification is maskable within the active domain.

RDSM has to inject such notification to the active domain without this. If this happens when the active domain does not intend to receive interrupts of the related domain (e.g. mask off FIQs when running OPTEE), then the active domain has to tell RDSM to delay such notification. Such mechanism may not be trivial to implement in the software.

@gagachang
Copy link

gagachang commented Feb 29, 2024

Hi @AndybnACT

I see. Something wrong in my last comment.
The mie.MSDEIE should not be controlled by SD, so it should not be delegated directly.

It depends on the implementation of "FIQ".
From Ved's comment (#5 (comment))
FIQ can be a kind of virtual interrupt.
Active SD can mask/unmask that virtual interrupt directly.
(From Andes' OPTEE prototype, we use a fake PLIC interrupt source to be this virtual interrupt.)

While the inactive SD's interrupt trapped to RDSM, RDSM can "pend" that virtual interrupt for active SD, and mret back to active SD.
FIQ will be taken only when active SD unmasks that virtual interrupt.

It seems you propose sie.SDEIE (or sip.SDEIP) to be this virtual interrupt, am I correct?

@AndybnACT
Copy link
Author

We can just let mip.MSDEIP and mie.MSDEIE (refer to Smmtt section 6.4 - Machine Interrupt registers) be delegated to S-mode supervisor domain by corresponding bits in mideleg.
Active supervisor domain can determine if there is an "FIQ" in the trap handler by

I don't think this alone fits in the security model of SMMTT. Delegating the entire SDEI to a domain gives it the ultimate privilege to see or ignore all interrupts on the platform.

So that's the reason why I listed the second proposed change. Let m-mode have a fine grain control of "interrupts from which domain is delegated". Only interrupts from a domain, where msdeideleg has corresponding bit set, is sent to and is maskable to the active domain when SDEIE is delegated in mdeleg. Interrupts from other domains work as what the original spec defines.

FIQ can be a kind of virtual interrupt.
Active SD can mask/unmask that virtual interrupt directly.

Yes, this is a good way to forward the FIQ. RDSM has to temporarily disable interrupts from the inactive domain source before mret back. I was thinking that then the s-mode domain must make a sbi call to ask RDSM to re-enable it after the domain unmask FIQs. But in fact if we raise an virtual interrupt in RDSM, then the s-mode domain can trap into the entry point to perform managed-exit once the domain unmask FIQs.

However, as you mentioned, this would cause extra m-mode round trips on FIQs delivery path. I have not done many in OPTEE and perhaps this is another topic. But I am wondering what makes OPTEE need managed exit, because Linux doesn't need it.

@gagachang
Copy link

gagachang commented Feb 29, 2024

Hi @AndybnACT

If I understand correctly, your proposal would look like following pseudo code of CPU actions:

/* Inactive SDs' interrupt controllers send the signal to CPU while CPU is running active SD. */

if (msdeip[i] && msdeie[i]) {    /* Making mip.MSDEIP = 1 */
    if (mie.MSDEIE) {
        if (msdeideleg[i]) {
            sip.SSDEIP = 1;
            if (sie.SSDEIE) {
                /* Trap to S-mode active SD. Acts as FIQ. */
            } else {
                /* Just pending FIQ since active SD masks FIQ. */
            }
        } else {
            /* Trap to M-mode RDSM */
        }
    }
}

But in fact if we raise an virtual interrupt in RDSM, then the s-mode domain can trap into the entry point to perform managed-exit once the domain unmask FIQs.

Yes, you got the point.

But I am wondering what makes OPTEE need managed exit, because Linux doesn't need it.

AFAIK it is because Linux can do "task migration to another CPU" and cause OP-TEE context corrupts if there is no managed exit, on SMP system.
Considering the following scenario with managed exit:

  1. Linux thread A on CPU 0 calls TA.
  2. TA is handling request on CPU 0.
  3. Non-secure interrupt traps to S-mode. OP-TEE saves TA context and yields the control to Linux to handle that NS-interrupt.
  4. Linux scheduler migrates thread A from CPU 0 to CPU 1.
  5. Linux thread A on CPU 1 calls OP-TEE again.
  6. OP-TEE restores previous TA context on CPU 1.

If RDSM directly preempts OP-TEE without managed exit:

  1. Linux thread A on CPU 0 calls TA.
  2. TA is handling request on CPU 0.
  3. Non-secure interrupt traps to RDSM. RDSM saves CPU 0's OP-TEE context and mret to Linux.
  4. Linux scheduler migrates thread A from CPU 0 to CPU 1.
  5. Linux thread A on CPU 1 calls OP-TEE again.
  6. RDSM restores CPU 1's OP-TEE context, which is not correct context of previous TA.

Because RDSM has no information about task migration, it will restore wrong OP-TEE context in this case.
You can refer to ARM FF-A section 9.3.1.2 - Non-secure interrupt is signaled after a managed exit (ME) for detail.
Note that I am not very familiar with Linux task migration.
If my description is wrong please correct me.

@rsahita
Copy link
Collaborator

rsahita commented Mar 1, 2024

@gagachang The RDSM should not have any info of any task migration by design - it is the role of the SDSM (the security monitor in the supervisor domain) to activate the right TA context on the pCPU/pHart selected by the host (Linux) and save it on an interrupt that was delegated to it or emulated by the RDSM.

Please see the models described in the cove spec (which is one of the use cases of supervisor domains) - see section "5.3. TSM operation and properties" in this spec

@gagachang
Copy link

gagachang commented Mar 1, 2024

@gagachang The RDSM should not have any info of any task migration by design - it is the role of the SDSM (the security monitor in the supervisor domain) to activate the right TA context on the pCPU/pHart selected by the host (Linux) and save it on an interrupt that was delegated to it or emulated by the RDSM.

Please see the models described in the cove spec (which is one of the use cases of supervisor domains) - see section "5.3. TSM operation and properties" in this spec

Thanks @rsahita! I checked section 5.3 and it aligns our thought here.
Yes, RDSM should have simple design. Task migration is per OS implementation.

From paragraph of section 5.3:

On an S-mode interrupt, the TSM hart context is saved by the
TSM and keeps the interrupt pending. The TSM may then TEERET to the host OS/VMM with
explicit information about the interruption provided via the pending interrupt to the OS/VMM.

We are thinking if we can improve the performance of "notifying TSM that there is an OS/VMM's interrupt".
In other words: "notifying active SD that there is an inactive SD's interrupt".
And also a simple way for active SD to control whether it wants to receive the notification or not. (Proposed by @AndybnACT)
In current Smmtt design, inactive SDs interrupts always trap to M-mode RDSM (by mip.MSDEIP).
It is RDSM's responsibility to inform active SD to suspend its execution (e.g., by pending active SD's virtual interrupt and mret).
This would cause extra M-mode round trips on this information delivery, as Andy mentioned.

If vendor wants to deploy unilaterally-mistrusting domains model (e.g., Linux + OP-TEE), we may want to define a faster way into Smmtt to eliminate that round trips.

@rsahita
Copy link
Collaborator

rsahita commented Mar 1, 2024

In current Smmtt design, inactive SDs interrupts always trap to M-mode RDSM (by mip.MSDEIP).
It is RDSM's responsibility to inform active SD to suspend its execution (e.g., by pending active SD's virtual interrupt and mret).

right - this is by design, as we do not want one SD to prevent/modify the delivery of any interrupts meant for another SD.

@gagachang
Copy link

right - this is by design, as we do not want one SD to prevent/modify the delivery of any interrupts meant for another SD.

It's correct for mutually-mistrusting SDs.
Could we have more options (or extension of Smsdia) for Secure/Non-secure SDs ?

@rsahita
Copy link
Collaborator

rsahita commented Mar 5, 2024

@gagachang can you clarify what does Secure/Non-Secure SDs mean?

@gagachang
Copy link

@gagachang can you clarify what does Secure/Non-Secure SDs mean?

Hi @rsahita Sure!
They are akin to Normal world and Secure world in ARM TrustZone.
Secure SD = Secure World OS (ex: OP-TEE)
Non-secure SD = Non-secure World OS (Linux)

Secure world has higher system permission than normal world:

  1. Secure OS can directly access Non-secure OS memory space.
  2. Secure OS can map non-secure physical memory into its page tables.
  3. Secure OS can receive Non-secure interrupt notification (i.e., FIQ) without EL3 firmware interleave, and yield control to Non-secure world to handle that FIQ. Without EL3 firmware interleave gets greater interrupt performance.

Above bullet 1 and 2 can be kind of shared-memory in MTT.
But bullet 3 needs interrupt delegation, which is proposed by this issue.

@ved-rivos
Copy link
Collaborator

I suggest discussing in terms of RISC-V architecture. When a Secure-SD is executing, there might be one of the following interrupts pending for non-secure-SD that may require switching out of the currently executing secure-SD to the non-secure-SD - External Interrupt, Timer Interrupt, other local interrupts such as High priority RAS interrupt, Low priority RAS interrupt, high-power or over-temperature event (not ratified), debug/trace interrupt (not ratified) or other custom interrupts.

In case of ARM architecture I believe FIQ used to represent a higher priority interrupt than IRQ but starting ARMv8 I believe that is not true and FIQ are generally designated for use as secure interrupt and so would just be only delegated to secure-SD exclusively (or shared somehow between the secure monitor and the secure SD).

While what is being discussed here could possibly provide an external interrupt notification to the secure-SD, the RDSM still needs to provide a notification interrupt to the SD on occurrence of all other forms of interrupts. Given that there is no concept of "Fast" interrupt vs. "Slow" interrupt, I am not sure there is much motivation to add complexity and exposures (covert channels, avenues for denial of service, etc.) by delegating just the external interrupt.

@gagachang
Copy link

Hi @ved-rivos

Refer to ARM documents, they use FIQ as Group 1 interrupts from other security state [1]. (Group 0 is group of interrupts that should be handled in EL3)

image

External Interrupt, Timer Interrupt, other local interrupts such as High priority RAS interrupt, Low priority RAS interrupt, high-power or over-temperature event (not ratified), debug/trace interrupt (not ratified) or other custom interrupts.

Yes, you're right, RISC-V has so many local interrupts, while most of ARM interrupts are routed into GIC and become IRQ/FIQ.
IMHO, local interrupts are always belong to active SD (ensured and context switch by RDSM), otherwise they must trap and dispatched by RDSM.
I am also thinking if "CLIC, or its interrupt sources" can also be selected (like we select APLIC by SDICN), but it might be too complex.

I am not sure there is much motivation to add complexity and exposures (covert channels, avenues for denial of service, etc.) by delegating just the external interrupt.

If platform chooses this model, that implies active SD is trusted by other SDs.
It is an open option for vendors. Vendors can still choose mutually-distrusting model.

[1]: https://developer.arm.com/documentation/ihi0069/latest/

@ved-rivos
Copy link
Collaborator

I hope none of the information you pasted is encumbered information. I am choosing not to visit the links you have pasted here.

For the secure/non-secure SD scenarios we only need to be able to delegate the MSDEIP to the secure SD. We can rename this interrupt to SDEIP and make the corresponding bit writable in mip.

@gagachang
Copy link

gagachang commented Mar 8, 2024

@ved-rivos

I hope none of the information you pasted is encumbered information.

Are you asking if FIQ has a patent ?

For the secure/non-secure SD scenarios we only need to be able to delegate the MSDEIP to the secure SD. We can rename this interrupt to SDEIP and make the corresponding bit writable in mip.

Thanks! I think it is correct direction.

I guess current design would be

if (mip.SDEIP && !msdeip[SDICN]) {
    /* There is an interrupt from other SD. */
    if (mideleg[32]) {
        /* Trap to S-mode to notify active SD directly. */
        sip.SDEIP = 1
    } else {
        /* Trap to M-mode, M-mode should notify active SD. */
    }
}

BTW, @AndybnACT proposed msdeideleg to let m-mode have a finer-grained control of "interrupts from which domain is delegated". So not all other SDs' interrupts will be delegated to the active SD. @AndybnACT would you have some comments?

if (mip.SDEIP && !msdeip[SDICN]) {
    /* There is an interrupt from other SDs. */

    /* Assume interrupt is from interrupt controller N, which is bound with another SD. */
    if (msdeideleg[N]) {
        /* Trap to S-mode to notify it directly. */
        sip.SDEIP = 1
    } else {
        /* Trap to M-mode, M-mode should notify active SD. */
    }
}

@gagachang
Copy link

gagachang commented Mar 8, 2024

When a supervisor domain that has a supervisor domain interrupt controller directly assigned to it, the RDSM updates the msdcfg.SDICN to select that interrupt controller and may clear the bit corresponding to that interrupt controller in msdeie prior to resuming execution of the supervisor domain. Hence when a supervisor domain is executing on the hart, the bit corresponding to its interrupt controller is 0 in msdeie.

Hi @ved-rivos
How do you feel about skipping the checks for msdeie[SDICN] and msdeip[SDICN] directly, when construct mip.SDEIP ?

may clear the bit corresponding to that interrupt controller in msdeie

Skip them in hardware side may increase performance, since RDSM doesn't need to set/clear msdeie every domain context-switch.
Does it make sense?

roughly pseudo code:

/* Current design */
mip.SDEIP = msdeie & msdeip

/* Proposed */
mip.SDEIP = {msdeie[MXLEN - 1 : SDICN + 1], msdeie[SDICN - 1: 0]} & {msdeip[MXLEN - 1 : SDICN + 1], msdeip[SDICN - 1: 0]}

@AndybnACT
Copy link
Author

BTW, @AndybnACT proposed msdeideleg to let m-mode have a finer-grained control of "interrupts from which domain is delegated". So not all other SDs' interrupts will be delegated to the active SD. @AndybnACT would you have some comments?

Sorry I don't quite understand the first "if" clause. Shouldn't mip.SDEIP be decided after resolving msdei[pe]? I am trying to write down the pseudo code according to my understanding. It should probably be something like this when the spec is unmodified:

/*  external s-domain interrupt */
if (msdeip & msdeie)
    mip.MSDEIP = 1 

IIUC, the likely happening one is:

/*  external s-domain interrupt */
if (msdeip & msdeie) {
    /* mip.SDEIE is writable so is able to be delegated into s-mode  */
    mip.SDEIP = 1
}

The one I proposed add a layer of control. It controls whether each s-domain external interrupt is delegated along with mdeleg.SDEIE.

/*  external s-domain interrupt */
if (msdeip & msdeie) {
        mip.SDEIP = 1
}
/* Delegation */
if ((~msdeideleg) & msdeip & msdeie) {
    /* causes trap into m-mode */
}
if (mdeleg.SDEIE && (msdeideleg & msdeip & msdeie)) {
    /* delegates to s-mode */
}

I am not sure if this can complicate the spec too much. What if msdeie & msdeideleg has the active domain's bit set, and delegated? Or, does it violates the privilege spec? There should be a cleaner way to do it.

@AndybnACT
Copy link
Author

Skip them in hardware side may increase performance, since RDSM doesn't need to set/clear msdeie every domain context-switch.

I don't think performance may be a big concern since msdeie is a hart-private csr, and it's likely that msdeie would be touched anyway. But I do appreciate it because this makes the spec cleaner.

Since interrupts from the active domain go through mip.seip, one may deselect mdeleg.seie to have the interrupt trapped into m-mode. smsdia introduce a way to do the same thing. Active domain's interrupt causes a trap into m-mode if msdeie has the active domain's bit set.

if ((msdeip & msdeie) & ~(1 << msdcfg.SDICN)) {
        mip.SDEIP = 1
}

@ved-rivos
Copy link
Collaborator

What I proposed is to simply make mideleg[MSDEIP] writable and allow this interrupt to be delegated to the SD.

@gagachang
Copy link

gagachang commented Mar 10, 2024

What I proposed is to simply make mideleg[MSDEIP] writable and allow this interrupt to be delegated to the SD.

Thanks @ved-rivos ! That's a cleaner way to achieve the feature.

@gagachang
Copy link

gagachang commented Mar 10, 2024

smsdia introduce a way to do the same thing.

Active domain's interrupts are already controlled to be delegated or not by mideleg[SEIP].
That's why I felt msdeie[SDICN] and msdeip[SDICN] can be skipped by ISA when constructing mip.SDEIP
Otherwise, when we want to delegate active SD's external interrupts, we must set mideleg[SEIP] and clear msdeie[SDICN] respectively during domain context switch.

Anyway, if the programmers are cautious about this, it might not be problem.

@rsahita
Copy link
Collaborator

rsahita commented Mar 19, 2024

FYI @AndybnACT @gagachang PR #40

@rsahita
Copy link
Collaborator

rsahita commented Mar 21, 2024

closing this as PR #40 has been acked by Andy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants