MCS: Improve constraints on grant via reply

Author: Matthew Brecknell
Proposed: 2022-11-04

Summary

In MCS configurations, the ability to grant capabilities when replying to a call is determined entirely by the grant right on the reply capability. If a server can create reply objects by retyping untyped memory, then it can also grant capabilities to its clients. Consequently, many realistic systems would not have the authority confinement property that we would expect.

This RFC proposes two alternatives:

Both alternatives restrict the server's ability to grant via reply according to whether or not the grant right is present on the receive endpoint capability at the time the server called seL4_Recv.
The first alternative keeps the grant bit on reply object capabilities. Servers that want to grant via reply must have grant rights on both the receive endpoint capability and the reply object capability.
The second alternative removes the grant bit from reply object capabilities. Servers that want to grant via reply only need a grant right on the receive endpoint, but can't make reply object capabilities that mask those grant rights.

Motivation

The seL4 integrity theorems show that seL4, in non-MCS configurations, can enforce access control policies. For example, one can construct a system in which untrusted client and server components can communicate through an endpoint, but cannot write to each other's private memory.

The theorems are conditional: They require system setups that constrain the propagation of authority. In particular, grant rights are too powerful for the access control model, so one of the conditions is that untrusted components cannot grant or be granted capabilities.

We would like to prove similar integrity theorems for MCS configurations, and therefore need appropriate mechanisms to limit capability grant rights between untrusted components. Currently, the only way to prevent an MCS server granting to its clients is to prevent it from retyping untyped memory. That's too severe a limitation. It's also an indirect mechanism, which would be difficult to express in an access control model.

This RFC proposes a more direct mechanism for limiting an MCS server's ability to grant to its clients, and should provide a clearer path to useful integrity theorems for MCS configurations.

Guide-level explanation

We give a brief account of the history that led to the current situation.

In the non-MCS kernel, there are no reply objects. Reply capabilities are generated by the kernel when a call is received through an endpoint, and inserted into a slot in the receiver's TCB. The kernel sets the grant right on the reply capability iff the grant right was set on the endpoint cap the receiver used when it called seL4_Recv.

The intention of the design is that the receiver's endpoint capability controls the receiver's ability to grant capabilities when it replies. The implementation is the grant right on the reply capability, but the kernel ensures that the grant right on the generated reply capability is determined by the grant right on the receiver's endpoint capability. The intention is important, because endpoint rights are crucial to the ability to construct systems with useful access control properties.

MCS changed the object-capability model, introducing reply objects and scheduling contexts. Reply objects make protected procedure call stacks more explicit, allowing scheduling contexts to be passed from caller to receiver and back again, potentially recursively. Reply object capabilities are supplied by the receiver, and the kernel internally links the reply object with the calling thread when a call is received.

With respect to grant rights for replies, MCS reused parts of the implementation from the non-MCS kernel, but did not capture its intention. In both MCS and non-MCS, the ability to grant when replying is controlled by the grant bit on the reply capability. However, the way that bit is set is very different. In MCS, the grant right on a reply object capability is typically controlled by the receiver, if it can create reply objects. The grant right on the receiver's endpoint capability has no effect. Consequently, it is harder to construct MCS systems with useful access control properties.

In this RFC, we propose changes that limit an MCS receiver's ability to grant when replying, according to the rights on the receiver's endpoint capability. The kernel propagates the endpoint grant right via the reply object.

Reference-level explanation

We propose to add a boolean canGrant field to the MCS reply object. It is controlled by the kernel, and is only meaningful when the reply object is associated with either a receiver waiting for a caller, or a caller waiting for a reply. In those cases, the canGrant field indicates whether the capcanGrant bit was set on the receiver's endpoint capability when it called seL4_Recv or seL4_ReplyRecv.

Regardless of whether a caller is waiting on the endpoint, the canGrant field of the reply object is set in receiveIPC, which is part of either seL4_Recv or seL4_ReplyRecv. In receiveIPC, the kernel has access to both the receiver's endpoint capability and its reply object capability, so it can directly copy the capCanGrant flag from the receive endpoint capability to the canGrant field of the reply object.

If there is already a caller waiting, then the send phase of the call completes immediately, and the caller becomes blocked on the reply object.

If the receiver has to block waiting for a caller, then either:

The receiver's IPC is cancelled before a caller arrives, and the reply object becomes dissociated from the receiver. In this case, the canGrant value that was set on the reply object is never used.
The receiver remains blocked until a caller arrives, and the canGrant field of the reply object still reflects the capCanGrant bit that was set on the receiver's endpoint capability. The send phase completes, and the caller becomes blocked on the reply object.

Therefore, regardless of how the caller becomes blocked on the reply object, the canGrant field of the reply object reflects the capCanGrant bit on the endpoint capability used to receive that call. This is the property we'll need to prove a useful integrity theorem.

This RFC includes two alternatives:

The first alternative only makes the change just described. The grant bit on the reply object capability (which is separate from the canGrant field of the reply object) is retained, and acts as a mask against the grant right on the receiver's endpoint capability. Therefore, to grant via a reply, a receiver would need grant rights on both the receive endpoint capability and the reply object capability.
The second alternative makes the change just described, and additionally removes the grant bit on reply object capabilities. To grant via a reply, a receiver only needs a grant right on the receive endpoint, but can't make a reply object capability that masks a grant right conferred by the receive endpoint.

Drawbacks

The two alternatives are compared in the next section.

Both alternatives have the potential to break existing MCS systems that include threads that:

receive calls on endpoint capabilities without grant rights, yet
expect to be able to grant capabilities when replying.

Any such systems would need to be updated to add grant rights to the receive endpoint capabilities.

Unfortunately, the breakage will only become evident when the caller tries to the use the capabilities it should have been granted.

A mitigating factor is that non-MCS systems already require the grant bit to be set appropriately on receive endpoint capabilities, so systems ported from non-MCS to MCS should not be affected.

Rationale and alternatives

We've already covered the rationale, so it only remains to compare the two alternatives.

Retaining the grant bit in reply object capabilities

The first alternative retains the grant bit in the reply object capability. The code change is minimal. Any existing MCS systems that make use of the ability to mask grant rights via the reply object capability will continue to enjoy that ability, so the change does not give any thread permissions it did not already have.

Removing the grant bit in reply object capabilities

The second alternative removes the grant bit in the reply object capability. This results in a simpler implementation and API semantics, but removes a feature that might be used in some MCS systems. There is no corresponding feature in the non-MCS kernel, so the feature removal is only potentially relevant to MCS systems.

Unfortunately, the only reasonable way to remove the feature is silently. This means that an existing MCS system that currently uses the feature to mask the grant rights of reply object capabilities will continue to work (assuming the corresponding receive endpoint capabilities have grant rights), but might include threads which gain permissions that the system designer did not intend. That is, there might be threads that did not previously have the right to grant capabilities when replying, that could gain that right when this alternative is implemented.

The rationale for this alternative is that we think it is unlikely that there are any such systems. We are not aware of any real use cases, and have not been able to construct a hypothetical use case that is not more easily implemented without this feature.

If there are any use cases, we would like to hear about them.

Prior art

The main inspiration for this change is the non-MCS version of seL4, which:

already uses the grant bit on a receiver's endpoint capability to govern the receiver's ability to grant capabilities when replying.
does not provide any facility to remove the grant right from a reply capability.

Unresolved questions

Is there a realistic use case for a grant bit on the reply object capability, that can be used to mask a grant right conferred by the receive endpoint capability?

This is the main factor in choosing between the two alternatives. So far, the discussion on this issue has only shown a hypothetical use case, which did not survive scrutiny, and which for non-MCS systems, was based on a mistaken understanding of the API semantics.

If you do use seL4_CNode_Mint to mask the grant rights of MCS reply object capabilities, we would like to hear from you.

Disposition

The TSC as approved the option of the RFC that removes the grant right on the reply cap (the second alternative). Should there be appear a particular use case for it after all, it should be easy enough to consider another RFC to add it.

The seL4 RFC Book