* [RFC PATCH v2.1 05/30] Documentation/x86: Document SGX permission details
[not found] <20220304093524.397485-1-jarkko@kernel.org>
@ 2022-03-04 9:34 ` Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 06/30] x86/sgx: Support VMA permissions more relaxed than enclave permissions Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 22/30] Documentation/x86: Introduce enclave runtime management section Jarkko Sakkinen
2 siblings, 0 replies; 3+ messages in thread
From: Jarkko Sakkinen @ 2022-03-04 9:34 UTC (permalink / raw)
To: linux-sgx
Cc: Nathaniel McCallum, Reinette Chatre, Andy Lutomirski,
Jarkko Sakkinen, Dave Hansen, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
H. Peter Anvin, Jonathan Corbet,
open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:DOCUMENTATION
From: Reinette Chatre <reinette.chatre@intel.com>
Provide summary of the various permissions involved in
managing access to enclave pages. This summary documents
the foundation for additions related to runtime managing of
enclave page permissions that is made possible with SGX2.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Documentation/x86/sgx.rst | 28 +++++++++++++++++++++++-----
1 file changed, 23 insertions(+), 5 deletions(-)
diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
index 265568a9292c..89ff924b1480 100644
--- a/Documentation/x86/sgx.rst
+++ b/Documentation/x86/sgx.rst
@@ -71,16 +71,34 @@ The processor tracks EPC pages in a hardware metadata structure called the
which describes the owning enclave, access rights and page type among the other
things.
-EPCM permissions are separate from the normal page tables. This prevents the
-kernel from, for instance, allowing writes to data which an enclave wishes to
-remain read-only. EPCM permissions may only impose additional restrictions on
-top of normal x86 page permissions.
-
For all intents and purposes, the SGX architecture allows the processor to
invalidate all EPCM entries at will. This requires that software be prepared to
handle an EPCM fault at any time. In practice, this can happen on events like
power transitions when the ephemeral key that encrypts enclave memory is lost.
+Details about enclave page permissions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+EPCM permissions are separate from the normal page tables. This prevents the
+kernel from, for instance, allowing writes to data which an enclave wishes
+to remain read-only.
+
+Three permission masks are relevant to SGX:
+
+* EPCM permissions.
+* Page Table Entry (PTE) permissions.
+* Virtual Memory Area (VMA) permissions.
+
+An enclave is only able to access an enclave page if all three permission
+masks enable it to do so.
+
+The relationships between the different permission masks are:
+
+* An SGX VMA can only be created if its permissions are the same or weaker
+ than the EPCM permissions.
+* PTEs are installed to match the EPCM permissions, but not be more
+ relaxed than the VMA permissions.
+
Application interface
=====================
--
2.35.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [RFC PATCH v2.1 06/30] x86/sgx: Support VMA permissions more relaxed than enclave permissions
[not found] <20220304093524.397485-1-jarkko@kernel.org>
2022-03-04 9:34 ` [RFC PATCH v2.1 05/30] Documentation/x86: Document SGX permission details Jarkko Sakkinen
@ 2022-03-04 9:35 ` Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 22/30] Documentation/x86: Introduce enclave runtime management section Jarkko Sakkinen
2 siblings, 0 replies; 3+ messages in thread
From: Jarkko Sakkinen @ 2022-03-04 9:35 UTC (permalink / raw)
To: linux-sgx
Cc: Nathaniel McCallum, Reinette Chatre, Jarkko Sakkinen, Dave Hansen,
Thomas Gleixner, Ingo Molnar, Borislav Petkov,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), H. Peter Anvin,
Jonathan Corbet, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:DOCUMENTATION
From: Reinette Chatre <reinette.chatre@intel.com>
=== Summary ===
An SGX VMA can only be created if its permissions are the same or
weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
creation this same rule is again enforced by the page fault handler:
faulted enclave pages are required to have equal or more relaxed
EPCM permissions than the VMA permissions.
On SGX1 systems the additional enforcement in the page fault handler
is redundant and on SGX2 systems it incorrectly prevents access.
On SGX1 systems it is unnecessary to repeat the enforcement of the
permission rule. The rule used during original VMA creation will
ensure that any access attempt will use correct permissions.
With SGX2 the EPCM permissions of a page can change after VMA
creation resulting in the VMA permissions potentially being more
relaxed than the EPCM permissions and the page fault handler
incorrectly blocking valid access attempts.
Enable the VMA's pages to remain accessible while ensuring that
the PTEs are installed to match the EPCM permissions but not be
more relaxed than the VMA permissions.
=== Full Changelog ===
An SGX enclave is an area of memory where parts of an application
can reside. First an enclave is created and loaded (from
non-enclave memory) with the code and data of an application,
then user space can map (mmap()) the enclave memory to
be able to enter the enclave at its defined entry points for
execution within it.
The hardware maintains a secure structure, the Enclave Page Cache Map
(EPCM), that tracks the contents of the enclave. Of interest here is
its tracking of the enclave page permissions. When a page is loaded
into the enclave its permissions are specified and recorded in the
EPCM. In parallel the kernel maintains permissions within the
page table entries (PTEs) and the rule is that PTE permissions
are not allowed to be more relaxed than the EPCM permissions.
A new mapping (mmap()) of enclave memory can only succeed if the
mapping has the same or weaker permissions than the permissions that
were vetted during enclave creation. This is enforced by
sgx_encl_may_map() that is called on the mmap() as well as mprotect()
paths. This rule remains.
One feature of SGX2 is to support the modification of EPCM permissions
after enclave initialization. Enclave pages may thus already be part
of a VMA at the time their EPCM permissions are changed resulting
in the VMA's permissions potentially being more relaxed than the EPCM
permissions.
Allow permissions of existing VMAs to be more relaxed than EPCM
permissions in preparation for dynamic EPCM permission changes
made possible in SGX2. New VMAs that attempt to have more relaxed
permissions than EPCM permissions continue to be unsupported.
Reasons why permissions of existing VMAs are allowed to be more relaxed
than EPCM permissions instead of dynamically changing VMA permissions
when EPCM permissions change are:
1) Changing VMA permissions involve splitting VMAs which is an
operation that can fail. Additionally changing EPCM permissions of
a range of pages could also fail on any of the pages involved.
Handling these error cases causes problems. For example, if an
EPCM permission change fails and the VMA has already been split
then it is not possible to undo the VMA split nor possible to
undo the EPCM permission changes that did succeed before the
failure.
2) The kernel has little insight into the user space where EPCM
permissions are controlled from. For example, a RW page may
be made RO just before it is made RX and splitting the VMAs
while the VMAs may change soon is unnecessary.
Remove the extra permission check called on a page fault
(vm_operations_struct->fault) or during debugging
(vm_operations_struct->access) when loading the enclave page from swap
that ensures that the VMA permissions are not more relaxed than the
EPCM permissions. Since a VMA could only exist if it passed the
original permission checks during mmap() and a VMA may indeed
have more relaxed permissions than the EPCM permissions this extra
permission check is no longer appropriate.
With the permission check removed, ensure that PTEs do
not blindly inherit the VMA permissions but instead the permissions
that the VMA and EPCM agree on. PTEs for writable pages (from VMA
and enclave perspective) are installed with the writable bit set,
reducing the need for this additional flow to the permission mismatch
cases handled next.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Documentation/x86/sgx.rst | 10 +++++++++
arch/x86/kernel/cpu/sgx/encl.c | 38 ++++++++++++++++++----------------
2 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
index 89ff924b1480..5659932728a5 100644
--- a/Documentation/x86/sgx.rst
+++ b/Documentation/x86/sgx.rst
@@ -99,6 +99,16 @@ The relationships between the different permission masks are:
* PTEs are installed to match the EPCM permissions, but not be more
relaxed than the VMA permissions.
+On systems supporting SGX2 EPCM permissions may change while the
+enclave page belongs to a VMA without impacting the VMA permissions.
+This means that a running VMA may appear to allow access to an enclave
+page that is not allowed by its EPCM permissions. For example, when an
+enclave page with RW EPCM permissions is mapped by a RW VMA but is
+subsequently changed to have read-only EPCM permissions. The kernel
+continues to maintain correct access to the enclave page through the
+PTE that will ensure that only access allowed by both the VMA
+and EPCM permissions are permitted.
+
Application interface
=====================
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..20e97d3abdce 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -91,10 +91,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
}
static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
- unsigned long addr,
- unsigned long vm_flags)
+ unsigned long addr)
{
- unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
struct sgx_epc_page *epc_page;
struct sgx_encl_page *entry;
@@ -102,14 +100,6 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
if (!entry)
return ERR_PTR(-EFAULT);
- /*
- * Verify that the faulted page has equal or higher build time
- * permissions than the VMA permissions (i.e. the subset of {VM_READ,
- * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
- */
- if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
- return ERR_PTR(-EFAULT);
-
/* Entry successfully located. */
if (entry->epc_page) {
if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
@@ -138,7 +128,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
{
unsigned long addr = (unsigned long)vmf->address;
struct vm_area_struct *vma = vmf->vma;
+ unsigned long page_prot_bits;
struct sgx_encl_page *entry;
+ unsigned long vm_prot_bits;
unsigned long phys_addr;
struct sgx_encl *encl;
vm_fault_t ret;
@@ -155,7 +147,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
mutex_lock(&encl->lock);
- entry = sgx_encl_load_page(encl, addr, vma->vm_flags);
+ entry = sgx_encl_load_page(encl, addr);
if (IS_ERR(entry)) {
mutex_unlock(&encl->lock);
@@ -167,7 +159,19 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
phys_addr = sgx_get_epc_phys_addr(entry->epc_page);
- ret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
+ /*
+ * Insert PTE to match the EPCM page permissions ensured to not
+ * exceed the VMA permissions.
+ */
+ vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+ page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits;
+ /*
+ * Add VM_SHARED so that PTE is made writable right away if VMA
+ * and EPCM are writable (no COW in SGX).
+ */
+ page_prot_bits |= (vma->vm_flags & VM_SHARED);
+ ret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr),
+ vm_get_page_prot(page_prot_bits));
if (ret != VM_FAULT_NOPAGE) {
mutex_unlock(&encl->lock);
@@ -295,15 +299,14 @@ static int sgx_encl_debug_write(struct sgx_encl *encl, struct sgx_encl_page *pag
* Load an enclave page to EPC if required, and take encl->lock.
*/
static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl,
- unsigned long addr,
- unsigned long vm_flags)
+ unsigned long addr)
{
struct sgx_encl_page *entry;
for ( ; ; ) {
mutex_lock(&encl->lock);
- entry = sgx_encl_load_page(encl, addr, vm_flags);
+ entry = sgx_encl_load_page(encl, addr);
if (PTR_ERR(entry) != -EBUSY)
break;
@@ -339,8 +342,7 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr,
return -EFAULT;
for (i = 0; i < len; i += cnt) {
- entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK,
- vma->vm_flags);
+ entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK);
if (IS_ERR(entry)) {
ret = PTR_ERR(entry);
break;
--
2.35.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [RFC PATCH v2.1 22/30] Documentation/x86: Introduce enclave runtime management section
[not found] <20220304093524.397485-1-jarkko@kernel.org>
2022-03-04 9:34 ` [RFC PATCH v2.1 05/30] Documentation/x86: Document SGX permission details Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 06/30] x86/sgx: Support VMA permissions more relaxed than enclave permissions Jarkko Sakkinen
@ 2022-03-04 9:35 ` Jarkko Sakkinen
2 siblings, 0 replies; 3+ messages in thread
From: Jarkko Sakkinen @ 2022-03-04 9:35 UTC (permalink / raw)
To: linux-sgx
Cc: Nathaniel McCallum, Reinette Chatre, Jarkko Sakkinen, Dave Hansen,
Thomas Gleixner, Ingo Molnar, Borislav Petkov,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), H. Peter Anvin,
Jonathan Corbet, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:DOCUMENTATION
From: Reinette Chatre <reinette.chatre@intel.com>
Enclave runtime management is introduced following the pattern
of the section describing enclave building. Provide a brief
summary of enclave runtime management, pointing to the functions
implementing the ioctl()s that will contain details within their
kernel-doc.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Documentation/x86/sgx.rst | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
index 5659932728a5..6c66ce0ec69c 100644
--- a/Documentation/x86/sgx.rst
+++ b/Documentation/x86/sgx.rst
@@ -128,6 +128,22 @@ pages and establish enclave page permissions.
sgx_ioc_enclave_init
sgx_ioc_enclave_provision
+Enclave runtime management
+--------------------------
+
+Systems supporting SGX2 additionally support changes to initialized
+enclaves: modifying enclave page permissions and type, and dynamically
+adding and removing of enclave pages. When an enclave accesses an address
+within its address range that does not have a backing page then a new
+regular page will be dynamically added to the enclave. The enclave is
+still required to run EACCEPT on the new page before it can be used.
+
+.. kernel-doc:: arch/x86/kernel/cpu/sgx/ioctl.c
+ :functions: sgx_ioc_enclave_relax_perm
+ sgx_ioc_enclave_restrict_perm
+ sgx_ioc_enclave_modt
+ sgx_ioc_enclave_remove_pages
+
Enclave vDSO
------------
--
2.35.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-03-04 9:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220304093524.397485-1-jarkko@kernel.org>
2022-03-04 9:34 ` [RFC PATCH v2.1 05/30] Documentation/x86: Document SGX permission details Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 06/30] x86/sgx: Support VMA permissions more relaxed than enclave permissions Jarkko Sakkinen
2022-03-04 9:35 ` [RFC PATCH v2.1 22/30] Documentation/x86: Introduce enclave runtime management section Jarkko Sakkinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).