* [PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags
2024-08-21 10:02 [PATCH v5 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
@ 2024-08-21 10:02 ` Dmitrii Kuvaiskii
2024-08-27 18:19 ` Jarkko Sakkinen
2024-08-21 10:02 ` [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-08-21 10:02 ` [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2 siblings, 1 reply; 9+ messages in thread
From: Dmitrii Kuvaiskii @ 2024-08-21 10:02 UTC (permalink / raw)
To: dave.hansen, jarkko, kai.huang, haitao.huang, reinette.chatre,
linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable
The page reclaimer thread sets SGX_ENC_PAGE_BEING_RECLAIMED flag when
the enclave page is being reclaimed (moved to the backing store). This
flag however has two logical meanings:
1. Don't attempt to load the enclave page (the page is busy), see
__sgx_encl_load_page().
2. Don't attempt to remove the PCMD page corresponding to this enclave
page (the PCMD page is busy), see reclaimer_writing_to_pcmd().
To reflect these two meanings, split SGX_ENCL_PAGE_BEING_RECLAIMED into
two flags: SGX_ENCL_PAGE_BUSY and SGX_ENCL_PAGE_PCMD_BUSY. Currently,
both flags are set only when the enclave page is being reclaimed (by the
page reclaimer thread). A future commit will introduce new cases when
the enclave page is being operated on; these new cases will set only the
SGX_ENCL_PAGE_BUSY flag.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Reviewed-by: Haitao Huang <haitao.huang@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
---
arch/x86/kernel/cpu/sgx/encl.c | 16 +++++++---------
arch/x86/kernel/cpu/sgx/encl.h | 10 ++++++++--
arch/x86/kernel/cpu/sgx/main.c | 4 ++--
3 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 279148e72459..c0a3c00284c8 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -46,10 +46,10 @@ static int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_ind
* a check if an enclave page sharing the PCMD page is in the process of being
* reclaimed.
*
- * The reclaimer sets the SGX_ENCL_PAGE_BEING_RECLAIMED flag when it
- * intends to reclaim that enclave page - it means that the PCMD page
- * associated with that enclave page is about to get some data and thus
- * even if the PCMD page is empty, it should not be truncated.
+ * The reclaimer sets the SGX_ENCL_PAGE_PCMD_BUSY flag when it intends to
+ * reclaim that enclave page - it means that the PCMD page associated with that
+ * enclave page is about to get some data and thus even if the PCMD page is
+ * empty, it should not be truncated.
*
* Context: Enclave mutex (&sgx_encl->lock) must be held.
* Return: 1 if the reclaimer is about to write to the PCMD page
@@ -77,8 +77,7 @@ static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
* Stop when reaching the SECS page - it does not
* have a page_array entry and its reclaim is
* started and completed with enclave mutex held so
- * it does not use the SGX_ENCL_PAGE_BEING_RECLAIMED
- * flag.
+ * it does not use the SGX_ENCL_PAGE_PCMD_BUSY flag.
*/
if (addr == encl->base + encl->size)
break;
@@ -91,8 +90,7 @@ static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
* VA page slot ID uses same bit as the flag so it is important
* to ensure that the page is not already in backing store.
*/
- if (entry->epc_page &&
- (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)) {
+ if (entry->epc_page && (entry->desc & SGX_ENCL_PAGE_PCMD_BUSY)) {
reclaimed = 1;
break;
}
@@ -257,7 +255,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
/* Entry successfully located. */
if (entry->epc_page) {
- if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
+ if (entry->desc & SGX_ENCL_PAGE_BUSY)
return ERR_PTR(-EBUSY);
return entry;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index f94ff14c9486..b566b8ad5f33 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -22,8 +22,14 @@
/* 'desc' bits holding the offset in the VA (version array) page. */
#define SGX_ENCL_PAGE_VA_OFFSET_MASK GENMASK_ULL(11, 3)
-/* 'desc' bit marking that the page is being reclaimed. */
-#define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
+/* 'desc' bit indicating that the page is busy (being reclaimed). */
+#define SGX_ENCL_PAGE_BUSY BIT(2)
+
+/*
+ * 'desc' bit indicating that PCMD page associated with the enclave page is
+ * busy (because the enclave page is being reclaimed).
+ */
+#define SGX_ENCL_PAGE_PCMD_BUSY BIT(3)
struct sgx_encl_page {
unsigned long desc;
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 166692f2d501..e94b09c43673 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -204,7 +204,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page,
void *va_slot;
int ret;
- encl_page->desc &= ~SGX_ENCL_PAGE_BEING_RECLAIMED;
+ encl_page->desc &= ~(SGX_ENCL_PAGE_BUSY | SGX_ENCL_PAGE_PCMD_BUSY);
va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
list);
@@ -340,7 +340,7 @@ static void sgx_reclaim_pages(void)
goto skip;
}
- encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
+ encl_page->desc |= SGX_ENCL_PAGE_BUSY | SGX_ENCL_PAGE_PCMD_BUSY;
mutex_unlock(&encl_page->encl->lock);
continue;
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags
2024-08-21 10:02 ` [PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags Dmitrii Kuvaiskii
@ 2024-08-27 18:19 ` Jarkko Sakkinen
0 siblings, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2024-08-27 18:19 UTC (permalink / raw)
To: Dmitrii Kuvaiskii, dave.hansen, kai.huang, haitao.huang,
reinette.chatre, linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable
On Wed Aug 21, 2024 at 1:02 PM EEST, Dmitrii Kuvaiskii wrote:
> The page reclaimer thread sets SGX_ENC_PAGE_BEING_RECLAIMED flag when
> the enclave page is being reclaimed (moved to the backing store). This
> flag however has two logical meanings:
>
> 1. Don't attempt to load the enclave page (the page is busy), see
> __sgx_encl_load_page().
> 2. Don't attempt to remove the PCMD page corresponding to this enclave
> page (the PCMD page is busy), see reclaimer_writing_to_pcmd().
>
> To reflect these two meanings, split SGX_ENCL_PAGE_BEING_RECLAIMED into
> two flags: SGX_ENCL_PAGE_BUSY and SGX_ENCL_PAGE_PCMD_BUSY. Currently,
> both flags are set only when the enclave page is being reclaimed (by the
> page reclaimer thread). A future commit will introduce new cases when
> the enclave page is being operated on; these new cases will set only the
> SGX_ENCL_PAGE_BUSY flag.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
> Reviewed-by: Haitao Huang <haitao.huang@linux.intel.com>
> Acked-by: Kai Huang <kai.huang@intel.com>
> ---
> arch/x86/kernel/cpu/sgx/encl.c | 16 +++++++---------
> arch/x86/kernel/cpu/sgx/encl.h | 10 ++++++++--
> arch/x86/kernel/cpu/sgx/main.c | 4 ++--
> 3 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 279148e72459..c0a3c00284c8 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -46,10 +46,10 @@ static int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_ind
> * a check if an enclave page sharing the PCMD page is in the process of being
> * reclaimed.
> *
> - * The reclaimer sets the SGX_ENCL_PAGE_BEING_RECLAIMED flag when it
> - * intends to reclaim that enclave page - it means that the PCMD page
> - * associated with that enclave page is about to get some data and thus
> - * even if the PCMD page is empty, it should not be truncated.
> + * The reclaimer sets the SGX_ENCL_PAGE_PCMD_BUSY flag when it intends to
> + * reclaim that enclave page - it means that the PCMD page associated with that
> + * enclave page is about to get some data and thus even if the PCMD page is
> + * empty, it should not be truncated.
> *
> * Context: Enclave mutex (&sgx_encl->lock) must be held.
> * Return: 1 if the reclaimer is about to write to the PCMD page
> @@ -77,8 +77,7 @@ static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
> * Stop when reaching the SECS page - it does not
> * have a page_array entry and its reclaim is
> * started and completed with enclave mutex held so
> - * it does not use the SGX_ENCL_PAGE_BEING_RECLAIMED
> - * flag.
> + * it does not use the SGX_ENCL_PAGE_PCMD_BUSY flag.
> */
> if (addr == encl->base + encl->size)
> break;
> @@ -91,8 +90,7 @@ static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
> * VA page slot ID uses same bit as the flag so it is important
> * to ensure that the page is not already in backing store.
> */
> - if (entry->epc_page &&
> - (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)) {
> + if (entry->epc_page && (entry->desc & SGX_ENCL_PAGE_PCMD_BUSY)) {
> reclaimed = 1;
> break;
> }
> @@ -257,7 +255,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
>
> /* Entry successfully located. */
> if (entry->epc_page) {
> - if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
> + if (entry->desc & SGX_ENCL_PAGE_BUSY)
> return ERR_PTR(-EBUSY);
>
> return entry;
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index f94ff14c9486..b566b8ad5f33 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -22,8 +22,14 @@
> /* 'desc' bits holding the offset in the VA (version array) page. */
> #define SGX_ENCL_PAGE_VA_OFFSET_MASK GENMASK_ULL(11, 3)
>
> -/* 'desc' bit marking that the page is being reclaimed. */
> -#define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
> +/* 'desc' bit indicating that the page is busy (being reclaimed). */
> +#define SGX_ENCL_PAGE_BUSY BIT(2)
> +
> +/*
> + * 'desc' bit indicating that PCMD page associated with the enclave page is
> + * busy (because the enclave page is being reclaimed).
> + */
> +#define SGX_ENCL_PAGE_PCMD_BUSY BIT(3)
>
> struct sgx_encl_page {
> unsigned long desc;
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 166692f2d501..e94b09c43673 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -204,7 +204,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page,
> void *va_slot;
> int ret;
>
> - encl_page->desc &= ~SGX_ENCL_PAGE_BEING_RECLAIMED;
> + encl_page->desc &= ~(SGX_ENCL_PAGE_BUSY | SGX_ENCL_PAGE_PCMD_BUSY);
>
> va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
> list);
> @@ -340,7 +340,7 @@ static void sgx_reclaim_pages(void)
> goto skip;
> }
>
> - encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
> + encl_page->desc |= SGX_ENCL_PAGE_BUSY | SGX_ENCL_PAGE_PCMD_BUSY;
> mutex_unlock(&encl_page->encl->lock);
> continue;
>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
2024-08-21 10:02 [PATCH v5 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-08-21 10:02 ` [PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags Dmitrii Kuvaiskii
@ 2024-08-21 10:02 ` Dmitrii Kuvaiskii
2024-08-21 11:15 ` Huang, Kai
2024-08-27 18:19 ` Jarkko Sakkinen
2024-08-21 10:02 ` [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2 siblings, 2 replies; 9+ messages in thread
From: Dmitrii Kuvaiskii @ 2024-08-21 10:02 UTC (permalink / raw)
To: dave.hansen, jarkko, kai.huang, haitao.huang, reinette.chatre,
linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable, Marcelina Kościelnicka
Imagine an mmap()'d file. Two threads touch the same address at the same
time and fault. Both allocate a physical page and race to install a PTE
for that page. Only one will win the race. The loser frees its page, but
still continues handling the fault as a success and returns
VM_FAULT_NOPAGE from the fault handler.
The same race can happen with SGX. But there's a bug: the loser in the
SGX steers into a failure path. The loser EREMOVE's the winner's EPC
page, then returns SIGBUS, likely killing the app.
Fix the SGX loser's behavior. Check whether another thread already
allocated the page and if yes, return with VM_FAULT_NOPAGE.
The race can be illustrated as follows:
/* /*
* Fault on CPU1 * Fault on CPU2
* on enclave page X * on enclave page X
*/ */
sgx_vma_fault() { sgx_vma_fault() {
xa_load(&encl->page_array) xa_load(&encl->page_array)
== NULL --> == NULL -->
sgx_encl_eaug_page() { sgx_encl_eaug_page() {
... ...
/* /*
* alloc encl_page * alloc encl_page
*/ */
mutex_lock(&encl->lock);
/*
* alloc EPC page
*/
epc_page = sgx_alloc_epc_page(...);
/*
* add page to enclave's xarray
*/
xa_insert(&encl->page_array, ...);
/*
* add page to enclave via EAUG
* (page is in pending state)
*/
/*
* add PTE entry
*/
vmf_insert_pfn(...);
mutex_unlock(&encl->lock);
return VM_FAULT_NOPAGE;
}
}
/*
* All good up to here: enclave page
* successfully added to enclave,
* ready for EACCEPT from user space
*/
mutex_lock(&encl->lock);
/*
* alloc EPC page
*/
epc_page = sgx_alloc_epc_page(...);
/*
* add page to enclave's xarray,
* this fails with -EBUSY as this
* page was already added by CPU2
*/
xa_insert(&encl->page_array, ...);
err_out_shrink:
sgx_encl_free_epc_page(epc_page) {
/*
* remove page via EREMOVE
*
* *BUG*: page added by CPU2 is
* yanked from enclave while it
* remains accessible from OS
* perspective (PTE installed)
*/
/*
* free EPC page
*/
sgx_free_epc_page(epc_page);
}
mutex_unlock(&encl->lock);
/*
* *BUG*: SIGBUS is returned
* for a valid enclave page
*/
return VM_FAULT_SIGBUS;
}
}
Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
Cc: stable@vger.kernel.org
Reported-by: Marcelina Kościelnicka <mwk@invisiblethingslab.com>
Suggested-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
---
arch/x86/kernel/cpu/sgx/encl.c | 36 ++++++++++++++++++++--------------
1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c0a3c00284c8..2aa7ced0e4a0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -337,6 +337,16 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
return VM_FAULT_SIGBUS;
+ mutex_lock(&encl->lock);
+
+ /*
+ * Multiple threads may try to fault on the same page concurrently.
+ * Re-check if another thread has already done that.
+ */
+ encl_page = xa_load(&encl->page_array, PFN_DOWN(addr));
+ if (encl_page)
+ goto done;
+
/*
* Ignore internal permission checking for dynamically added pages.
* They matter only for data added during the pre-initialization
@@ -345,23 +355,23 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
*/
secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
- if (IS_ERR(encl_page))
- return VM_FAULT_OOM;
-
- mutex_lock(&encl->lock);
+ if (IS_ERR(encl_page)) {
+ vmret = VM_FAULT_OOM;
+ goto err_out_unlock;
+ }
epc_page = sgx_encl_load_secs(encl);
if (IS_ERR(epc_page)) {
if (PTR_ERR(epc_page) == -EBUSY)
vmret = VM_FAULT_NOPAGE;
- goto err_out_unlock;
+ goto err_out_encl;
}
epc_page = sgx_alloc_epc_page(encl_page, false);
if (IS_ERR(epc_page)) {
if (PTR_ERR(epc_page) == -EBUSY)
vmret = VM_FAULT_NOPAGE;
- goto err_out_unlock;
+ goto err_out_encl;
}
va_page = sgx_encl_grow(encl, false);
@@ -376,10 +386,6 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
encl_page, GFP_KERNEL);
- /*
- * If ret == -EBUSY then page was created in another flow while
- * running without encl->lock
- */
if (ret)
goto err_out_shrink;
@@ -389,7 +395,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
if (ret)
- goto err_out;
+ goto err_out_eaug;
encl_page->encl = encl;
encl_page->epc_page = epc_page;
@@ -408,20 +414,20 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
mutex_unlock(&encl->lock);
return VM_FAULT_SIGBUS;
}
+done:
mutex_unlock(&encl->lock);
return VM_FAULT_NOPAGE;
-err_out:
+err_out_eaug:
xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
-
err_out_shrink:
sgx_encl_shrink(encl, va_page);
err_out_epc:
sgx_encl_free_epc_page(epc_page);
+err_out_encl:
+ kfree(encl_page);
err_out_unlock:
mutex_unlock(&encl->lock);
- kfree(encl_page);
-
return vmret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
2024-08-21 10:02 ` [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
@ 2024-08-21 11:15 ` Huang, Kai
2024-08-27 18:19 ` Jarkko Sakkinen
1 sibling, 0 replies; 9+ messages in thread
From: Huang, Kai @ 2024-08-21 11:15 UTC (permalink / raw)
To: Kuvaiskii, Dmitrii, linux-sgx@vger.kernel.org, Chatre, Reinette,
jarkko@kernel.org, linux-kernel@vger.kernel.org,
haitao.huang@linux.intel.com, dave.hansen@linux.intel.com
Cc: stable@vger.kernel.org, mwk@invisiblethingslab.com, Vij, Mona,
Qin, Kailun
On Wed, 2024-08-21 at 03:02 -0700, Dmitrii Kuvaiskii wrote:
> Imagine an mmap()'d file. Two threads touch the same address at the same
> time and fault. Both allocate a physical page and race to install a PTE
> for that page. Only one will win the race. The loser frees its page, but
> still continues handling the fault as a success and returns
> VM_FAULT_NOPAGE from the fault handler.
>
> The same race can happen with SGX. But there's a bug: the loser in the
> SGX steers into a failure path. The loser EREMOVE's the winner's EPC
> page, then returns SIGBUS, likely killing the app.
>
> Fix the SGX loser's behavior. Check whether another thread already
> allocated the page and if yes, return with VM_FAULT_NOPAGE.
>
[...]
> Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
> Cc: stable@vger.kernel.org
> Reported-by: Marcelina Kościelnicka <mwk@invisiblethingslab.com>
> Suggested-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
>
Reviewed-by: Kai Huang <kai.huang@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
2024-08-21 10:02 ` [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-08-21 11:15 ` Huang, Kai
@ 2024-08-27 18:19 ` Jarkko Sakkinen
1 sibling, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2024-08-27 18:19 UTC (permalink / raw)
To: Dmitrii Kuvaiskii, dave.hansen, kai.huang, haitao.huang,
reinette.chatre, linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable, Marcelina Kościelnicka
On Wed Aug 21, 2024 at 1:02 PM EEST, Dmitrii Kuvaiskii wrote:
> Imagine an mmap()'d file. Two threads touch the same address at the same
> time and fault. Both allocate a physical page and race to install a PTE
> for that page. Only one will win the race. The loser frees its page, but
> still continues handling the fault as a success and returns
> VM_FAULT_NOPAGE from the fault handler.
>
> The same race can happen with SGX. But there's a bug: the loser in the
> SGX steers into a failure path. The loser EREMOVE's the winner's EPC
> page, then returns SIGBUS, likely killing the app.
>
> Fix the SGX loser's behavior. Check whether another thread already
> allocated the page and if yes, return with VM_FAULT_NOPAGE.
>
> The race can be illustrated as follows:
>
> /* /*
> * Fault on CPU1 * Fault on CPU2
> * on enclave page X * on enclave page X
> */ */
> sgx_vma_fault() { sgx_vma_fault() {
>
> xa_load(&encl->page_array) xa_load(&encl->page_array)
> == NULL --> == NULL -->
>
> sgx_encl_eaug_page() { sgx_encl_eaug_page() {
>
> ... ...
>
> /* /*
> * alloc encl_page * alloc encl_page
> */ */
> mutex_lock(&encl->lock);
> /*
> * alloc EPC page
> */
> epc_page = sgx_alloc_epc_page(...);
> /*
> * add page to enclave's xarray
> */
> xa_insert(&encl->page_array, ...);
> /*
> * add page to enclave via EAUG
> * (page is in pending state)
> */
> /*
> * add PTE entry
> */
> vmf_insert_pfn(...);
>
> mutex_unlock(&encl->lock);
> return VM_FAULT_NOPAGE;
> }
> }
> /*
> * All good up to here: enclave page
> * successfully added to enclave,
> * ready for EACCEPT from user space
> */
> mutex_lock(&encl->lock);
> /*
> * alloc EPC page
> */
> epc_page = sgx_alloc_epc_page(...);
> /*
> * add page to enclave's xarray,
> * this fails with -EBUSY as this
> * page was already added by CPU2
> */
> xa_insert(&encl->page_array, ...);
>
> err_out_shrink:
> sgx_encl_free_epc_page(epc_page) {
> /*
> * remove page via EREMOVE
> *
> * *BUG*: page added by CPU2 is
> * yanked from enclave while it
> * remains accessible from OS
> * perspective (PTE installed)
> */
> /*
> * free EPC page
> */
> sgx_free_epc_page(epc_page);
> }
>
> mutex_unlock(&encl->lock);
> /*
> * *BUG*: SIGBUS is returned
> * for a valid enclave page
> */
> return VM_FAULT_SIGBUS;
> }
> }
>
> Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
> Cc: stable@vger.kernel.org
> Reported-by: Marcelina Kościelnicka <mwk@invisiblethingslab.com>
> Suggested-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
> ---
> arch/x86/kernel/cpu/sgx/encl.c | 36 ++++++++++++++++++++--------------
> 1 file changed, 21 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index c0a3c00284c8..2aa7ced0e4a0 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -337,6 +337,16 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> return VM_FAULT_SIGBUS;
>
> + mutex_lock(&encl->lock);
> +
> + /*
> + * Multiple threads may try to fault on the same page concurrently.
> + * Re-check if another thread has already done that.
> + */
> + encl_page = xa_load(&encl->page_array, PFN_DOWN(addr));
> + if (encl_page)
> + goto done;
> +
> /*
> * Ignore internal permission checking for dynamically added pages.
> * They matter only for data added during the pre-initialization
> @@ -345,23 +355,23 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> */
> secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
> encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
> - if (IS_ERR(encl_page))
> - return VM_FAULT_OOM;
> -
> - mutex_lock(&encl->lock);
> + if (IS_ERR(encl_page)) {
> + vmret = VM_FAULT_OOM;
> + goto err_out_unlock;
> + }
>
> epc_page = sgx_encl_load_secs(encl);
> if (IS_ERR(epc_page)) {
> if (PTR_ERR(epc_page) == -EBUSY)
> vmret = VM_FAULT_NOPAGE;
> - goto err_out_unlock;
> + goto err_out_encl;
> }
>
> epc_page = sgx_alloc_epc_page(encl_page, false);
> if (IS_ERR(epc_page)) {
> if (PTR_ERR(epc_page) == -EBUSY)
> vmret = VM_FAULT_NOPAGE;
> - goto err_out_unlock;
> + goto err_out_encl;
> }
>
> va_page = sgx_encl_grow(encl, false);
> @@ -376,10 +386,6 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>
> ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
> encl_page, GFP_KERNEL);
> - /*
> - * If ret == -EBUSY then page was created in another flow while
> - * running without encl->lock
> - */
> if (ret)
> goto err_out_shrink;
>
> @@ -389,7 +395,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>
> ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> if (ret)
> - goto err_out;
> + goto err_out_eaug;
>
> encl_page->encl = encl;
> encl_page->epc_page = epc_page;
> @@ -408,20 +414,20 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> mutex_unlock(&encl->lock);
> return VM_FAULT_SIGBUS;
> }
> +done:
> mutex_unlock(&encl->lock);
> return VM_FAULT_NOPAGE;
>
> -err_out:
> +err_out_eaug:
> xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> -
> err_out_shrink:
> sgx_encl_shrink(encl, va_page);
> err_out_epc:
> sgx_encl_free_epc_page(epc_page);
> +err_out_encl:
> + kfree(encl_page);
> err_out_unlock:
> mutex_unlock(&encl->lock);
> - kfree(encl_page);
> -
> return vmret;
> }
>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race
2024-08-21 10:02 [PATCH v5 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-08-21 10:02 ` [PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags Dmitrii Kuvaiskii
2024-08-21 10:02 ` [PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
@ 2024-08-21 10:02 ` Dmitrii Kuvaiskii
2024-08-21 11:11 ` Huang, Kai
2024-08-27 18:20 ` Jarkko Sakkinen
2 siblings, 2 replies; 9+ messages in thread
From: Dmitrii Kuvaiskii @ 2024-08-21 10:02 UTC (permalink / raw)
To: dave.hansen, jarkko, kai.huang, haitao.huang, reinette.chatre,
linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable
Two enclave threads may try to add and remove the same enclave page
simultaneously (e.g., if the SGX runtime supports both lazy allocation
and MADV_DONTNEED semantics). Consider some enclave page added to the
enclave. User space decides to temporarily remove this page (e.g.,
emulating the MADV_DONTNEED semantics) on CPU1. At the same time, user
space performs a memory access on the same page on CPU2, which results
in a #PF and ultimately in sgx_vma_fault(). Scenario proceeds as
follows:
/*
* CPU1: User space performs
* ioctl(SGX_IOC_ENCLAVE_REMOVE_PAGES)
* on enclave page X
*/
sgx_encl_remove_pages() {
mutex_lock(&encl->lock);
entry = sgx_encl_load_page(encl);
/*
* verify that page is
* trimmed and accepted
*/
mutex_unlock(&encl->lock);
/*
* remove PTE entry; cannot
* be performed under lock
*/
sgx_zap_enclave_ptes(encl);
/*
* Fault on CPU2 on same page X
*/
sgx_vma_fault() {
/*
* PTE entry was removed, but the
* page is still in enclave's xarray
*/
xa_load(&encl->page_array) != NULL ->
/*
* SGX driver thinks that this page
* was swapped out and loads it
*/
mutex_lock(&encl->lock);
/*
* this is effectively a no-op
*/
entry = sgx_encl_load_page_in_vma();
/*
* add PTE entry
*
* *BUG*: a PTE is installed for a
* page in process of being removed
*/
vmf_insert_pfn(...);
mutex_unlock(&encl->lock);
return VM_FAULT_NOPAGE;
}
/*
* continue with page removal
*/
mutex_lock(&encl->lock);
sgx_encl_free_epc_page(epc_page) {
/*
* remove page via EREMOVE
*/
/*
* free EPC page
*/
sgx_free_epc_page(epc_page);
}
xa_erase(&encl->page_array);
mutex_unlock(&encl->lock);
}
Here, CPU1 removed the page. However CPU2 installed the PTE entry on the
same page. This enclave page becomes perpetually inaccessible (until
another SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl). This is because the page is
marked accessible in the PTE entry but is not EAUGed, and any subsequent
access to this page raises a fault: with the kernel believing there to
be a valid VMA, the unlikely error code X86_PF_SGX encountered by code
path do_user_addr_fault() -> access_error() causes the SGX driver's
sgx_vma_fault() to be skipped and user space receives a SIGSEGV instead.
The userspace SIGSEGV handler cannot perform EACCEPT because the page
was not EAUGed. Thus, the user space is stuck with the inaccessible
page.
Fix this race by forcing the fault handler on CPU2 to back off if the
page is currently being removed (on CPU1). This is achieved by
setting SGX_ENCL_PAGE_BUSY flag right-before the first mutex_unlock() in
sgx_encl_remove_pages(). Upon loading the page, CPU2 checks whether this
page is busy, and if yes then CPU2 backs off and waits until the page is
completely removed. After that, any memory access to this page results
in a normal "allocate and EAUG a page on #PF" flow.
Additionally fix a similar race: user space converts a normal enclave
page to a TCS page (via SGX_IOC_ENCLAVE_MODIFY_TYPES) on CPU1, and at
the same time, user space performs a memory access on the same page on
CPU2. This fix is not strictly necessary (this particular race would
indicate a bug in a user space application), but it gives a consistent
rule: if an enclave page is under certain operation by the kernel with
the mapping removed, then other threads trying to access that page are
temporarily blocked and should retry.
Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
---
arch/x86/kernel/cpu/sgx/encl.h | 3 ++-
arch/x86/kernel/cpu/sgx/ioctl.c | 17 +++++++++++++++++
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index b566b8ad5f33..96b11e8fb770 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -22,7 +22,8 @@
/* 'desc' bits holding the offset in the VA (version array) page. */
#define SGX_ENCL_PAGE_VA_OFFSET_MASK GENMASK_ULL(11, 3)
-/* 'desc' bit indicating that the page is busy (being reclaimed). */
+/* 'desc' bit indicating that the page is busy (being reclaimed, removed or
+ * converted to a TCS page). */
#define SGX_ENCL_PAGE_BUSY BIT(2)
/*
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 5d390df21440..ee619f2b3414 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -969,12 +969,22 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl,
/*
* Do not keep encl->lock because of dependency on
* mmap_lock acquired in sgx_zap_enclave_ptes().
+ *
+ * Releasing encl->lock leads to a data race: while CPU1
+ * performs sgx_zap_enclave_ptes() and removes the PTE
+ * entry for the enclave page, CPU2 may attempt to load
+ * this page (because the page is still in enclave's
+ * xarray). To prevent CPU2 from loading the page, mark
+ * the page as busy before unlock and unmark after lock
+ * again.
*/
+ entry->desc |= SGX_ENCL_PAGE_BUSY;
mutex_unlock(&encl->lock);
sgx_zap_enclave_ptes(encl, addr);
mutex_lock(&encl->lock);
+ entry->desc &= ~SGX_ENCL_PAGE_BUSY;
sgx_mark_page_reclaimable(entry->epc_page);
}
@@ -1141,7 +1151,14 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
/*
* Do not keep encl->lock because of dependency on
* mmap_lock acquired in sgx_zap_enclave_ptes().
+ *
+ * Releasing encl->lock leads to a data race: while CPU1
+ * performs sgx_zap_enclave_ptes() and removes the PTE entry
+ * for the enclave page, CPU2 may attempt to load this page
+ * (because the page is still in enclave's xarray). To prevent
+ * CPU2 from loading the page, mark the page as busy.
*/
+ entry->desc |= SGX_ENCL_PAGE_BUSY;
mutex_unlock(&encl->lock);
sgx_zap_enclave_ptes(encl, addr);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race
2024-08-21 10:02 ` [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
@ 2024-08-21 11:11 ` Huang, Kai
2024-08-27 18:20 ` Jarkko Sakkinen
1 sibling, 0 replies; 9+ messages in thread
From: Huang, Kai @ 2024-08-21 11:11 UTC (permalink / raw)
To: Kuvaiskii, Dmitrii, linux-sgx@vger.kernel.org, Chatre, Reinette,
jarkko@kernel.org, linux-kernel@vger.kernel.org,
haitao.huang@linux.intel.com, dave.hansen@linux.intel.com
Cc: stable@vger.kernel.org, Vij, Mona, Qin, Kailun
[...]
>
> Additionally fix a similar race: user space converts a normal enclave
> page to a TCS page (via SGX_IOC_ENCLAVE_MODIFY_TYPES) on CPU1, and at
> the same time, user space performs a memory access on the same page on
> CPU2. This fix is not strictly necessary (this particular race would
> indicate a bug in a user space application), but it gives a consistent
> rule: if an enclave page is under certain operation by the kernel with
> the mapping removed, then other threads trying to access that page are
> temporarily blocked and should retry.
>
> Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
This also needs another Fixes tag:
Fixes: 45d546b8c109d ("x86/sgx: Support modifying SGX page type")
.. since this patch fixes two functions introduced by two patches.
Maybe it's better to split this into two patches with each have one Fixes tag,
but I think it should also be fine to have two Fixes tags in this one patch
(I've seen that before), because all SGX2 patches were merged together in the
same kernel release.
Reviewed-by: Kai Huang <kai.huang@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race
2024-08-21 10:02 ` [PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2024-08-21 11:11 ` Huang, Kai
@ 2024-08-27 18:20 ` Jarkko Sakkinen
1 sibling, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2024-08-27 18:20 UTC (permalink / raw)
To: Dmitrii Kuvaiskii, dave.hansen, kai.huang, haitao.huang,
reinette.chatre, linux-sgx, linux-kernel
Cc: mona.vij, kailun.qin, stable
On Wed Aug 21, 2024 at 1:02 PM EEST, Dmitrii Kuvaiskii wrote:
> Two enclave threads may try to add and remove the same enclave page
> simultaneously (e.g., if the SGX runtime supports both lazy allocation
> and MADV_DONTNEED semantics). Consider some enclave page added to the
> enclave. User space decides to temporarily remove this page (e.g.,
> emulating the MADV_DONTNEED semantics) on CPU1. At the same time, user
> space performs a memory access on the same page on CPU2, which results
> in a #PF and ultimately in sgx_vma_fault(). Scenario proceeds as
> follows:
>
> /*
> * CPU1: User space performs
> * ioctl(SGX_IOC_ENCLAVE_REMOVE_PAGES)
> * on enclave page X
> */
> sgx_encl_remove_pages() {
>
> mutex_lock(&encl->lock);
>
> entry = sgx_encl_load_page(encl);
> /*
> * verify that page is
> * trimmed and accepted
> */
>
> mutex_unlock(&encl->lock);
>
> /*
> * remove PTE entry; cannot
> * be performed under lock
> */
> sgx_zap_enclave_ptes(encl);
> /*
> * Fault on CPU2 on same page X
> */
> sgx_vma_fault() {
> /*
> * PTE entry was removed, but the
> * page is still in enclave's xarray
> */
> xa_load(&encl->page_array) != NULL ->
> /*
> * SGX driver thinks that this page
> * was swapped out and loads it
> */
> mutex_lock(&encl->lock);
> /*
> * this is effectively a no-op
> */
> entry = sgx_encl_load_page_in_vma();
> /*
> * add PTE entry
> *
> * *BUG*: a PTE is installed for a
> * page in process of being removed
> */
> vmf_insert_pfn(...);
>
> mutex_unlock(&encl->lock);
> return VM_FAULT_NOPAGE;
> }
> /*
> * continue with page removal
> */
> mutex_lock(&encl->lock);
>
> sgx_encl_free_epc_page(epc_page) {
> /*
> * remove page via EREMOVE
> */
> /*
> * free EPC page
> */
> sgx_free_epc_page(epc_page);
> }
>
> xa_erase(&encl->page_array);
>
> mutex_unlock(&encl->lock);
> }
>
> Here, CPU1 removed the page. However CPU2 installed the PTE entry on the
> same page. This enclave page becomes perpetually inaccessible (until
> another SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl). This is because the page is
> marked accessible in the PTE entry but is not EAUGed, and any subsequent
> access to this page raises a fault: with the kernel believing there to
> be a valid VMA, the unlikely error code X86_PF_SGX encountered by code
> path do_user_addr_fault() -> access_error() causes the SGX driver's
> sgx_vma_fault() to be skipped and user space receives a SIGSEGV instead.
> The userspace SIGSEGV handler cannot perform EACCEPT because the page
> was not EAUGed. Thus, the user space is stuck with the inaccessible
> page.
>
> Fix this race by forcing the fault handler on CPU2 to back off if the
> page is currently being removed (on CPU1). This is achieved by
> setting SGX_ENCL_PAGE_BUSY flag right-before the first mutex_unlock() in
> sgx_encl_remove_pages(). Upon loading the page, CPU2 checks whether this
> page is busy, and if yes then CPU2 backs off and waits until the page is
> completely removed. After that, any memory access to this page results
> in a normal "allocate and EAUG a page on #PF" flow.
>
> Additionally fix a similar race: user space converts a normal enclave
> page to a TCS page (via SGX_IOC_ENCLAVE_MODIFY_TYPES) on CPU1, and at
> the same time, user space performs a memory access on the same page on
> CPU2. This fix is not strictly necessary (this particular race would
> indicate a bug in a user space application), but it gives a consistent
> rule: if an enclave page is under certain operation by the kernel with
> the mapping removed, then other threads trying to access that page are
> temporarily blocked and should retry.
>
> Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
> ---
> arch/x86/kernel/cpu/sgx/encl.h | 3 ++-
> arch/x86/kernel/cpu/sgx/ioctl.c | 17 +++++++++++++++++
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index b566b8ad5f33..96b11e8fb770 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -22,7 +22,8 @@
> /* 'desc' bits holding the offset in the VA (version array) page. */
> #define SGX_ENCL_PAGE_VA_OFFSET_MASK GENMASK_ULL(11, 3)
>
> -/* 'desc' bit indicating that the page is busy (being reclaimed). */
> +/* 'desc' bit indicating that the page is busy (being reclaimed, removed or
> + * converted to a TCS page). */
> #define SGX_ENCL_PAGE_BUSY BIT(2)
>
> /*
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5d390df21440..ee619f2b3414 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -969,12 +969,22 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl,
> /*
> * Do not keep encl->lock because of dependency on
> * mmap_lock acquired in sgx_zap_enclave_ptes().
> + *
> + * Releasing encl->lock leads to a data race: while CPU1
> + * performs sgx_zap_enclave_ptes() and removes the PTE
> + * entry for the enclave page, CPU2 may attempt to load
> + * this page (because the page is still in enclave's
> + * xarray). To prevent CPU2 from loading the page, mark
> + * the page as busy before unlock and unmark after lock
> + * again.
> */
> + entry->desc |= SGX_ENCL_PAGE_BUSY;
> mutex_unlock(&encl->lock);
>
> sgx_zap_enclave_ptes(encl, addr);
>
> mutex_lock(&encl->lock);
> + entry->desc &= ~SGX_ENCL_PAGE_BUSY;
>
> sgx_mark_page_reclaimable(entry->epc_page);
> }
> @@ -1141,7 +1151,14 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
> /*
> * Do not keep encl->lock because of dependency on
> * mmap_lock acquired in sgx_zap_enclave_ptes().
> + *
> + * Releasing encl->lock leads to a data race: while CPU1
> + * performs sgx_zap_enclave_ptes() and removes the PTE entry
> + * for the enclave page, CPU2 may attempt to load this page
> + * (because the page is still in enclave's xarray). To prevent
> + * CPU2 from loading the page, mark the page as busy.
> */
> + entry->desc |= SGX_ENCL_PAGE_BUSY;
> mutex_unlock(&encl->lock);
>
> sgx_zap_enclave_ptes(encl, addr);
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
^ permalink raw reply [flat|nested] 9+ messages in thread