public inbox for linux-sgx@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Dmitrii Kuvaiskii" <dmitrii.kuvaiskii@intel.com>,
	<dave.hansen@linux.intel.com>, <kai.huang@intel.com>,
	<haitao.huang@linux.intel.com>, <reinette.chatre@intel.com>,
	<linux-sgx@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <mona.vij@intel.com>, <kailun.qin@intel.com>, <stable@vger.kernel.org>
Subject: Re: [PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race
Date: Wed, 17 Jul 2024 13:38:59 +0300	[thread overview]
Message-ID: <D2RQZSM3MMVN.8DFKF3GGGTWE@kernel.org> (raw)
In-Reply-To: <20240705074524.443713-4-dmitrii.kuvaiskii@intel.com>

On Fri Jul 5, 2024 at 10:45 AM EEST, Dmitrii Kuvaiskii wrote:
> Two enclave threads may try to add and remove the same enclave page
> simultaneously (e.g., if the SGX runtime supports both lazy allocation
> and MADV_DONTNEED semantics). Consider some enclave page added to the
> enclave. User space decides to temporarily remove this page (e.g.,
> emulating the MADV_DONTNEED semantics) on CPU1. At the same time, user
> space performs a memory access on the same page on CPU2, which results
> in a #PF and ultimately in sgx_vma_fault(). Scenario proceeds as
> follows:
>
> /*
>  * CPU1: User space performs
>  * ioctl(SGX_IOC_ENCLAVE_REMOVE_PAGES)
>  * on enclave page X
>  */
> sgx_encl_remove_pages() {
>
>   mutex_lock(&encl->lock);
>
>   entry = sgx_encl_load_page(encl);
>   /*
>    * verify that page is
>    * trimmed and accepted
>    */
>
>   mutex_unlock(&encl->lock);
>
>   /*
>    * remove PTE entry; cannot
>    * be performed under lock
>    */
>   sgx_zap_enclave_ptes(encl);
>                                  /*
>                                   * Fault on CPU2 on same page X
>                                   */
>                                  sgx_vma_fault() {
>                                    /*
>                                     * PTE entry was removed, but the
>                                     * page is still in enclave's xarray
>                                     */
>                                    xa_load(&encl->page_array) != NULL ->
>                                    /*
>                                     * SGX driver thinks that this page
>                                     * was swapped out and loads it
>                                     */
>                                    mutex_lock(&encl->lock);
>                                    /*
>                                     * this is effectively a no-op
>                                     */
>                                    entry = sgx_encl_load_page_in_vma();
>                                    /*
>                                     * add PTE entry
>                                     *
>                                     * *BUG*: a PTE is installed for a
>                                     * page in process of being removed
>                                     */
>                                    vmf_insert_pfn(...);
>
>                                    mutex_unlock(&encl->lock);
>                                    return VM_FAULT_NOPAGE;
>                                  }
>   /*
>    * continue with page removal
>    */
>   mutex_lock(&encl->lock);
>
>   sgx_encl_free_epc_page(epc_page) {
>     /*
>      * remove page via EREMOVE
>      */
>     /*
>      * free EPC page
>      */
>     sgx_free_epc_page(epc_page);
>   }
>
>   xa_erase(&encl->page_array);
>
>   mutex_unlock(&encl->lock);
> }
>
> Here, CPU1 removed the page. However CPU2 installed the PTE entry on the
> same page. This enclave page becomes perpetually inaccessible (until
> another SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl). This is because the page is
> marked accessible in the PTE entry but is not EAUGed, and any subsequent
> access to this page raises a fault: with the kernel believing there to
> be a valid VMA, the unlikely error code X86_PF_SGX encountered by code
> path do_user_addr_fault() -> access_error() causes the SGX driver's
> sgx_vma_fault() to be skipped and user space receives a SIGSEGV instead.
> The userspace SIGSEGV handler cannot perform EACCEPT because the page
> was not EAUGed. Thus, the user space is stuck with the inaccessible
> page.
>
> Fix this race by forcing the fault handler on CPU2 to back off if the
> page is currently being removed (on CPU1). This is achieved by
> setting SGX_ENCL_PAGE_BUSY flag right-before the first mutex_unlock() in
> sgx_encl_remove_pages(). Upon loading the page, CPU2 checks whether this
> page is busy, and if yes then CPU2 backs off and waits until the page is
> completely removed. After that, any memory access to this page results
> in a normal "allocate and EAUG a page on #PF" flow.
>
> Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/ioctl.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5d390df21440..02441883401d 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1141,7 +1141,14 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>  		/*
>  		 * Do not keep encl->lock because of dependency on
>  		 * mmap_lock acquired in sgx_zap_enclave_ptes().
> +		 *
> +		 * Releasing encl->lock leads to a data race: while CPU1
> +		 * performs sgx_zap_enclave_ptes() and removes the PTE entry
> +		 * for the enclave page, CPU2 may attempt to load this page
> +		 * (because the page is still in enclave's xarray). To prevent
> +		 * CPU2 from loading the page, mark the page as busy.
>  		 */
> +		entry->desc |= SGX_ENCL_PAGE_BUSY;
>  		mutex_unlock(&encl->lock);
>  
>  		sgx_zap_enclave_ptes(encl, addr);

Ditto.

BR, Jarkko

  parent reply	other threads:[~2024-07-17 10:39 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-05  7:45 [PATCH v4 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-07-05  7:45 ` [PATCH v4 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags Dmitrii Kuvaiskii
2024-07-10 15:15   ` Haitao Huang
2024-07-17 10:36   ` Jarkko Sakkinen
2024-08-12  8:12     ` Dmitrii Kuvaiskii
2024-08-15 18:29       ` Jarkko Sakkinen
2024-07-17 10:37   ` Jarkko Sakkinen
2024-08-12  8:16     ` Dmitrii Kuvaiskii
2024-08-15 18:30       ` Jarkko Sakkinen
2024-07-25  2:00   ` Huang, Kai
2024-07-05  7:45 ` [PATCH v4 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-07-17 10:38   ` Jarkko Sakkinen
2024-08-12  8:21     ` Dmitrii Kuvaiskii
2024-08-15 18:31       ` Jarkko Sakkinen
2024-07-25  0:52   ` Huang, Kai
2024-07-05  7:45 ` [PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2024-07-10 15:16   ` Haitao Huang
2024-07-17 10:38   ` Jarkko Sakkinen [this message]
2024-08-12  8:25     ` Dmitrii Kuvaiskii
2024-08-15 18:34       ` Jarkko Sakkinen
2024-08-15 18:37         ` Jarkko Sakkinen
2024-07-25  1:21   ` Huang, Kai
2024-08-09  9:35     ` Dmitrii Kuvaiskii
2024-08-09 11:19       ` Huang, Kai
2024-08-12  8:32         ` Dmitrii Kuvaiskii
2024-08-12 10:34           ` Huang, Kai
2024-07-05  9:03 ` [PATCH v4 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D2RQZSM3MMVN.8DFKF3GGGTWE@kernel.org \
    --to=jarkko@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmitrii.kuvaiskii@intel.com \
    --cc=haitao.huang@linux.intel.com \
    --cc=kai.huang@intel.com \
    --cc=kailun.qin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mona.vij@intel.com \
    --cc=reinette.chatre@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox