From: Jarkko Sakkinen <jarkko@kernel.org>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: dave.hansen@linux.intel.com, tglx@linutronix.de, bp@alien8.de,
luto@kernel.org, mingo@redhat.com, linux-sgx@vger.kernel.org,
x86@kernel.org, haitao.huang@intel.com, hpa@zytor.com,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH V3 4/5] x86/sgx: Fix race between reclaimer and page fault handler
Date: Fri, 13 May 2022 17:40:14 +0300 [thread overview]
Message-ID: <Yn5tznK08SeWxd4S@iki.fi> (raw)
In-Reply-To: <ed20a5db516aa813873268e125680041ae11dfcf.1652389823.git.reinette.chatre@intel.com>
On Thu, May 12, 2022 at 02:51:00PM -0700, Reinette Chatre wrote:
> Haitao reported encountering a WARN triggered by the ENCLS[ELDU]
> instruction faulting with a #GP.
>
> The WARN is encountered when the reclaimer evicts a range of
> pages from the enclave when the same pages are faulted back right away.
>
> Consider two enclave pages (ENCLAVE_A and ENCLAVE_B)
> sharing a PCMD page (PCMD_AB). ENCLAVE_A is in the
> enclave memory and ENCLAVE_B is in the backing store. PCMD_AB contains
> just one entry, that of ENCLAVE_B.
>
> Scenario proceeds where ENCLAVE_A is being evicted from the enclave
> while ENCLAVE_B is faulted in.
>
> sgx_reclaim_pages() {
>
> ...
>
> /*
> * Reclaim ENCLAVE_A
> */
> mutex_lock(&encl->lock);
> /*
> * Get a reference to ENCLAVE_A's
> * shmem page where enclave page
> * encrypted data will be stored
> * as well as a reference to the
> * enclave page's PCMD data page,
> * PCMD_AB.
> * Release mutex before writing
> * any data to the shmem pages.
> */
> sgx_encl_get_backing(...);
> encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
> mutex_unlock(&encl->lock);
>
> /*
> * Fault ENCLAVE_B
> */
>
> sgx_vma_fault() {
>
> mutex_lock(&encl->lock);
> /*
> * Get reference to
> * ENCLAVE_B's shmem page
> * as well as PCMD_AB.
> */
> sgx_encl_get_backing(...)
> /*
> * Load page back into
> * enclave via ELDU.
> */
> /*
> * Release reference to
> * ENCLAVE_B' shmem page and
> * PCMD_AB.
> */
> sgx_encl_put_backing(...);
> /*
> * PCMD_AB is found empty so
> * it and ENCLAVE_B's shmem page
> * are truncated.
> */
> /* Truncate ENCLAVE_B backing page */
> sgx_encl_truncate_backing_page();
> /* Truncate PCMD_AB */
> sgx_encl_truncate_backing_page();
>
> mutex_unlock(&encl->lock);
>
> ...
> }
> mutex_lock(&encl->lock);
> encl_page->desc &=
> ~SGX_ENCL_PAGE_BEING_RECLAIMED;
> /*
> * Write encrypted contents of
> * ENCLAVE_A to ENCLAVE_A shmem
> * page and its PCMD data to
> * PCMD_AB.
> */
> sgx_encl_put_backing(...)
>
> /*
> * Reference to PCMD_AB is
> * dropped and it is truncated.
> * ENCLAVE_A's PCMD data is lost.
> */
> mutex_unlock(&encl->lock);
> }
>
> What happens next depends on whether it is ENCLAVE_A being faulted
> in or ENCLAVE_B being evicted - but both end up with ENCLS[ELDU] faulting
> with a #GP.
>
> If ENCLAVE_A is faulted then at the time sgx_encl_get_backing() is called
> a new PCMD page is allocated and providing the empty PCMD data for
> ENCLAVE_A would cause ENCLS[ELDU] to #GP
>
> If ENCLAVE_B is evicted first then a new PCMD_AB would be allocated by the
> reclaimer but later when ENCLAVE_A is faulted the ENCLS[ELDU] instruction
> would #GP during its checks of the PCMD value and the WARN would be
> encountered.
>
> Noting that the reclaimer sets SGX_ENCL_PAGE_BEING_RECLAIMED at the time
> it obtains a reference to the backing store pages of an enclave page it
> is in the process of reclaiming, fix the race by only truncating the PCMD
> page after ensuring that no page sharing the PCMD page is in the process
> of being reclaimed.
>
> Cc: stable@vger.kernel.org
> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
> Reported-by: Haitao Huang <haitao.huang@intel.com>
> Tested-by: Haitao Huang <haitao.huang@intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Declare "addr" and "entry" within loop. (Dave)
> - Fix incorrect error return when enclave page not found to belong
> to enclave. Continue search instead of returning failure. (Dave)
> - Add Haitao's "Tested-by" tag.
> - Rename pcmd_page_in_use() to reclaimer_writing_to_pcmd() to be less
> generic. (Dave)
> - Improve function comments to be clear about it testing for PCMD
> page soon becoming non-empty, also add context info to kernel-doc
> to indicate that enclave mutex must be held. (Dave)
>
> Changes since RFC v1:
> - New patch.
>
> arch/x86/kernel/cpu/sgx/encl.c | 94 +++++++++++++++++++++++++++++++++-
> 1 file changed, 93 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 5104a428b72c..243f3bd78145 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -12,6 +12,92 @@
> #include "encls.h"
> #include "sgx.h"
>
> +#define PCMDS_PER_PAGE (PAGE_SIZE / sizeof(struct sgx_pcmd))
> +/*
> + * 32 PCMD entries share a PCMD page. PCMD_FIRST_MASK is used to
> + * determine the page index associated with the first PCMD entry
> + * within a PCMD page.
> + */
> +#define PCMD_FIRST_MASK GENMASK(4, 0)
> +
> +/**
> + * reclaimer_writing_to_pcmd() - Query if any enclave page associated with
> + * a PCMD page is in process of being reclaimed.
> + * @encl: Enclave to which PCMD page belongs
> + * @start_addr: Address of enclave page using first entry within the PCMD page
> + *
> + * When an enclave page is reclaimed some Paging Crypto MetaData (PCMD) is
> + * stored. The PCMD data of a reclaimed enclave page contains enough
> + * information for the processor to verify the page at the time
> + * it is loaded back into the Enclave Page Cache (EPC).
> + *
> + * The backing storage to which enclave pages are reclaimed is laid out as
> + * follows:
> + * Encrypted enclave pages:SECS page:PCMD pages
> + *
> + * Each PCMD page contains the PCMD metadata of
> + * PAGE_SIZE/sizeof(struct sgx_pcmd) enclave pages.
> + *
> + * A PCMD page can only be truncated if it is (a) empty, and (b) not in the
> + * process of getting data (and thus soon being non-empty). (b) is tested with
> + * a check if an enclave page sharing the PCMD page is in the process of being
> + * reclaimed.
> + *
> + * The reclaimer sets the SGX_ENCL_PAGE_BEING_RECLAIMED flag when it
> + * intends to reclaim that enclave page - it means that the PCMD page
> + * associated with that enclave page is about to get some data and thus
> + * even if the PCMD page is empty, it should not be truncated.
> + *
> + * Context: Enclave mutex (&sgx_encl->lock) must be held.
> + * Return: 1 if the reclaimer is about to write to the PCMD page
> + * 0 if the reclaimer has no intention to write to the PCMD page
> + */
> +static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
> + unsigned long start_addr)
> +{
> + int reclaimed = 0;
> + int i;
> +
> + /*
> + * PCMD_FIRST_MASK is based on number of PCMD entries within
> + * PCMD page being 32.
> + */
> + BUILD_BUG_ON(PCMDS_PER_PAGE != 32);
> +
> + for (i = 0; i < PCMDS_PER_PAGE; i++) {
> + struct sgx_encl_page *entry;
> + unsigned long addr;
> +
> + addr = start_addr + i * PAGE_SIZE;
> +
> + /*
> + * Stop when reaching the SECS page - it does not
> + * have a page_array entry and its reclaim is
> + * started and completed with enclave mutex held so
> + * it does not use the SGX_ENCL_PAGE_BEING_RECLAIMED
> + * flag.
> + */
> + if (addr == encl->base + encl->size)
> + break;
> +
> + entry = xa_load(&encl->page_array, PFN_DOWN(addr));
> + if (!entry)
> + continue;
> +
> + /*
> + * VA page slot ID uses same bit as the flag so it is important
> + * to ensure that the page is not already in backing store.
> + */
> + if (entry->epc_page &&
> + (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)) {
> + reclaimed = 1;
> + break;
> + }
> + }
> +
> + return reclaimed;
> +}
> +
> /*
> * Calculate byte offset of a PCMD struct associated with an enclave page. PCMD's
> * follow right after the EPC data in the backing storage. In addition to the
> @@ -47,6 +133,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
> struct sgx_encl *encl = encl_page->encl;
> pgoff_t page_index, page_pcmd_off;
> + unsigned long pcmd_first_page;
> struct sgx_pageinfo pginfo;
> struct sgx_backing b;
> bool pcmd_page_empty;
> @@ -58,6 +145,11 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> else
> page_index = PFN_DOWN(encl->size);
>
> + /*
> + * Address of enclave page using the first entry within the PCMD page.
> + */
> + pcmd_first_page = PFN_PHYS(page_index & ~PCMD_FIRST_MASK) + encl->base;
> +
> page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
>
> ret = sgx_encl_get_backing(encl, page_index, &b);
> @@ -99,7 +191,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>
> sgx_encl_truncate_backing_page(encl, page_index);
>
> - if (pcmd_page_empty)
> + if (pcmd_page_empty && !reclaimer_writing_to_pcmd(encl, pcmd_first_page))
> sgx_encl_truncate_backing_page(encl, PFN_DOWN(page_pcmd_off));
>
> return ret;
> --
> 2.25.1
>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
next prev parent reply other threads:[~2022-05-13 14:44 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 21:50 [PATCH V3 0/5] SGX shmem backing store issue Reinette Chatre
2022-05-12 21:50 ` [PATCH V3 1/5] x86/sgx: Disconnect backing page references from dirty status Reinette Chatre
2022-05-13 14:39 ` Jarkko Sakkinen
2022-05-12 21:50 ` [PATCH V3 2/5] x86/sgx: Mark PCMD page as dirty when modifying contents Reinette Chatre
2022-05-12 21:50 ` [PATCH V3 3/5] x86/sgx: Obtain backing storage page with enclave mutex held Reinette Chatre
2022-05-13 14:39 ` Jarkko Sakkinen
2022-05-12 21:51 ` [PATCH V3 4/5] x86/sgx: Fix race between reclaimer and page fault handler Reinette Chatre
2022-05-13 14:40 ` Jarkko Sakkinen [this message]
2022-05-12 21:51 ` [PATCH V3 5/5] x86/sgx: Ensure no data in PCMD page after truncate Reinette Chatre
2022-05-13 14:39 ` Jarkko Sakkinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yn5tznK08SeWxd4S@iki.fi \
--to=jarkko@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=haitao.huang@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox