All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Dmitrii Kuvaiskii" <dmitrii.kuvaiskii@intel.com>,
	<dave.hansen@linux.intel.com>, <kai.huang@intel.com>,
	<haitao.huang@linux.intel.com>, <reinette.chatre@intel.com>,
	<linux-sgx@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <mona.vij@intel.com>, <kailun.qin@intel.com>
Subject: Re: [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows
Date: Mon, 29 Apr 2024 16:06:39 +0300	[thread overview]
Message-ID: <D0WMNTCRUN00.TQHC8O6X6WI2@kernel.org> (raw)
In-Reply-To: <20240429104330.3636113-1-dmitrii.kuvaiskii@intel.com>

On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote:
> SGX runtimes such as Gramine may implement EDMM-based lazy allocation of
> enclave pages and may support MADV_DONTNEED semantics [1]. The former
> implies #PF-based page allocation, and the latter implies the usage of
> SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl.
>
> A trivial program like below (run under Gramine and with EDMM enabled)
> stresses these two flows in the SGX driver and hangs:
>
> /* repeatedly touch different enclave pages at random and mix with
>  * `madvise(MADV_DONTNEED)` to stress EAUG/EREMOVE flows */
> static void* thread_func(void* arg) {
>     size_t num_pages = 0xA000 / page_size;
>     for (int i = 0; i < 5000; i++) {
>         size_t page = get_random_ulong() % num_pages;
>         char data = READ_ONCE(((char*)arg)[page * page_size]);
>
>         page = get_random_ulong() % num_pages;
>         madvise(arg + page * page_size, page_size, MADV_DONTNEED);
>     }
> }
>
> addr = mmap(NULL, 0xA000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS, -1, 0);
> pthread_t threads[16];
> for (int i = 0; i < 16; i++)
>     pthread_create(&threads[i], NULL, thread_func, addr);

I'm not convinced that kernel is the problem here but it could be also
how Gramine is implemented.

So maybe you could make a better case of that. The example looks a bit
artificial to me.

>
> This program uncovers two data races in the SGX driver. The remaining
> patches describe and fix these races.
>
> I performed several stress tests to verify that there are no other data
> races (at least with the test program above):
>
> - On Icelake server with 128GB of PRMRR (EPC), without madvise(). This
>   stresses the first data race. A Gramine SGX test suite running in the
>   background for additional stressing. Result: 1,000 runs without hangs
>   (result without the first bug fix: hangs every time).
> - On Icelake server with 128GB of PRMRR (EPC), with madvise(). This
>   stresses the second data race. A Gramine SGX test suite running in the
>   background for additional stressing. Result: 1,000 runs without hangs
>   (result with the first bug fix but without the second bug fix: hangs
>   approx. once in 50 runs).
> - On Icelake server with 4GB of PRMRR (EPC), with madvise(). This
>   additionally stresses the enclave page swapping flows. Two Gramine SGX
>   test suites running in the background for additional stressing of
>   swapping (I observe 100% CPU utilization from ksgxd which confirms that
>   swapping happens). Result: 1,000 runs without hangs.
>
> (Sorry for the previous copy of this email, accidentally sent to
> stable@vger.kernel.org. Failed to use `--suppress-cc` during a test send.)
>
> Dmitrii Kuvaiskii (2):
>   x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
>   x86/sgx: Resolve EREMOVE page vs EAUG page data race
>
>  arch/x86/kernel/cpu/sgx/encl.c  | 10 +++++++---
>  arch/x86/kernel/cpu/sgx/encl.h  |  3 +++
>  arch/x86/kernel/cpu/sgx/ioctl.c |  1 +
>  3 files changed, 11 insertions(+), 3 deletions(-)

BR, Jarkko

  parent reply	other threads:[~2024-04-29 13:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-29 10:43 [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-04-29 10:43 ` [PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-04-29 13:04   ` Jarkko Sakkinen
2024-04-29 13:22     ` Jarkko Sakkinen
2024-04-29 13:24       ` Jarkko Sakkinen
2024-04-30 14:37     ` Dmitrii Kuvaiskii
2024-05-10 23:47       ` Reinette Chatre
2024-04-29 10:43 ` [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2024-04-29 13:11   ` Jarkko Sakkinen
2024-04-30 14:38     ` Dmitrii Kuvaiskii
2024-05-10 23:47       ` Reinette Chatre
2024-04-29 13:06 ` Jarkko Sakkinen [this message]
2024-04-30 14:35   ` [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D0WMNTCRUN00.TQHC8O6X6WI2@kernel.org \
    --to=jarkko@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmitrii.kuvaiskii@intel.com \
    --cc=haitao.huang@linux.intel.com \
    --cc=kai.huang@intel.com \
    --cc=kailun.qin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mona.vij@intel.com \
    --cc=reinette.chatre@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.