public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows
@ 2024-04-29 10:43 Dmitrii Kuvaiskii
  2024-04-29 10:43 ` [PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Dmitrii Kuvaiskii @ 2024-04-29 10:43 UTC (permalink / raw)
  To: dave.hansen, jarkko, kai.huang, haitao.huang, reinette.chatre,
	linux-sgx, linux-kernel
  Cc: mona.vij, kailun.qin

SGX runtimes such as Gramine may implement EDMM-based lazy allocation of
enclave pages and may support MADV_DONTNEED semantics [1]. The former
implies #PF-based page allocation, and the latter implies the usage of
SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl.

A trivial program like below (run under Gramine and with EDMM enabled)
stresses these two flows in the SGX driver and hangs:

/* repeatedly touch different enclave pages at random and mix with
 * `madvise(MADV_DONTNEED)` to stress EAUG/EREMOVE flows */
static void* thread_func(void* arg) {
    size_t num_pages = 0xA000 / page_size;
    for (int i = 0; i < 5000; i++) {
        size_t page = get_random_ulong() % num_pages;
        char data = READ_ONCE(((char*)arg)[page * page_size]);

        page = get_random_ulong() % num_pages;
        madvise(arg + page * page_size, page_size, MADV_DONTNEED);
    }
}

addr = mmap(NULL, 0xA000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS, -1, 0);
pthread_t threads[16];
for (int i = 0; i < 16; i++)
    pthread_create(&threads[i], NULL, thread_func, addr);

This program uncovers two data races in the SGX driver. The remaining
patches describe and fix these races.

I performed several stress tests to verify that there are no other data
races (at least with the test program above):

- On Icelake server with 128GB of PRMRR (EPC), without madvise(). This
  stresses the first data race. A Gramine SGX test suite running in the
  background for additional stressing. Result: 1,000 runs without hangs
  (result without the first bug fix: hangs every time).
- On Icelake server with 128GB of PRMRR (EPC), with madvise(). This
  stresses the second data race. A Gramine SGX test suite running in the
  background for additional stressing. Result: 1,000 runs without hangs
  (result with the first bug fix but without the second bug fix: hangs
  approx. once in 50 runs).
- On Icelake server with 4GB of PRMRR (EPC), with madvise(). This
  additionally stresses the enclave page swapping flows. Two Gramine SGX
  test suites running in the background for additional stressing of
  swapping (I observe 100% CPU utilization from ksgxd which confirms that
  swapping happens). Result: 1,000 runs without hangs.

(Sorry for the previous copy of this email, accidentally sent to
stable@vger.kernel.org. Failed to use `--suppress-cc` during a test send.)

Dmitrii Kuvaiskii (2):
  x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
  x86/sgx: Resolve EREMOVE page vs EAUG page data race

 arch/x86/kernel/cpu/sgx/encl.c  | 10 +++++++---
 arch/x86/kernel/cpu/sgx/encl.h  |  3 +++
 arch/x86/kernel/cpu/sgx/ioctl.c |  1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-05-10 23:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-29 10:43 [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Dmitrii Kuvaiskii
2024-04-29 10:43 ` [PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS Dmitrii Kuvaiskii
2024-04-29 13:04   ` Jarkko Sakkinen
2024-04-29 13:22     ` Jarkko Sakkinen
2024-04-29 13:24       ` Jarkko Sakkinen
2024-04-30 14:37     ` Dmitrii Kuvaiskii
2024-05-10 23:47       ` Reinette Chatre
2024-04-29 10:43 ` [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race Dmitrii Kuvaiskii
2024-04-29 13:11   ` Jarkko Sakkinen
2024-04-30 14:38     ` Dmitrii Kuvaiskii
2024-05-10 23:47       ` Reinette Chatre
2024-04-29 13:06 ` [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows Jarkko Sakkinen
2024-04-30 14:35   ` Dmitrii Kuvaiskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox