From: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-sgx@vger.kernel.org, Haitao Huang <haitao.huang@linux.intel.com>
Subject: Re: [PATCH v2] x86/sgx: Fix deadlock and race conditions between fork() and EPC reclaim
Date: Fri, 10 Apr 2020 16:22:45 +0300 [thread overview]
Message-ID: <20200410132245.GA49170@linux.intel.com> (raw)
In-Reply-To: <20200409191353.GB8919@linux.intel.com>
On Thu, Apr 09, 2020 at 12:13:53PM -0700, Sean Christopherson wrote:
> On Mon, Apr 06, 2020 at 08:15:56PM +0300, Jarkko Sakkinen wrote:
> > On Mon, Apr 06, 2020 at 08:10:29PM +0300, Jarkko Sakkinen wrote:
> > > On Mon, Apr 06, 2020 at 07:36:38AM -0700, Sean Christopherson wrote:
> > > > On Sat, Apr 04, 2020 at 04:12:02AM +0300, Jarkko Sakkinen wrote:
> > > > > On Fri, Apr 03, 2020 at 04:42:39PM -0700, Sean Christopherson wrote:
> > > > > > On Fri, Apr 03, 2020 at 12:35:50PM +0300, Jarkko Sakkinen wrote:
> > > > > > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > > > > > @@ -221,12 +224,16 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
> > > > > > > return ret;
> > > > > > > }
> > > > > > >
> > > > > > > + /*
> > > > > > > + * The page reclaimer uses list version for synchronization instead of
> > > > > > > + * synchronize_scru() because otherwise we could conflict with
> > > > > > > + * dup_mmap().
> > > > > > > + */
> > > > > > > spin_lock(&encl->mm_lock);
> > > > > > > list_add_rcu(&encl_mm->list, &encl->mm_list);
> > > > > >
> > > > > > You dropped the smp_wmb().
> > > > >
> > > > > As I said to you in my review x86 pipeline does not reorder writes.
> > > >
> > > > And as I pointed out in this thread, smp_wmb() is a _compiler_ barrier if
> > > > and only if CONFIG_SMP=y. The compiler can reorder list_add_rcu() and
> > > > mm_list_version++ because from it's perspective there is no dependency
> > > > between the two. And that's entirely true except for the SMP case where
> > > > the consumer of mm_list_version is relying on the list to be updated before
> > > > the version changes.
> > >
> > > I see.
> > >
> > > So why not change the variable volatile given that x86 is the only
> > > arch that this code gets used?
> >
> > Please note that I'm fully aware of
> >
> > https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html
> >
> > Just wondering. Anyway, I'll add smp_wmb() back since it is safe play
> > in terms of acceptance.
>
> Because volatile is overkill and too heavy-handed for what is needed here.
> E.g. if SMP=n, then there are no barriers needed whatsover, as a CPU can't
> race with itself.
>
> Even for SMP=y, volatile is too much as it prevents optimization that would
> otherwise be allowed. For the smp_wmb() it doesn't matter much, if at all,
> since the list and version are updated inside of a critical section, i.e.
> barriered on both sides so the compiler can't do much optimization anyways.
>
> But for the smp_rmb() case, the compiler is free to cache the version in a
> register early on in the function, it just needs to make sure that the
> access is done before starting the list walk. If encl->mm_list_version
> were volatile, the compiler would not be allowed to do such an optimization
> as it'd be required to access memory exactly when it's referenced in code.
>
> This is easily visible in the compiled code, as encl->mm_list_version is
> only read from memory once per iteration (in the unlikely case that there
> are multiple iterations). It takes the do-while loop, which appears to
> read encl->mm_list_version twice per iteration:
>
> do {
> mm_list_version = encl->mm_list_version;
>
> <walk list>
>
> } while (unlikely(encl->mm_list_version != mm_list_version));
>
>
> And turns it into an optimized loop that loads encl->mm_list_version the
> minimum number of times. If encl->mm_list_version list were volatile, the
> compiler would not be allowed to cache it in %rax.
>
> mov 0x58(%r12),%rax // Load from encl->mm_list_version
> lea 0x40(%r12),%rbx // Interleaved to optimize ALU vs LD
> and $0xfffffffffffff000,%rbp // Interleaved to optimize ALU vs LD
> mov %rax,0x8(%rsp) // mm_list_version = encl->mm_list_version
>
>
> walk_mm_list:
> <blah blah blah>
>
> mov 0x58(%r12),%rax // Load from encl->mm_list_version
> cmp 0x8(%rsp),%rax // Compare against mm_list_version
> jne update_mm_list_version
>
> <happy path>
> ret
>
> update_mm_list_version:
> mov %rax,0x8(%rsp) // mm_list_version = encl->mm_list_version
> jmpq 0xffffffff8102e460 <walk_mm_list>
Thanks and check v4 so that I can squash it :-)
/Jarkko
prev parent reply other threads:[~2020-04-10 13:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 9:35 [PATCH v2] x86/sgx: Fix deadlock and race conditions between fork() and EPC reclaim Jarkko Sakkinen
2020-04-03 23:33 ` Jarkko Sakkinen
2020-04-03 23:35 ` Sean Christopherson
2020-04-04 1:12 ` Jarkko Sakkinen
2020-04-06 14:31 ` Sean Christopherson
2020-04-06 14:44 ` Sean Christopherson
2020-04-03 23:42 ` Sean Christopherson
2020-04-04 1:12 ` Jarkko Sakkinen
2020-04-06 14:36 ` Sean Christopherson
2020-04-06 17:10 ` Jarkko Sakkinen
2020-04-06 17:15 ` Jarkko Sakkinen
2020-04-09 19:13 ` Sean Christopherson
2020-04-10 13:22 ` Jarkko Sakkinen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200410132245.GA49170@linux.intel.com \
--to=jarkko.sakkinen@linux.intel.com \
--cc=haitao.huang@linux.intel.com \
--cc=linux-sgx@vger.kernel.org \
--cc=sean.j.christopherson@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).