From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nicholas Piggin <npiggin@gmail.com>, paulmck <paulmck@kernel.org>,
Alan Stern <stern@rowland.harvard.edu>
Cc: Anton Blanchard <anton@ozlabs.org>, Arnd Bergmann <arnd@arndb.de>,
linux-arch <linux-arch@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>, x86 <x86@kernel.org>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Thu, 16 Jul 2020 14:58:41 -0400 (EDT) [thread overview]
Message-ID: <595582123.17106.1594925921537.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1370747990.15974.1594915396143.JavaMail.zimbra@efficios.com>
----- On Jul 16, 2020, at 12:03 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Jul 16, 2020, at 11:46 AM, Mathieu Desnoyers
> mathieu.desnoyers@efficios.com wrote:
>
>> ----- On Jul 16, 2020, at 12:42 AM, Nicholas Piggin npiggin@gmail.com wrote:
>>> I should be more complete here, especially since I was complaining
>>> about unclear barrier comment :)
>>>
>>>
>>> CPU0 CPU1
>>> a. user stuff 1. user stuff
>>> b. membarrier() 2. enter kernel
>>> c. smp_mb() 3. smp_mb__after_spinlock(); // in __schedule
>>> d. read rq->curr 4. rq->curr switched to kthread
>>> e. is kthread, skip IPI 5. switch_to kthread
>>> f. return to user 6. rq->curr switched to user thread
>>> g. user stuff 7. switch_to user thread
>>> 8. exit kernel
>>> 9. more user stuff
>>>
>>> What you're really ordering is a, g vs 1, 9 right?
>>>
>>> In other words, 9 must see a if it sees g, g must see 1 if it saw 9,
>>> etc.
>>>
>>> Userspace does not care where the barriers are exactly or what kernel
>>> memory accesses might be being ordered by them, so long as there is a
>>> mb somewhere between a and g, and 1 and 9. Right?
>>
>> This is correct.
>
> Actually, sorry, the above is not quite right. It's been a while
> since I looked into the details of membarrier.
>
> The smp_mb() at the beginning of membarrier() needs to be paired with a
> smp_mb() _after_ rq->curr is switched back to the user thread, so the
> memory barrier is between store to rq->curr and following user-space
> accesses.
>
> The smp_mb() at the end of membarrier() needs to be paired with the
> smp_mb__after_spinlock() at the beginning of schedule, which is
> between accesses to userspace memory and switching rq->curr to kthread.
>
> As to *why* this ordering is needed, I'd have to dig through additional
> scenarios from https://lwn.net/Articles/573436/. Or maybe Paul remembers ?
Thinking further about this, I'm beginning to consider that maybe we have been
overly cautious by requiring memory barriers before and after store to rq->curr.
If CPU0 observes a CPU1's rq->curr->mm which differs from its own process (current)
while running the membarrier system call, it necessarily means that CPU1 had
to issue smp_mb__after_spinlock when entering the scheduler, between any user-space
loads/stores and update of rq->curr.
Requiring a memory barrier between update of rq->curr (back to current process's
thread) and following user-space memory accesses does not seem to guarantee
anything more than what the initial barrier at the beginning of __schedule already
provides, because the guarantees are only about accesses to user-space memory.
Therefore, with the memory barrier at the beginning of __schedule, just observing that
CPU1's rq->curr differs from current should guarantee that a memory barrier was issued
between any sequentially consistent instructions belonging to the current process on
CPU1.
Or am I missing/misremembering an important point here ?
Thanks,
Mathieu
>
> Thanks,
>
> Mathieu
>
>
>> Note that the accesses to user-space memory can be
>> done either by user-space code or kernel code, it doesn't matter.
>> However, in order to be considered as happening before/after
>> either membarrier or the matching compiler barrier, kernel code
>> needs to have causality relationship with user-space execution,
>> e.g. user-space does a system call, or returns from a system call.
>>
>> In the case of io_uring, submitting a request or returning from waiting
>> on request completion appear to provide this causality relationship.
>>
>> Thanks,
>>
>> Mathieu
>>
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nicholas Piggin <npiggin@gmail.com>, paulmck <paulmck@kernel.org>,
Alan Stern <stern@rowland.harvard.edu>
Cc: linux-arch <linux-arch@vger.kernel.org>,
Arnd Bergmann <arnd@arndb.de>,
Peter Zijlstra <peterz@infradead.org>, x86 <x86@kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>, Andy Lutomirski <luto@kernel.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Thu, 16 Jul 2020 14:58:41 -0400 (EDT) [thread overview]
Message-ID: <595582123.17106.1594925921537.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1370747990.15974.1594915396143.JavaMail.zimbra@efficios.com>
----- On Jul 16, 2020, at 12:03 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Jul 16, 2020, at 11:46 AM, Mathieu Desnoyers
> mathieu.desnoyers@efficios.com wrote:
>
>> ----- On Jul 16, 2020, at 12:42 AM, Nicholas Piggin npiggin@gmail.com wrote:
>>> I should be more complete here, especially since I was complaining
>>> about unclear barrier comment :)
>>>
>>>
>>> CPU0 CPU1
>>> a. user stuff 1. user stuff
>>> b. membarrier() 2. enter kernel
>>> c. smp_mb() 3. smp_mb__after_spinlock(); // in __schedule
>>> d. read rq->curr 4. rq->curr switched to kthread
>>> e. is kthread, skip IPI 5. switch_to kthread
>>> f. return to user 6. rq->curr switched to user thread
>>> g. user stuff 7. switch_to user thread
>>> 8. exit kernel
>>> 9. more user stuff
>>>
>>> What you're really ordering is a, g vs 1, 9 right?
>>>
>>> In other words, 9 must see a if it sees g, g must see 1 if it saw 9,
>>> etc.
>>>
>>> Userspace does not care where the barriers are exactly or what kernel
>>> memory accesses might be being ordered by them, so long as there is a
>>> mb somewhere between a and g, and 1 and 9. Right?
>>
>> This is correct.
>
> Actually, sorry, the above is not quite right. It's been a while
> since I looked into the details of membarrier.
>
> The smp_mb() at the beginning of membarrier() needs to be paired with a
> smp_mb() _after_ rq->curr is switched back to the user thread, so the
> memory barrier is between store to rq->curr and following user-space
> accesses.
>
> The smp_mb() at the end of membarrier() needs to be paired with the
> smp_mb__after_spinlock() at the beginning of schedule, which is
> between accesses to userspace memory and switching rq->curr to kthread.
>
> As to *why* this ordering is needed, I'd have to dig through additional
> scenarios from https://lwn.net/Articles/573436/. Or maybe Paul remembers ?
Thinking further about this, I'm beginning to consider that maybe we have been
overly cautious by requiring memory barriers before and after store to rq->curr.
If CPU0 observes a CPU1's rq->curr->mm which differs from its own process (current)
while running the membarrier system call, it necessarily means that CPU1 had
to issue smp_mb__after_spinlock when entering the scheduler, between any user-space
loads/stores and update of rq->curr.
Requiring a memory barrier between update of rq->curr (back to current process's
thread) and following user-space memory accesses does not seem to guarantee
anything more than what the initial barrier at the beginning of __schedule already
provides, because the guarantees are only about accesses to user-space memory.
Therefore, with the memory barrier at the beginning of __schedule, just observing that
CPU1's rq->curr differs from current should guarantee that a memory barrier was issued
between any sequentially consistent instructions belonging to the current process on
CPU1.
Or am I missing/misremembering an important point here ?
Thanks,
Mathieu
>
> Thanks,
>
> Mathieu
>
>
>> Note that the accesses to user-space memory can be
>> done either by user-space code or kernel code, it doesn't matter.
>> However, in order to be considered as happening before/after
>> either membarrier or the matching compiler barrier, kernel code
>> needs to have causality relationship with user-space execution,
>> e.g. user-space does a system call, or returns from a system call.
>>
>> In the case of io_uring, submitting a request or returning from waiting
>> on request completion appear to provide this causality relationship.
>>
>> Thanks,
>>
>> Mathieu
>>
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2020-07-16 18:58 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-10 1:56 [RFC PATCH 0/7] mmu context cleanup, lazy tlb cleanup, Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 1/7] asm-generic: add generic MMU versions of mmu context functions Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 2/7] arch: use asm-generic mmu context for no-op implementations Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 3/7] mm: introduce exit_lazy_tlb Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 9:42 ` Peter Zijlstra
2020-07-10 9:42 ` Peter Zijlstra
2020-07-10 14:02 ` Mathieu Desnoyers
2020-07-10 14:02 ` Mathieu Desnoyers
2020-07-10 17:04 ` Andy Lutomirski
2020-07-10 17:04 ` Andy Lutomirski
2020-07-13 4:45 ` Nicholas Piggin
2020-07-13 4:45 ` Nicholas Piggin
2020-07-13 13:47 ` Nicholas Piggin
2020-07-13 13:47 ` Nicholas Piggin
2020-07-13 14:13 ` Mathieu Desnoyers
2020-07-13 14:13 ` Mathieu Desnoyers
2020-07-13 15:48 ` Andy Lutomirski
2020-07-13 15:48 ` Andy Lutomirski
2020-07-13 16:37 ` Nicholas Piggin
2020-07-13 16:37 ` Nicholas Piggin
2020-07-16 4:15 ` Nicholas Piggin
2020-07-16 4:15 ` Nicholas Piggin
2020-07-16 4:42 ` Nicholas Piggin
2020-07-16 4:42 ` Nicholas Piggin
2020-07-16 15:46 ` Mathieu Desnoyers
2020-07-16 15:46 ` Mathieu Desnoyers
2020-07-16 16:03 ` Mathieu Desnoyers
2020-07-16 16:03 ` Mathieu Desnoyers
2020-07-16 18:58 ` Mathieu Desnoyers [this message]
2020-07-16 18:58 ` Mathieu Desnoyers
2020-07-16 21:24 ` Alan Stern
2020-07-16 21:24 ` Alan Stern
2020-07-17 13:39 ` Mathieu Desnoyers
2020-07-17 13:39 ` Mathieu Desnoyers
2020-07-17 14:51 ` Alan Stern
2020-07-17 14:51 ` Alan Stern
2020-07-17 15:39 ` Mathieu Desnoyers
2020-07-17 15:39 ` Mathieu Desnoyers
2020-07-17 16:11 ` Alan Stern
2020-07-17 16:11 ` Alan Stern
2020-07-17 16:22 ` Mathieu Desnoyers
2020-07-17 16:22 ` Mathieu Desnoyers
2020-07-17 17:44 ` Alan Stern
2020-07-17 17:44 ` Alan Stern
2020-07-17 17:52 ` Mathieu Desnoyers
2020-07-17 17:52 ` Mathieu Desnoyers
2020-07-17 0:00 ` Nicholas Piggin
2020-07-17 0:00 ` Nicholas Piggin
2020-07-16 5:18 ` Andy Lutomirski
2020-07-16 5:18 ` Andy Lutomirski
2020-07-16 6:06 ` Nicholas Piggin
2020-07-16 6:06 ` Nicholas Piggin
2020-07-16 8:50 ` Peter Zijlstra
2020-07-16 8:50 ` Peter Zijlstra
2020-07-16 10:03 ` Nicholas Piggin
2020-07-16 10:03 ` Nicholas Piggin
2020-07-16 11:00 ` peterz
2020-07-16 11:00 ` peterz
2020-07-16 15:34 ` Mathieu Desnoyers
2020-07-16 15:34 ` Mathieu Desnoyers
2020-07-16 23:26 ` Nicholas Piggin
2020-07-16 23:26 ` Nicholas Piggin
2020-07-17 13:42 ` Mathieu Desnoyers
2020-07-17 13:42 ` Mathieu Desnoyers
2020-07-20 3:03 ` Nicholas Piggin
2020-07-20 3:03 ` Nicholas Piggin
2020-07-20 16:46 ` Mathieu Desnoyers
2020-07-20 16:46 ` Mathieu Desnoyers
2020-07-21 10:04 ` Nicholas Piggin
2020-07-21 10:04 ` Nicholas Piggin
2020-07-21 13:11 ` Mathieu Desnoyers
2020-07-21 13:11 ` Mathieu Desnoyers
2020-07-21 14:30 ` Nicholas Piggin
2020-07-21 14:30 ` Nicholas Piggin
2020-07-21 15:06 ` peterz
2020-07-21 15:06 ` peterz
2020-07-21 15:15 ` Mathieu Desnoyers
2020-07-21 15:15 ` Mathieu Desnoyers
2020-07-21 15:19 ` Peter Zijlstra
2020-07-21 15:19 ` Peter Zijlstra
2020-07-21 15:22 ` Mathieu Desnoyers
2020-07-21 15:22 ` Mathieu Desnoyers
2020-07-10 1:56 ` [RFC PATCH 5/7] lazy tlb: introduce lazy mm refcount helper functions Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 9:48 ` Peter Zijlstra
2020-07-10 9:48 ` Peter Zijlstra
2020-07-10 1:56 ` [RFC PATCH 6/7] lazy tlb: allow lazy tlb mm switching to be configurable Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 7/7] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Nicholas Piggin
2020-07-10 1:56 ` Nicholas Piggin
2020-07-10 9:35 ` Peter Zijlstra
2020-07-10 9:35 ` Peter Zijlstra
2020-07-13 4:58 ` Nicholas Piggin
2020-07-13 4:58 ` Nicholas Piggin
2020-07-13 15:59 ` Andy Lutomirski
2020-07-13 15:59 ` Andy Lutomirski
2020-07-13 16:48 ` Nicholas Piggin
2020-07-13 16:48 ` Nicholas Piggin
2020-07-13 18:18 ` Andy Lutomirski
2020-07-13 18:18 ` Andy Lutomirski
2020-07-14 5:04 ` Nicholas Piggin
2020-07-14 5:04 ` Nicholas Piggin
2020-07-14 6:31 ` Nicholas Piggin
2020-07-14 6:31 ` Nicholas Piggin
2020-07-14 12:46 ` Andy Lutomirski
2020-07-14 12:46 ` Andy Lutomirski
2020-07-14 13:23 ` Peter Zijlstra
2020-07-14 13:23 ` Peter Zijlstra
2020-07-16 2:26 ` Nicholas Piggin
2020-07-16 2:26 ` Nicholas Piggin
2020-07-16 2:35 ` Nicholas Piggin
2020-07-16 2:35 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=595582123.17106.1594925921537.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=anton@ozlabs.org \
--cc=arnd@arndb.de \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=stern@rowland.harvard.edu \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.