From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Anton Blanchard <anton@ozlabs.org>, Arnd Bergmann <arnd@arndb.de>,
linux-arch <linux-arch@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Andy Lutomirski <luto@amacapital.net>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>, x86 <x86@kernel.org>,
Jens Axboe <axboe@kernel.dk>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Mon, 20 Jul 2020 12:46:30 -0400 (EDT) [thread overview]
Message-ID: <2055788870.20749.1595263590675.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1595213677.kxru89dqy2.astroid@bobo.none>
----- On Jul 19, 2020, at 11:03 PM, Nicholas Piggin npiggin@gmail.com wrote:
> Excerpts from Mathieu Desnoyers's message of July 17, 2020 11:42 pm:
>> ----- On Jul 16, 2020, at 7:26 PM, Nicholas Piggin npiggin@gmail.com wrote:
>> [...]
>>>
>>> membarrier does replace barrier instructions on remote CPUs, which do
>>> order accesses performed by the kernel on the user address space. So
>>> membarrier should too I guess.
>>>
>>> Normal process context accesses like read(2) will do so because they
>>> don't get filtered out from IPIs, but kernel threads using the mm may
>>> not.
>>
>> But it should not be an issue, because membarrier's ordering is only with
>> respect
>> to submit and completion of io_uring requests, which are performed through
>> system calls from the context of user-space threads, which are called from the
>> right mm.
>
> Is that true? Can io completions be written into an address space via a
> kernel thread? I don't know the io_uring code well but it looks like
> that's asynchonously using the user mm context.
Indeed, the io completion appears to be signaled asynchronously between kernel
and user-space. Therefore, both kernel and userspace code need to have proper
memory barriers in place to signal completion, otherwise user-space could read
garbage after it notices completion of a read.
I did not review the entire io_uring implementation, but the publish side
for completion appears to be:
static void __io_commit_cqring(struct io_ring_ctx *ctx)
{
struct io_rings *rings = ctx->rings;
/* order cqe stores with ring update */
smp_store_release(&rings->cq.tail, ctx->cached_cq_tail);
if (wq_has_sleeper(&ctx->cq_wait)) {
wake_up_interruptible(&ctx->cq_wait);
kill_fasync(&ctx->cq_fasync, SIGIO, POLL_IN);
}
}
The store-release on tail should be paired with a load_acquire on the
reader-side (it's called "read_barrier()" in the code):
tools/io_uring/queue.c:
static int __io_uring_get_cqe(struct io_uring *ring,
struct io_uring_cqe **cqe_ptr, int wait)
{
struct io_uring_cq *cq = &ring->cq;
const unsigned mask = *cq->kring_mask;
unsigned head;
int ret;
*cqe_ptr = NULL;
head = *cq->khead;
do {
/*
* It's necessary to use a read_barrier() before reading
* the CQ tail, since the kernel updates it locklessly. The
* kernel has the matching store barrier for the update. The
* kernel also ensures that previous stores to CQEs are ordered
* with the tail update.
*/
read_barrier();
if (head != *cq->ktail) {
*cqe_ptr = &cq->cqes[head & mask];
break;
}
if (!wait)
break;
ret = io_uring_enter(ring->ring_fd, 0, 1,
IORING_ENTER_GETEVENTS, NULL);
if (ret < 0)
return -errno;
} while (1);
return 0;
}
So as far as membarrier memory ordering dependencies are concerned, it relies
on the store-release/load-acquire dependency chain in the completion queue to
order against anything that was done prior to the completed requests.
What is in-flight while the requests are being serviced provides no memory
ordering guarantee whatsoever.
> How about other memory accesses via kthread_use_mm? Presumably there is
> still ordering requirement there for membarrier,
Please provide an example case with memory accesses via kthread_use_mm where
ordering matters to support your concern.
> so I really think
> it's a fragile interface with no real way for the user to know how
> kernel threads may use its mm for any particular reason, so membarrier
> should synchronize all possible kernel users as well.
I strongly doubt so, but perhaps something should be clarified in the documentation
if you have that feeling.
Thanks,
Mathieu
>
> Thanks,
> Nick
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2020-07-20 16:46 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-10 1:56 [RFC PATCH 0/7] mmu context cleanup, lazy tlb cleanup, Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 1/7] asm-generic: add generic MMU versions of mmu context functions Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 2/7] arch: use asm-generic mmu context for no-op implementations Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 3/7] mm: introduce exit_lazy_tlb Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode Nicholas Piggin
2020-07-10 9:42 ` Peter Zijlstra
2020-07-10 14:02 ` Mathieu Desnoyers
2020-07-10 17:04 ` Andy Lutomirski
2020-07-13 4:45 ` Nicholas Piggin
2020-07-13 13:47 ` Nicholas Piggin
2020-07-13 14:13 ` Mathieu Desnoyers
2020-07-13 15:48 ` Andy Lutomirski
2020-07-13 16:37 ` Nicholas Piggin
2020-07-16 4:15 ` Nicholas Piggin
2020-07-16 4:42 ` Nicholas Piggin
2020-07-16 15:46 ` Mathieu Desnoyers
2020-07-16 16:03 ` Mathieu Desnoyers
2020-07-16 18:58 ` Mathieu Desnoyers
2020-07-16 21:24 ` Alan Stern
2020-07-17 13:39 ` Mathieu Desnoyers
2020-07-17 14:51 ` Alan Stern
2020-07-17 15:39 ` Mathieu Desnoyers
2020-07-17 16:11 ` Alan Stern
2020-07-17 16:22 ` Mathieu Desnoyers
2020-07-17 17:44 ` Alan Stern
2020-07-17 17:52 ` Mathieu Desnoyers
2020-07-17 0:00 ` Nicholas Piggin
2020-07-16 5:18 ` Andy Lutomirski
2020-07-16 6:06 ` Nicholas Piggin
2020-07-16 8:50 ` Peter Zijlstra
2020-07-16 10:03 ` Nicholas Piggin
2020-07-16 11:00 ` peterz
2020-07-16 15:34 ` Mathieu Desnoyers
2020-07-16 23:26 ` Nicholas Piggin
2020-07-17 13:42 ` Mathieu Desnoyers
2020-07-20 3:03 ` Nicholas Piggin
2020-07-20 16:46 ` Mathieu Desnoyers [this message]
2020-07-21 10:04 ` Nicholas Piggin
2020-07-21 13:11 ` Mathieu Desnoyers
2020-07-21 14:30 ` Nicholas Piggin
2020-07-21 15:06 ` peterz
2020-07-21 15:15 ` Mathieu Desnoyers
2020-07-21 15:19 ` Peter Zijlstra
2020-07-21 15:22 ` Mathieu Desnoyers
2020-07-10 1:56 ` [RFC PATCH 5/7] lazy tlb: introduce lazy mm refcount helper functions Nicholas Piggin
2020-07-10 9:48 ` Peter Zijlstra
2020-07-10 1:56 ` [RFC PATCH 6/7] lazy tlb: allow lazy tlb mm switching to be configurable Nicholas Piggin
2020-07-10 1:56 ` [RFC PATCH 7/7] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Nicholas Piggin
2020-07-10 9:35 ` Peter Zijlstra
2020-07-13 4:58 ` Nicholas Piggin
2020-07-13 15:59 ` Andy Lutomirski
2020-07-13 16:48 ` Nicholas Piggin
2020-07-13 18:18 ` Andy Lutomirski
2020-07-14 5:04 ` Nicholas Piggin
2020-07-14 6:31 ` Nicholas Piggin
2020-07-14 12:46 ` Andy Lutomirski
2020-07-14 13:23 ` Peter Zijlstra
2020-07-16 2:26 ` Nicholas Piggin
2020-07-16 2:35 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2055788870.20749.1595263590675.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=anton@ozlabs.org \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).