linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Anton Blanchard <anton@ozlabs.org>, Arnd Bergmann <arnd@arndb.de>,
	 linux-arch <linux-arch@vger.kernel.org>,
	 linux-kernel <linux-kernel@vger.kernel.org>,
	 linux-mm <linux-mm@kvack.org>,
	 linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	 Andy Lutomirski <luto@amacapital.net>,
	 Andy Lutomirski <luto@kernel.org>,
	 Peter Zijlstra <peterz@infradead.org>, x86 <x86@kernel.org>,
	 Jens Axboe <axboe@kernel.dk>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Mon, 20 Jul 2020 12:46:30 -0400 (EDT)	[thread overview]
Message-ID: <2055788870.20749.1595263590675.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1595213677.kxru89dqy2.astroid@bobo.none>

----- On Jul 19, 2020, at 11:03 PM, Nicholas Piggin npiggin@gmail.com wrote:

> Excerpts from Mathieu Desnoyers's message of July 17, 2020 11:42 pm:
>> ----- On Jul 16, 2020, at 7:26 PM, Nicholas Piggin npiggin@gmail.com wrote:
>> [...]
>>> 
>>> membarrier does replace barrier instructions on remote CPUs, which do
>>> order accesses performed by the kernel on the user address space. So
>>> membarrier should too I guess.
>>> 
>>> Normal process context accesses like read(2) will do so because they
>>> don't get filtered out from IPIs, but kernel threads using the mm may
>>> not.
>> 
>> But it should not be an issue, because membarrier's ordering is only with
>> respect
>> to submit and completion of io_uring requests, which are performed through
>> system calls from the context of user-space threads, which are called from the
>> right mm.
> 
> Is that true? Can io completions be written into an address space via a
> kernel thread? I don't know the io_uring code well but it looks like
> that's asynchonously using the user mm context.

Indeed, the io completion appears to be signaled asynchronously between kernel
and user-space. Therefore, both kernel and userspace code need to have proper
memory barriers in place to signal completion, otherwise user-space could read
garbage after it notices completion of a read.

I did not review the entire io_uring implementation, but the publish side
for completion appears to be:

static void __io_commit_cqring(struct io_ring_ctx *ctx)
{
        struct io_rings *rings = ctx->rings;

        /* order cqe stores with ring update */
        smp_store_release(&rings->cq.tail, ctx->cached_cq_tail);

        if (wq_has_sleeper(&ctx->cq_wait)) {
                wake_up_interruptible(&ctx->cq_wait);
                kill_fasync(&ctx->cq_fasync, SIGIO, POLL_IN);
        }
}

The store-release on tail should be paired with a load_acquire on the
reader-side (it's called "read_barrier()" in the code):

tools/io_uring/queue.c:

static int __io_uring_get_cqe(struct io_uring *ring,
                              struct io_uring_cqe **cqe_ptr, int wait)
{
        struct io_uring_cq *cq = &ring->cq;
        const unsigned mask = *cq->kring_mask;
        unsigned head;
        int ret;

        *cqe_ptr = NULL;
        head = *cq->khead;
        do {
                /*
                 * It's necessary to use a read_barrier() before reading
                 * the CQ tail, since the kernel updates it locklessly. The
                 * kernel has the matching store barrier for the update. The
                 * kernel also ensures that previous stores to CQEs are ordered
                 * with the tail update.
                 */
                read_barrier();
                if (head != *cq->ktail) {
                        *cqe_ptr = &cq->cqes[head & mask];
                        break;
                }
                if (!wait)
                        break;
                ret = io_uring_enter(ring->ring_fd, 0, 1,
                                        IORING_ENTER_GETEVENTS, NULL);
                if (ret < 0)
                        return -errno;
        } while (1);

        return 0;
}

So as far as membarrier memory ordering dependencies are concerned, it relies
on the store-release/load-acquire dependency chain in the completion queue to
order against anything that was done prior to the completed requests.

What is in-flight while the requests are being serviced provides no memory
ordering guarantee whatsoever.

> How about other memory accesses via kthread_use_mm? Presumably there is
> still ordering requirement there for membarrier,

Please provide an example case with memory accesses via kthread_use_mm where
ordering matters to support your concern.

> so I really think
> it's a fragile interface with no real way for the user to know how
> kernel threads may use its mm for any particular reason, so membarrier
> should synchronize all possible kernel users as well.

I strongly doubt so, but perhaps something should be clarified in the documentation
if you have that feeling.

Thanks,

Mathieu

> 
> Thanks,
> Nick

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


  reply	other threads:[~2020-07-20 16:46 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10  1:56 [RFC PATCH 0/7] mmu context cleanup, lazy tlb cleanup, Nicholas Piggin
2020-07-10  1:56 ` [RFC PATCH 1/7] asm-generic: add generic MMU versions of mmu context functions Nicholas Piggin
2020-07-10  1:56 ` [RFC PATCH 2/7] arch: use asm-generic mmu context for no-op implementations Nicholas Piggin
2020-07-10  1:56 ` [RFC PATCH 3/7] mm: introduce exit_lazy_tlb Nicholas Piggin
2020-07-10  1:56 ` [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode Nicholas Piggin
2020-07-10  9:42   ` Peter Zijlstra
2020-07-10 14:02   ` Mathieu Desnoyers
2020-07-10 17:04   ` Andy Lutomirski
2020-07-13  4:45     ` Nicholas Piggin
2020-07-13 13:47       ` Nicholas Piggin
2020-07-13 14:13         ` Mathieu Desnoyers
2020-07-13 15:48           ` Andy Lutomirski
2020-07-13 16:37             ` Nicholas Piggin
2020-07-16  4:15           ` Nicholas Piggin
2020-07-16  4:42             ` Nicholas Piggin
2020-07-16 15:46               ` Mathieu Desnoyers
2020-07-16 16:03                 ` Mathieu Desnoyers
2020-07-16 18:58                   ` Mathieu Desnoyers
2020-07-16 21:24                     ` Alan Stern
2020-07-17 13:39                       ` Mathieu Desnoyers
2020-07-17 14:51                         ` Alan Stern
2020-07-17 15:39                           ` Mathieu Desnoyers
2020-07-17 16:11                             ` Alan Stern
2020-07-17 16:22                               ` Mathieu Desnoyers
2020-07-17 17:44                                 ` Alan Stern
2020-07-17 17:52                                   ` Mathieu Desnoyers
2020-07-17  0:00                     ` Nicholas Piggin
2020-07-16  5:18             ` Andy Lutomirski
2020-07-16  6:06               ` Nicholas Piggin
2020-07-16  8:50               ` Peter Zijlstra
2020-07-16 10:03                 ` Nicholas Piggin
2020-07-16 11:00                   ` peterz
2020-07-16 15:34                     ` Mathieu Desnoyers
2020-07-16 23:26                     ` Nicholas Piggin
2020-07-17 13:42                       ` Mathieu Desnoyers
2020-07-20  3:03                         ` Nicholas Piggin
2020-07-20 16:46                           ` Mathieu Desnoyers [this message]
2020-07-21 10:04                             ` Nicholas Piggin
2020-07-21 13:11                               ` Mathieu Desnoyers
2020-07-21 14:30                                 ` Nicholas Piggin
2020-07-21 15:06                               ` peterz
2020-07-21 15:15                                 ` Mathieu Desnoyers
2020-07-21 15:19                                   ` Peter Zijlstra
2020-07-21 15:22                                     ` Mathieu Desnoyers
2020-07-10  1:56 ` [RFC PATCH 5/7] lazy tlb: introduce lazy mm refcount helper functions Nicholas Piggin
2020-07-10  9:48   ` Peter Zijlstra
2020-07-10  1:56 ` [RFC PATCH 6/7] lazy tlb: allow lazy tlb mm switching to be configurable Nicholas Piggin
2020-07-10  1:56 ` [RFC PATCH 7/7] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Nicholas Piggin
2020-07-10  9:35   ` Peter Zijlstra
2020-07-13  4:58     ` Nicholas Piggin
2020-07-13 15:59   ` Andy Lutomirski
2020-07-13 16:48     ` Nicholas Piggin
2020-07-13 18:18       ` Andy Lutomirski
2020-07-14  5:04         ` Nicholas Piggin
2020-07-14  6:31           ` Nicholas Piggin
2020-07-14 12:46             ` Andy Lutomirski
2020-07-14 13:23               ` Peter Zijlstra
2020-07-16  2:26               ` Nicholas Piggin
2020-07-16  2:35               ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2055788870.20749.1595263590675.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=anton@ozlabs.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).