From mboxrd@z Thu Jan 1 00:00:00 1970 From: Will Deacon Subject: Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending Date: Wed, 15 Jul 2020 15:48:06 +0100 Message-ID: <20200715144806.GA3443108@google.com> References: <20200715104246.GA3143299@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726998AbgGOOsK (ORCPT ); Wed, 15 Jul 2020 10:48:10 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D32CC061755 for ; Wed, 15 Jul 2020 07:48:10 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id q15so5888128wmj.2 for ; Wed, 15 Jul 2020 07:48:10 -0700 (PDT) Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Palmer Dabbelt Cc: kernel@esmil.dk, guoren@kernel.org, linux-riscv@lists.infradead.org, Arnd Bergmann , Paul Walmsley , linux-arch@vger.kernel.org On Wed, Jul 15, 2020 at 07:03:49AM -0700, Palmer Dabbelt wrote: > On Wed, 15 Jul 2020 03:42:46 PDT (-0700), Will Deacon wrote: > > Hmm. Although I _think_ something like the diff below ought to work, are you > > sure you want to be doing MMIO writes in preemptible context? Setting > > '.disable_locking = true' in 'sifive_gpio_regmap_config' implies to me that > > you should be handling the locking within the driver itself, and all the > > other regmap writes are protected by '&gc->bgpio_lock'. > > I guess my goal here was to avoid fixing the drivers: it's one thing if it's > just broken SiFive drivers, as they're all a bit crusty, but this is blowing up > for me in the 8250 driver on QEMU as well. At that point I figured there'd be > an endless stream of bugs around this and I'd rather just. Right, and my patch should solve that. > > Given that riscv is one of the few architectures needing an implementation > > of mmiowb(), doing MMIO in a preemptible section seems especially dangerous > > as you have no way to ensure completion of the writes without adding an > > mmiowb() to the CPU migration path (i.e. context switch). > > I was going to just stick one in our context switching code unconditionally. > While we could go track cumulative writes outside the locks, the mmiowb is > essentially free for us because the one RISC-V implementation treats all fences > the same way so the subsequent store_release would hold all this up anyway. > > I think the right thing to do is to add some sort of arch hook right about here > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index cfd71d61aa3c..14b4f8b7433f 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3212,6 +3212,7 @@ static struct rq *finish_task_switch(struct task_struct *prev) > prev_state = prev->state; > vtime_task_switch(prev); > perf_event_task_sched_in(prev, current); > + finish_arch_pre_release(prev); > finish_task(prev); > finish_lock_switch(rq); > finish_arch_post_lock_switch(); > > but I was just going to stick it in switch_to for now... :). I guess we could > also roll the fence up into yet another one-off primitive for the scheduler, > something like What does the above get you over switch_to()? > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h > > index 9439ff037b2d..5698fca3bf56 100644 > > --- a/include/asm-generic/mmiowb.h > > +++ b/include/asm-generic/mmiowb.h > > @@ -27,7 +27,7 @@ > > #include > > > > DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state); > > -#define __mmiowb_state() this_cpu_ptr(&__mmiowb_state) > > +#define __mmiowb_state() raw_cpu_ptr(&__mmiowb_state) > > #else > > #define __mmiowb_state() arch_mmiowb_state() > > #endif /* arch_mmiowb_state */ > > @@ -35,7 +35,9 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state); > > static inline void mmiowb_set_pending(void) > > { > > struct mmiowb_state *ms = __mmiowb_state(); > > - ms->mmiowb_pending = ms->nesting_count; > > + > > + if (likely(ms->nesting_count)) > > + ms->mmiowb_pending = ms->nesting_count; > > Ya, that's one of the earlier ideas I had, but I decided it doesn't actually do > anything: if we're scheduleable then we know that pending and count are zero, > thus the check isn't necessary. It made sense late last night and still does > this morning, but I haven't had my coffee yet. What it does is prevent preemptible writeX() from trashing the state on another CPU, so I think it's a valid fix. I agree that it doesn't help you if you need mmiowb(), but then that _really_ should only be needed if you're holding a spinlock. If you're doing concurrent lockless MMIO you deserve all the pain you get. I don't get why you think the patch does nothing, as it will operate as expected if writeX() is called with preemption disabled, which is the common case. > I'm kind of tempted to just declare "mmiowb() is fast on RISC-V, so let's do it > unconditionally everywhere it's necessary". IIRC that's essentially true on > the existing implementation, as it'll get rolled up to any upcoming fence > anyway. It seems like building any real machine that relies on the orderings > provided by mmiowb is going to have an infinate rabbit hole of bugs anyway, so > in that case we'd just rely on the hardware to elide the now unnecessary fences > so we'd just be throwing static code size at this wacky memory model and then > forgetting about it. If you can do that, that's obviously the best approach. > I'm going to send out a patch set that does all the work I think is necessary > to avoid fixing up the various drivers, with the accounting code to avoid > mmiowbs all over our port. I'm not sure I'm going to like it, but I guess we > can argue as to exactly how ugly it is :) Ok. Will