public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Stephen Boyd <sboyd@codeaurora.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Krzysztof Kozlowski <k.kozlowski@samsung.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>
Subject: Re: [PATCH v2] ARM: Don't use complete() during __cpu_die
Date: Fri, 13 Feb 2015 15:38:09 +0000	[thread overview]
Message-ID: <20150213153809.GF10496@leverpostej> (raw)
In-Reply-To: <54DA6E92.3090109@codeaurora.org>

On Tue, Feb 10, 2015 at 08:48:18PM +0000, Stephen Boyd wrote:
> On 02/10/15 07:14, Mark Rutland wrote:
> > On Tue, Feb 10, 2015 at 01:24:08AM +0000, Stephen Boyd wrote:
> >> On 02/05/15 08:11, Russell King - ARM Linux wrote:
> >>> On Thu, Feb 05, 2015 at 06:29:18AM -0800, Paul E. McKenney wrote:
> >>>> Works for me, assuming no hidden uses of RCU in the IPI code.  ;-)
> >>> Sigh... I kind'a new it wouldn't be this simple.  The gic code which
> >>> actually raises the IPI takes a raw spinlock, so it's not going to be
> >>> this simple - there's a small theoretical window where we have taken
> >>> this lock, written the register to send the IPI, and then dropped the
> >>> lock - the update to the lock to release it could get lost if the
> >>> CPU power is quickly cut at that point.
> >> Hm.. at first glance it would seem like a similar problem exists with
> >> the completion variable. But it seems that we rely on the call to
> >> complete() fom the dying CPU to synchronize with wait_for_completion()
> >> on the killing CPU via the completion's wait.lock.
> >>
> >> void complete(struct completion *x)
> >> {
> >>         unsigned long flags;
> >>
> >>         spin_lock_irqsave(&x->wait.lock, flags);
> >>         x->done++;
> >>         __wake_up_locked(&x->wait, TASK_NORMAL, 1);
> >>         spin_unlock_irqrestore(&x->wait.lock, flags);
> >> }
> >>
> >> and
> >>
> >> static inline long __sched
> >> do_wait_for_common(struct completion *x,
> >>                   long (*action)(long), long timeout, int state)
> >>                         ...
> >> 			spin_unlock_irq(&x->wait.lock);
> >> 			timeout = action(timeout);
> >> 			spin_lock_irq(&x->wait.lock);
> >>
> >>
> >> so the power can't really be cut until the killing CPU sees the lock
> >> released either explicitly via the second cache flush in cpu_die() or
> >> implicitly via hardware.
> > That sounds about right, though surely cache flush is irrelevant w.r.t.
> > publishing of the unlock? The dsb(ishst) in the unlock path will ensure
> > that the write is visibile prior to the second flush_cache_louis().
> 
> Ah right. I was incorrectly thinking that the CPU had already disabled
> coherency at this point.
> 
> >
> > That said, we _do_ need to flush the cache prior to the CPU being
> > killed, or we can lose any (shared) dirty cache lines the CPU owns. In
> > the presence of dirty cacheline migration we need to be sure the CPU to
> > be killed doesn't acquire any lines prior to being killed (i.e. its
> > caches need to be off and flushed). Given that I don't think it's
> > feasible to perform an IPI.
> 
> The IPI/completion sounds nice because it allows the killing CPU to
> schedule and do other work until the dying CPU notifies that it's almost
> dead.

Ok. My main concern was that it's not sufficient to know that a CPU is
ready to die, but I guess it may signal that it is close.

> > I think we need to move the synchronisation down into the
> > cpu_ops::{cpu_die,cpu_kill} implementations, so that we can have the
> > dying CPU signal readiness after it has disabled and flushed its caches.
> >
> > If the CPU can kill itself and we can query the state of the CPU, then
> > the dying CPU needs to do nothing, and cpu_kill can just poll until it
> > is dead. If the CPU needs to be killed from another CPU, it can update a
> > (cacheline-padded) percpu variable that cpu_kill can poll (cleaning
> > before each read).
> 
> How about a hybrid approach where we send the IPI from generic cpu_die()
> and then do the cacheline-padded bit poll + invalidate and bit set? That
> way we get the benefit of not doing that poll until we really need to
> and if we need to do it at all.

That sounds sane to me.

> cpu_kill | cpu_die | IPI | bit poll 
> ---------+---------+-----+----------
>     Y    |    Y    |  Y  |   N
>     N    |    Y    |  Y  |   Y
>     Y    |    N    |  ?  |   ?    <-- Is this a valid configuration?
>     N    |    N    |  N  |   N    <-- Hotplug should be disabled
> 
> 
> If the hardware doesn't have a synchronization method in row 1 we can
> expose the bit polling functionality to the ops so that they can set and
> poll the bit. It looks like rockchip would need this because we just
> pull the power in cpu_kill without any synchronization. Unfortunately
> this is starting to sound like a fairly large patch to backport.

Oh, fun.

> Aside: For that last row we really should be setting cpu->hotpluggable
> in struct cpu based on what cpu_ops::cpu_disable returns (from what I
> can tell we use that op to indicate if a CPU can be hotplugged).
> 
> Double Aside: It seems that exynos has a bug with coherency.
> exynos_cpu_die() calls v7_exit_coherency_flush() and then calls
> exynos_cpu_power_down() which may call of_machine_is_compatible() and
> that function will grab and release a kref which uses ldrex/strex for
> atomics after we've disabled coherency in v7_exit_coherency_flush().
> 
> >> Maybe we can do the same thing here by using a
> >> spinlock for synchronization between the IPI handler and the dying CPU?
> >> So lock/unlock around the IPI sending from the dying CPU and then do a
> >> lock/unlock on the killing CPU before continuing.
> >>
> >> It would be nice if we didn't have to do anything at all though so
> >> perhaps we can make it a nop on configs where there isn't a big little
> >> switcher. Yeah it's some ugly coupling between these two pieces of code,
> >> but I'm not sure how we can do better.
> > I'm missing something here. What does the switcher have to do with this?
> 
> The switcher is the reason we have a spinlock in gic_raise_softirq().
> That's the problem that Russell was mentioning.

Ah. Thanks for the pointer.

Thanks,
Mark.

  parent reply	other threads:[~2015-02-13 15:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-05 10:14 [PATCH v2] ARM: Don't use complete() during __cpu_die Krzysztof Kozlowski
2015-02-05 10:50 ` Russell King - ARM Linux
2015-02-05 11:00   ` Krzysztof Kozlowski
2015-02-05 11:08     ` Russell King - ARM Linux
2015-02-05 11:28   ` Mark Rutland
2015-02-05 11:30     ` Russell King - ARM Linux
2015-02-05 14:29   ` Paul E. McKenney
2015-02-05 16:11     ` Russell King - ARM Linux
2015-02-05 17:02       ` Paul E. McKenney
2015-02-05 17:34         ` Russell King - ARM Linux
2015-02-05 17:54           ` Paul E. McKenney
2015-02-10  1:24       ` Stephen Boyd
2015-02-10  1:37         ` Paul E. McKenney
2015-02-10  2:05           ` Stephen Boyd
2015-02-10  3:05             ` Paul E. McKenney
2015-02-10 15:14         ` Mark Rutland
2015-02-10 20:48           ` Stephen Boyd
2015-02-10 21:04             ` Stephen Boyd
2015-02-10 21:15               ` Russell King - ARM Linux
2015-02-10 21:49                 ` Stephen Boyd
2015-02-10 22:05                   ` Stephen Boyd
2015-02-13 15:52               ` Mark Rutland
2015-02-13 16:27                 ` Russell King - ARM Linux
2015-02-13 17:21                   ` Mark Rutland
2015-02-13 17:30                     ` Russell King - ARM Linux
2015-02-13 16:28                 ` Stephen Boyd
2015-02-13 15:38             ` Mark Rutland [this message]
2015-02-10 20:58           ` Russell King - ARM Linux
2015-02-10 15:41         ` Russell King - ARM Linux
2015-02-10 18:33           ` Stephen Boyd
2015-02-25 12:56       ` Russell King - ARM Linux
2015-02-25 16:47         ` Nicolas Pitre
2015-02-25 17:00           ` Russell King - ARM Linux
2015-02-25 18:13             ` Nicolas Pitre
2015-02-25 20:16               ` Nicolas Pitre
2015-02-26  1:05                 ` Paul E. McKenney
2015-03-22 23:30                   ` Paul E. McKenney
2015-03-23 12:55                     ` Russell King - ARM Linux
2015-03-23 13:21                       ` Paul E. McKenney
2015-03-23 14:00                         ` Russell King - ARM Linux
2015-03-23 15:37                           ` Paul E. McKenney
2015-03-23 16:56                             ` Paul E. McKenney
2015-02-26 19:14           ` Daniel Thompson
2015-02-26 19:47             ` Nicolas Pitre
2015-02-05 10:53 ` Mark Rutland
2015-02-05 10:59   ` Krzysztof Kozlowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150213153809.GF10496@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=arnd@arndb.de \
    --cc=b.zolnierkie@samsung.com \
    --cc=k.kozlowski@samsung.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=sboyd@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox