From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: peterz@infradead.org, tglx@linutronix.de, mingo@kernel.org,
tj@kernel.org, rusty@rustcorp.com.au, akpm@linux-foundation.org,
hch@infradead.org, mgorman@suse.de, riel@redhat.com, bp@suse.de,
rostedt@goodmis.org, mgalbraith@suse.de, ego@linux.vnet.ibm.com,
paulmck@linux.vnet.ibm.com, oleg@redhat.com, rjw@rjwysocki.net,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"
Date: Fri, 23 May 2014 15:22:53 +0200 [thread overview]
Message-ID: <20140523132250.GA1768@localhost.localdomain> (raw)
In-Reply-To: <20140523101216.17924.62447.stgit@srivatsabhat.in.ibm.com>
On Fri, May 23, 2014 at 03:42:20PM +0530, Srivatsa S. Bhat wrote:
> During CPU offline, stop-machine is used to take control over all the online
> CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
> that is to be taken offline.
>
> But stop-machine itself has several stages: _PREPARE, _DISABLE_IRQ, _RUN etc.
> The important thing to note here is that the _DISABLE_IRQ stage comes much
> later after starting stop-machine, and hence there is a large window where
> other CPUs can send IPIs to the CPU going offline. As a result, we can
> encounter a scenario as depicted below, which causes IPIs to be sent to the
> CPU going offline, and that CPU notices them *after* it has gone offline,
> triggering the "IPI-to-offline-CPU" warning from the smp-call-function code.
>
>
> CPU 1 CPU 2
> (Online CPU) (CPU going offline)
>
> Enter _PREPARE stage Enter _PREPARE stage
>
> Enter _DISABLE_IRQ stage
>
>
> =
> Got a device interrupt, | Didn't notice the IPI
> and the interrupt handler | since interrupts were
> called smp_call_function() | disabled on this CPU.
> and sent an IPI to CPU 2. |
> =
>
>
> Enter _DISABLE_IRQ stage
>
>
> Enter _RUN stage Enter _RUN stage
>
> =
> Busy loop with interrupts | Invoke take_cpu_down()
> disabled. | and take CPU 2 offline
> =
>
>
> Enter _EXIT stage Enter _EXIT stage
>
> Re-enable interrupts Re-enable interrupts
>
> The pending IPI is noted
> immediately, but alas,
> the CPU is offline at
> this point.
>
>
>
> So, as we can observe from this scenario, the IPI was sent when CPU 2 was
> still online, and hence it was perfectly legal. But unfortunately it was
> noted only after CPU 2 went offline, resulting in the warning from the
> IPI handling code. In other words, the fault was not at the sender, but
> at the receiver side - and if we look closely, the real bug is in the
> stop-machine sequence itself.
>
> The problem here is that the CPU going offline disabled its local interrupts
> (by entering _DISABLE_IRQ phase) *before* the other CPUs. And that's the
> reason why it was not able to respond to the IPI before going offline.
>
> A simple solution to this problem is to ensure that the CPU going offline
> disables its interrupts only *after* the other CPUs do the same thing.
> To achieve this, split the _DISABLE_IRQ state into 2 parts:
>
> 1st part: MULTI_STOP_DISABLE_IRQ_INACTIVE, where only the non-active CPUs
> (i.e., the "other" CPUs) disable their interrupts.
>
> 2nd part: MULTI_STOP_DISABLE_IRQ_ACTIVE, where the active CPU (i.e., the
> CPU going offline) disables its interrupts.
>
> With this in place, the CPU going offline will always be the last one to
> disable interrupts. After this step, no further IPIs can be sent to the
> outgoing CPU, since all the other CPUs would be executing the stop-machine
> code with interrupts disabled. And by the time stop-machine ends, the CPU
> would have gone offline and disappeared from the cpu_online_mask, and hence
> future invocations of smp_call_function() and friends will automatically
> prune that CPU out. Thus, we can guarantee that no CPU will end up
> *inadvertently* sending IPIs to an offline CPU.
>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
>
> kernel/stop_machine.c | 39 ++++++++++++++++++++++++++++++++++-----
> 1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 01fbae5..288f7fe 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -130,8 +130,10 @@ enum multi_stop_state {
> MULTI_STOP_NONE,
> /* Awaiting everyone to be scheduled. */
> MULTI_STOP_PREPARE,
> - /* Disable interrupts. */
> - MULTI_STOP_DISABLE_IRQ,
> + /* Disable interrupts on CPUs not in ->active_cpus mask. */
> + MULTI_STOP_DISABLE_IRQ_INACTIVE,
> + /* Disable interrupts on CPUs in ->active_cpus mask. */
> + MULTI_STOP_DISABLE_IRQ_ACTIVE,
> /* Run the function */
> MULTI_STOP_RUN,
> /* Exit */
> @@ -189,12 +191,39 @@ static int multi_cpu_stop(void *data)
> do {
> /* Chill out and ensure we re-read multi_stop_state. */
> cpu_relax();
> +
> + /*
> + * We use 2 separate stages to disable interrupts, namely
> + * _INACTIVE and _ACTIVE, to ensure that the inactive CPUs
> + * disable their interrupts first, followed by the active CPUs.
> + *
> + * This is done to avoid a race in the CPU offline path, which
> + * can lead to receiving IPIs on the outgoing CPU *after* it
> + * has gone offline.
> + *
> + * During CPU offline, we don't want the other CPUs to send
> + * IPIs to the active_cpu (the outgoing CPU) *after* it has
> + * disabled interrupts (because, then it will notice the IPIs
> + * only after it has gone offline). We can prevent this by
> + * making the other CPUs disable their interrupts first - that
> + * way, they will run the stop-machine code with interrupts
> + * disabled, and hence won't send IPIs after that point.
> + */
> +
> if (msdata->state != curstate) {
> curstate = msdata->state;
> switch (curstate) {
> - case MULTI_STOP_DISABLE_IRQ:
> - local_irq_disable();
> - hard_irq_disable();
> + case MULTI_STOP_DISABLE_IRQ_INACTIVE:
> + if (!is_active) {
> + local_irq_disable();
> + hard_irq_disable();
> + }
> + break;
> + case MULTI_STOP_DISABLE_IRQ_ACTIVE:
> + if (is_active) {
> + local_irq_disable();
> + hard_irq_disable();
> + }
Do we actually need that now that we are flushing the ipi queue on CPU dying?
> break;
> case MULTI_STOP_RUN:
> if (is_active)
>
next prev parent reply other threads:[~2014-05-23 13:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-23 10:11 [PATCH v6 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
2014-05-23 10:12 ` [PATCH v6 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
2014-05-23 10:12 ` [PATCH v6 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU" Srivatsa S. Bhat
2014-05-23 13:22 ` Frederic Weisbecker [this message]
2014-05-23 14:45 ` Srivatsa S. Bhat
2014-05-23 15:04 ` Frederic Weisbecker
2014-05-23 15:24 ` Srivatsa S. Bhat
2014-05-23 15:12 ` Peter Zijlstra
2014-05-23 15:18 ` Srivatsa S. Bhat
2014-05-23 15:31 ` Peter Zijlstra
2014-05-23 15:33 ` Srivatsa S. Bhat
2014-05-23 15:37 ` Srivatsa S. Bhat
2014-05-23 15:48 ` Peter Zijlstra
2014-05-23 15:53 ` Srivatsa S. Bhat
2014-05-23 17:05 ` Srivatsa S. Bhat
2014-05-23 15:21 ` Peter Zijlstra
2014-05-23 15:31 ` Srivatsa S. Bhat
2014-05-23 10:12 ` [PATCH v6 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
2014-05-23 13:27 ` Frederic Weisbecker
2014-05-23 14:47 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140523132250.GA1768@localhost.localdomain \
--to=fweisbec@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bp@suse.de \
--cc=ego@linux.vnet.ibm.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgalbraith@suse.de \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=rjw@rjwysocki.net \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox