linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jan H. Schönherr" <jschoenh@amazon.de>
To: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Joerg Roedel" <jroedel@suse.de>,
	"Borislav Petkov" <bp@alien8.de>,
	linux-kernel@vger.kernel.org,
	"Lai Jiangshan" <laijs@cn.fujitsu.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Anton Blanchard" <anton@samba.org>,
	"Matt Wilson" <msw@amazon.com>,
	"Jan H. Schönherr" <jschoenh@amazon.de>
Subject: [PATCH] sched: fix cpu_active_mask/cpu_online_mask race
Date: Mon, 27 Jul 2015 20:21:28 +0200	[thread overview]
Message-ID: <1438021288-12335-1-git-send-email-jschoenh@amazon.de> (raw)
In-Reply-To: <20150720152744.GA7010@nazgul.tnic>

From: Jan H. Schönherr <jschoenh@amazon.de>

There is a race condition in SMP bootup code, which may result in

    WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418
    workqueue_cpu_up_callback()
or
    kernel BUG at kernel/smpboot.c:135!

It can be triggered with a bit of luck in Linux guests running on
busy hosts.

CPU0                        CPUn
====                        ====

_cpu_up()
  __cpu_up()
                            start_secondary()
                              set_cpu_online()
                                cpumask_set_cpu(cpu,
                                           to_cpumask(cpu_online_bits));
  cpu_notify(CPU_ONLINE)
    <do stuff, see below>
                                cpumask_set_cpu(cpu,
                                           to_cpumask(cpu_active_bits));

During the various CPU_ONLINE callbacks CPUn is online but not active.
Several things can go wrong at that point, depending on the scheduling
of tasks on CPU0.

Variant 1:

  cpu_notify(CPU_ONLINE)
    workqueue_cpu_up_callback()
      rebind_workers()
        set_cpus_allowed_ptr()

  This call fails because it requires an active CPU; rebind_workers()
  ends with a warning:

    WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:4418
    workqueue_cpu_up_callback()

Variant 2:

  cpu_notify(CPU_ONLINE)
    smpboot_thread_call()
      smpboot_unpark_threads()
       ..
        __kthread_unpark()
          __kthread_bind()
          wake_up_state()
           ..
            select_task_rq()
              select_fallback_rq()

  The ->wake_cpu of the unparked thread is not allowed, making a call
  to select_fallback_rq() necessary. Then, select_fallback_rq() cannot
  find an allowed, active CPU and promptly resets the allowed CPUs, so
  that the task in question ends up on CPU0.

  When those unparked tasks are eventually executed, they run
  immediately into a BUG:

    kernel BUG at kernel/smpboot.c:135!

Just changing the order in which the online/active bits are set (and
adding some memory barriers), would solve the two issues above.
However, it would change the order of operations back to the one
before commit 6acbfb96976f ("sched: Fix hotplug vs.
set_cpus_allowed_ptr()"), thus, reintroducing that particular problem.

Going further back into history, we have at least the following commits
touching this topic:
- commit 2baab4e90495 ("sched: Fix select_fallback_rq() vs
  cpu_active/cpu_online")
- commit 5fbd036b552f ("sched: Cleanup cpu_active madness")

Together, these give us the following non-working solutions:
- secondary CPU sets active before online, because active is assumed to
  be a subset of online;
- secondary CPU sets online before active, because the primary CPU
  assumes that an online CPU is also active;
- secondary CPU sets online and waits for primary CPU to set active,
  because it might deadlock.

Commit 875ebe940d77 ("powerpc/smp: Wait until secondaries are active &
online") introduces an arch-specific solution to this arch-independent
problem.

Now, go for a more general solution without explicit waiting and simply
set active twice: once on the secondary CPU after online was set and
once on the primary CPU after online was seen.

Fixes: 6acbfb96976f ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78b4bad10..155bb4a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5433,6 +5433,7 @@ static int sched_cpu_active(struct notifier_block *nfb,
 	case CPU_STARTING:
 		set_cpu_rq_start_time();
 		return NOTIFY_OK;
+	case CPU_ONLINE:
 	case CPU_DOWN_FAILED:
 		set_cpu_active((long)hcpu, true);
 		return NOTIFY_OK;

  reply	other threads:[~2015-07-27 18:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-16  9:17 [PATCH] x86/smpboot: Check for cpu_active on cpu initialization Joerg Roedel
2015-07-20 14:46 ` Borislav Petkov
2015-07-20 15:02   ` Joerg Roedel
2015-07-20 15:10     ` Borislav Petkov
2015-07-20 15:18       ` Joerg Roedel
2015-07-20 15:27         ` Borislav Petkov
2015-07-27 18:21           ` Jan H. Schönherr [this message]
2015-07-30 16:00             ` [PATCH] sched: fix cpu_active_mask/cpu_online_mask race Peter Zijlstra
2015-07-30 19:17               ` [PATCH v2] " Jan H. Schönherr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1438021288-12335-1-git-send-email-jschoenh@amazon.de \
    --to=jschoenh@amazon.de \
    --cc=anton@samba.org \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jroedel@suse.de \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=msw@amazon.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).