All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: mingo@redhat.com, peterz@infradead.org, fweisbec@gmail.com,
	tglx@linutronix.de
Subject: Re: RCU used on incoming CPU before rcu_cpu_starting() called
Date: Wed, 8 Mar 2017 15:41:52 -0800	[thread overview]
Message-ID: <20170308234152.GA14819@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170308221656.GA11949@linux.vnet.ibm.com>

On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> Hello!
> 
> I am seeing the following splat in rcutorture testing of v4.11-rc1:
> 
> [   30.694013] =============================
> [   30.694013] WARNING: suspicious RCU usage
> [   30.694013] 4.11.0-rc1+ #1 Not tainted
> [   30.694013] -----------------------------
> [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> [   30.694013] 
> [   30.694013] other info that might help us debug this:
> [   30.694013] 
> [   30.694013] 
> [   30.694013] RCU used illegally from offline CPU!
> [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
> [   30.694013] no locks held by swapper/1/0.
> [   30.694013] 
> [   30.694013] stack backtrace:
> [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> [   30.694013] Call Trace:
> [   30.694013]  dump_stack+0x67/0x99
> [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
> [   30.694013]  get_work_pool+0x82/0x90
> [   30.694013]  __queue_work+0x70/0x5f0
> [   30.694013]  queue_work_on+0x33/0x70
> [   30.694013]  clear_sched_clock_stable+0x33/0x40
> [   30.694013]  early_init_intel+0xe7/0x2f0
> [   30.694013]  init_intel+0x11/0x350
> [   30.694013]  identify_cpu+0x344/0x5a0
> [   30.694013]  identify_secondary_cpu+0x18/0x80
> [   30.694013]  smp_store_cpu_info+0x39/0x40
> [   30.694013]  start_secondary+0x4e/0x100
> [   30.694013]  start_cpu+0x14/0x14
> 
> Here is the relevant code from x86's smp_callin():
> 
> 	/*
> 	 * Save our processor parameters. Note: this information
> 	 * is needed for clock calibration.
> 	 */
> 	smp_store_cpu_info(cpuid);
> 
> 	/*
> 	 * Get our bogomips.
> 	 * Update loops_per_jiffy in cpu_data. Previous call to
> 	 * smp_store_cpu_info() stored a value that is close but not as
> 	 * accurate as the value just calculated.
> 	 */
> 	calibrate_delay();
> 	cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> 	pr_debug("Stack at about %p\n", &cpuid);
> 
> 	/*
> 	 * This must be done before setting cpu_online_mask
> 	 * or calling notify_cpu_starting.
> 	 */
> 	set_cpu_sibling_map(raw_smp_processor_id());
> 	wmb();
> 
> 	notify_cpu_starting(cpuid);
> 
> The problem is that smp_store_cpu_info() indirectly invokes
> schedule_work(), which wants to use RCU.  But RCU isn't informed
> of the incoming CPU until the call to notify_cpu_starting(), which
> causes lockdep to complain bitterly about the use of RCU by the
> premature call to schedule_work().
> 
> I considered just moving the notify_cpu_starting() earlier in the
> sequence, but the comments make it seem like this would not be
> a wise choice.
> 
> Any suggestions?

And I should hasten to mention my default approach, which would be
to provide a Kconfig option that causes notify_cpu_starting() to
skip calling rcu_cpu_starting(), and to add a call to this function
in x86's smp_callin() just before the call to smp_store_cpu_info().
But this seemed a bit crude, so I figured I should check with you
guys before taking that approach.

But what is the fun in that?  Please see below for the corresponding
lightly tested patch.  ;-)

							Thanx, Paul

------------------------------------------------------------------------

commit aa805dad9ae66cc4f5106d004b6bb5102fd84434
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Mar 8 15:32:29 2017 -0800

    cpu: Move RCU earlier in x86 CPU-online procedure
    
    Running rcutorture on v4.11-rc1 results in the following splat on x86
    on kernels built with both CPU hotplug and lockdep:
    
    [   30.694013] =============================
    [   30.694013] WARNING: suspicious RCU usage
    [   30.694013] 4.11.0-rc1+ #1 Not tainted
    [   30.694013] -----------------------------
    [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
    [   30.694013]
    [   30.694013] other info that might help us debug this:
    [   30.694013]
    [   30.694013]
    [   30.694013] RCU used illegally from offline CPU!
    [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
    [   30.694013] no locks held by swapper/1/0.
    [   30.694013]
    [   30.694013] stack backtrace:
    [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
    [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    [   30.694013] Call Trace:
    [   30.694013]  dump_stack+0x67/0x99
    [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
    [   30.694013]  get_work_pool+0x82/0x90
    [   30.694013]  __queue_work+0x70/0x5f0
    [   30.694013]  queue_work_on+0x33/0x70
    [   30.694013]  clear_sched_clock_stable+0x33/0x40
    [   30.694013]  early_init_intel+0xe7/0x2f0
    [   30.694013]  init_intel+0x11/0x350
    [   30.694013]  identify_cpu+0x344/0x5a0
    [   30.694013]  identify_secondary_cpu+0x18/0x80
    [   30.694013]  smp_store_cpu_info+0x39/0x40
    [   30.694013]  start_secondary+0x4e/0x100
    [   30.694013]  start_cpu+0x14/0x14
    
    This is caused by the fact that smp_store_cpu_info() indirectly invokes
    schedule_work(), which wants to use RCU.  But RCU isn't informed of the
    incoming CPU until the later call to notify_cpu_starting(), which causes
    lockdep to complain bitterly about the use of RCU by the premature call
    to schedule_work().  It is said to be unwise to move the call to
    notify_cpu_starting() to precede that to smp_store_cpu_info(), so this
    commit adds a ARCH_RCU_ONLINE_EARLY Kconfig option that is selected by
    the x86 architecture.  This option suppresses the call to rcu_cpu_starting()
    from notify_cpu_starting(), and a call to rcu_cpu_starting() is added
    preceding the call to smp_store_cpu_info().
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a294ee..bb42b76ec2dd 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -63,6 +63,7 @@ config X86
 	select ARCH_MIGHT_HAVE_ACPI_PDC		if ACPI
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
+	select ARCH_RCU_ONLINE_EARLY
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index bd1f1ad35284..ca7386d04194 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -188,6 +188,12 @@ static void smp_callin(void)
 	apic_ap_setup();
 
 	/*
+	 * Enable RCU readers on this CPU, needed for workqueues,
+	 * which are used in smp_store_cpu_info().
+	 */
+	rcu_cpu_starting(cpuid);
+
+	/*
 	 * Save our processor parameters. Note: this information
 	 * is needed for clock calibration.
 	 */
diff --git a/init/Kconfig b/init/Kconfig
index c859c993c26f..31c2d5e08f16 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -543,6 +543,9 @@ config RCU_STALL_COMMON
 	  the tiny variants to disable RCU CPU stall warnings, while
 	  making these warnings mandatory for the tree variants.
 
+config ARCH_RCU_ONLINE_EARLY
+	bool
+
 config CONTEXT_TRACKING
        bool
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index f7c063239fa5..9d98f2785fa0 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -851,7 +851,8 @@ void notify_cpu_starting(unsigned int cpu)
 	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
 	enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
 
-	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
+	if (!IS_ENABLED(CONFIG_ARCH_RCU_ONLINE_EARLY))
+		rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
 	while (st->state < target) {
 		st->state++;
 		cpuhp_invoke_callback(cpu, st->state, true, NULL);

  reply	other threads:[~2017-03-08 23:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-08 22:16 RCU used on incoming CPU before rcu_cpu_starting() called Paul E. McKenney
2017-03-08 23:41 ` Paul E. McKenney [this message]
2017-03-09  3:55   ` Frederic Weisbecker
2017-03-09  5:24     ` Paul E. McKenney
2017-03-09 13:08 ` Thomas Gleixner
2017-03-09 15:12   ` Peter Zijlstra
2017-03-09 15:29     ` Paul E. McKenney
2017-03-09 15:50       ` Paul E. McKenney
2017-03-20  8:32         ` Tomeu Vizoso
2017-03-20 12:50           ` Paul E. McKenney
2017-03-09 15:24   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170308234152.GA14819@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.