From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: paulmck@linux.vnet.ibm.com
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>
Subject: WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0(), during CPU hotplug
Date: Wed, 12 Sep 2012 18:06:20 +0530 [thread overview]
Message-ID: <505081C4.2050600@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120719171550.GL2507@linux.vnet.ibm.com>
On 07/19/2012 10:45 PM, Paul E. McKenney wrote:
> On Thu, Jul 19, 2012 at 05:39:30PM +0530, Srivatsa S. Bhat wrote:
>> Hi Paul,
>>
>> While running a CPU hotplug stress test on v3.5-rc7+
>> (mainline commit 8a7298b7805ab) I hit this warning.
>> I haven't tried to debug this yet...
>>
>> Line number 1550 maps to:
>>
>> WARN_ON_ONCE(cpu_is_offline(smp_processor_id()));
>>
>> inside rcu_do_batch().
>
> Hello, Srivatsa,
>
> I believe that you need commit a16b7a69 (Prevent __call_rcu() from
> invoking RCU core on offline CPUs), which is currently in -tip, queued
> for 3.6. Please see below for the patch.
>
> Does this help?
>
Hi Paul,
I am hitting the cpu_is_offline() warning in rcu_do_batch() (see 2 of the
examples below) occasionally while testing CPU hotplug on Thomas' smp/hotplug
branch in -tip. It does contain the commit that you had mentioned above.
The stack trace suggests that we are not hitting this from the __call_rcu()
path. So I guess this needs a different fix?
Regards,
Srivatsa S. Bhat
[ 53.882344] smpboot: CPU 7 is now offline
[ 53.891072] CPU 12 MCA banks CMCI:6 CMCI:8
[ 53.895621] CPU 15 MCA banks CMCI:2 CMCI:3 CMCI:5
[ 53.914738] Broke affinity for irq 81
[ 53.917769] do_IRQ: 8.211 No irq handler for vector (irq -1)
[ 53.917769] ------------[ cut here ]------------
[ 53.917769] WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0()
[ 53.917769] Hardware name: IBM System x -[7870C4Q]-
[ 53.917769] Modules linked in: ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse loop dm_mod iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm cdc_ether usbnet ioatdma lpc_ich mfd_core crc32c_intel microcode mii pcspkr i2c_i801 shpchp i2c_core serio_raw bnx2 tpm_tis dca tpm pci_hotplug i7core_edac tpm_bios edac_core sg rtc_cmos button uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif edd ext3 mbcache jbd fan processor mptsas mptscsih mptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon
[ 53.917769] Pid: 47, comm: migration/8 Not tainted 3.6.0-rc1-tglx-hotplug-0.0.0.28.36b5ec9-default #1
[ 53.917769] Call Trace:
[ 53.917769] <IRQ> [<ffffffff810e7806>] ? rcu_do_batch+0x386/0x3a0
[ 53.917769] [<ffffffff810e7806>] ? rcu_do_batch+0x386/0x3a0
[ 53.917769] [<ffffffff8104338a>] warn_slowpath_common+0x7a/0xb0
[ 53.917769] [<ffffffff810433d5>] warn_slowpath_null+0x15/0x20
[ 53.917769] [<ffffffff810e7806>] rcu_do_batch+0x386/0x3a0
[ 53.917769] [<ffffffff810ac8a0>] ? trace_hardirqs_on_caller+0x70/0x1b0
[ 53.917769] [<ffffffff810ac9ed>] ? trace_hardirqs_on+0xd/0x10
[ 53.917769] [<ffffffff810e9143>] __rcu_process_callbacks+0x1a3/0x200
[ 53.917769] [<ffffffff810e9228>] rcu_process_callbacks+0x88/0x240
[ 53.917769] [<ffffffff8104dc79>] __do_softirq+0x159/0x400
[ 53.917769] [<ffffffff814c627c>] call_softirq+0x1c/0x30
[ 53.917769] [<ffffffff810044f5>] do_softirq+0x95/0xd0
[ 53.917769] [<ffffffff8104d745>] irq_exit+0xe5/0x100
[ 53.917769] [<ffffffff81003c14>] do_IRQ+0x64/0xe0
[ 53.917769] [<ffffffff814bc12f>] common_interrupt+0x6f/0x6f
[ 53.917769] <EOI> [<ffffffff810cfe7a>] ? stop_machine_cpu_stop+0xda/0x130
[ 53.917769] [<ffffffff810cfda0>] ? stop_one_cpu_nowait+0x50/0x50
[ 53.917769] [<ffffffff810cfab9>] cpu_stopper_thread+0xd9/0x1b0
[ 53.917769] [<ffffffff814bbe4f>] ? _raw_spin_unlock_irqrestore+0x3f/0x80
[ 53.917769] [<ffffffff810cf9e0>] ? res_counter_init+0x50/0x50
[ 53.917769] [<ffffffff810ac95d>] ? trace_hardirqs_on_caller+0x12d/0x1b0
[ 53.917769] [<ffffffff810ac9ed>] ? trace_hardirqs_on+0xd/0x10
[ 53.917769] [<ffffffff810cf9e0>] ? res_counter_init+0x50/0x50
[ 53.917769] [<ffffffff8106deae>] kthread+0xde/0xf0
[ 53.917769] [<ffffffff814c6184>] kernel_thread_helper+0x4/0x10
[ 53.917769] [<ffffffff814bc1f0>] ? retint_restore_args+0x13/0x13
[ 53.917769] [<ffffffff8106ddd0>] ? __init_kthread_worker+0x70/0x70
[ 53.917769] [<ffffffff814c6180>] ? gs_change+0x13/0x13
[ 53.917769] ---[ end trace f60a282810c4ce78 ]---
[ 54.170634] smpboot: CPU 8 is now offline
[ 54.192259] NOHZ: local_softirq_pending 200
[ 54.197936] smpboot: CPU 9 is now offline
[ 54.219707] NOHZ: local_softirq_pending 200
[ 54.225795] smpboot: CPU 10 is now offline
---
[ 372.482434] smpboot: CPU 11 is now offline
[ 372.534211] smpboot: CPU 12 is now offline
[ 372.539786] CPU 13 MCA banks CMCI:6 CMCI:8
[ 372.582474] smpboot: CPU 13 is now offline
[ 372.591006] CPU 14 MCA banks CMCI:6 CMCI:8
[ 372.629745] ------------[ cut here ]------------
[ 372.633736] WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0()
[ 372.633736] Hardware name: IBM System x -[7870C4Q]-
[ 372.633736] Modules linked in: ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse loop dm_mod iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel
microcode serio_raw tpm_tis i7core_edac cdc_ether usbnet ioatdma mii lpc_ich pcspkr edac_core mfd_core bnx2 shpchp pci_hotplug i2c_i801 i2c_core dca tpm sg tpm_bios rtc_cmos button uhci_hcd ehci_hcd
usbcore usb_common sd_mod crc_t10dif edd ext3 mbcache jbd fan processor mptsas mptscsih mptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon
[ 372.633736] Pid: 8625, comm: migration/14 Not tainted 3.6.0-rc1-tglx-hotplug-0.0.0.28.36b5ec9-default #1
[ 372.633736] Call Trace:
[ 372.633736] <IRQ> [<ffffffff810e7806>] ? rcu_do_batch+0x386/0x3a0
[ 372.633736] [<ffffffff810e7806>] ? rcu_do_batch+0x386/0x3a0
[ 372.633736] [<ffffffff8104338a>] warn_slowpath_common+0x7a/0xb0
[ 372.633736] [<ffffffff810433d5>] warn_slowpath_null+0x15/0x20
[ 372.633736] [<ffffffff810e7806>] rcu_do_batch+0x386/0x3a0
[ 372.633736] [<ffffffff810ac8a0>] ? trace_hardirqs_on_caller+0x70/0x1b0
[ 372.633736] [<ffffffff810ac9ed>] ? trace_hardirqs_on+0xd/0x10
[ 372.633736] [<ffffffff810e9143>] __rcu_process_callbacks+0x1a3/0x200
[ 372.633736] [<ffffffff810e9228>] rcu_process_callbacks+0x88/0x240
[ 372.633736] [<ffffffff8104dc79>] __do_softirq+0x159/0x400
[ 372.633736] [<ffffffff814c627c>] call_softirq+0x1c/0x30
[ 372.633736] [<ffffffff810044f5>] do_softirq+0x95/0xd0
[ 372.633736] [<ffffffff8104d745>] irq_exit+0xe5/0x100
[ 372.633736] [<ffffffff81028df9>] smp_apic_timer_interrupt+0x69/0xa0
[ 372.633736] [<ffffffff814c5aef>] apic_timer_interrupt+0x6f/0x80
[ 372.633736] <EOI> [<ffffffff810cfe7a>] ? stop_machine_cpu_stop+0xda/0x130
[ 372.633736] [<ffffffff810cfda0>] ? stop_one_cpu_nowait+0x50/0x50
[ 372.633736] [<ffffffff810cfab9>] cpu_stopper_thread+0xd9/0x1b0
[ 372.633736] [<ffffffff814bbe4f>] ? _raw_spin_unlock_irqrestore+0x3f/0x80
[ 372.633736] [<ffffffff810cf9e0>] ? res_counter_init+0x50/0x50
[ 372.633736] [<ffffffff810ac95d>] ? trace_hardirqs_on_caller+0x12d/0x1b0
[ 372.633736] [<ffffffff810ac9ed>] ? trace_hardirqs_on+0xd/0x10
[ 372.633736] [<ffffffff810cf9e0>] ? res_counter_init+0x50/0x50
[ 372.633736] [<ffffffff8106deae>] kthread+0xde/0xf0
[ 372.633736] [<ffffffff814c6184>] kernel_thread_helper+0x4/0x10
[ 372.633736] [<ffffffff814bc1f0>] ? retint_restore_args+0x13/0x13
[ 372.633736] [<ffffffff8106ddd0>] ? __init_kthread_worker+0x70/0x70
[ 372.633736] [<ffffffff814c6180>] ? gs_change+0x13/0x13
[ 372.633736] ---[ end trace a4296a31284c846d ]---
[ 372.883063] smpboot: CPU 14 is now offline
[ 372.892721] CPU 15 MCA banks CMCI:6 CMCI:8
[ 372.907250] smpboot: CPU 15 is now offline
[ 372.911545] SMP alternatives: lockdep: fixing up alternatives
[ 372.917292] SMP alternatives: switching to UP code
[ 372.941917] SMP alternatives: lockdep: fixing up alternatives
next prev parent reply other threads:[~2012-09-12 12:37 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-19 12:09 WARNING: at kernel/rcutree.c:1550 __rcu_process_callbacks+0x46f/0x4b0() Srivatsa S. Bhat
2012-07-19 17:15 ` Paul E. McKenney
2012-07-20 10:41 ` Srivatsa S. Bhat
2012-07-20 14:36 ` Paul E. McKenney
2012-07-20 14:57 ` Srivatsa S. Bhat
2012-09-12 12:36 ` Srivatsa S. Bhat [this message]
2012-09-12 15:31 ` WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0(), during CPU hotplug Paul E. McKenney
2012-09-13 6:30 ` Michael Wang
2012-09-13 12:47 ` Srivatsa S. Bhat
2012-09-14 4:33 ` Michael Wang
2012-09-26 9:35 ` Srivatsa S. Bhat
2012-09-27 2:59 ` Michael Wang
2012-09-27 19:06 ` Srivatsa S. Bhat
2012-09-13 8:35 ` Srivatsa S. Bhat
2012-09-14 11:47 ` Fengguang Wu
2012-09-14 12:18 ` Srivatsa S. Bhat
2012-09-14 12:25 ` Peter Zijlstra
2012-09-14 12:32 ` Fengguang Wu
2012-09-14 12:34 ` Srivatsa S. Bhat
2012-09-14 12:28 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=505081C4.2050600@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.