public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.34-rc2: cpu hotplug test failure on x86_64
@ 2010-03-20 17:17 Sachin Sant
  2010-03-20 17:29 ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Sachin Sant @ 2010-03-20 17:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar

Running cpu hotplug tests on a x86_64 box results in
the following BUG.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu1/online
CPU 0
Modules linked in: ipv6 fuse loop dm_mod sg bnx2 rtc_cmos mptctl rtc_core i2c_piix4 rtc_lib serio_raw pcspkr shpchp button i2c_core k8temp pci_hotplug ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 jbd fan thermal processor thermal_sys hwmon mptsas mptscsih mptbase scsi_transport_sas scsi_mod

Pid: 7657, comm: bash Not tainted 2.6.34-rc2-autotest #1 Server Blade/BladeCenter LS21 -[79716AA]-
RIP: 0010:[<ffffffff81037388>]  [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67
RSP: 0018:ffff880129e11d88  EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88000608b6f0 RCX: ffffffff8178f3f0
RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffffffff81930f94
RBP: ffff880129e11d98 R08: 0000000000000000 R09: ffff880129e11ca8
R10: 0000000000000000 R11: 0000000000018600 R12: 00000000fffffffd
R13: ffffffff8179f4e0 R14: 0000000000000001 R15: 0000000000000007
FS:  00007f7e4c9346f0(0000) GS:ffff880006000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000004 CR3: 0000000129f51000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 7657, threadinfo ffff880129e10000, task ffff88012850b1e0)
Stack:
 ffff880129e11d98 0000000000000000 ffff880129e11da8 ffffffff81388ab3
<0> ffff880129e11de8 ffffffff8139474e 0000000000000001 ffffffff8184c5e0
<0> 0000000000000001 0000000000000000 0000000000000000 0000000000000001
Call Trace:
 [<ffffffff81388ab3>] x86_pmu_notifier+0x51/0x58
 [<ffffffff8139474e>] notifier_call_chain+0x33/0x5b
 [<ffffffff810838e0>] raw_notifier_call_chain+0xf/0x11
 [<ffffffff8137d5b1>] _cpu_down+0x1ed/0x2f3
 [<ffffffff8107c5ce>] ? __create_workqueue_key+0x204/0x22c
 [<ffffffff8137d6f0>] cpu_down+0x39/0x53
 [<ffffffff8137f58e>] store_online+0x2c/0x6f
 [<ffffffff812c3707>] sysdev_store+0x1b/0x1d
 [<ffffffff8116bfc0>] sysfs_write_file+0xdf/0x114
 [<ffffffff8111810e>] vfs_write+0xae/0x16a
 [<ffffffff8111828e>] sys_write+0x47/0x6e
 [<ffffffff810299ab>] system_call_fastpath+0x16/0x1b
Code: b6 80 00 01 76 4f 48 63 c7 48 c7 c3 f0 b6 00 00 48 c7 c7 94 0f 93 81 48 03 1c c5 c0 2e 84 81 e8 00 a2 35 00 48 8b 93 28 07 00 00 <8b> 42 04 ff c8 85 c0 89 42 04 75 0c 48 8b bb 28 07 00 00 e8 ae
RIP  [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67
 RSP <ffff880129e11d88>
CR2: 0000000000000004
---[ end trace d44efb4255454e5f ]---

The problem seem to have been introduced in 2.6.34-rc1-git8(397104793...)
I haven't tried a git bisect yet. The following two commits
modified the code in perf_event_amd.c.

34538ee77b39a12702e0f4c3ed9e8fa2dd5eb92c
3f6da3905398826d85731247e7fbcf53400c18bd

Will try reverting them to check if that helps.

Thanks
-Sachin


-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.34-rc2: cpu hotplug test failure on x86_64
  2010-03-20 17:17 2.6.34-rc2: cpu hotplug test failure on x86_64 Sachin Sant
@ 2010-03-20 17:29 ` Peter Zijlstra
  2010-03-20 20:53   ` Rafael J. Wysocki
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2010-03-20 17:29 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-kernel, Ingo Molnar

On Sat, 2010-03-20 at 22:47 +0530, Sachin Sant wrote:
> Running cpu hotplug tests on a x86_64 box results in
> the following BUG.
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> IP: [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67

I guess something like the below might work.

---
 arch/x86/kernel/cpu/perf_event_amd.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 358a8e3..0189af4 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -345,6 +345,8 @@ static void amd_pmu_cpu_offline(int cpu)
 		return;
 
 	cpuhw = &per_cpu(cpu_hw_events, cpu);
+	if (!cpuhw)
+		return;
 
 	raw_spin_lock(&amd_nb_lock);
 



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: 2.6.34-rc2: cpu hotplug test failure on x86_64
  2010-03-20 17:29 ` Peter Zijlstra
@ 2010-03-20 20:53   ` Rafael J. Wysocki
  2010-03-21  7:48     ` Sachin Sant
  0 siblings, 1 reply; 4+ messages in thread
From: Rafael J. Wysocki @ 2010-03-20 20:53 UTC (permalink / raw)
  To: Peter Zijlstra, Sachin Sant; +Cc: linux-kernel, Ingo Molnar

On Saturday 20 March 2010, Peter Zijlstra wrote:
> On Sat, 2010-03-20 at 22:47 +0530, Sachin Sant wrote:
> > Running cpu hotplug tests on a x86_64 box results in
> > the following BUG.
> > 
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> > IP: [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67
> 
> I guess something like the below might work.

Not really.  Sachin, please try this patch instead:
https://patchwork.kernel.org/patch/87189/

Rafael

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.34-rc2: cpu hotplug test failure on x86_64
  2010-03-20 20:53   ` Rafael J. Wysocki
@ 2010-03-21  7:48     ` Sachin Sant
  0 siblings, 0 replies; 4+ messages in thread
From: Sachin Sant @ 2010-03-21  7:48 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Peter Zijlstra, linux-kernel, Ingo Molnar

Rafael J. Wysocki wrote:
> On Saturday 20 March 2010, Peter Zijlstra wrote:
>   
>> On Sat, 2010-03-20 at 22:47 +0530, Sachin Sant wrote:
>>     
>>> Running cpu hotplug tests on a x86_64 box results in
>>> the following BUG.
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>>> IP: [<ffffffff81037388>] amd_pmu_cpu_offline+0x38/0x67
>>>       
>> I guess something like the below might work.
>>     
>
> Not really.  Sachin, please try this patch instead:
> https://patchwork.kernel.org/patch/87189/
>   
Works for me. Thanks Rafael.

Regards
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-03-21  7:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-20 17:17 2.6.34-rc2: cpu hotplug test failure on x86_64 Sachin Sant
2010-03-20 17:29 ` Peter Zijlstra
2010-03-20 20:53   ` Rafael J. Wysocki
2010-03-21  7:48     ` Sachin Sant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox