* 2.6.30-rc2 hangs in get_measured_perf on tigerton
@ 2009-04-15 6:01 Zhang, Yanmin
2009-04-15 6:49 ` Zhang, Yanmin
2009-04-15 11:14 ` Rusty Russell
0 siblings, 2 replies; 7+ messages in thread
From: Zhang, Yanmin @ 2009-04-15 6:01 UTC (permalink / raw)
To: Rusty Russell; +Cc: LKML, Venkatesh Pallipadi
My machine hanged with kernel 2.6.30-rc2 when script read
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor.
opps happens in get_measured_perf:
cur.aperf.whole = readin.aperf.whole -
per_cpu(drv_data, cpu)->saved_aperf;
Because per_cpu(drv_data, cpu)=NULL.
So function get_measured_perf should check if (per_cpu(drv_data, cpu)==NULL)
and return 0 if it's NULL.
Other functions have such checking.
yanmin
--------------sys log------------------
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
PGD a7dd88067 PUD a7ccf5067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
CPU 0
Modules linked in: video output
Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server
RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000
RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001
RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0
R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000
R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff88004d200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task ffff880a7d4d18c0)
Stack:
ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95
0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a
0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547
Call Trace:
[<ffffffff803efd54>] ? kobject_get+0x12/0x17
[<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57
[<ffffffff80650547>] ? do_dbs_timer+0x147/0x272
[<ffffffff80650400>] ? do_dbs_timer+0x0/0x272
[<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5
[<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e
[<ffffffff8024736f>] ? worker_thread+0x0/0x1f5
[<ffffffff80249f0d>] ? kthread+0x54/0x83
[<ffffffff8020c87a>] ? child_rip+0xa/0x20
[<ffffffff80249eb9>] ? kthread+0x0/0x83
[<ffffffff8020c870>] ? child_rip+0x0/0x20
Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28
RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
RSP <ffff880a7d56de20>
CR2: 0000000000000020
---[ end trace 2b8fac9a49e19ad4 ]---
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: 2.6.30-rc2 hangs in get_measured_perf on tigerton 2009-04-15 6:01 2.6.30-rc2 hangs in get_measured_perf on tigerton Zhang, Yanmin @ 2009-04-15 6:49 ` Zhang, Yanmin 2009-04-15 11:14 ` Rusty Russell 1 sibling, 0 replies; 7+ messages in thread From: Zhang, Yanmin @ 2009-04-15 6:49 UTC (permalink / raw) To: Rusty Russell; +Cc: LKML, Venkatesh Pallipadi On Wed, 2009-04-15 at 14:01 +0800, Zhang, Yanmin wrote: > My machine hanged with kernel 2.6.30-rc2 when script read > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. > > opps happens in get_measured_perf: > > cur.aperf.whole = readin.aperf.whole - > per_cpu(drv_data, cpu)->saved_aperf; > > Because per_cpu(drv_data, cpu)=NULL. > > So function get_measured_perf should check if (per_cpu(drv_data, cpu)==NULL) > and return 0 if it's NULL. > > Other functions have such checking. How about below patch? I tested it on my tigerton machine. --- --- linux-2.6.30-rc2/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c 2009-04-15 02:24:38.000000000 -0400 +++ linux-2.6.30-rc2_cpufreqbug/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c 2009-04-15 02:31:37.000000000 -0400 @@ -277,6 +277,9 @@ static unsigned int get_measured_perf(st unsigned int perf_percent; unsigned int retval; + if (unlikely(per_cpu(drv_data, cpu) == NULL)) + return 0; + if (smp_call_function_single(cpu, read_measured_perf_ctrs, &readin, 1)) return 0; > > yanmin > > > > --------------sys log------------------ > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 > IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > PGD a7dd88067 PUD a7ccf5067 PMD 0 > Oops: 0000 [#1] SMP > last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > CPU 0 > Modules linked in: video output > Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server > RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 > RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 > RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 > R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 > R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 > FS: 0000000000000000(0000) GS:ffff88004d200000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task ffff880a7d4d18c0) > Stack: > ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 > 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a > 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 > Call Trace: > [<ffffffff803efd54>] ? kobject_get+0x12/0x17 > [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 > [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 > [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 > [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 > [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 > [<ffffffff80249f0d>] ? kthread+0x54/0x83 > [<ffffffff8020c87a>] ? child_rip+0xa/0x20 > [<ffffffff80249eb9>] ? kthread+0x0/0x83 > [<ffffffff8020c870>] ? child_rip+0x0/0x20 > Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28 > RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > RSP <ffff880a7d56de20> > CR2: 0000000000000020 > ---[ end trace 2b8fac9a49e19ad4 ]--- > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.30-rc2 hangs in get_measured_perf on tigerton 2009-04-15 6:01 2.6.30-rc2 hangs in get_measured_perf on tigerton Zhang, Yanmin 2009-04-15 6:49 ` Zhang, Yanmin @ 2009-04-15 11:14 ` Rusty Russell 2009-04-15 13:31 ` Pallipadi, Venkatesh [not found] ` <B5B0CFF685D7DF46A05CF1678CFB42ED20E09E32@orsmsx505.amr.corp.intel.com> 1 sibling, 2 replies; 7+ messages in thread From: Rusty Russell @ 2009-04-15 11:14 UTC (permalink / raw) To: Zhang, Yanmin Cc: LKML, Venkatesh Pallipadi, Denis Sadykov, cpufreq, linux-acpi On Wed, 15 Apr 2009 03:31:23 pm Zhang, Yanmin wrote: > My machine hanged with kernel 2.6.30-rc2 when script read > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. > > opps happens in get_measured_perf: > > cur.aperf.whole = readin.aperf.whole - > per_cpu(drv_data, cpu)->saved_aperf; > > Because per_cpu(drv_data, cpu)=NULL. > > So function get_measured_perf should check if (per_cpu(drv_data, cpu)==NULL) > and return 0 if it's NULL. > > Other functions have such checking. Possibly true, but I can't see that get_measured_perf() ever did. Unless there's something subtle with preemption no longer being disabled inside that function... Cc'd the experts. Thanks, Rusty. > > yanmin > > > > --------------sys log------------------ > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 > IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > PGD a7dd88067 PUD a7ccf5067 PMD 0 > Oops: 0000 [#1] SMP > last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > CPU 0 > Modules linked in: video output > Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server > RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 > RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 > RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 > R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 > R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 > FS: 0000000000000000(0000) GS:ffff88004d200000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task ffff880a7d4d18c0) > Stack: > ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 > 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a > 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 > Call Trace: > [<ffffffff803efd54>] ? kobject_get+0x12/0x17 > [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 > [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 > [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 > [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 > [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 > [<ffffffff80249f0d>] ? kthread+0x54/0x83 > [<ffffffff8020c87a>] ? child_rip+0xa/0x20 > [<ffffffff80249eb9>] ? kthread+0x0/0x83 > [<ffffffff8020c870>] ? child_rip+0x0/0x20 > Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28 > RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > RSP <ffff880a7d56de20> > CR2: 0000000000000020 > ---[ end trace 2b8fac9a49e19ad4 ]--- > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: 2.6.30-rc2 hangs in get_measured_perf on tigerton 2009-04-15 11:14 ` Rusty Russell @ 2009-04-15 13:31 ` Pallipadi, Venkatesh [not found] ` <B5B0CFF685D7DF46A05CF1678CFB42ED20E09E32@orsmsx505.amr.corp.intel.com> 1 sibling, 0 replies; 7+ messages in thread From: Pallipadi, Venkatesh @ 2009-04-15 13:31 UTC (permalink / raw) To: Rusty Russell, Zhang, Yanmin Cc: LKML, Denis Sadykov, cpufreq@vger.kernel.org, linux-acpi@vger.kernel.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4210 bytes --] >-----Original Message----- >From: cpufreq-owner@vger.kernel.org >[mailto:cpufreq-owner@vger.kernel.org] On Behalf Of Rusty Russell >Sent: Wednesday, April 15, 2009 4:14 AM >To: Zhang, Yanmin >Cc: LKML; Pallipadi, Venkatesh; Denis Sadykov; >cpufreq@vger.kernel.org; linux-acpi@vger.kernel.org >Subject: Re: 2.6.30-rc2 hangs in get_measured_perf on tigerton > >On Wed, 15 Apr 2009 03:31:23 pm Zhang, Yanmin wrote: >> My machine hanged with kernel 2.6.30-rc2 when script read >> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. >> >> opps happens in get_measured_perf: >> >> cur.aperf.whole = readin.aperf.whole - >> per_cpu(drv_data, cpu)->saved_aperf; >> >> Because per_cpu(drv_data, cpu)=NULL. >> >> So function get_measured_perf should check if >(per_cpu(drv_data, cpu)==NULL) >> and return 0 if it's NULL. >> >> Other functions have such checking. > >Possibly true, but I can't see that get_measured_perf() ever did. > >Unless there's something subtle with preemption no longer >being disabled >inside that function... > Checking the NULL and returning is not an option. We need to look at average current freq on all CPUs to make correct next freq decision. Also, per_cpu drv_data should be set for all CPUs. I will poke a bit at this and get back... Thanks, Venki >> >> >> --------------sys log------------------ >> >> BUG: unable to handle kernel NULL pointer dereference at >0000000000000020 >> IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 >> PGD a7dd88067 PUD a7ccf5067 PMD 0 >> Oops: 0000 [#1] SMP >> last sysfs file: >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor >> CPU 0 >> Modules linked in: video output >> Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server >> RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] >get_measured_perf+0x4a/0xf9 >> RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 >> RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 >> RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 >> RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 >> R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 >> R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 >> FS: 0000000000000000(0000) GS:ffff88004d200000(0000) >knlGS:0000000000000000 >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b >> CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, >task ffff880a7d4d18c0) >> Stack: >> ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 >> 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a >> 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 >> Call Trace: >> [<ffffffff803efd54>] ? kobject_get+0x12/0x17 >> [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 >> [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 >> [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 >> [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 >> [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e >> [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 >> [<ffffffff80249f0d>] ? kthread+0x54/0x83 >> [<ffffffff8020c87a>] ? child_rip+0xa/0x20 >> [<ffffffff80249eb9>] ? kthread+0x0/0x83 >> [<ffffffff8020c870>] ? child_rip+0x0/0x20 >> Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b >44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 >48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 >0a 48 2b 46 28 >> RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 >> RSP <ffff880a7d56de20> >> CR2: 0000000000000020 >> ---[ end trace 2b8fac9a49e19ad4 ]--- >> >> >-- >To unsubscribe from this list: send the line "unsubscribe cpufreq" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <B5B0CFF685D7DF46A05CF1678CFB42ED20E09E32@orsmsx505.amr.corp.intel.com>]
* [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf [not found] ` <B5B0CFF685D7DF46A05CF1678CFB42ED20E09E32@orsmsx505.amr.corp.intel.com> @ 2009-04-15 17:37 ` Pallipadi, Venkatesh 2009-04-16 1:20 ` Zhang, Yanmin 2009-04-18 4:11 ` Len Brown 0 siblings, 2 replies; 7+ messages in thread From: Pallipadi, Venkatesh @ 2009-04-15 17:37 UTC (permalink / raw) To: Zhang, Yanmin Cc: Rusty Russell, LKML, cpufreq@vger.kernel.org, linux-acpi@vger.kernel.org, lenb Patch below fixes the problem. Yanmin can you please verify. Len please push this patch along to linus. Thanks, Venki Fix for a regression that was introduced by earlier commit commit 18b2646fe3babeb40b34a0c1751e0bf5adfdc64c Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Mon Apr 6 11:26:08 2009 -0700 Regression resulted in the below error happened on systems with software coordination where per_cpu acpi data will not be initiated for secondary CPUs in a P-state domain. On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote: My machine hanged with kernel 2.6.30-rc2 when script read > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. > > opps happens in get_measured_perf: > > cur.aperf.whole = readin.aperf.whole - > per_cpu(drv_data, cpu)->saved_aperf; > > Because per_cpu(drv_data, cpu)=NULL. > > So function get_measured_perf should check if (per_cpu(drv_data, > cpu)==NULL) > and return 0 if it's NULL. --------------sys log------------------ BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 PGD a7dd88067 PUD a7ccf5067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor CPU 0 Modules linked in: video output Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88004d200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task ffff880a7d4d18c0) Stack: ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 Call Trace: [<ffffffff803efd54>] ? kobject_get+0x12/0x17 [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 [<ffffffff80249f0d>] ? kthread+0x54/0x83 [<ffffffff8020c87a>] ? child_rip+0xa/0x20 [<ffffffff80249eb9>] ? kthread+0x0/0x83 [<ffffffff8020c870>] ? child_rip+0x0/0x20 Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28 RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 RSP <ffff880a7d56de20> CR2: 0000000000000020 ---[ end trace 2b8fac9a49e19ad4 ]--- Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> --- arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c index 837c2c4..d722b8d 100644 --- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c +++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c @@ -68,11 +68,16 @@ struct acpi_cpufreq_data { unsigned int max_freq; unsigned int resume; unsigned int cpu_feature; - u64 saved_aperf, saved_mperf; }; static DEFINE_PER_CPU(struct acpi_cpufreq_data *, drv_data); +struct acpi_msr_data { + u64 saved_aperf, saved_mperf; +}; + +static DEFINE_PER_CPU(struct acpi_msr_data, msr_data); + DEFINE_TRACE(power_mark); /* acpi_perf_data is a pointer to percpu data. */ @@ -281,11 +286,11 @@ static unsigned int get_measured_perf(struct cpufreq_policy *policy, return 0; cur.aperf.whole = readin.aperf.whole - - per_cpu(drv_data, cpu)->saved_aperf; + per_cpu(msr_data, cpu).saved_aperf; cur.mperf.whole = readin.mperf.whole - - per_cpu(drv_data, cpu)->saved_mperf; - per_cpu(drv_data, cpu)->saved_aperf = readin.aperf.whole; - per_cpu(drv_data, cpu)->saved_mperf = readin.mperf.whole; + per_cpu(msr_data, cpu).saved_mperf; + per_cpu(msr_data, cpu).saved_aperf = readin.aperf.whole; + per_cpu(msr_data, cpu).saved_mperf = readin.mperf.whole; #ifdef __i386__ /* -- 1.6.0.6 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf 2009-04-15 17:37 ` [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf Pallipadi, Venkatesh @ 2009-04-16 1:20 ` Zhang, Yanmin 2009-04-18 4:11 ` Len Brown 1 sibling, 0 replies; 7+ messages in thread From: Zhang, Yanmin @ 2009-04-16 1:20 UTC (permalink / raw) To: Pallipadi, Venkatesh Cc: Rusty Russell, LKML, cpufreq@vger.kernel.org, linux-acpi@vger.kernel.org, lenb On Wed, 2009-04-15 at 10:37 -0700, Pallipadi, Venkatesh wrote: > Patch below fixes the problem. > Yanmin can you please verify. It does work. > Len please push this patch along to linus. > > Thanks, > Venki > > > Fix for a regression that was introduced by earlier commit > > commit 18b2646fe3babeb40b34a0c1751e0bf5adfdc64c > Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> > Date: Mon Apr 6 11:26:08 2009 -0700 > > Regression resulted in the below error happened on systems with > software coordination where per_cpu acpi data will not be initiated for > secondary CPUs in a P-state domain. > > On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote: > My machine hanged with kernel 2.6.30-rc2 when script read > > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. > > > > opps happens in get_measured_perf: > > > > cur.aperf.whole = readin.aperf.whole - > > per_cpu(drv_data, cpu)->saved_aperf; > > > > Because per_cpu(drv_data, cpu)=NULL. > > > > So function get_measured_perf should check if (per_cpu(drv_data, > > cpu)==NULL) > > and return 0 if it's NULL. > > --------------sys log------------------ > > BUG: unable to handle kernel NULL pointer dereference at > 0000000000000020 > IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > PGD a7dd88067 PUD a7ccf5067 PMD 0 > Oops: 0000 [#1] SMP > last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > CPU 0 > Modules linked in: video output > Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server > RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] > get_measured_perf+0x4a/0xf9 > RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 > RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 > RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 > R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 > R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 > FS: 0000000000000000(0000) GS:ffff88004d200000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task > ffff880a7d4d18c0) > Stack: > ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 > 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a > 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 > Call Trace: > [<ffffffff803efd54>] ? kobject_get+0x12/0x17 > [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 > [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 > [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 > [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 > [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 > [<ffffffff80249f0d>] ? kthread+0x54/0x83 > [<ffffffff8020c87a>] ? child_rip+0xa/0x20 > [<ffffffff80249eb9>] ? kthread+0x0/0x83 > [<ffffffff8020c870>] ? child_rip+0x0/0x20 > Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 > c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b > 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28 > RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 > RSP <ffff880a7d56de20> > CR2: 0000000000000020 > ---[ end trace 2b8fac9a49e19ad4 ]--- > > Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> > Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> > --- > arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 15 ++++++++++----- > 1 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c > index 837c2c4..d722b8d 100644 > --- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c > +++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c > @@ -68,11 +68,16 @@ struct acpi_cpufreq_data { > unsigned int max_freq; > unsigned int resume; > unsigned int cpu_feature; > - u64 saved_aperf, saved_mperf; > }; > > static DEFINE_PER_CPU(struct acpi_cpufreq_data *, drv_data); > > +struct acpi_msr_data { > + u64 saved_aperf, saved_mperf; > +}; > + > +static DEFINE_PER_CPU(struct acpi_msr_data, msr_data); > + > DEFINE_TRACE(power_mark); > > /* acpi_perf_data is a pointer to percpu data. */ > @@ -281,11 +286,11 @@ static unsigned int get_measured_perf(struct cpufreq_policy *policy, > return 0; > > cur.aperf.whole = readin.aperf.whole - > - per_cpu(drv_data, cpu)->saved_aperf; > + per_cpu(msr_data, cpu).saved_aperf; > cur.mperf.whole = readin.mperf.whole - > - per_cpu(drv_data, cpu)->saved_mperf; > - per_cpu(drv_data, cpu)->saved_aperf = readin.aperf.whole; > - per_cpu(drv_data, cpu)->saved_mperf = readin.mperf.whole; > + per_cpu(msr_data, cpu).saved_mperf; > + per_cpu(msr_data, cpu).saved_aperf = readin.aperf.whole; > + per_cpu(msr_data, cpu).saved_mperf = readin.mperf.whole; > > #ifdef __i386__ > /* ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf 2009-04-15 17:37 ` [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf Pallipadi, Venkatesh 2009-04-16 1:20 ` Zhang, Yanmin @ 2009-04-18 4:11 ` Len Brown 1 sibling, 0 replies; 7+ messages in thread From: Len Brown @ 2009-04-18 4:11 UTC (permalink / raw) To: Pallipadi, Venkatesh Cc: Zhang, Yanmin, Rusty Russell, LKML, cpufreq@vger.kernel.org, linux-acpi@vger.kernel.org applied. thanks, Len Brown, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-04-18 4:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-15 6:01 2.6.30-rc2 hangs in get_measured_perf on tigerton Zhang, Yanmin
2009-04-15 6:49 ` Zhang, Yanmin
2009-04-15 11:14 ` Rusty Russell
2009-04-15 13:31 ` Pallipadi, Venkatesh
[not found] ` <B5B0CFF685D7DF46A05CF1678CFB42ED20E09E32@orsmsx505.amr.corp.intel.com>
2009-04-15 17:37 ` [PATCH] x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf Pallipadi, Venkatesh
2009-04-16 1:20 ` Zhang, Yanmin
2009-04-18 4:11 ` Len Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox