public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: [discuss] [OOPS] powernow on smp dual core amd64
@ 2005-06-13 21:47 Langsdorf, Mark
  2005-06-13 22:20 ` Tom Duffy
  0 siblings, 1 reply; 15+ messages in thread
From: Langsdorf, Mark @ 2005-06-13 21:47 UTC (permalink / raw)
  To: Tom Duffy; +Cc: Linux Kernel Mailing List, discuss

[-- Attachment #1: Type: text/plain, Size: 3660 bytes --]

> powernow-k8: Found 8 AMD Athlon 64 / Opteron processors 
> (version 1.40.2)
> powernow-k8:    0 : fid 0xe (2200 MHz), vid 0x8 (1350 mV)
> powernow-k8:    1 : fid 0xc (2000 MHz), vid 0xa (1300 mV)
> powernow-k8:    2 : fid 0xa (1800 MHz), vid 0xc (1250 mV)
> cpu_init done, current fid 0xe, vid 0x8
> powernow-k8:    0 : fid 0xe (2200 MHz), vid 0x8 (1350 mV)
> powernow-k8:    1 : fid 0xc (2000 MHz), vid 0xa (1300 mV)
> powernow-k8:    2 : fid 0xa (1800 MHz), vid 0xc (1250 mV)
> cpu_init done, current fid 0xe, vid 0x8
> powernow-k8:    0 : fid 0xe (2200 MHz), vid 0x8 (1350 mV)
> powernow-k8:    1 : fid 0xc (2000 MHz), vid 0xa (1300 mV)
> powernow-k8:    2 : fid 0xa (1800 MHz), vid 0xc (1250 mV)
> cpu_init done, current fid 0xe, vid 0x8
> powernow-k8:    0 : fid 0xe (2200 MHz), vid 0x8 (1350 mV)
> powernow-k8:    1 : fid 0xc (2000 MHz), vid 0xa (1300 mV)
> powernow-k8:    2 : fid 0xa (1800 MHz), vid 0xc (1250 mV)
> cpu_init done, current fid 0xe, vid 0x8

>  sdb:<3>powernowk8_get() cpu is 1
> Unable to handle kernel NULL pointer dereference at 
> 0000000000000024 RIP: 
> <ffffffff8011d94c>{query_current_values_with_pending_wait+60} 
> sdb1 sdb2
> 
> PGD 3fe74067 PUD 3fd28067 PMD 0
> Oops: 0002 [1] SMP
> CPU 1
> Modules linked in: mptscsih mptbase sd_mod scsi_mod
> Pid: 25, comm: events/7 Not tainted 2.6.12-rc6andro
> RIP: 0010:[<ffffffff8011d94c>] 
> <ffffffff8011d94c>{query_current_values_with_pending_wait+60}
> RSP: 0000:ffff81007fddbdc8  EFLAGS: 00010202
> RAX: 000000000000000e RBX: 0000000000000000 RCX: 00000000c0010042
> RDX: 0000000000000008 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: 0000000000000080 R08: ffff81007fdda000 R09: ffff81003fd421f0
> R10: 000000000000001c R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000283 R15: ffffffff80112630
> FS:  000000000057d850(0000) GS:ffffffff80498180(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000024 CR3: 000000003f415000 CR4: 
> 00000000000006e0 Process events/7 (pid: 25, threadinfo 
> ffff81007fdda000, task ffff81003fd421f0)
> Stack: 0000000000000080 ffffffff8011dea1 0000000000000001 
> ffff81003fd91430
>        ffff81003fd91400 ffffffff802e2b90 0000000000000000 
> 0000000000000003
>        ffff81007fddbe48 0000000000000001
> Call Trace:<ffffffff8011dea1>{powernowk8_get+145} 
> <ffffffff802e2b90>{cpufreq_get+96}
>        <ffffffff8011266a>{handle_cpufreq_delayed_get+58} 
> <ffffffff80148eec>{worker_thread+476}
>        <ffffffff801326d0>{default_wake_function+0} 
> <ffffffff80130733>{__wake_up_common+67}
>        <ffffffff80148d10>{worker_thread+0} 
> <ffffffff8014d7a9>{kthread+217}
>        <ffffffff80133be0>{schedule_tail+64} 
> <ffffffff8010f5b7>{child_rip+8}
>        <ffffffff8011d4f0>{flat_send_IPI_mask+0} 
> <ffffffff8014d6d0>{kthread+0}
>        <ffffffff8010f5af>{child_rip+0}
> 
> Code: 89 47 24 89 57 20 31 c0 48 83 c4 08 c3 66 66 66 90 66 
> 66 90 RIP 
> <ffffffff8011d94c>{query_current_values_with_pending_wait+60} 
> RSP <ffff81007fddbdc8>
> CR2: 0000000000000024
>  <5>Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0 

Okay, I think I have figured this out.  During initialization,
the cpufreq infrastruture only initializes the first core of
each processor.  When a request comes into the second core,
it's data structre is unitialized and we get the null point
dereference.

The solution is to assign the pointer to the data structure for
the first core to all the other cores.

Tom, could you try this patch and see if it helps?

-Mark Langsdorf
AMD, Inc.

[-- Attachment #2: jhpn-2.6.12-rc6.patch --]
[-- Type: application/octet-stream, Size: 837 bytes --]

--- linux-2.6.12-rc6/arch/i386/kernel/cpu/cpufreq/powernow-k8.c.old	2005-06-12 17:41:55.123651184 -0500
+++ linux-2.6.12-rc6/arch/i386/kernel/cpu/cpufreq/powernow-k8.c	2005-06-12 17:46:32.780440936 -0500
@@ -44,7 +44,7 @@
 
 #define PFX "powernow-k8: "
 #define BFX PFX "BIOS error: "
-#define VERSION "version 1.40.2"
+#define VERSION "version 1.40.4"
 #include "powernow-k8.h"
 
 /* serialize freq changes  */
@@ -978,7 +978,7 @@
 {
 	struct powernow_k8_data *data;
 	cpumask_t oldmask = CPU_MASK_ALL;
-	int rc;
+	int rc, i;
 
 	if (!check_supported_cpu(pol->cpu))
 		return -ENODEV;
@@ -1064,7 +1064,9 @@
 	printk("cpu_init done, current fid 0x%x, vid 0x%x\n",
 	       data->currfid, data->currvid);
 
-	powernow_data[pol->cpu] = data;
+	for_each_cpu_mask(i, cpu_core_map[pol->cpu]) {
+		powernow_data[i] = data;
+	}
 
 	return 0;
 

^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: [discuss] [OOPS] powernow on smp dual core amd64
@ 2005-06-13 22:44 Langsdorf, Mark
  2005-06-13 22:47 ` Andi Kleen
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Langsdorf, Mark @ 2005-06-13 22:44 UTC (permalink / raw)
  To: Tom Duffy; +Cc: discuss, Linux Kernel Mailing List

> On Mon, 2005-06-13 at 16:47 -0500, Langsdorf, Mark wrote:
> > Okay, I think I have figured this out.  During initialization, the 
> > cpufreq infrastruture only initializes the first core of each 
> > processor.  When a request comes into the second core, it's data 
> > structre is unitialized and we get the null point dereference.
> > 
> > The solution is to assign the pointer to the data structure for the 
> > first core to all the other cores.
> > 
> > Tom, could you try this patch and see if it helps?
> 
> Yes!  It fixed the panic.  I get much further.

Great, I'll test that some more then submit it.

> Unfortunately, after starting cpuspeed daemon, I get this:

It looks like it's happening sometime after cpuspeed starts.
Could you disable cpuspeed and see if the problem still
occurs? 

> Starting cpuspeed: [  OK  ]
> Starting pcmcia:  Starting PCMCIA services:
> CPU 6: Machine Check Exception:                4 Bank 4: 
> b200000000070f0f
> TSC 4129a3d70d

> Code:  Bad RIP value.
> RIP [<00000000000000ff>] RSP <ffff81003fe63fa0>
> CR2: 00000000000000ff
>  <0>Kernel panic - not syncing: Oops

Andi said that "Something tried to access a physical memory 
address that was not mapped in the CPU."  Andi, is this
related to the bug that you thought might have been fixed
in 2.6.12-rc6-git4?

-Mark Langsdorf
AMD, Inc.


^ permalink raw reply	[flat|nested] 15+ messages in thread
[parent not found: <84EA05E2CA77634C82730353CBE3A84301CFC14B@SAUSEXMB1.amd.com>]
* RE: [discuss] [OOPS] powernow on smp dual core amd64
@ 2005-06-10 19:48 Langsdorf, Mark
  2005-06-10 20:01 ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Langsdorf, Mark @ 2005-06-10 19:48 UTC (permalink / raw)
  To: Tom Duffy, Andi Kleen; +Cc: discuss, linux-kernel

It looks like the crash is caused by an invalid
pointer dereference in 
query_current_values_with_pending_wait(), which
implies that powernowk8_get() was called with an
invalid CPU number.

Andi, what will happen if you do
set_cpus_allowed(current, cpumask_of_cpu(cpu)) when
cpu isn't in the range of online CPUs?  There's
supposed to be a check to prevent an invalid
pointer access from happening but it's failing for 
some reason.

-Mark Langsdorf
AMD, Inc.

> -----Original Message-----
> From: Tom Duffy [mailto:tduffy@sun.com] 
> Sent: Friday, June 10, 2005 1:47 PM
> To: Andi Kleen
> Cc: discuss@x86-64.org; linux-kernel@vger.kernel.org
> Subject: Re: [discuss] [OOPS] powernow on smp dual core amd64
> 
> 
> On Fri, 2005-06-10 at 18:53 +0200, Andi Kleen wrote:
> > On Thu, Jun 09, 2005 at 04:46:19PM -0700, Tom Duffy wrote:
> > > Got this panic when I recently upgraded my BIOS so that 
> it supports 
> > > k8 powernow on SMP dual-core.
> > 
> > 2.6.12-rc has a dual core aware powernow k8 driver.
> 
> Despite the name of kernel, it is based off of 2.6.12-rc6.
> 
> Here is the panic on bootup.
> 
> Unable to handle kernel NULL pointer dereference at 
> 0000000000000024 RIP: 
> <ffffffff8011dadc>{query_current_values_with_pending_wait+60}
> PGD 3f255067 PUD 7fe7e067 PMD 0
> Oops: 0002 [1] SMP
> CPU 1
> Modules linked in: mptscsih(U) mptbase(U) sd_mod scsi_mod
> Pid: 33, comm: events/7 Not tainted 2.6.11-1.1381_FC5smp
> RIP: 0010:[<ffffffff8011dadc>]  sdb1 sdb2 
> <ffffffff8011dadc>{query_current_values_with_pending_wait+60}
> RSP: 0018:ffff81007fd9fdc8  EFLAGS: 00010202
> RAX: 000000000000000e RBX: 0000000000000000 RCX: 00000000c0010042
> RDX: 0000000000000008 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffff81007fd9e000 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000080
> R13: 0000000000000000 R14: 0000000000000292 R15: ffffffff80112950
> FS:  00000000005a5858(0000) GS:ffffffff8050ec00(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000024 CR3: 000000007fd76000 CR4: 
> 00000000000006e0 Process events/7 (pid: 33, threadinfo 
> ffff81007fd9e000, task ffff81007fd43070)
> Stack: 0000000000000000 ffffffff8011e0b1 0000000000000001 
> ffff81007fa02d10
>        ffff81007fa02d40 ffffffff802e6f23 0000000000000000 
> 0000000000000003
>        0000000000000001 0000000000000020
> Call Trace:<ffffffff8011e0b1>{powernowk8_get+129} 
> <ffffffff802e6f23>{cpufreq_get+115}
>        <ffffffff8011298a>{handle_cpufreq_delayed_get+58} 
> <ffffffff8014b9dc>{worker_thread+476}
>        <ffffffff80134710>{default_wake_function+0} 
> <ffffffff801326a3>{__wake_up_common+67}
>        <ffffffff8014b800>{worker_thread+0} 
> <ffffffff80150469>{kthread+217}
>        <ffffffff80135c90>{schedule_tail+64} 
> <ffffffff8010f76b>{child_rip+8}
>        <ffffffff8011d680>{flat_send_IPI_mask+0} 
> <ffffffff80150390>{kthread+0}
>        <ffffffff8010f763>{child_rip+0}
> 
> Code: 89 47 24 89 57 20 31 c0 48 83 c4 08 c3 66 66 66 90 66 
> 66 90 RIP 
> <ffffffff8011dadc>{query_current_values_with_pending_wait+60} 
> RSP <ffff81007fd9fdc8>
> CR2: 0000000000000024
>  <3>Debug: sleeping function called from invalid context at 
> include/linux/rwsem.h:43 in_atomic():0, irqs_disabled():1
> 
> Call Trace:<ffffffff8013abc5>{profile_task_exit+21} 
> <ffffffff8013bfe2>{do_exit+34}
>        <ffffffff80267378>{do_unblank_screen+40} 
> <ffffffff80124286>{do_page_fault+1926}
>        <ffffffff8035c032>{thread_return+0} 
> <ffffffff8035c084>{thread_return+82}
>        <ffffffff8013433d>{activate_task+141} 
> <ffffffff80112950>{handle_cpufreq_delayed_get+0}
>        <ffffffff8010f5b5>{error_exit+0} 
> <ffffffff80112950>{handle_cpufreq_delayed_get+0}
>        <ffffffff8011dadc>{query_current_values_with_pending_wait+60}
>        <ffffffff8011e0b1>{powernowk8_get+129} 
> <ffffffff802e6f23>{cpufreq_get+115}
>        <ffffffff8011298a>{handle_cpufreq_delayed_get+58} 
> <ffffffff8014b9dc>{worker_thread+476}
>        <ffffffff80134710>{default_wake_function+0} 
> <ffffffff801326a3>{__wake_up_common+67}
>        <ffffffff8014b800>{worker_thread+0} 
> <ffffffff80150469>{kthread+217}
>        <ffffffff80135c90>{schedule_tail+64} 
> <ffffffff8010f76b>{child_rip+8}
>        <ffffffff8011d680>{flat_send_IPI_mask+0} 
> <ffffffff80150390>{kthread+0}
>        <ffffffff8010f763>{child_rip+0}
> 
> -tduffy
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread
* [OOPS] powernow on smp dual core amd64
@ 2005-06-09 23:46 Tom Duffy
  2005-06-10 16:53 ` [discuss] " Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Duffy @ 2005-06-09 23:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: discuss

[-- Attachment #1: Type: text/plain, Size: 1890 bytes --]

Got this panic when I recently upgraded my BIOS so that it supports k8
powernow on SMP dual-core.

Unable to handle kernel NULL pointer dereference at 0000000000000024 RIP:
<ffffffff8011dadc>{query_current_values_with_pending_wait+60}
PGD 0
Oops: 0002 [1] SMP
CPU 1
Modules linked in: nls_utf8 e1000(U) parport_pc lp parport sr_mod autofs4 md5 ipv6 sunrpc usb_storage pcmcia yenta_socket rsrc_nonstatic pcmcia_core xfs exportfs video button battery ac ohci_hcd ehci_hcd i2c_nforce2 i2c_core shpchp usbnet mii dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod sata_nv libata qla2322 qla2xxx scsi_transport_fc mptscsih(U) mptbase(U) sd_mod scsi_mod
Pid: 26, comm: events/0 Not tainted 2.6.11-1.1369_FC4smp
RIP: 0010:[<ffffffff8011dadc>] <ffffffff8011dadc>{query_current_values_with_pending_wait+60}
RSP: 0000:ffff81003fca3dc8  EFLAGS: 00010202
RAX: 000000000000000e RBX: 0000000000000000 RCX: 00000000c0010042
RDX: 0000000000000008 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffff81003fca2000 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000292 R15: ffffffff80112950
FS:  00002aaaaadfd6e0(0000) GS:ffffffff80510700(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000024 CR3: 000000003e8c8000 CR4: 00000000000006e0
Process events/0 (pid: 26, threadinfo ffff81003fca2000, task ffff81007fd9b840)
Stack: 0000000000000000 ffffffff8011e0b1 0000000000000001 ffff81003feb1e00
       ffff81003feb1e30 ffffffff802e7d03 0000000000000000 0000000000000003
       0000000000000001 0000000000000020
Call Trace:<ffffffff8011e0b1>{powernowk8_get+129} <ffffffff802e7d03>{cpufreq_get+115}
       <ffffffff8011298a>{handle_cpufreq_delayed_get+58} <ffffffff8014b9dc>{worker_thread+476}
       <ffffffff80134710>{default_wake_function+0}

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2005-06-14 18:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-13 21:47 [discuss] [OOPS] powernow on smp dual core amd64 Langsdorf, Mark
2005-06-13 22:20 ` Tom Duffy
2005-06-13 22:38   ` Andi Kleen
2005-06-13 23:17   ` Zachary Amsden
2005-06-13 23:34     ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2005-06-13 22:44 Langsdorf, Mark
2005-06-13 22:47 ` Andi Kleen
2005-06-13 22:58 ` Tom Duffy
2005-06-13 23:35   ` Andi Kleen
2005-06-14 18:19 ` Tom Duffy
     [not found] <84EA05E2CA77634C82730353CBE3A84301CFC14B@SAUSEXMB1.amd.com>
2005-06-13 21:27 ` Tom Duffy
2005-06-10 19:48 Langsdorf, Mark
2005-06-10 20:01 ` Andi Kleen
2005-06-09 23:46 Tom Duffy
2005-06-10 16:53 ` [discuss] " Andi Kleen
2005-06-10 18:46   ` Tom Duffy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox