netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
@ 2006-08-12 10:07 Rafael J. Wysocki
  2006-08-12 12:28 ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 10:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Stephen Hemminger, netdev

Hi,

On 2.6.18-rc3-mm2 with hotfixes I get things like the appended one on attempts
to suspend to disk.  It occurs while devices are being suspended and is fairly
reproducible.

Greetings,
Rafael


Suspending device 0000:01:00.0
Suspending device 0000:02:02.0
Suspending device 0000:02:01.4
Suspending device 0000:02:01.3
Suspending device 0000:02:01.2
Suspending device 0000:02:01.1
Suspending device 0000:02:01.0
Suspending device 0000:02:00.0
skge Ram read data parity error
skge Ram write data parity error
skge eth0: receive queue parity error
skge <NULL>: receive queue parity error
skge 0000:02:00.0: PCI error cmd=0x110 status=0x2b0
general protection fault: 0000 [1] PREEMPT
last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/subsystem_device
CPU 0
Modules linked in: ide_cd cdrom usbserial asus_acpi thermal ipv6 processor fan button battery ac af_packet snd_pcm_oss snd_mixer_oss snd_seq
snd_seq_device bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yenta_socket rsrc_nonstatic pc
mcia_core usbhid ff_memless snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc ehci_hcd ohci_hcd i2c_nfo
rce2 i2c_core parport_pc lp parport dm_mod
Pid: 4, comm: events/0 Not tainted 2.6.18-rc3-mm2 #17
RIP: 0010:[<ffffffff88107287>]  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
RSP: 0018:ffffffff80621e70  EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000040
RDX: ffff81005addf128 RSI: ffffffff80621eec RDI: ffff81005addeb60
RBP: ffffffff80621ed0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000040 R11: 0000000000000000 R12: ffff81005addf0a0
R13: 0000000000000000 R14: ffff810057fe9180 R15: 00000000ffffffff
FS:  00002b4b98df4b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002adeb0d7d0b0 CR3: 0000000025147000 CR4: 00000000000006e0
Process events/0 (pid: 4, threadinfo ffff810037f44000, task ffff810037fef100)
Stack:  ffffffff80621eb0 ffffffff80621eec ffff81005addeb60 ffff81005ad61488
 ffff81005addf128 000000400000000a 00000001008e6a25 0000000000000000
 ffff81005addeb60 0000000000000000 00000001008e6a25 00000000ffffffff
Call Trace:
 [<ffffffff8040b1ba>] net_rx_action+0xba/0x1f0
 [<ffffffff80233640>] __do_softirq+0x70/0xf0
 [<ffffffff8020aa7c>] call_softirq+0x1c/0x30
DWARF2 unwinder stuck at call_softirq+0x1c/0x30
Leftover inexact backtrace:
 <IRQ> [<ffffffff8020ca4d>] do_softirq+0x3d/0xb0
 [<ffffffff8023349e>] irq_exit+0x4e/0x60
 [<ffffffff8020cbf5>] do_IRQ+0x135/0x140
 [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
 [<ffffffff8020a266>] ret_from_intr+0x0/0xf
 <EOI> [<ffffffff80233367>] local_bh_enable_ip+0xe7/0x110
 [<ffffffff804718b9>] _spin_unlock_bh+0x39/0x40
 [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
 [<ffffffff80427c8b>] rt_cache_flush+0xab/0x100
 [<ffffffff8045a1c9>] fib_netdev_event+0xa9/0xc0
 [<ffffffff8023c2af>] notifier_call_chain+0x2f/0x50
 [<ffffffff8023c4b9>] raw_notifier_call_chain+0x9/0x10
 [<ffffffff80409789>] netdev_state_change+0x29/0x40
 [<ffffffff80415122>] linkwatch_run_queue+0x162/0x190
 [<ffffffff8041517a>] linkwatch_event+0x2a/0x40
 [<ffffffff8023fd72>] run_workqueue+0xc2/0x120
 [<ffffffff80415150>] linkwatch_event+0x0/0x40
 [<ffffffff8023fff1>] worker_thread+0x121/0x160
 [<ffffffff80229370>] default_wake_function+0x0/0x10
 [<ffffffff8023fed0>] worker_thread+0x0/0x160
 [<ffffffff802436f9>] kthread+0xd9/0x110
 [<ffffffff8024b1ad>] trace_hardirqs_on+0x11d/0x150
 [<ffffffff8020a706>] child_rip+0x8/0x12
 [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
 [<ffffffff8020a2c0>] restore_args+0x0/0x30
 [<ffffffff80243620>] kthread+0x0/0x110
 [<ffffffff8020a6fe>] child_rip+0x0/0x12


Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
RIP  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
 RSP <ffffffff80621e70>
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 10:07 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend Rafael J. Wysocki
@ 2006-08-12 12:28 ` Andrew Morton
  2006-08-12 13:39   ` Jeff Garzik
  2006-08-12 14:31   ` Rafael J. Wysocki
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2006-08-12 12:28 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML, Stephen Hemminger, netdev

On Sat, 12 Aug 2006 12:07:42 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Hi,
> 
> On 2.6.18-rc3-mm2 with hotfixes I get things like the appended one on attempts
> to suspend to disk.  It occurs while devices are being suspended and is fairly
> reproducible.
> 
> Greetings,
> Rafael
> 
> 
> Suspending device 0000:01:00.0
> Suspending device 0000:02:02.0
> Suspending device 0000:02:01.4
> Suspending device 0000:02:01.3
> Suspending device 0000:02:01.2
> Suspending device 0000:02:01.1
> Suspending device 0000:02:01.0
> Suspending device 0000:02:00.0
> skge Ram read data parity error
> skge Ram write data parity error
> skge eth0: receive queue parity error
> skge <NULL>: receive queue parity error
> skge 0000:02:00.0: PCI error cmd=0x110 status=0x2b0
> general protection fault: 0000 [1] PREEMPT
> last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/subsystem_device
> CPU 0
> Modules linked in: ide_cd cdrom usbserial asus_acpi thermal ipv6 processor fan button battery ac af_packet snd_pcm_oss snd_mixer_oss snd_seq
> snd_seq_device bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yenta_socket rsrc_nonstatic pc
> mcia_core usbhid ff_memless snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc ehci_hcd ohci_hcd i2c_nfo
> rce2 i2c_core parport_pc lp parport dm_mod
> Pid: 4, comm: events/0 Not tainted 2.6.18-rc3-mm2 #17
> RIP: 0010:[<ffffffff88107287>]  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> RSP: 0018:ffffffff80621e70  EFLAGS: 00010202
> RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000040

RAX doesn't look good.

> RDX: ffff81005addf128 RSI: ffffffff80621eec RDI: ffff81005addeb60
> RBP: ffffffff80621ed0 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000040 R11: 0000000000000000 R12: ffff81005addf0a0
> R13: 0000000000000000 R14: ffff810057fe9180 R15: 00000000ffffffff
> FS:  00002b4b98df4b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002adeb0d7d0b0 CR3: 0000000025147000 CR4: 00000000000006e0
> Process events/0 (pid: 4, threadinfo ffff810037f44000, task ffff810037fef100)
> Stack:  ffffffff80621eb0 ffffffff80621eec ffff81005addeb60 ffff81005ad61488
>  ffff81005addf128 000000400000000a 00000001008e6a25 0000000000000000
>  ffff81005addeb60 0000000000000000 00000001008e6a25 00000000ffffffff
> Call Trace:
>  [<ffffffff8040b1ba>] net_rx_action+0xba/0x1f0
>  [<ffffffff80233640>] __do_softirq+0x70/0xf0
>  [<ffffffff8020aa7c>] call_softirq+0x1c/0x30
> DWARF2 unwinder stuck at call_softirq+0x1c/0x30
> Leftover inexact backtrace:
>  <IRQ> [<ffffffff8020ca4d>] do_softirq+0x3d/0xb0
>  [<ffffffff8023349e>] irq_exit+0x4e/0x60
>  [<ffffffff8020cbf5>] do_IRQ+0x135/0x140
>  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
>  [<ffffffff8020a266>] ret_from_intr+0x0/0xf
>  <EOI> [<ffffffff80233367>] local_bh_enable_ip+0xe7/0x110
>  [<ffffffff804718b9>] _spin_unlock_bh+0x39/0x40
>  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
>  [<ffffffff80427c8b>] rt_cache_flush+0xab/0x100
>  [<ffffffff8045a1c9>] fib_netdev_event+0xa9/0xc0
>  [<ffffffff8023c2af>] notifier_call_chain+0x2f/0x50
>  [<ffffffff8023c4b9>] raw_notifier_call_chain+0x9/0x10
>  [<ffffffff80409789>] netdev_state_change+0x29/0x40
>  [<ffffffff80415122>] linkwatch_run_queue+0x162/0x190
>  [<ffffffff8041517a>] linkwatch_event+0x2a/0x40
>  [<ffffffff8023fd72>] run_workqueue+0xc2/0x120
>  [<ffffffff80415150>] linkwatch_event+0x0/0x40
>  [<ffffffff8023fff1>] worker_thread+0x121/0x160
>  [<ffffffff80229370>] default_wake_function+0x0/0x10
>  [<ffffffff8023fed0>] worker_thread+0x0/0x160
>  [<ffffffff802436f9>] kthread+0xd9/0x110
>  [<ffffffff8024b1ad>] trace_hardirqs_on+0x11d/0x150
>  [<ffffffff8020a706>] child_rip+0x8/0x12
>  [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
>  [<ffffffff8020a2c0>] restore_args+0x0/0x30
>  [<ffffffff80243620>] kthread+0x0/0x110
>  [<ffffffff8020a6fe>] child_rip+0x0/0x12
> Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
> RIP  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
>  RSP <ffffffff80621e70>
>  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

ksymoops says:

Code;  ffffffff88107287 <_end+7ac9287/7efc2000>
00000000 <_EIP>:
Code;  ffffffff88107287 <_end+7ac9287/7efc2000>   <=====
   0:   44                        inc    %esp   <=====
Code;  ffffffff88107288 <_end+7ac9288/7efc2000>
   1:   8b 28                     mov    (%eax),%ebp
Code;  ffffffff8810728a <_end+7ac928a/7efc2000>
   3:   c7 45 d0 00 00 00 00      movl   $0x0,0xffffffd0(%ebp)
Code;  ffffffff88107291 <_end+7ac9291/7efc2000>
   a:   45                        inc    %ebp
Code;  ffffffff88107292 <_end+7ac9292/7efc2000>
   b:   85 ed                     test   %ebp,%ebp
Code;  ffffffff88107294 <_end+7ac9294/7efc2000>
   d:   0f 89 29 fb ff ff         jns    fffffb3c <_EIP+0xfffffb3c>
Code;  ffffffff8810729a <_end+7ac929a/7efc2000>
  13:   e9 00 00 00 00            jmp    18 <_EIP+0x18>

So even if we didn't deref a kfree'd pointer, we're about to.

It would be good if you could poke around in gdb, work out exactly which
statement it's oopsing at, please.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 12:28 ` Andrew Morton
@ 2006-08-12 13:39   ` Jeff Garzik
  2006-08-12 14:32     ` Rafael J. Wysocki
  2006-08-12 14:31   ` Rafael J. Wysocki
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff Garzik @ 2006-08-12 13:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Rafael J. Wysocki, LKML, Stephen Hemminger, netdev

Andrew Morton wrote:
> It would be good if you could poke around in gdb, work out exactly which
> statement it's oopsing at, please.

I'm also interested to know if the problem goes away when you disable 
preempt...

	Jeff



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 12:28 ` Andrew Morton
  2006-08-12 13:39   ` Jeff Garzik
@ 2006-08-12 14:31   ` Rafael J. Wysocki
  2006-08-12 16:12     ` Edgar E. Iglesias
  1 sibling, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 14:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, Stephen Hemminger, netdev

On Saturday 12 August 2006 14:28, Andrew Morton wrote:
> On Sat, 12 Aug 2006 12:07:42 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > Hi,
> > 
> > On 2.6.18-rc3-mm2 with hotfixes I get things like the appended one on attempts
> > to suspend to disk.  It occurs while devices are being suspended and is fairly
> > reproducible.
> > 
> > Greetings,
> > Rafael
> > 
> > 
> > Suspending device 0000:01:00.0
> > Suspending device 0000:02:02.0
> > Suspending device 0000:02:01.4
> > Suspending device 0000:02:01.3
> > Suspending device 0000:02:01.2
> > Suspending device 0000:02:01.1
> > Suspending device 0000:02:01.0
> > Suspending device 0000:02:00.0
> > skge Ram read data parity error
> > skge Ram write data parity error
> > skge eth0: receive queue parity error
> > skge <NULL>: receive queue parity error

This stuff comes from the interrupt handler which apparently races with
something.

> > skge 0000:02:00.0: PCI error cmd=0x110 status=0x2b0
> > general protection fault: 0000 [1] PREEMPT
> > last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/subsystem_device
> > CPU 0
> > Modules linked in: ide_cd cdrom usbserial asus_acpi thermal ipv6 processor fan button battery ac af_packet snd_pcm_oss snd_mixer_oss snd_seq
> > snd_seq_device bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yenta_socket rsrc_nonstatic pc
> > mcia_core usbhid ff_memless snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc ehci_hcd ohci_hcd i2c_nfo
> > rce2 i2c_core parport_pc lp parport dm_mod
> > Pid: 4, comm: events/0 Not tainted 2.6.18-rc3-mm2 #17
> > RIP: 0010:[<ffffffff88107287>]  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> > RSP: 0018:ffffffff80621e70  EFLAGS: 00010202
> > RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000040
> 
> RAX doesn't look good.

Yup.

> > RDX: ffff81005addf128 RSI: ffffffff80621eec RDI: ffff81005addeb60
> > RBP: ffffffff80621ed0 R08: 0000000000000001 R09: 0000000000000000
> > R10: 0000000000000040 R11: 0000000000000000 R12: ffff81005addf0a0
> > R13: 0000000000000000 R14: ffff810057fe9180 R15: 00000000ffffffff
> > FS:  00002b4b98df4b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 00002adeb0d7d0b0 CR3: 0000000025147000 CR4: 00000000000006e0
> > Process events/0 (pid: 4, threadinfo ffff810037f44000, task ffff810037fef100)
> > Stack:  ffffffff80621eb0 ffffffff80621eec ffff81005addeb60 ffff81005ad61488
> >  ffff81005addf128 000000400000000a 00000001008e6a25 0000000000000000
> >  ffff81005addeb60 0000000000000000 00000001008e6a25 00000000ffffffff
> > Call Trace:
> >  [<ffffffff8040b1ba>] net_rx_action+0xba/0x1f0
> >  [<ffffffff80233640>] __do_softirq+0x70/0xf0
> >  [<ffffffff8020aa7c>] call_softirq+0x1c/0x30
> > DWARF2 unwinder stuck at call_softirq+0x1c/0x30
> > Leftover inexact backtrace:
> >  <IRQ> [<ffffffff8020ca4d>] do_softirq+0x3d/0xb0
> >  [<ffffffff8023349e>] irq_exit+0x4e/0x60
> >  [<ffffffff8020cbf5>] do_IRQ+0x135/0x140
> >  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
> >  [<ffffffff8020a266>] ret_from_intr+0x0/0xf
> >  <EOI> [<ffffffff80233367>] local_bh_enable_ip+0xe7/0x110
> >  [<ffffffff804718b9>] _spin_unlock_bh+0x39/0x40
> >  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
> >  [<ffffffff80427c8b>] rt_cache_flush+0xab/0x100
> >  [<ffffffff8045a1c9>] fib_netdev_event+0xa9/0xc0
> >  [<ffffffff8023c2af>] notifier_call_chain+0x2f/0x50
> >  [<ffffffff8023c4b9>] raw_notifier_call_chain+0x9/0x10
> >  [<ffffffff80409789>] netdev_state_change+0x29/0x40
> >  [<ffffffff80415122>] linkwatch_run_queue+0x162/0x190
> >  [<ffffffff8041517a>] linkwatch_event+0x2a/0x40
> >  [<ffffffff8023fd72>] run_workqueue+0xc2/0x120
> >  [<ffffffff80415150>] linkwatch_event+0x0/0x40
> >  [<ffffffff8023fff1>] worker_thread+0x121/0x160
> >  [<ffffffff80229370>] default_wake_function+0x0/0x10
> >  [<ffffffff8023fed0>] worker_thread+0x0/0x160
> >  [<ffffffff802436f9>] kthread+0xd9/0x110
> >  [<ffffffff8024b1ad>] trace_hardirqs_on+0x11d/0x150
> >  [<ffffffff8020a706>] child_rip+0x8/0x12
> >  [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
> >  [<ffffffff8020a2c0>] restore_args+0x0/0x30
> >  [<ffffffff80243620>] kthread+0x0/0x110
> >  [<ffffffff8020a6fe>] child_rip+0x0/0x12
> > Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
> > RIP  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> >  RSP <ffffffff80621e70>
> >  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
> 
> ksymoops says:
> 
> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>
> 00000000 <_EIP>:
> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>   <=====
>    0:   44                        inc    %esp   <=====
> Code;  ffffffff88107288 <_end+7ac9288/7efc2000>
>    1:   8b 28                     mov    (%eax),%ebp
> Code;  ffffffff8810728a <_end+7ac928a/7efc2000>
>    3:   c7 45 d0 00 00 00 00      movl   $0x0,0xffffffd0(%ebp)
> Code;  ffffffff88107291 <_end+7ac9291/7efc2000>
>    a:   45                        inc    %ebp
> Code;  ffffffff88107292 <_end+7ac9292/7efc2000>
>    b:   85 ed                     test   %ebp,%ebp
> Code;  ffffffff88107294 <_end+7ac9294/7efc2000>
>    d:   0f 89 29 fb ff ff         jns    fffffb3c <_EIP+0xfffffb3c>
> Code;  ffffffff8810729a <_end+7ac929a/7efc2000>
>   13:   e9 00 00 00 00            jmp    18 <_EIP+0x18>
> 
> So even if we didn't deref a kfree'd pointer, we're about to.

Hm, but the code should be 64-bit?

> It would be good if you could poke around in gdb, work out exactly which
> statement it's oopsing at, please.

(gdb) l *skge_poll+0x547
0x5287 is in skge_poll (skge.c:2719).
2714                    struct skge_rx_desc *rd = e->desc;
2715                    struct sk_buff *skb;
2716                    u32 control;
2717
2718                    rmb();
2719                    control = rd->control;
2720                    if (control & BMU_OWN)
2721                            break;
2722
2723                    skb = skge_rx_get(skge, e, control, rd->status, rd->csum2);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 13:39   ` Jeff Garzik
@ 2006-08-12 14:32     ` Rafael J. Wysocki
  2006-08-12 19:34       ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 14:32 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Saturday 12 August 2006 15:39, Jeff Garzik wrote:
> Andrew Morton wrote:
> > It would be good if you could poke around in gdb, work out exactly which
> > statement it's oopsing at, please.
> 
> I'm also interested to know if the problem goes away when you disable 
> preempt...

That will take some time to test. :-)

Rafael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 14:31   ` Rafael J. Wysocki
@ 2006-08-12 16:12     ` Edgar E. Iglesias
  2006-08-12 17:13       ` Rafael J. Wysocki
  0 siblings, 1 reply; 18+ messages in thread
From: Edgar E. Iglesias @ 2006-08-12 16:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Sat, Aug 12, 2006 at 04:31:18PM +0200, Rafael J. Wysocki wrote:
> On Saturday 12 August 2006 14:28, Andrew Morton wrote:
> > On Sat, 12 Aug 2006 12:07:42 +0200
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > Hi,
> > > 
> > > On 2.6.18-rc3-mm2 with hotfixes I get things like the appended one on attempts
> > > to suspend to disk.  It occurs while devices are being suspended and is fairly
> > > reproducible.
> > > 
> > > Greetings,
> > > Rafael
> > > 
> > > 
> > > Suspending device 0000:01:00.0
> > > Suspending device 0000:02:02.0
> > > Suspending device 0000:02:01.4
> > > Suspending device 0000:02:01.3
> > > Suspending device 0000:02:01.2
> > > Suspending device 0000:02:01.1
> > > Suspending device 0000:02:01.0
> > > Suspending device 0000:02:00.0
> > > skge Ram read data parity error
> > > skge Ram write data parity error
> > > skge eth0: receive queue parity error
> > > skge <NULL>: receive queue parity error
> 
> This stuff comes from the interrupt handler which apparently races with
> something.

Maybe the skge driver is not doing netif_poll_disable before clearing the rx 
ring at suspend/down?

Best regards
-- 
        Programmer
        Edgar E. Iglesias <edgar.iglesias@axis.com> 46.46.272.1946

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 16:12     ` Edgar E. Iglesias
@ 2006-08-12 17:13       ` Rafael J. Wysocki
  2006-08-12 18:16         ` Edgar E. Iglesias
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 17:13 UTC (permalink / raw)
  To: Edgar E. Iglesias; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Saturday 12 August 2006 18:12, Edgar E. Iglesias wrote:
> On Sat, Aug 12, 2006 at 04:31:18PM +0200, Rafael J. Wysocki wrote:
> > On Saturday 12 August 2006 14:28, Andrew Morton wrote:
> > > On Sat, 12 Aug 2006 12:07:42 +0200
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > Hi,
> > > > 
> > > > On 2.6.18-rc3-mm2 with hotfixes I get things like the appended one on attempts
> > > > to suspend to disk.  It occurs while devices are being suspended and is fairly
> > > > reproducible.
> > > > 
> > > > Greetings,
> > > > Rafael
> > > > 
> > > > 
> > > > Suspending device 0000:01:00.0
> > > > Suspending device 0000:02:02.0
> > > > Suspending device 0000:02:01.4
> > > > Suspending device 0000:02:01.3
> > > > Suspending device 0000:02:01.2
> > > > Suspending device 0000:02:01.1
> > > > Suspending device 0000:02:01.0
> > > > Suspending device 0000:02:00.0
> > > > skge Ram read data parity error
> > > > skge Ram write data parity error
> > > > skge eth0: receive queue parity error
> > > > skge <NULL>: receive queue parity error
> > 
> > This stuff comes from the interrupt handler which apparently races with
> > something.
> 
> Maybe the skge driver is not doing netif_poll_disable before clearing the rx 
> ring at suspend/down?

Apparently it doesn't.

At least netif_poll_disable is not referenced anywhere in skge.c .

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 17:13       ` Rafael J. Wysocki
@ 2006-08-12 18:16         ` Edgar E. Iglesias
  2006-08-12 19:32           ` Rafael J. Wysocki
  2006-08-13  5:56           ` Stephen Hemminger
  0 siblings, 2 replies; 18+ messages in thread
From: Edgar E. Iglesias @ 2006-08-12 18:16 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Sat, Aug 12, 2006 at 07:13:01PM +0200, Rafael J. Wysocki wrote:
> Apparently it doesn't.

Hi, could you try and see if this helps?

Best regards
-- 
        Programmer
        Edgar E. Iglesias <edgar.iglesias@axis.com> 46.46.272.1946

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@axis.com>

diff --git a/drivers/net/skge.c b/drivers/net/skge.c
index 7de9a07..accefab 100644
--- a/drivers/net/skge.c
+++ b/drivers/net/skge.c
@@ -2211,6 +2211,7 @@ static int skge_up(struct net_device *de
 	skge_write8(hw, Q_ADDR(rxqaddr[port], Q_CSR), CSR_START | CSR_IRQ_CL_F);
 	skge_led(skge, LED_MODE_ON);
 
+	netif_poll_enable(dev);
 	return 0;
 
  free_rx_ring:
@@ -2279,6 +2280,7 @@ static int skge_down(struct net_device *
 
 	skge_led(skge, LED_MODE_OFF);
 
+	netif_poll_disable(dev);	
 	skge_tx_clean(skge);
 	skge_rx_clean(skge);
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 18:16         ` Edgar E. Iglesias
@ 2006-08-12 19:32           ` Rafael J. Wysocki
  2006-08-13  5:56           ` Stephen Hemminger
  1 sibling, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 19:32 UTC (permalink / raw)
  To: Edgar E. Iglesias; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Saturday 12 August 2006 20:16, Edgar E. Iglesias wrote:
> On Sat, Aug 12, 2006 at 07:13:01PM +0200, Rafael J. Wysocki wrote:
> > Apparently it doesn't.
> 
> Hi, could you try and see if this helps?

With the patch I can't reproduce the problem.  I sometimes get the error
messages from the interrupt handler, but then it doesn't blow up in
skge_poll(), so I think the patch helps.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 14:32     ` Rafael J. Wysocki
@ 2006-08-12 19:34       ` Rafael J. Wysocki
  0 siblings, 0 replies; 18+ messages in thread
From: Rafael J. Wysocki @ 2006-08-12 19:34 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, LKML, Stephen Hemminger, netdev

On Saturday 12 August 2006 16:32, Rafael J. Wysocki wrote:
> On Saturday 12 August 2006 15:39, Jeff Garzik wrote:
> > Andrew Morton wrote:
> > > It would be good if you could poke around in gdb, work out exactly which
> > > statement it's oopsing at, please.
> > 
> > I'm also interested to know if the problem goes away when you disable 
> > preempt...
> 
> That will take some time to test. :-)

It's also reproducible with PREEMPT disabled.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-12 18:16         ` Edgar E. Iglesias
  2006-08-12 19:32           ` Rafael J. Wysocki
@ 2006-08-13  5:56           ` Stephen Hemminger
  1 sibling, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2006-08-13  5:56 UTC (permalink / raw)
  To: Edgar E. Iglesias; +Cc: Rafael J. Wysocki, Andrew Morton, LKML, netdev

On Sat, 12 Aug 2006 20:16:03 +0200
"Edgar E. Iglesias" <edgar.iglesias@axis.com> wrote:

> On Sat, Aug 12, 2006 at 07:13:01PM +0200, Rafael J. Wysocki wrote:
> > Apparently it doesn't.
> 
> Hi, could you try and see if this helps?
> 
> Best regards

That looks good, but needs a few more changes for full safety.
Kind of like the sky2 changes needed to get Mac Mini to work.

The machine I have with skge boards don't suspend right but that is because
of other problems.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
@ 2006-08-13  8:53 Chuck Ebbert
  2006-08-13 17:38 ` Andrew Morton
  2006-08-14  0:21 ` Keith Owens
  0 siblings, 2 replies; 18+ messages in thread
From: Chuck Ebbert @ 2006-08-13  8:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael J. Wysocki, Stephen Hemminger, linux-kernel, linux-netdev,
	Keith Owens

In-Reply-To: <20060812052853.f9e5d648.akpm@osdl.org>

On Sat, 12 Aug 2006 05:28:53 -0700, Andrew Morton wrote:

> > general protection fault: 0000 [1] PREEMPT
> > last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/subsystem_device
> > CPU 0
> > Modules linked in: ide_cd cdrom usbserial asus_acpi thermal ipv6 processor fan button battery ac af_packet snd_pcm_oss snd_mixer_oss snd_seq
> > snd_seq_device bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yenta_socket rsrc_nonstatic pc
> > mcia_core usbhid ff_memless snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc ehci_hcd ohci_hcd i2c_nfo
> > rce2 i2c_core parport_pc lp parport dm_mod
> > Pid: 4, comm: events/0 Not tainted 2.6.18-rc3-mm2 #17
> > RIP: 0010:[<ffffffff88107287>]  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> > RSP: 0018:ffffffff80621e70  EFLAGS: 00010202
> > RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000040
> 
> RAX doesn't look good.
> 
> > RDX: ffff81005addf128 RSI: ffffffff80621eec RDI: ffff81005addeb60
> > RBP: ffffffff80621ed0 R08: 0000000000000001 R09: 0000000000000000
> > R10: 0000000000000040 R11: 0000000000000000 R12: ffff81005addf0a0
> > R13: 0000000000000000 R14: ffff810057fe9180 R15: 00000000ffffffff
> > FS:  00002b4b98df4b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 00002adeb0d7d0b0 CR3: 0000000025147000 CR4: 00000000000006e0
> > Process events/0 (pid: 4, threadinfo ffff810037f44000, task ffff810037fef100)
> > Stack:  ffffffff80621eb0 ffffffff80621eec ffff81005addeb60 ffff81005ad61488
> >  ffff81005addf128 000000400000000a 00000001008e6a25 0000000000000000
> >  ffff81005addeb60 0000000000000000 00000001008e6a25 00000000ffffffff
> > Call Trace:
> >  [<ffffffff8040b1ba>] net_rx_action+0xba/0x1f0
> >  [<ffffffff80233640>] __do_softirq+0x70/0xf0
> >  [<ffffffff8020aa7c>] call_softirq+0x1c/0x30
> > DWARF2 unwinder stuck at call_softirq+0x1c/0x30
> > Leftover inexact backtrace:
> >  <IRQ> [<ffffffff8020ca4d>] do_softirq+0x3d/0xb0
> >  [<ffffffff8023349e>] irq_exit+0x4e/0x60
> >  [<ffffffff8020cbf5>] do_IRQ+0x135/0x140
> >  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
> >  [<ffffffff8020a266>] ret_from_intr+0x0/0xf
> >  <EOI> [<ffffffff80233367>] local_bh_enable_ip+0xe7/0x110
> >  [<ffffffff804718b9>] _spin_unlock_bh+0x39/0x40
> >  [<ffffffff80427b9e>] rt_run_flush+0x8e/0xd0
> >  [<ffffffff80427c8b>] rt_cache_flush+0xab/0x100
> >  [<ffffffff8045a1c9>] fib_netdev_event+0xa9/0xc0
> >  [<ffffffff8023c2af>] notifier_call_chain+0x2f/0x50
> >  [<ffffffff8023c4b9>] raw_notifier_call_chain+0x9/0x10
> >  [<ffffffff80409789>] netdev_state_change+0x29/0x40
> >  [<ffffffff80415122>] linkwatch_run_queue+0x162/0x190
> >  [<ffffffff8041517a>] linkwatch_event+0x2a/0x40
> >  [<ffffffff8023fd72>] run_workqueue+0xc2/0x120
> >  [<ffffffff80415150>] linkwatch_event+0x0/0x40
> >  [<ffffffff8023fff1>] worker_thread+0x121/0x160
> >  [<ffffffff80229370>] default_wake_function+0x0/0x10
> >  [<ffffffff8023fed0>] worker_thread+0x0/0x160
> >  [<ffffffff802436f9>] kthread+0xd9/0x110
> >  [<ffffffff8024b1ad>] trace_hardirqs_on+0x11d/0x150
> >  [<ffffffff8020a706>] child_rip+0x8/0x12
> >  [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
> >  [<ffffffff8020a2c0>] restore_args+0x0/0x30
> >  [<ffffffff80243620>] kthread+0x0/0x110
> >  [<ffffffff8020a6fe>] child_rip+0x0/0x12
> > Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
> > RIP  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> >  RSP <ffffffff80621e70>
>
> ksymoops says:
> 
> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>
> 00000000 <_EIP>:
> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>   <=====
>    0:   44                        inc    %esp   <=====
> Code;  ffffffff88107288 <_end+7ac9288/7efc2000>
>    1:   8b 28                     mov    (%eax),%ebp

0x44 is a REX prefix in 64-bit mode, so somehow ksymoops got it
wrong and gave you an i386-mode decode instead of 64-bit mode.
Did you run it on a i386 machine and it assumed i386? Maybe you
need to use "-a x86-64"?  (I can't make it work on my setup.)

So it's really "mov (%r8),%ebp" if I am reading the manual right.

-- 
Chuck


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-13  8:53 Chuck Ebbert
@ 2006-08-13 17:38 ` Andrew Morton
  2006-08-14  0:21 ` Keith Owens
  1 sibling, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2006-08-13 17:38 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Rafael J. Wysocki, Stephen Hemminger, linux-kernel, linux-netdev,
	Keith Owens

On Sun, 13 Aug 2006 04:53:09 -0400
Chuck Ebbert <76306.1226@compuserve.com> wrote:

> > > Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
> > > RIP  [<ffffffff88107287>] :skge:skge_poll+0x547/0x570
> > >  RSP <ffffffff80621e70>
> >
> > ksymoops says:
> > 
> > Code;  ffffffff88107287 <_end+7ac9287/7efc2000>
> > 00000000 <_EIP>:
> > Code;  ffffffff88107287 <_end+7ac9287/7efc2000>   <=====
> >    0:   44                        inc    %esp   <=====
> > Code;  ffffffff88107288 <_end+7ac9288/7efc2000>
> >    1:   8b 28                     mov    (%eax),%ebp
> 
> 0x44 is a REX prefix in 64-bit mode, so somehow ksymoops got it
> wrong and gave you an i386-mode decode instead of 64-bit mode.
> Did you run it on a i386 machine and it assumed i386? Maybe you
> need to use "-a x86-64"?  (I can't make it work on my setup.)
> 
> So it's really "mov (%r8),%ebp" if I am reading the manual right.

I don't know what ksymoops's problem is.  I noticed that without `-a' it
gave x86 code so I gave it `-a i386:x86-64' and didn't bother to read the
output ;) Seems that nothing I can do will persuade it to not treat this as
i386 code.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-13  8:53 Chuck Ebbert
  2006-08-13 17:38 ` Andrew Morton
@ 2006-08-14  0:21 ` Keith Owens
  2006-08-14  0:35   ` Andrew Morton
  1 sibling, 1 reply; 18+ messages in thread
From: Keith Owens @ 2006-08-14  0:21 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Andrew Morton, Rafael J. Wysocki, Stephen Hemminger, linux-kernel,
	linux-netdev

Chuck Ebbert (on Sun, 13 Aug 2006 04:53:09 -0400) wrote:
>In-Reply-To: <20060812052853.f9e5d648.akpm@osdl.org>
>
>On Sat, 12 Aug 2006 05:28:53 -0700, Andrew Morton wrote:
>
>> > Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
>>
>> ksymoops says:
>> 
>> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>
>> 00000000 <_EIP>:
>> Code;  ffffffff88107287 <_end+7ac9287/7efc2000>   <=====
>>    0:   44                        inc    %esp   <=====
>> Code;  ffffffff88107288 <_end+7ac9288/7efc2000>
>>    1:   8b 28                     mov    (%eax),%ebp
>
>0x44 is a REX prefix in 64-bit mode, so somehow ksymoops got it
>wrong and gave you an i386-mode decode instead of 64-bit mode.
>Did you run it on a i386 machine and it assumed i386? Maybe you
>need to use "-a x86-64"?  (I can't make it work on my setup.)
>
>So it's really "mov (%r8),%ebp" if I am reading the manual right.

ksymoops -VKLMO -t elf64-x86-64 -a i386:x86-64

ksymoops 2.4.11 on i686 2.6.16.21-0.13-smp.  Options used
     -V (specified)
     -K (specified)
     -L (specified)
     -O (specified)
     -M (specified)
     -t elf64-x86-64 -a i386:x86-64

Warning (merge_maps): no symbols in merged map
Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9

Code;  0000000000000000 No symbols available
0000000000000000 <_RIP>:
Code;  0000000000000000 No symbols available
   0:   44 8b 28                  mov    (%rax),%r13d
Code;  0000000000000003 No symbols available
   3:   c7 45 d0 00 00 00 00      movl   $0x0,0xffffffffffffffd0(%rbp)
Code;  000000000000000a No symbols available
   a:   45 85 ed                  test   %r13d,%r13d
Code;  000000000000000d No symbols available
   d:   0f 89 29 fb ff ff         jns    fffffffffffffb3c <_RIP+0xfffffffffffffb3c>
Code;  0000000000000013 No symbols available
  13:   e9 00 00 00 00            jmpq   18 <_RIP+0x18>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-14  0:21 ` Keith Owens
@ 2006-08-14  0:35   ` Andrew Morton
  2006-08-14  0:54     ` Keith Owens
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2006-08-14  0:35 UTC (permalink / raw)
  To: Keith Owens
  Cc: Chuck Ebbert, Rafael J. Wysocki, Stephen Hemminger, linux-kernel,
	linux-netdev

On Mon, 14 Aug 2006 10:21:55 +1000
Keith Owens <kaos@ocs.com.au> wrote:

> ksymoops -VKLMO -t elf64-x86-64 -a i386:x86-64

box:/home/akpm> ksymoops -VKLMO -t elf64-x86-64 -a i386:x86-64 < x
ksymoops 2.4.11 on x86_64 2.6.17-rc5.  Options used
     -V (specified)
     -K (specified)
     -L (specified)
     -O (specified)
     -M (specified)
     -t elf64-x86-64 -a i386:x86-64

Warning (merge_maps): no symbols in merged map
CPU 0
...
 [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
 [<ffffffff8020a2c0>] restore_args+0x0/0x30
 [<ffffffff80243620>] kthread+0x0/0x110
 [<ffffffff8020a6fe>] child_rip+0x0/0x12
Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
Error (Oops_bfd_perror): /tmp/ksymoops.0lrVNY Invalid bfd target

box:/home/akpm> rpm -qi ksymoops 
Name        : ksymoops                     Relocations: (not relocatable)
Version     : 2.4.11                            Vendor: (none)
Release     : 1                             Build Date: Sat Jan  8 05:43:45 2005
Install Date: Wed Jun 28 16:59:45 2006      Build Host: ocs3.ocs.com.au
Group       : Utilities/System              Source RPM: ksymoops-2.4.11-1.src.rpm
Size        : 542288                           License: GPL
Signature   : (none)
Summary     : Kernel oops and error message decoder
Description :
The Linux kernel produces error messages that contain machine specific
numbers which are meaningless for debugging.  ksymoops reads machine
specific files and the error log and converts the addresses to
meaningful symbols and offsets.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-14  0:35   ` Andrew Morton
@ 2006-08-14  0:54     ` Keith Owens
  2006-08-14  1:06       ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Owens @ 2006-08-14  0:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Chuck Ebbert, Rafael J. Wysocki, Stephen Hemminger, linux-kernel,
	linux-netdev

Andrew Morton (on Sun, 13 Aug 2006 17:35:03 -0700) wrote:
>On Mon, 14 Aug 2006 10:21:55 +1000
>Keith Owens <kaos@ocs.com.au> wrote:
>
>> ksymoops -VKLMO -t elf64-x86-64 -a i386:x86-64
>
>box:/home/akpm> ksymoops -VKLMO -t elf64-x86-64 -a i386:x86-64 < x
>ksymoops 2.4.11 on x86_64 2.6.17-rc5.  Options used
>     -V (specified)
>     -K (specified)
>     -L (specified)
>     -O (specified)
>     -M (specified)
>     -t elf64-x86-64 -a i386:x86-64
>
>Warning (merge_maps): no symbols in merged map
>CPU 0
>...
> [<ffffffff80471e5b>] _spin_unlock_irq+0x2b/0x60
> [<ffffffff8020a2c0>] restore_args+0x0/0x30
> [<ffffffff80243620>] kthread+0x0/0x110
> [<ffffffff8020a6fe>] child_rip+0x0/0x12
>Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
>Error (Oops_bfd_perror): /tmp/ksymoops.0lrVNY Invalid bfd target
>
>box:/home/akpm> rpm -qi ksymoops 
>Name        : ksymoops                     Relocations: (not relocatable)
>Version     : 2.4.11                            Vendor: (none)
>Release     : 1                             Build Date: Sat Jan  8 05:43:45 2005
>Install Date: Wed Jun 28 16:59:45 2006      Build Host: ocs3.ocs.com.au
>Group       : Utilities/System              Source RPM: ksymoops-2.4.11-1.src.rpm

Back in 2000 there were a lot of version problems between ksymoops and
libbfd and libiberty, so I statically link against these libraries when
I build the rpm.  You have an i386 version of ksymoops, which was built
against an i386 only version of libbfd, it does not support target
elf64-x86-64.  Grab the ksymoops src.rpm and rebuild on x86_64, or use
a binary rpm from an x86_64 distribution.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-14  0:54     ` Keith Owens
@ 2006-08-14  1:06       ` Andrew Morton
  2006-08-14  1:10         ` Keith Owens
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2006-08-14  1:06 UTC (permalink / raw)
  To: Keith Owens
  Cc: Chuck Ebbert, Rafael J. Wysocki, Stephen Hemminger, linux-kernel,
	linux-netdev

On Mon, 14 Aug 2006 10:54:21 +1000
Keith Owens <kaos@ocs.com.au> wrote:

> >Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
> >Error (Oops_bfd_perror): /tmp/ksymoops.0lrVNY Invalid bfd target
> >
> >box:/home/akpm> rpm -qi ksymoops 
> >Name        : ksymoops                     Relocations: (not relocatable)
> >Version     : 2.4.11                            Vendor: (none)
> >Release     : 1                             Build Date: Sat Jan  8 05:43:45 2005
> >Install Date: Wed Jun 28 16:59:45 2006      Build Host: ocs3.ocs.com.au
> >Group       : Utilities/System              Source RPM: ksymoops-2.4.11-1.src.rpm
> 
> Back in 2000 there were a lot of version problems between ksymoops and
> libbfd and libiberty, so I statically link against these libraries when
> I build the rpm.  You have an i386 version of ksymoops, which was built
> against an i386 only version of libbfd, it does not support target
> elf64-x86-64.  Grab the ksymoops src.rpm and rebuild on x86_64, or use
> a binary rpm from an x86_64 distribution.

But would such a binary be able to decode i386 oopses?

ftp://ftp.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/ksymoops-2.4.11-1.src.rpm
fails to build, btw.  Had to do s/Copyright/License/ in the spec file.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend
  2006-08-14  1:06       ` Andrew Morton
@ 2006-08-14  1:10         ` Keith Owens
  0 siblings, 0 replies; 18+ messages in thread
From: Keith Owens @ 2006-08-14  1:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Chuck Ebbert, Rafael J. Wysocki, Stephen Hemminger, linux-kernel,
	linux-netdev

Andrew Morton (on Sun, 13 Aug 2006 18:06:02 -0700) wrote:
>On Mon, 14 Aug 2006 10:54:21 +1000
>Keith Owens <kaos@ocs.com.au> wrote:
>
>> >Code: 44 8b 28 c7 45 d0 00 00 00 00 45 85 ed 0f 89 29 fb ff ff e9
>> >Error (Oops_bfd_perror): /tmp/ksymoops.0lrVNY Invalid bfd target
>> >
>> >box:/home/akpm> rpm -qi ksymoops 
>> >Name        : ksymoops                     Relocations: (not relocatable)
>> >Version     : 2.4.11                            Vendor: (none)
>> >Release     : 1                             Build Date: Sat Jan  8 05:43:45 2005
>> >Install Date: Wed Jun 28 16:59:45 2006      Build Host: ocs3.ocs.com.au
>> >Group       : Utilities/System              Source RPM: ksymoops-2.4.11-1.src.rpm
>> 
>> Back in 2000 there were a lot of version problems between ksymoops and
>> libbfd and libiberty, so I statically link against these libraries when
>> I build the rpm.  You have an i386 version of ksymoops, which was built
>> against an i386 only version of libbfd, it does not support target
>> elf64-x86-64.  Grab the ksymoops src.rpm and rebuild on x86_64, or use
>> a binary rpm from an x86_64 distribution.
>
>But would such a binary be able to decode i386 oopses?

It depends on your versions of bfdutils and binutils.  ksymoops does
not decode the object itself, it uses bfd and objdump to do the work.
FWIW, the version of ksymoops in suselinux 10.0 for x86_64 will handle
both i386 and x86_64.

>ftp://ftp.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/ksymoops-2.4.11-1.src.rpm
>fails to build, btw.  Had to do s/Copyright/License/ in the spec file.

Ah, the joys of changing RPM standards.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2006-08-14  1:10 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-12 10:07 2.6.18-rc3-mm2 (+ hotfixes): GPF related to skge on suspend Rafael J. Wysocki
2006-08-12 12:28 ` Andrew Morton
2006-08-12 13:39   ` Jeff Garzik
2006-08-12 14:32     ` Rafael J. Wysocki
2006-08-12 19:34       ` Rafael J. Wysocki
2006-08-12 14:31   ` Rafael J. Wysocki
2006-08-12 16:12     ` Edgar E. Iglesias
2006-08-12 17:13       ` Rafael J. Wysocki
2006-08-12 18:16         ` Edgar E. Iglesias
2006-08-12 19:32           ` Rafael J. Wysocki
2006-08-13  5:56           ` Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2006-08-13  8:53 Chuck Ebbert
2006-08-13 17:38 ` Andrew Morton
2006-08-14  0:21 ` Keith Owens
2006-08-14  0:35   ` Andrew Morton
2006-08-14  0:54     ` Keith Owens
2006-08-14  1:06       ` Andrew Morton
2006-08-14  1:10         ` Keith Owens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).