* networking oops after resume from s2ram (2.6.28-rc6) @ 2008-11-28 21:15 Marcin Slusarz 2008-11-30 3:36 ` Andrew Morton 0 siblings, 1 reply; 3+ messages in thread From: Marcin Slusarz @ 2008-11-28 21:15 UTC (permalink / raw) To: LKML; +Cc: netdev Hi Sometimes after resume from s2ram networking doesn't work, so I restart it by /etc/init.d/net.eth1 restart. Recently it started to lock up my box completely, but today it oopsed only (and killed my keyboard, so I had to save dmesg with mouse :D). It looks like it tries to use netconsole without working network interface... [ 1621.013789] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 1621.013793] IP: [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba [ 1621.013802] PGD 16880067 PUD 16894067 PMD 0 [ 1621.013806] Oops: 0000 [#1] PREEMPT [ 1621.013808] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor [ 1621.013812] Dumping ftrace buffer: [ 1621.013814] (ftrace buffer empty) [ 1621.013816] CPU 0 [ 1621.013819] Pid: 6725, comm: ip Not tainted 2.6.28-rc6-stack+mode #51 [ 1621.013821] RIP: 0010:[<ffffffff803f5db9>] [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba [ 1621.013825] RSP: 0018:ffff8800168c38b8 EFLAGS: 00010006 [ 1621.013827] RAX: 000000000000007f RBX: ffff88003e9c7090 RCX: 0000000000000001 [ 1621.013829] RDX: 0000000000000001 RSI: ffff88003e8b2840 RDI: ffff88003e95d700 [ 1621.013832] RBP: ffff8800168c3918 R08: 0000000000000002 R09: 0000000000000000 [ 1621.013834] R10: 0000000000000006 R11: 0000000000000000 R12: ffff88003e95d700 [ 1621.013836] R13: ffff88003e9c7090 R14: 0000000000000000 R15: 0000000000000001 [ 1621.013838] FS: 00007f47e467f6f0(0000) GS:ffffffff80738180(0000) knlGS:0000000000000000 [ 1621.013841] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1621.013843] CR2: 0000000000000000 CR3: 000000001689a000 CR4: 00000000000006e0 [ 1621.013845] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1621.013847] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1621.013850] Process ip (pid: 6725, threadinfo ffff8800168c2000, task ffff88001688da30) [ 1621.013851] Stack: [ 1621.013853] ffff880026921800 0000000080737c20 ffff88003e8b2000 ffff88003e8b2840 [ 1621.013856] ffff88003e87cf00 ffff88003e8b2000 0000000000000000 ffff88003e9ec120 [ 1621.013860] 0000000000000082 ffff88003e8b2000 0000000000000000 0000000000000001 [ 1621.013864] Call Trace: [ 1621.013866] [<ffffffff804f5a28>] netpoll_send_skb+0xcd/0x196 [ 1621.013871] [<ffffffff804f5fe7>] netpoll_send_udp+0x202/0x20e [ 1621.013874] [<ffffffff80400081>] write_msg+0x80/0xbf [ 1621.013879] [<ffffffff8023220d>] __call_console_drivers+0x58/0x69 [ 1621.013884] [<ffffffff8023227f>] _call_console_drivers+0x61/0x66 [ 1621.013887] [<ffffffff802323b5>] release_console_sem+0x131/0x1d4 [ 1621.013890] [<ffffffff80232a08>] vprintk+0x389/0x3b8 [ 1621.013894] [<ffffffff80232a9e>] printk+0x67/0x69 [ 1621.013897] [<ffffffff804e7785>] ? dev_set_rx_mode+0x19/0x2e [ 1621.013902] [<ffffffff80252e95>] ? __lock_acquire+0x6dd/0x73b [ 1621.013905] [<ffffffff80251379>] ? mark_held_locks+0x52/0x72 [ 1621.013908] [<ffffffff802361f5>] ? local_bh_enable_ip+0xba/0xd6 [ 1621.013912] [<ffffffff803f2a07>] skge_up+0x7c/0x88a [ 1621.013916] [<ffffffff804e7795>] ? dev_set_rx_mode+0x29/0x2e [ 1621.013920] [<ffffffff802361f5>] ? local_bh_enable_ip+0xba/0xd6 [ 1621.013923] [<ffffffff804eb050>] dev_open+0x72/0xa6 [ 1621.013926] [<ffffffff804e9211>] dev_change_flags+0xa8/0x167 [ 1621.013929] [<ffffffff80527683>] devinet_ioctl+0x26a/0x5df [ 1621.013933] [<ffffffff8052845a>] inet_ioctl+0x92/0xaa [ 1621.013936] [<ffffffff804de260>] sock_ioctl+0x18c/0x1b8 [ 1621.013939] [<ffffffff8029ed28>] vfs_ioctl+0x2a/0x77 [ 1621.013943] [<ffffffff8029f110>] do_vfs_ioctl+0x39b/0x3e5 [ 1621.013946] [<ffffffff802479ad>] ? up_read+0x26/0x2b [ 1621.013950] [<ffffffff8020b36c>] ? sysret_check+0x27/0x62 [ 1621.013953] [<ffffffff8029f19c>] sys_ioctl+0x42/0x65 [ 1621.013956] [<ffffffff8020b33b>] system_call_fastpath+0x16/0x1b [ 1621.013960] Code: 48 c1 f8 03 0f b7 52 04 69 c0 cd cc cc cc 8d 44 30 ff ff c2 39 d0 0f 8c 00 03 00 00 48 8b 75 b8 4c 8b ae 98 00 00 00 4d 8b 75 08 <41> 83 3e 00 79 04 0f 0b eb fe 4d 89 65 10 41 8b 44 24 68 45 31 [ 1621.013988] RIP [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba [ 1621.013991] RSP <ffff8800168c38b8> [ 1621.013993] CR2: 0000000000000000 [ 1621.013995] ---[ end trace 50a82573cfe89df3 ]--- [ 1621.013998] note: ip[6725] exited with preempt_count 3 Full demsg and config can be found at: http://www.kadu.net/~joi/kernel/2008.11.28/skge-net.oops/ Marcin ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: networking oops after resume from s2ram (2.6.28-rc6) 2008-11-28 21:15 networking oops after resume from s2ram (2.6.28-rc6) Marcin Slusarz @ 2008-11-30 3:36 ` Andrew Morton 2008-11-30 12:59 ` Marcin Slusarz 0 siblings, 1 reply; 3+ messages in thread From: Andrew Morton @ 2008-11-30 3:36 UTC (permalink / raw) To: Marcin Slusarz; +Cc: LKML, netdev On Fri, 28 Nov 2008 22:15:40 +0100 Marcin Slusarz <marcin.slusarz@gmail.com> wrote: > Hi > > Sometimes after resume from s2ram networking doesn't work, so I restart it by > /etc/init.d/net.eth1 restart. Recently it started to lock up my box completely, > but today it oopsed only (and killed my keyboard, so I had to save dmesg with > mouse :D). > > It looks like it tries to use netconsole without working network interface... > > [ 1621.013789] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > [ 1621.013793] IP: [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba > [ 1621.013802] PGD 16880067 PUD 16894067 PMD 0 > [ 1621.013806] Oops: 0000 [#1] PREEMPT > [ 1621.013808] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > [ 1621.013812] Dumping ftrace buffer: > [ 1621.013814] (ftrace buffer empty) > [ 1621.013816] CPU 0 > [ 1621.013819] Pid: 6725, comm: ip Not tainted 2.6.28-rc6-stack+mode #51 > [ 1621.013821] RIP: 0010:[<ffffffff803f5db9>] [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba skge driver went splat, I guess. It would be fun to try using this: From: Arjan van de Ven <arjan@infradead.org> We're struggling all the time to figure out where the code came from that oopsed.. The script below (a adaption from a script used by kerneloops.org) can help developers quite a bit, at least for non-module cases. It works and looks like this: [/home/arjan/linux]$ dmesg | perl scripts/markup_oops.pl vmlinux { struct agp_memory *memory; memory = agp_allocate_memory(agp_bridge, pg_count, type); c055c10f: 89 c2 mov %eax,%edx if (memory == NULL) c055c111: 74 19 je c055c12c <agp_allocate_memory_wrap+0x30> /* This function must only be called when current_controller != NULL */ static void agp_insert_into_pool(struct agp_memory * temp) { struct agp_memory *prev; prev = agp_fe.current_controller->pool; c055c113: a1 ec dc 8f c0 mov 0xc08fdcec,%eax *c055c118: 8b 40 10 mov 0x10(%eax),%eax <----- faulting instruction if (prev != NULL) { c055c11b: 85 c0 test %eax,%eax c055c11d: 74 05 je c055c124 <agp_allocate_memory_wrap+0x28> prev->prev = temp; c055c11f: 89 50 04 mov %edx,0x4(%eax) temp->next = prev; c055c122: 89 02 mov %eax,(%edx) } agp_fe.current_controller->pool = temp; c055c124: a1 ec dc 8f c0 mov 0xc08fdcec,%eax c055c129: 89 50 10 mov %edx,0x10(%eax) if (memory == NULL) return NULL; agp_insert_into_pool(memory); so in this case, we faulted while dereferencing agp_fe.current_controller pointer, and we get to see exactly which function and line it affects... Personally I find this very useful, and I can see value for having this script in the kernel for more-than-just-me to use. Caveats: * It only works for oopses not-in-modules * It only works nicely for kernels compiled with CONFIG_DEBUG_INFO * It's not very fast. * It only works on x86 Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- scripts/markup_oops.pl | 162 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 162 insertions(+) diff -puN /dev/null scripts/markup_oops.pl --- /dev/null +++ a/scripts/markup_oops.pl @@ -0,0 +1,162 @@ +#!/usr/bin/perl -w + +# Copyright 2008, Intel Corporation +# +# This file is part of the Linux kernel +# +# This program file is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; version 2 of the License. +# +# Authors: +# Arjan van de Ven <arjan@linux.intel.com> + + +my $vmlinux_name = $ARGV[0]; + +# +# Step 1: Parse the oops to find the EIP value +# + +my $target = "0"; +while (<STDIN>) { + if ($_ =~ /EIP: 0060:\[\<([a-z0-9]+)\>\]/) { + $target = $1; + } +} + +if ($target =~ /^f8/) { + print "This script does not work on modules ... \n"; + exit; +} + +if ($target eq "0") { + print "No oops found!\n"; + print "Usage: \n"; + print " dmesg | perl scripts/markup_oops.pl vmlinux\n"; + exit; +} + +my $counter = 0; +my $state = 0; +my $center = 0; +my @lines; + +sub InRange { + my ($address, $target) = @_; + my $ad = "0x".$address; + my $ta = "0x".$target; + my $delta = hex($ad) - hex($ta); + + if (($delta > -4096) && ($delta < 4096)) { + return 1; + } + return 0; +} + + + +# first, parse the input into the lines array, but to keep size down, +# we only do this for 4Kb around the sweet spot + +my $filename; + +open(FILE, "objdump -dS $vmlinux_name |") || die "Cannot start objdump"; + +while (<FILE>) { + my $line = $_; + chomp($line); + if ($state == 0) { + if ($line =~ /^([a-f0-9]+)\:/) { + if (InRange($1, $target)) { + $state = 1; + } + } + } else { + if ($line =~ /^([a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]+)\:/) { + my $val = $1; + if (!InRange($val, $target)) { + last; + } + if ($val eq $target) { + $center = $counter; + } + } + $lines[$counter] = $line; + + $counter = $counter + 1; + } +} + +close(FILE); + +if ($counter == 0) { + print "No matching code found \n"; + exit; +} + +if ($center == 0) { + print "No matching code found \n"; + exit; +} + +my $start; +my $finish; +my $codelines = 0; +my $binarylines = 0; +# now we go up and down in the array to find how much we want to print + +$start = $center; + +while ($start > 1) { + $start = $start - 1; + my $line = $lines[$start]; + if ($line =~ /^([a-f0-9]+)\:/) { + $binarylines = $binarylines + 1; + } else { + $codelines = $codelines + 1; + } + if ($codelines > 10) { + last; + } + if ($binarylines > 20) { + last; + } +} + + +$finish = $center; +$codelines = 0; +$binarylines = 0; +while ($finish < $counter) { + $finish = $finish + 1; + my $line = $lines[$finish]; + if ($line =~ /^([a-f0-9]+)\:/) { + $binarylines = $binarylines + 1; + } else { + $codelines = $codelines + 1; + } + if ($codelines > 10) { + last; + } + if ($binarylines > 20) { + last; + } +} + + +my $i; + +my $fulltext = ""; +$i = $start; +while ($i < $finish) { + if ($i == $center) { + $fulltext = $fulltext . "*$lines[$i] <----- faulting instruction\n"; + } else { + $fulltext = $fulltext . " $lines[$i]\n"; + } + $i = $i +1; +} + +print $fulltext; + _ ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: networking oops after resume from s2ram (2.6.28-rc6) 2008-11-30 3:36 ` Andrew Morton @ 2008-11-30 12:59 ` Marcin Slusarz 0 siblings, 0 replies; 3+ messages in thread From: Marcin Slusarz @ 2008-11-30 12:59 UTC (permalink / raw) To: Andrew Morton; +Cc: LKML, netdev, Arjan van de Ven On Sat, Nov 29, 2008 at 07:36:56PM -0800, Andrew Morton wrote: > On Fri, 28 Nov 2008 22:15:40 +0100 Marcin Slusarz <marcin.slusarz@gmail.com> wrote: > > > Hi > > > > Sometimes after resume from s2ram networking doesn't work, so I restart it by > > /etc/init.d/net.eth1 restart. Recently it started to lock up my box completely, > > but today it oopsed only (and killed my keyboard, so I had to save dmesg with > > mouse :D). > > > > It looks like it tries to use netconsole without working network interface... > > > > [ 1621.013789] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > > [ 1621.013793] IP: [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba > > [ 1621.013802] PGD 16880067 PUD 16894067 PMD 0 > > [ 1621.013806] Oops: 0000 [#1] PREEMPT > > [ 1621.013808] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > > [ 1621.013812] Dumping ftrace buffer: > > [ 1621.013814] (ftrace buffer empty) > > [ 1621.013816] CPU 0 > > [ 1621.013819] Pid: 6725, comm: ip Not tainted 2.6.28-rc6-stack+mode #51 > > [ 1621.013821] RIP: 0010:[<ffffffff803f5db9>] [<ffffffff803f5db9>] skge_xmit_frame+0xb8/0x3ba > > skge driver went splat, I guess. > > It would be fun to try using this: I had to "add support" for x86_64 oopses. Patch at the end of mail. > > From: Arjan van de Ven <arjan@infradead.org> > > We're struggling all the time to figure out where the code came from that > oopsed.. The script below (a adaption from a script used by > kerneloops.org) can help developers quite a bit, at least for non-module > cases. > > It works and looks like this: > > [/home/arjan/linux]$ dmesg | perl scripts/markup_oops.pl vmlinux > (...) > diff -puN /dev/null scripts/markup_oops.pl > --- /dev/null > +++ a/scripts/markup_oops.pl > (...) ffffffff803f5db9 u64 map; if (skb_padto(skb, ETH_ZLEN)) return NETDEV_TX_OK; if (unlikely(skge_avail(&skge->tx_ring) < skb_shinfo(skb)->nr_frags + 1)) ffffffff803f5d75: 41 8b 94 24 a8 00 00 mov 0xa8(%r12),%edx ffffffff803f5d7c: 00 ffffffff803f5d7d: 48 2b 41 08 sub 0x8(%rcx),%rax ffffffff803f5d81: 49 03 94 24 b0 00 00 add 0xb0(%r12),%rdx ffffffff803f5d88: 00 ffffffff803f5d89: b9 01 00 00 00 mov $0x1,%ecx ffffffff803f5d8e: 48 c1 f8 03 sar $0x3,%rax ffffffff803f5d92: 0f b7 52 04 movzwl 0x4(%rdx),%edx ffffffff803f5d96: 69 c0 cd cc cc cc imul $0xcccccccd,%eax,%eax ffffffff803f5d9c: 8d 44 30 ff lea -0x1(%rax,%rsi,1),%eax ffffffff803f5da0: ff c2 inc %edx ffffffff803f5da2: 39 d0 cmp %edx,%eax ffffffff803f5da4: 0f 8c 00 03 00 00 jl ffffffff803f60aa <skge_xmit_frame+0x3a9> return NETDEV_TX_BUSY; e = skge->tx_ring.to_use; ffffffff803f5daa: 48 8b 75 b8 mov -0x48(%rbp),%rsi ffffffff803f5dae: 4c 8b ae 98 00 00 00 mov 0x98(%rsi),%r13 td = e->desc; ffffffff803f5db5: 4d 8b 75 08 mov 0x8(%r13),%r14 BUG_ON(td->control & BMU_OWN); *ffffffff803f5db9: 41 83 3e 00 cmpl $0x0,(%r14) <----- faulting instruction ffffffff803f5dbd: 79 04 jns ffffffff803f5dc3 <skge_xmit_frame+0xc2> ffffffff803f5dbf: 0f 0b ud2a ffffffff803f5dc1: eb fe jmp ffffffff803f5dc1 <skge_xmit_frame+0xc0> e->skb = skb; ffffffff803f5dc3: 4d 89 65 10 mov %r12,0x10(%r13) return skb->data_len; } static inline unsigned int skb_headlen(const struct sk_buff *skb) { return skb->len - skb->data_len; ffffffff803f5dc7: 41 8b 44 24 68 mov 0x68(%r12),%eax } static inline dma_addr_t --- From: Marcin Slusarz <marcin.slusarz@gmail.com> Subject: [PATCH] markup_oops.pl: "add support" for x86_64 Find instruction pointer in x86_64 oopses. -w removed because it spammed with: "Hexadecimal number > 0xffffffff non-portable at scripts/markup_oops.pl line 52, <FILE> line 383." Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> --- scripts/markup_oops.pl | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/scripts/markup_oops.pl b/scripts/markup_oops.pl index 700a7a6..7038679 100644 --- a/scripts/markup_oops.pl +++ b/scripts/markup_oops.pl @@ -1,4 +1,4 @@ -#!/usr/bin/perl -w +#!/usr/bin/perl # Copyright 2008, Intel Corporation # @@ -23,6 +23,9 @@ while (<STDIN>) { if ($_ =~ /EIP: 0060:\[\<([a-z0-9]+)\>\]/) { $target = $1; } + if ($_ =~ /RIP: 0010:\[\<([a-z0-9]+)\>\]/) { + $target = $1; + } } if ($target =~ /^f8/) { -- ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-11-30 12:59 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-28 21:15 networking oops after resume from s2ram (2.6.28-rc6) Marcin Slusarz 2008-11-30 3:36 ` Andrew Morton 2008-11-30 12:59 ` Marcin Slusarz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).