netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <shemminger@vyatta.com>
To: Rene Mayrhofer <rene.mayrhofer@gibraltar.at>
Cc: netdev@vger.kernel.org, Richard Leitner <leitner@esys.at>
Subject: Re: Kernel oops on setting sky2 interfaces down
Date: Tue, 21 Jul 2009 09:58:53 -0700	[thread overview]
Message-ID: <20090721095853.30f4fbda@nehalam> (raw)
In-Reply-To: <4A65EC3F.4050400@gibraltar.at>

On Tue, 21 Jul 2009 18:26:39 +0200
Rene Mayrhofer <rene.mayrhofer@gibraltar.at> wrote:

> Hi everybody,
> 
> [Please CC me in replies, I am not currently subscribed to this list.]
> 
> I have a fully reproducible kernel oops in the sky2 module in kernel
> 2.6.28.10. The kernel is a vanilla 2.6.28.10 (and I can't switch to
> anything newer at this time because of missing squashfs-lzma support),
> patched with PaX, netfilter-layer7, squashfs (with LZMA), and IMQ. The
> base system is a Debian Lenny with some updates from testing/unstable.
> 
> Whenever interfaces using the sky2 module (this box has 8 network
> interfaces in a 19" rack appliance) go down, the oops occurs:

Looks like the device is disappearing from the PCI bus when
brought down.  Can you reproduce it with 2.6.30.2 or 2.6.31-rc3?


> [~]# ifdown -a --exclude=lo
> [ 1535.000069] sky2 0000:01:00.0: error interrupt status=0xffffffff
> [ 1535.006649] sky2 0000:01:00.0: PCI hardware error (0xffff)
> [ 1535.012608] sky2 0000:01:00.0: PCI Express error (0xffffffff)
> [ 1535.018821] sky2 wan: ram data read parity error
> [ 1535.023827] sky2 wan: ram data write parity error
> [ 1535.028913] sky2 wan: MAC parity error
> [ 1535.032992] sky2 wan: RX parity error
> [ 1535.036983] sky2 wan: TCP segmentation error
> [ 1535.041655] general protection fault: 0000 [#1] PREEMPT SMP
> [ 1535.045601] last sysfs file:
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
> 
> [ 1535.045601] Modules linked in: xt_multiport cpufreq_userspace xt_DSCP
> xt_length xt_mark xt_dscp xt_MARK xt_CONNMARK xt_comment xt_policy
> ipt_REDIRECT ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
> ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
> xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
> p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
> nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
> nf_defrag_ipv4 ipv6 evdev parport_pc parport serio_raw pcspkr i2c_i801
> i2c_core iTCO_wdt rng_core intel_agp agpgart squashfs sqlzma unlzma loop
> aufs exportfs nls_utf8 nls_cp437 ide_generic sd_mod ide_gd_mod
> ata_generic pata_acpi ata_piix piix ide_pci_generic ide_core skge sky2
> thermal_sys
> [ 1535.045601]
> 
> [ 1535.045601] Pid: 9960, comm: mv Not tainted (2.6.28.10 #2)
> 
> [ 1535.045601] EIP: 0060:[<f808085a>] EFLAGS: 00010286 CPU: 0
> 
> [ 1535.045601] EIP is at sky2_mac_intr+0x22/0x9d [sky2]
> 
> [ 1535.045601] EAX: f8090f88 EBX: 00000001 ECX: 00000008 EDX: 000000ff
> 
> [ 1535.045601] ESI: 00000000 EDI: f682cb80 EBP: 00000080 ESP: f5f13ed4
> 
> [ 1535.045601]  DS: 0068 ES: 0068 FS: 00d8 GS: 0033 SS: 0068
> 
> [ 1535.045601] Process mv (pid: 9960, ti=f5f12000 task=f4a961c0
> task.ti=f5f12000)
> 
> [ 1535.045601] Stack:
> 
> [ 1535.045601]  ff08340b f682cb88 ffffffff ffffffff f712b800 f80839d6
> 00000040 f682cb88
> 
> [ 1535.045601]  00000000 00000001 f682cb80 c082111a 00000000 00000000
> 00000003 f7014b80
> [ 1535.045601]  c0a604e8 00000246 f7014b80 c0838f21 00000000 c0a604e8
> 00000101 c1d10124
> [ 1535.045601] Call Trace:
> [ 1535.045601]  [<f80839d6>] sky2_poll+0x1cb/0xbed [sky2]
> [ 1535.045601]  [<c082111a>] __wake_up+0x29/0x39
> [ 1535.045601]  [<c0a604e8>] _spin_unlock_irqrestore+0x22/0x39
> [ 1535.045601]  [<c0838f21>] __queue_work+0x4d/0x5a
> [ 1535.045601]  [<c0a604e8>] _spin_unlock_irqrestore+0x22/0x39
> [ 1535.045601]  [<c09eda45>] net_rx_action+0xb8/0x1f6
> [ 1535.045601]  [<c082f954>] __do_softirq+0x95/0x142
> [ 1535.045601]  [<c082fa49>] do_softirq+0x48/0x57
> [ 1535.045601]  [<c082fbc9>] irq_exit+0x3b/0x78
> [ 1535.045601]  [<c081218f>] smp_apic_timer_interrupt+0x75/0x7f
> [ 1535.045601]  [<c0804f48>] apic_timer_interrupt+0x28/0x30
> [ 1535.045601]  [<c0a60000>] rwsem_down_failed_common+0xa4/0x175
> [ 1535.045601] Code: c0 83 c4 14 5b 5e 5f 5d c3 55 89 d5 57 89 c7 56 53
> 89 d3 c1 e5 07 83 ec 04 8b 74 90 30 8d 85 08 0f 00 00 03 07 8a 10 88 54
> 24 03 <f6> 86 0d 05 00 00 02 74 12 0f b6 c2 50 56 68 84 5b 08 f8 e8 cd
> [ 1535.045601] EIP: [<f808085a>] sky2_mac_intr+0x22/0x9d [sky2] SS:ESP
> 0068:f5f13ed4
> [ 1535.302490] Kernel panic - not syncing: Fatal exception in interrupt
> [ 1535.309412] Rebooting in 30 seconds..
> 
> 
> Or even when doing it more slowly, interface by interface:
> 
> [~]# ifdown tun6to4; cat /proc/net/dev | cut -d: -f1 | grep -v Inter |
> grep -v face | sort -u | while read iface; do echo $iface; ifdown
> $iface; sleep 3s; done
> hb
> 
> lo
> 
> dmz
> 
> lan
> [ 1127.000261] sky2 0000:04:00.0: error interrupt status=0xffffffff
> [ 1127.007348] sky2 0000:04:00.0: PCI hardware error (0xffff)
> [ 1127.013745] sky2 0000:04:00.0: PCI Express error (0xffffffff)
> [ 1127.020468] sky2 lan: ram data read parity error
> [ 1127.025834] sky2 lan: ram data write parity error
> [ 1127.031302] sky2 lan: MAC parity error
> [ 1127.035671] sky2 lan: RX parity error
> [ 1127.039910] sky2 lan: TCP segmentation error
> [ 1127.045079] general protection fault: 0000 [#1] PREEMPT SMP
> [ 1127.048879] last sysfs file:
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
> 
> [ 1127.048879] Modules linked in: xt_multiport cpufreq_userspace xt_DSCP
> xt_length xt_mark xt_dscp xt_MARK xt_CONNMARK xt_comment xt_policy
> ipt_REDIRECT ip6t_LOG xt_tcpudp ip6table_mangle iptable_mangle
> ip6table_filter ip6_tables sit tunnel4 8021q garp stp llc ipt_LOG
> xt_limit xt_state iptable_nat iptable_filter ip_tables x_tables dm_mod
> p4_clockmod speedstep_lib freq_table tun imq nf_nat_ftp nf_nat
> nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4 nf_conntrack
> nf_defrag_ipv4 ipv6 evdev parport_pc parport pcspkr serio_raw i2c_i801
> i2c_core iTCO_wdt rng_core intel_agp agpgart squashfs sqlzma unlzma loop
> aufs exportfs nls_utf8 nls_cp437 ide_generic sd_mod ide_gd_mod
> ata_generic pata_acpi ata_piix piix ide_pci_generic ide_core skge sky2
> thermal_sys
> [ 1127.048879]
> 
> [ 1127.048879] Pid: 20150, comm: rndc Not tainted (2.6.28.10 #2)
> 
> [ 1127.048879] EIP: 0060:[<f808085a>] EFLAGS: 00010286 CPU: 0
> 
> [ 1127.048879] EIP is at sky2_mac_intr+0x22/0x9d [sky2]
> 
> [ 1127.048879] EAX: f80d8f88 EBX: 00000001 ECX: 00000008 EDX: 000000ff
> 
> [ 1127.048879] ESI: 00000000 EDI: f68c2a80 EBP: 00000080 ESP: eb83fb38
> 
> [ 1127.048879]  DS: 0068 ES: 0068 FS: 00d8 GS: 0000 SS: 0068
> 
> [ 1127.048879] Process rndc (pid: 20150, ti=eb83e000 task=f695bb00
> task.ti=eb83e000)
> 
> [ 1127.048879] Stack:
> 
> [ 1127.048879]  ff08340b f68c2a88 ffffffff ffffffff f712c000 f80839d6
> 00000040 f68c2a88
> 
> [ 1127.048879]  c0a78d54 f70344e0 f68c2a80 f695bb00 c0a78d54 c0a604e8
> c1d10980 c0a78d54
> 
> [ 1127.048879]  c0827013 00000000 0000000f 00000246 f70344e0 00000102
> c0be5180 c0832dc6
> 
> [ 1127.048879] Call Trace:
> 
> [ 1127.048879]  [<f80839d6>] sky2_poll+0x1cb/0xbed [sky2]
> 
> [ 1127.048879]  [<c0a604e8>] _spin_unlock_irqrestore+0x22/0x39
> 
> [ 1127.048879]  [<c0827013>] try_to_wake_up+0x158/0x162
> 
> [ 1127.048879]  [<c0832dc6>] process_timeout+0x0/0x5
> 
> [ 1127.048879]  [<c09eda45>] net_rx_action+0xb8/0x1f6
> 
> [ 1127.048879]  [<c082f954>] __do_softirq+0x95/0x142
> 
> [ 1127.048879]  [<c082fa49>] do_softirq+0x48/0x57
> 
> [ 1127.048879]  [<c082fbc9>] irq_exit+0x3b/0x78
> 
> [ 1127.048879]  [<c081218f>] smp_apic_timer_interrupt+0x75/0x7f
> 
> [ 1127.048879]  [<c0804f48>] apic_timer_interrupt+0x28/0x30
> 
> [ 1127.048879]  [<c0867764>] get_page_from_freelist+0x2b8/0x3df
> 
> [ 1127.048879]  [<c0867ae0>] __alloc_pages_internal+0x98/0x37f
> 
> [ 1127.048879]  [<c0862ee0>] find_lock_page+0x10/0x43
> 
> [ 1127.048879]  [<c0a60555>] _spin_unlock+0x10/0x23
> 
> [ 1127.048879]  [<c086fb70>] __do_fault+0xaa/0x3bc
> 
> [ 1127.048879]  [<c08718e1>] handle_mm_fault+0x54a/0xbfa
> 
> [ 1127.048879]  [<c0a60555>] _spin_unlock+0x10/0x23
> 
> [ 1127.048879]  [<c089442d>] __d_lookup+0xfa/0x116
> 
> [ 1127.048879]  [<c088cb78>] do_lookup+0x53/0x153
> 
> [ 1127.048879]  [<c0893375>] dput+0x16/0xfc
> 
> [ 1127.048879]  [<c088eb25>] __link_path_walk+0xb01/0xbfb
> 
> [ 1127.048879]  [<c0a60555>] _spin_unlock+0x10/0x23
> 
> [ 1127.048879]  [<c086f05e>] kmap_high+0x17c/0x186
> 
> [ 1127.048879]  [<c0819b76>] default_spin_lock_flags+0x5/0x7
> 
> [ 1127.048879]  [<c081a64b>] do_page_fault+0x335/0x86e
> 
> [ 1127.048879]  [<c0a60555>] _spin_unlock+0x10/0x23
> 
> [ 1127.048879]  [<c0870a91>] unmap_vmas+0x498/0x6ab
> 
> [ 1127.048879]  [<c087321e>] free_pgtables+0x7d/0x93
> 
> [ 1127.048879]  [<c086d42e>] vma_prio_tree_insert+0x17/0x7f
> 
> [ 1127.048879]  [<c0874a45>] vma_link+0x51/0x73
> 
> [ 1127.048879]  [<c0a60555>] _spin_unlock+0x10/0x23
> 
> [ 1127.048879]  [<c0874a5f>] vma_link+0x6b/0x73
> 
> [ 1127.048879]  [<c08763b8>] mmap_region+0x475/0x58c
> 
> [ 1127.048879]  [<c08767a4>] do_mmap_pgoff+0x2d5/0x326
> 
> [ 1127.048879]  [<c08081db>] sys_mmap2+0x62/0x77
> 
> [ 1127.048879]  [<c08081e9>] sys_mmap2+0x70/0x77
> 
> [ 1127.048879]  [<c081a316>] do_page_fault+0x0/0x86e
> 
> [ 1127.048879]  [<c0a60805>] error_code+0x75/0x80
> 
> [ 1127.048879] Code: c0 83 c4 14 5b 5e 5f 5d c3 55 89 d5 57 89 c7 56 53
> 89 d3 c1 e5 07 83 ec 04 8b 74 90 30 8d 85 08 0f 00 00 03 07 8a 10 88 54
> 24 03 <f6> 86 0d 05 00 00 02 74 12 0f b6 c2 50 56 68 84 5b 08 f8 e8 cd
> 
> [ 1127.048879] EIP: [<f808085a>] sky2_mac_intr+0x22/0x9d [sky2] SS:ESP
> 0068:eb83fb38
> 
> [ 1127.470534] Kernel panic - not syncing: Fatal exception in interrupt
> 
> [ 1127.478035] Rebooting in 30 seconds..
> 
> 
> 
> It seems that the oops occurs when the last network interface using the
> sky2 module goes down, although I am not completely certain about this.
> I am also fairly sure that the other patches applied to 2.6.28.10 are
> not at fault, as the same kernel works perfectly well on different
> hardware (which is not using the sky2 NIC module).
> 
> Attached are the lspci -v output and the kernel config.
> 
> Any hints on what may be wrong would be highly appreciated. I am able to
> try patches to sky2 and/or give remote ssh access to the box (although
> it will be offline for 5 minutes after triggering the oops...).

Try later kernels.  


-- 

  reply	other threads:[~2009-07-21 16:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-21 16:26 Kernel oops on setting sky2 interfaces down Rene Mayrhofer
2009-07-21 16:58 ` Stephen Hemminger [this message]
2009-07-21 19:59   ` Rene Mayrhofer
2009-07-21 20:54     ` Stephen Hemminger
2009-07-23 17:28 ` Stephen Hemminger
2009-07-27 11:03   ` Rene Mayrhofer
2009-07-27 16:30     ` Stephen Hemminger
2009-07-28  7:21       ` Rene Mayrhofer
2009-07-27 22:35     ` Stephen Hemminger
2009-07-28  7:25       ` Rene Mayrhofer
2009-07-28  9:48       ` Rene Mayrhofer
2009-08-03 11:55       ` Rene Mayrhofer
2009-08-03 18:19         ` Rene Mayrhofer
2009-08-04  7:38           ` Rene Mayrhofer
2009-08-04 11:18             ` Mike McCormack
2009-08-04 21:31               ` Rene Mayrhofer
     [not found]             ` <392fb48f0908040445pc21105bo3182773b76d49596@mail.gmail.com>
2009-08-04 22:55               ` Rene Mayrhofer
2009-08-04 22:59                 ` Rene Mayrhofer
2009-08-04 23:08                   ` Stephen Hemminger
2009-08-04 23:53                   ` Mike McCormack
2009-08-05 12:14                     ` Rene Mayrhofer
2009-08-05 22:50                       ` Mike McCormack
2009-08-10 10:28                         ` Rene Mayrhofer
2009-08-11  8:54                           ` Rene Mayrhofer
2009-08-19  7:01                             ` Rene Mayrhofer
2009-08-19 15:00                               ` Mike McCormack
2009-08-19 15:11                                 ` Rene Mayrhofer
2009-08-19 21:07                                 ` Rene Mayrhofer
2009-08-19 21:25                                   ` Rene Mayrhofer
2009-08-19 22:05                                   ` Mike McCormack
2009-08-20  0:46                                     ` Stephen Hemminger
2009-08-20 20:37                                       ` Rene Mayrhofer
2009-08-21 11:03                                         ` Mike McCormack
2009-08-20 19:42                                     ` Rene Mayrhofer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090721095853.30f4fbda@nehalam \
    --to=shemminger@vyatta.com \
    --cc=leitner@esys.at \
    --cc=netdev@vger.kernel.org \
    --cc=rene.mayrhofer@gibraltar.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).