public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* softirq oops from b44_poll
@ 2011-11-21 13:58 Xander Hover
  2011-11-21 23:17 ` Peter P Waskiewicz Jr
  0 siblings, 1 reply; 10+ messages in thread
From: Xander Hover @ 2011-11-21 13:58 UTC (permalink / raw)
  To: peter.p.waskiewicz.jr; +Cc: linux-kernel

Hi all,

I noticed the small discussion about the b44_poll OOPS and
I also have a uni-processor PC with a broadcom network device (b44)
that causes similar kernel OOPSes.

Here is a (reproducible) trace that still shows up in kernel 3.1.1:

 ------------[ cut here ]------------
    WARNING: at kernel/softirq.c:159 local_bh_enable+0x32/0x79()
    Hardware name: Dimension 2400
    Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event
snd_seq snd_pcm_oss snd_mixer_oss bnep rfcomm cryptd aes_i586
aes_generic ecb btusb bluetooth rfkill ppdev snd_emu10k1 snd_rawmidi
snd_ac97_codec ac97_bus snd_pcm snd_seq_device snd_timer
snd_page_alloc dcdbas snd_util_mem parport_pc snd_hwdep snd parport
emu10k1_gp rtc_cmos gameport i2c_i801
    Pid: 0, comm: swapper Not tainted 3.1.1-gentoo #1
    Call Trace:
     [<c1022970>] warn_slowpath_common+0x65/0x7a
     [<c102699e>] ? local_bh_enable+0x32/0x79
     [<c1022994>] warn_slowpath_null+0xf/0x13
     [<c102699e>] local_bh_enable+0x32/0x79
     [<c134bfd8>] destroy_conntrack+0x7c/0x9b
     [<c134890b>] nf_conntrack_destroy+0x1f/0x26
     [<c132e3a6>] skb_release_head_state+0x74/0x83
     [<c132e286>] __kfree_skb+0xb/0x6b
     [<c132e30a>] consume_skb+0x24/0x26
     [<c127c925>] b44_poll+0xaa/0x449
     [<c1333ca1>] net_rx_action+0x3f/0xea
     [<c1026a44>] __do_softirq+0x5f/0xd5
     [<c10269e5>] ? local_bh_enable+0x79/0x79
     <IRQ>  [<c1026c32>] ? irq_exit+0x34/0x8d
     [<c1003628>] ? do_IRQ+0x74/0x87
     [<c13f5329>] ? common_interrupt+0x29/0x30
     [<c1006e18>] ? default_idle+0x29/0x3e
     [<c10015a7>] ? cpu_idle+0x2f/0x5d
     [<c13e91c5>] ? rest_init+0x79/0x7b
     [<c15c66a9>] ? start_kernel+0x297/0x29c
     [<c15c60b0>] ? i386_start_kernel+0xb0/0xb7
    ---[ end trace 583f33bb1aa207a9 ]---


However if I apply the following patch this error does not show up anymore:


diff --git a/drivers/net/ethernet/broadcom/b44.c
b/drivers/net/ethernet/broadcom/b44.c
index 4cf835d..3fb66d0 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -608,7 +608,7 @@ static void b44_tx(struct b44 *bp)
                                 skb->len,
                                 DMA_TO_DEVICE);
                rp->skb = NULL;
-               dev_kfree_skb(skb);
+               dev_kfree_skb_irq(skb);
        }

        bp->tx_cons = cons;
--
1.7.8.rc3


This is a "works for me..", however it may very well be that this only
suppresses the symptoms instead of solving the root cause.
I got this idea from a previous patch (see commit log below) which did
the exact opposite from this patch.

--------------
commit 4924f44b97a034dbf44c14b709b0b0907ee23f04
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Mon Jul 4 17:57:10 2011 -0700

    b44: use dev_kfree_skb() in b44_tx()

    b44_tx() is run from softirq handler, it can use dev_kfree_skb() instead
    of dev_kfree_skb_irq()
--------------

Any thoughts?

Kind regards,

Xander Hover

^ permalink raw reply related	[flat|nested] 10+ messages in thread
* softirq oops from b44_poll
@ 2011-11-07 20:56 Josh Boyer
  2011-11-08  6:21 ` Peter P Waskiewicz Jr
  0 siblings, 1 reply; 10+ messages in thread
From: Josh Boyer @ 2011-11-07 20:56 UTC (permalink / raw)
  To: Gary Zambrano, netdev; +Cc: linux-kernel, kernel-team

Hi all,

We've had two reports of a WARN_ON being spit out from kernel/softirq.c
that seem fairly related in symptoms.  Both seem to involved b44_poll
either during the middle of some disk I/O.  An example of the output is
here:

:WARNING: at kernel/softirq.c:159 _local_bh_enable_ip+0x44/0x8e()
:Hardware name: Vostro 1500                     
:Modules linked in: fuse lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack sunrpc uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec
snd_hwdep snd_seq snd_seq_device snd_pcm dell_wmi sparse_keymap dell_laptop
joydev dcdbas microcode r852 sm_common nand nand_ids b44 nand_ecc r592 mtd ssb
mii memstick arc4 i2c_i801 iTCO_wdt iTCO_vendor_support iwl3945 iwl_legacy
mac80211 cfg80211 rfkill snd_timer snd soundcore snd_page_alloc firewire_ohci
firewire_core crc_itu_t uas usb_storage sdhci_pci sdhci mmc_core nouveau ttm
drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video [last unloaded:
scsi_wait_scan]
:Pid: 1511, comm: nepomukservices Not tainted 3.1.0-1.fc16.x86_64 #1
:Call Trace:
: <IRQ>  [<ffffffff81057a56>] warn_slowpath_common+0x83/0x9b
: [<ffffffff81057a88>] warn_slowpath_null+0x1a/0x1c
: [<ffffffff8105d462>] _local_bh_enable_ip+0x44/0x8e
: [<ffffffff8105d4ba>] local_bh_enable_ip+0xe/0x10
: [<ffffffff814b5af4>] _raw_spin_unlock_bh+0x15/0x17
: [<ffffffffa03cc969>] destroy_conntrack+0x9d/0xdc [nf_conntrack]
: [<ffffffff813fa083>] nf_conntrack_destroy+0x19/0x1b
: [<ffffffff813ce4ed>] skb_release_head_state+0xa7/0xef
: [<ffffffff813ce2f1>] __kfree_skb+0x13/0x83
: [<ffffffff813ce3b7>] consume_skb+0x56/0x6b
: [<ffffffffa02e48c4>] b44_poll+0xaf/0x3ec [b44]
: [<ffffffff813d8137>] net_rx_action+0xa9/0x1b8
: [<ffffffffa02e202e>] ? br32+0x19/0x1d [b44]
: [<ffffffff8105d6b3>] __do_softirq+0xc9/0x1b5
: [<ffffffff81027719>] ? ack_APIC_irq+0x15/0x17
: [<ffffffff814be32c>] call_softirq+0x1c/0x30
: [<ffffffff81010b45>] do_softirq+0x46/0x81
: [<ffffffff8105d97b>] irq_exit+0x57/0xb1
: [<ffffffff814bec0e>] do_IRQ+0x8e/0xa5
: [<ffffffff814b5d2e>] common_interrupt+0x6e/0x6e
: <EOI>  [<ffffffff814bc1f4>] ? sysret_audit+0x16/0x20

You can find the original bug reports in the URLs below.  This has happened
on two different machines, one 32-bit and another 64-bit.  I'm fairly sure
both reports are the same issue, but I haven't a clue what that issue might
be at the moment.

Thoughts?

https://bugzilla.redhat.com/show_bug.cgi?id=749856
https://bugzilla.redhat.com/show_bug.cgi?id=741117

josh


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-11-23  8:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-21 13:58 softirq oops from b44_poll Xander Hover
2011-11-21 23:17 ` Peter P Waskiewicz Jr
2011-11-22 11:21   ` Xander Hover
2011-11-22 13:43   ` Xander Hover
2011-11-22 20:54   ` David Miller
2011-11-22 23:16     ` Xander Hover
2011-11-23  8:14       ` Peter P Waskiewicz Jr
2011-11-23  8:13     ` Peter P Waskiewicz Jr
  -- strict thread matches above, loose matches on Subject: below --
2011-11-07 20:56 Josh Boyer
2011-11-08  6:21 ` Peter P Waskiewicz Jr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox