netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ganesh Venkatesan <ganesh.venkatesan@gmail.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Andrew Morton <akpm@osdl.org>,
	netdev@oss.sgi.com, hejianj@cn.ibm.com,
	linuxppc64-dev@lists.linuxppc.org.sgi.com, anton@samba.org,
	jgarzik@pobox.com
Subject: Re: Fw: [Bugme-new] [Bug 4628] New: Test server hang while running rhr (network) test on RHEL4 with kernel 2.6.12-rc1-mm4
Date: Mon, 16 May 2005 10:43:02 -0700	[thread overview]
Message-ID: <5fc59ff3050516104367a8d5cd@mail.gmail.com> (raw)
In-Reply-To: <E1DXdL8-0005mE-00@gondolin.me.apana.org.au>

Jian:

Could you try the e100 from
http://prdownloads.sourceforge.net/e1000/e100-3.4.8.tar.gz?download?
This (e100 3.4.8) has a fix for the problem you've encountered.
Specifically this driver uses netif_poll_{enable|disable} to avoid the
race.

 static int e100_up(struct nic *nic)
 {
@@ -1688,13 +1753,18 @@ static int e100_up(struct nic *nic)
        if((err = e100_hw_init(nic)))
                goto err_clean_cbs;
        e100_set_multicast_list(nic->netdev);
-       e100_start_receiver(nic);
+       e100_start_receiver(nic, 0);
        mod_timer(&nic->watchdog, jiffies);
        if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ,
                nic->netdev->name, nic->netdev)))
                goto err_no_irq;
-       e100_enable_irq(nic);
        netif_wake_queue(nic->netdev);
+#ifdef CONFIG_E100_NAPI
+       netif_poll_enable(nic->netdev);
+       /* enable ints _after_ enabling poll, preventing a race between
+        * disable ints+schedule */
+#endif
+       e100_enable_irq(nic);
        return 0;

 err_no_irq:
@@ -1708,11 +1778,15 @@ err_rx_clean_list:

 static void e100_down(struct nic *nic)
 {
+#ifdef CONFIG_E100_NAPI
+       /* wait here for poll to complete */
+       netif_poll_disable(nic->netdev);
+#endif
+       netif_stop_queue(nic->netdev);
        e100_hw_reset(nic);
        free_irq(nic->pdev->irq, nic->netdev);
        del_timer_sync(&nic->watchdog);
        netif_carrier_off(nic->netdev);
-       netif_stop_queue(nic->netdev);
        e100_clean_cbs(nic);
        e100_rx_clean_list(nic);


ganesh.


On 5/16/05, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > Might be a bug in the e100 driver, might not be.
> >
> > I assume this is the
> >
> >        BUG_ON(skb->list != NULL);
> 
> It certainly is a bug in e100.
> 
> e100_tx_timeout -> e100_down -> e100_rx_clean_list
> 
> is racing against
> 
> e100_poll -> e100_rx_clean -> e100_rx_indicate
> 
> e100_rx_clean/e100_rx_indicate takes an skb off the RX ring and
> while it's being processed e100_rx_clean_list comes along and
> frees it.
> 
> From a quick check similar problems may exist in other drivers that
> have lockless ->poll() functions with RX rings.
> 
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 
>

  reply	other threads:[~2005-05-16 17:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-16  9:59 Fw: [Bugme-new] [Bug 4628] New: Test server hang while running rhr (network) test on RHEL4 with kernel 2.6.12-rc1-mm4 Andrew Morton
2005-05-16 10:41 ` Jian Jun He
2005-05-16 11:00 ` Herbert Xu
2005-05-16 17:43   ` Ganesh Venkatesan [this message]
2005-05-16 21:29     ` Herbert Xu
2005-05-16 21:58       ` Jeff Garzik
     [not found]     ` <OFB1F7DBFD.6A6514AD-ON48257004.0038A154-48257004.0038E08A@cn.ibm.com>
2005-05-26  7:38       ` Andrew Morton
2005-05-26  7:53         ` Jeff Garzik
2005-05-24 18:36 ` Ganesh Venkatesan
2005-05-25  3:21   ` Jian Jun He
  -- strict thread matches above, loose matches on Subject: below --
2005-05-26 13:00 Venkatesan, Ganesh
2005-05-26 16:09 ` Jian Jun He
2005-05-26 20:31   ` Andrew Morton
2005-05-27  6:18     ` Jian Jun He
2005-05-27  8:21       ` Andrew Morton
2005-05-27 10:12         ` Jian Jun He
2005-05-26 20:41 Venkatesan, Ganesh
2005-05-26 21:34 ` Herbert Xu
2005-05-26 23:08   ` Ganesh Venkatesan
2005-05-27  0:11     ` Herbert Xu
2005-05-27  0:20       ` Herbert Xu
2005-05-27  0:28 Venkatesan, Ganesh
2005-05-27  1:26 ` Herbert Xu
2005-05-27  1:44   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fc59ff3050516104367a8d5cd@mail.gmail.com \
    --to=ganesh.venkatesan@gmail.com \
    --cc=akpm@osdl.org \
    --cc=anton@samba.org \
    --cc=hejianj@cn.ibm.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgarzik@pobox.com \
    --cc=linuxppc64-dev@lists.linuxppc.org.sgi.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).