All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ganesh Venkatesan <ganesh.venkatesan@gmail.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Andrew Morton <akpm@osdl.org>,
	netdev@oss.sgi.com, hejianj@cn.ibm.com,
	linuxppc64-dev@lists.linuxppc.org.sgi.com, anton@samba.org,
	jgarzik@pobox.com
Subject: Re: Fw: [Bugme-new] [Bug 4628] New: Test server hang while running rhr (network) test on RHEL4 with kernel 2.6.12-rc1-mm4
Date: Mon, 16 May 2005 10:43:02 -0700	[thread overview]
Message-ID: <5fc59ff3050516104367a8d5cd@mail.gmail.com> (raw)
In-Reply-To: <E1DXdL8-0005mE-00@gondolin.me.apana.org.au>

Jian:

Could you try the e100 from
http://prdownloads.sourceforge.net/e1000/e100-3.4.8.tar.gz?download?
This (e100 3.4.8) has a fix for the problem you've encountered.
Specifically this driver uses netif_poll_{enable|disable} to avoid the
race.

 static int e100_up(struct nic *nic)
 {
@@ -1688,13 +1753,18 @@ static int e100_up(struct nic *nic)
        if((err = e100_hw_init(nic)))
                goto err_clean_cbs;
        e100_set_multicast_list(nic->netdev);
-       e100_start_receiver(nic);
+       e100_start_receiver(nic, 0);
        mod_timer(&nic->watchdog, jiffies);
        if((err = request_irq(nic->pdev->irq, e100_intr, SA_SHIRQ,
                nic->netdev->name, nic->netdev)))
                goto err_no_irq;
-       e100_enable_irq(nic);
        netif_wake_queue(nic->netdev);
+#ifdef CONFIG_E100_NAPI
+       netif_poll_enable(nic->netdev);
+       /* enable ints _after_ enabling poll, preventing a race between
+        * disable ints+schedule */
+#endif
+       e100_enable_irq(nic);
        return 0;

 err_no_irq:
@@ -1708,11 +1778,15 @@ err_rx_clean_list:

 static void e100_down(struct nic *nic)
 {
+#ifdef CONFIG_E100_NAPI
+       /* wait here for poll to complete */
+       netif_poll_disable(nic->netdev);
+#endif
+       netif_stop_queue(nic->netdev);
        e100_hw_reset(nic);
        free_irq(nic->pdev->irq, nic->netdev);
        del_timer_sync(&nic->watchdog);
        netif_carrier_off(nic->netdev);
-       netif_stop_queue(nic->netdev);
        e100_clean_cbs(nic);
        e100_rx_clean_list(nic);


ganesh.


On 5/16/05, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > Might be a bug in the e100 driver, might not be.
> >
> > I assume this is the
> >
> >        BUG_ON(skb->list != NULL);
> 
> It certainly is a bug in e100.
> 
> e100_tx_timeout -> e100_down -> e100_rx_clean_list
> 
> is racing against
> 
> e100_poll -> e100_rx_clean -> e100_rx_indicate
> 
> e100_rx_clean/e100_rx_indicate takes an skb off the RX ring and
> while it's being processed e100_rx_clean_list comes along and
> frees it.
> 
> From a quick check similar problems may exist in other drivers that
> have lockless ->poll() functions with RX rings.
> 
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 
>

  reply	other threads:[~2005-05-16 17:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-16  9:59 Fw: [Bugme-new] [Bug 4628] New: Test server hang while running rhr (network) test on RHEL4 with kernel 2.6.12-rc1-mm4 Andrew Morton
2005-05-16 10:41 ` Jian Jun He
2005-05-16 11:00 ` Herbert Xu
2005-05-16 17:43   ` Ganesh Venkatesan [this message]
2005-05-16 21:29     ` Herbert Xu
2005-05-16 21:58       ` Jeff Garzik
     [not found]     ` <OFB1F7DBFD.6A6514AD-ON48257004.0038A154-48257004.0038E08A@cn.ibm.com>
2005-05-26  7:38       ` Andrew Morton
2005-05-26  7:53         ` Jeff Garzik
2005-05-24 18:36 ` Ganesh Venkatesan
2005-05-25  3:21   ` Jian Jun He
  -- strict thread matches above, loose matches on Subject: below --
2005-05-26 13:00 Venkatesan, Ganesh
2005-05-26 16:09 ` Jian Jun He
2005-05-26 20:31   ` Andrew Morton
2005-05-27  6:18     ` Jian Jun He
2005-05-27  8:21       ` Andrew Morton
2005-05-27 10:12         ` Jian Jun He
2005-05-26 20:41 Venkatesan, Ganesh
2005-05-26 21:34 ` Herbert Xu
2005-05-26 23:08   ` Ganesh Venkatesan
2005-05-27  0:11     ` Herbert Xu
2005-05-27  0:20       ` Herbert Xu
2005-05-27  0:28 Venkatesan, Ganesh
2005-05-27  1:26 ` Herbert Xu
2005-05-27  1:44   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fc59ff3050516104367a8d5cd@mail.gmail.com \
    --to=ganesh.venkatesan@gmail.com \
    --cc=akpm@osdl.org \
    --cc=anton@samba.org \
    --cc=hejianj@cn.ibm.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgarzik@pobox.com \
    --cc=linuxppc64-dev@lists.linuxppc.org.sgi.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.