netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Andreas Henriksson <andreas@fjortis.info>
Cc: Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua>,
	netdev@oss.sgi.com, Marcelo Tosatti <marcelo@conectiva.com.br>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: [PATCH] Re: fealnx oopses
Date: Fri, 26 Mar 2004 14:33:18 -0500	[thread overview]
Message-ID: <4064857E.2050603@pobox.com> (raw)
In-Reply-To: <20040326192211.GA15319@scream.fjortis.info>

[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]

Andreas Henriksson wrote:
> On Fri, Mar 26, 2004 at 12:14:57PM +0200, Denis Vlasenko wrote:
> 
> <snip>
> 
>>BTW, my box is indeed slow and low on RAM, this fits.
>>
> 
> 
> I have only been looking at problems with races between the interrupt
> handler and the rest of the driver code.. there might be a bunch of
> problems with failed memory allocations that hasn't bitten me.
> 
> 
>>Regarding your patch: I looked in start_tx(). Apart from latent
>>bug in commented out part of code:
>>	next = (struct fealnx *) np->cur_tx_copy.next_desc_logical;
>>which must be
>>	next = (struct fealnx_desc *) np->cur_tx_copy->next_desc_logical;
>>I can't see anything racy there. The function just submits more
>>tx buffers for the card, it never touch cur_tx or cur_tx->skbuff...
> 
> 
> Francois Romieu explains the race in a comment to the bug I opened in
> the bugzilla.
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=1902#c1
> 
> The problem is that really_tx_count and similar parts of the private
> structure isn't atomically updated and both the interrupt handler and
> the start_tx function updates them.
> (they are regular integers instead of atomic_t)
> 
> 
>>If I miss something and it indeed races with interrupt, you
>>definitely need to add
>>        spin_lock(&np->lock);
>>...
>>        spin_unlock(&np->lock);
>>around interrupt handler body or at least around tx part
>>of it, or else your patch is incomplete (race will still
>>be possible on SMP).
>>
> 
> 
> I came to the conclusion that there should be a spinlock in the
> interrupt handler yesterday, but it won't effect me at all since I don't
> have SMP (nor preempt) so I'll leave it for now anyway.
> 
> 
>>Anyway, I applied your patch and flooded with UDP again.
>>My box did not oops. Unfortunately, it did not oops when
>>I reverted back to old, presumably buggy driver. I cannot
>>reproduce it anymore with old driver too! Bad. :(
> 
> 
> I haven't been able to reproduce a kernel panic with my patch eighter.
> And I've been transfering Terabytes of traffic during the last weeks (or
> maybee it's months, well anyway.. I've done enough testing to say that
> the card works "good enough" in this machine atleast).
> And I've even tried your udp test..
> 
> Although I now have the myson/fealnx card in my p3-900 (256Mb)
> workstation instead of the old p-166 (40Mb) which served as a gateway before.
> It might just be that it's harder to trigger on newer/bigger machines.
> Maybee I should power up my p-166 again.. I actually have 2 of these
> cards so I can have one in each machine.. :)

Well really, somebody needs to port Donald Becker's myson driver to 2.6 
APIs...  I would like to get rid of fealnx, or somebody needs to spend a 
decent amount of time fixing it.

Does the attached patch fix the issue?

	Jeff



[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 1037 bytes --]

===== drivers/net/fealnx.c 1.34 vs edited =====
--- 1.34/drivers/net/fealnx.c	Sun Mar 14 01:54:58 2004
+++ edited/drivers/net/fealnx.c	Fri Mar 26 14:31:07 2004
@@ -1303,14 +1303,15 @@
 	/* for the last tx descriptor */
 	np->tx_ring[i - 1].next_desc = np->tx_ring_dma;
 	np->tx_ring[i - 1].next_desc_logical = &np->tx_ring[0];
-
-	return;
 }
 
 
 static int start_tx(struct sk_buff *skb, struct net_device *dev)
 {
 	struct netdev_private *np = dev->priv;
+	unsigned long flags;
+
+	spin_lock_irqsave(&np->lock, flags);
 
 	np->cur_tx_copy->skbuff = skb;
 
@@ -1377,6 +1378,7 @@
 	writel(0, dev->base_addr + TXPDR);
 	dev->trans_start = jiffies;
 
+	spin_unlock_irqrestore(&np->lock, flags);
 	return 0;
 }
 
@@ -1423,6 +1425,8 @@
 	unsigned int num_tx = 0;
 	int handled = 0;
 
+	spin_lock(&np->lock);
+
 	writel(0, dev->base_addr + IMR);
 
 	ioaddr = dev->base_addr;
@@ -1564,6 +1568,8 @@
 		       dev->name, readl(ioaddr + ISR));
 
 	writel(np->imrvalue, ioaddr + IMR);
+
+	spin_unlock(&np->lock);
 
 	return IRQ_RETVAL(handled);
 }

  reply	other threads:[~2004-03-26 19:33 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-26 10:14 fealnx oopses Denis Vlasenko
2004-03-26 19:22 ` Andreas Henriksson
2004-03-26 19:33   ` Jeff Garzik [this message]
2004-03-26 20:05     ` [PATCH] " Denis Vlasenko
2004-03-27  2:13     ` Andreas Henriksson
2004-03-26 19:57   ` Denis Vlasenko
     [not found]     ` <40648CAF.5010203@pobox.com>
2004-03-26 22:14       ` Denis Vlasenko
2004-03-26 22:35         ` Francois Romieu
2004-03-27  0:03           ` Denis Vlasenko
2004-03-27  0:30             ` Francois Romieu
     [not found]           ` <4064BB35.4050301@pobox.com>
2004-03-27 21:28             ` Denis Vlasenko
2004-03-27 23:55               ` Francois Romieu
2004-03-28 20:19                 ` Denis Vlasenko
2004-03-28 23:27                   ` Andreas Henriksson
2004-03-28 23:38                     ` Francois Romieu
2004-03-29 17:01                       ` Andreas Henriksson
2004-03-29 21:49                         ` Denis Vlasenko
2004-03-29 22:20                           ` Francois Romieu
2004-03-29 22:50                             ` Denis Vlasenko
2004-03-29 23:16                               ` Denis Vlasenko
2004-03-31 21:01                                 ` Francois Romieu
     [not found]                               ` <4068AC87.2030506@pobox.com>
2004-03-29 23:18                                 ` Denis Vlasenko
2004-03-31 16:39                                   ` fealnx oopses (with [PATCH]) Denis Vlasenko
2004-03-31 19:24                                     ` Andreas Henriksson
2004-03-31 20:38                                       ` Denis Vlasenko
2004-03-31 20:53                                         ` Jeff Garzik
2004-03-31 22:23                                           ` Francois Romieu
2004-04-01 11:09                                             ` Denis Vlasenko
2004-04-01 12:28                                               ` Francois Romieu
2004-03-29  7:52                     ` fealnx oopses Denis Vlasenko
2004-03-26 20:20   ` Francois Romieu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4064857E.2050603@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=andreas@fjortis.info \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=netdev@oss.sgi.com \
    --cc=vda@port.imtp.ilyichevsk.odessa.ua \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).