netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] e100: Fix the TX workqueue race
@ 2010-04-23 14:34 Alan Cox
  2010-04-23 16:20 ` Jeff Garzik
  2010-04-23 23:31 ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Cox @ 2010-04-23 14:34 UTC (permalink / raw)
  To: netdev, e1000-devel

I'd assumed someone would have picked up on this and fixed it using rtnl_lock
as was suggested but it seems to have fallen through the cracks ?

Anyway this is I assume what was meant ?

---

Nothing stops the workqueue being left to run in parallel with close or a
few other operations. This causes double unmaps and the like.

See kerneloops.org #1041230 for an example

Signed-off-by: Alan Cox <alan@linux.intel.com>
---

 drivers/net/e100.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)


diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 3e8d000..859e833 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2280,8 +2280,13 @@ static void e100_tx_timeout_task(struct work_struct *work)
 
 	netif_printk(nic, tx_err, KERN_DEBUG, nic->netdev,
 		     "scb.status=0x%02X\n", ioread8(&nic->csr->scb.status));
-	e100_down(netdev_priv(netdev));
-	e100_up(netdev_priv(netdev));
+
+	rtnl_lock();
+	if (netif_running(dev)) {
+		e100_down(netdev_priv(netdev));
+		e100_up(netdev_priv(netdev));
+	}
+	rtnl_unlock();
 }
 
 static int e100_loopback_test(struct nic *nic, enum loopback loopback_mode)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
@ 2010-04-23 16:20 ` Jeff Garzik
  2010-04-23 23:31 ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Garzik @ 2010-04-23 16:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: e1000-devel, netdev

On 04/23/2010 10:34 AM, Alan Cox wrote:
> I'd assumed someone would have picked up on this and fixed it using rtnl_lock
> as was suggested but it seems to have fallen through the cracks ?
>
> Anyway this is I assume what was meant ?
>
> ---
>
> Nothing stops the workqueue being left to run in parallel with close or a
> few other operations. This causes double unmaps and the like.
>
> See kerneloops.org #1041230 for an example
>
> Signed-off-by: Alan Cox<alan@linux.intel.com>

Acked-by: Jeff Garzik <jgarzik@redhat.com>

Glad someone finally fixed this, it has bugged me for years...



------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
  2010-04-23 16:20 ` Jeff Garzik
@ 2010-04-23 23:31 ` David Miller
  2010-04-23 23:35   ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2010-04-23 23:31 UTC (permalink / raw)
  To: alan; +Cc: netdev, e1000-devel

From: Alan Cox <alan@linux.intel.com>
Date: Fri, 23 Apr 2010 15:34:43 +0100

> I'd assumed someone would have picked up on this and fixed it using rtnl_lock
> as was suggested but it seems to have fallen through the cracks ?
> 
> Anyway this is I assume what was meant ?

I hope this doesn't deadlock with linkwatch, as that's generally
a problem we hit with trying to take RTNL from workqueues in
the networking.

Linkwatch takes the RTNL lock, and then can make calls into the driver
in it's main work loop.

But since you don't hold any driver locks here (you can't as if we did
we couldn't take the RTNL lock here at all) so it should be OK.

I'll apply this to net-2.6, thanks Alan.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-23 23:31 ` David Miller
@ 2010-04-23 23:35   ` David Miller
  2010-04-24 10:36     ` Alan Cox
  2010-04-24 11:11     ` Alan Cox
  0 siblings, 2 replies; 9+ messages in thread
From: David Miller @ 2010-04-23 23:35 UTC (permalink / raw)
  To: alan; +Cc: netdev, e1000-devel

From: David Miller <davem@davemloft.net>
Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)

> I'll apply this to net-2.6, thanks Alan.

Nevermind...

Doesn't apply to net-2.6, but even when I fix that up it doesn't
even compile.  There is no 'dev' variable present etc.

You even use a combination of "dev" and "netdev" in the resulting
code block.

If it doesn't even build, I doubt it's been tested either.

Please resolve this and get some testing on it, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-23 23:35   ` David Miller
@ 2010-04-24 10:36     ` Alan Cox
  2010-04-25  3:00       ` David Miller
  2010-04-24 11:11     ` Alan Cox
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Cox @ 2010-04-24 10:36 UTC (permalink / raw)
  To: David Miller; +Cc: e1000-devel, netdev

On Fri, 23 Apr 2010 16:35:45 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: David Miller <davem@davemloft.net>
> Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)
> 
> > I'll apply this to net-2.6, thanks Alan.
> 
> Nevermind...
> 
> Doesn't apply to net-2.6, but even when I fix that up it doesn't
> even compile.  There is no 'dev' variable present etc.
> 
> You even use a combination of "dev" and "netdev" in the resulting
> code block.
> 
> If it doesn't even build, I doubt it's been tested either.

Puzzling as it came from a building -next tree. Will see whats happened
next week if I get time, but I'm afraid net stuff isn't a priority - in
fact its disappointing that having diagnosed a bug months ago (which
was the hard bit) and posted a test patch months ago the maintainers
haven't fixed it.

Alan

------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-23 23:35   ` David Miller
  2010-04-24 10:36     ` Alan Cox
@ 2010-04-24 11:11     ` Alan Cox
  2010-04-25  2:58       ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Cox @ 2010-04-24 11:11 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, e1000-devel

On Fri, 23 Apr 2010 16:35:45 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: David Miller <davem@davemloft.net>
> Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)
> 
> > I'll apply this to net-2.6, thanks Alan.
> 
> Nevermind...
> 
> Doesn't apply to net-2.6, but even when I fix that up it doesn't
> even compile.  There is no 'dev' variable present etc.
> 
> You even use a combination of "dev" and "netdev" in the resulting
> code block.
> 
> If it doesn't even build, I doubt it's been tested either.

No idea why it won't apply - I guess net has diverged from -next in
this area. Other problem is not typing "stg ref" before "stg export"

(If it doesn't apply I'll look at it next weekend when -next and -net ought to be back in sync ?)


commit 526fb792b745da7c9532725a1a6ecc83a01110cf
Author: Alan Cox <alan@linux.intel.com>
Date:   Sat Apr 24 12:09:23 2010 +0100

    e100: Fix the TX workqueue race
    
    Nothing stops the workqueue being left to run in parallel with close or a
    few other operations. This causes double unmaps and the like.
    
    See kerneloops.org #1041230 for an example
    
    Signed-off-by: Alan Cox <alan@linux.intel.com>

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 3e8d000..ef97bfc 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -168,6 +168,7 @@
 #include <linux/ethtool.h>
 #include <linux/string.h>
 #include <linux/firmware.h>
+#include <linux/rtnetlink.h>
 #include <asm/unaligned.h>
 
 
@@ -2280,8 +2281,13 @@ static void e100_tx_timeout_task(struct work_struct *work)
 
 	netif_printk(nic, tx_err, KERN_DEBUG, nic->netdev,
 		     "scb.status=0x%02X\n", ioread8(&nic->csr->scb.status));
-	e100_down(netdev_priv(netdev));
-	e100_up(netdev_priv(netdev));
+
+	rtnl_lock();
+	if (netif_running(netdev)) {
+		e100_down(netdev_priv(netdev));
+		e100_up(netdev_priv(netdev));
+	}
+	rtnl_unlock();
 }
 
 static int e100_loopback_test(struct nic *nic, enum loopback loopback_mode)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-24 11:11     ` Alan Cox
@ 2010-04-25  2:58       ` David Miller
  2010-04-25  4:10         ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2010-04-25  2:58 UTC (permalink / raw)
  To: alan; +Cc: e1000-devel, netdev

From: Alan Cox <alan@linux.intel.com>
Date: Sat, 24 Apr 2010 12:11:27 +0100

> No idea why it won't apply - I guess net has diverged from -next in
> this area. Other problem is not typing "stg ref" before "stg export"

It has, the debug print statement above the lines you are changing are
completely different.

Please generate this patch against net-2.6 so I can apply it, thanks
Alan.

------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-24 10:36     ` Alan Cox
@ 2010-04-25  3:00       ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2010-04-25  3:00 UTC (permalink / raw)
  To: alan; +Cc: netdev, e1000-devel

From: Alan Cox <alan@linux.intel.com>
Date: Sat, 24 Apr 2010 11:36:29 +0100

> Puzzling as it came from a building -next tree. Will see whats
> happened next week if I get time, but I'm afraid net stuff isn't a
> priority - in fact its disappointing that having diagnosed a bug
> months ago (which was the hard bit) and posted a test patch months
> ago the maintainers haven't fixed it.

It's disappointing to me that someone as experienced and skilled
as yourself can't generate a clean patch which is 1) against
the appropriate tree for a bug fix and 2) actually compiles.

Or is this too much to ask? :-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] e100: Fix the TX workqueue race
  2010-04-25  2:58       ` David Miller
@ 2010-04-25  4:10         ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2010-04-25  4:10 UTC (permalink / raw)
  To: alan; +Cc: e1000-devel, netdev

From: David Miller <davem@davemloft.net>
Date: Sat, 24 Apr 2010 19:58:59 -0700 (PDT)

> Please generate this patch against net-2.6 so I can apply it, thanks
> Alan.

Nevermind, I took care of this for you.

------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-04-25  4:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
2010-04-23 16:20 ` Jeff Garzik
2010-04-23 23:31 ` David Miller
2010-04-23 23:35   ` David Miller
2010-04-24 10:36     ` Alan Cox
2010-04-25  3:00       ` David Miller
2010-04-24 11:11     ` Alan Cox
2010-04-25  2:58       ` David Miller
2010-04-25  4:10         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).