* [PATCH] e100: Fix the TX workqueue race
@ 2010-04-23 14:34 Alan Cox
2010-04-23 16:20 ` Jeff Garzik
2010-04-23 23:31 ` David Miller
0 siblings, 2 replies; 9+ messages in thread
From: Alan Cox @ 2010-04-23 14:34 UTC (permalink / raw)
To: netdev, e1000-devel
I'd assumed someone would have picked up on this and fixed it using rtnl_lock
as was suggested but it seems to have fallen through the cracks ?
Anyway this is I assume what was meant ?
---
Nothing stops the workqueue being left to run in parallel with close or a
few other operations. This causes double unmaps and the like.
See kerneloops.org #1041230 for an example
Signed-off-by: Alan Cox <alan@linux.intel.com>
---
drivers/net/e100.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 3e8d000..859e833 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2280,8 +2280,13 @@ static void e100_tx_timeout_task(struct work_struct *work)
netif_printk(nic, tx_err, KERN_DEBUG, nic->netdev,
"scb.status=0x%02X\n", ioread8(&nic->csr->scb.status));
- e100_down(netdev_priv(netdev));
- e100_up(netdev_priv(netdev));
+
+ rtnl_lock();
+ if (netif_running(dev)) {
+ e100_down(netdev_priv(netdev));
+ e100_up(netdev_priv(netdev));
+ }
+ rtnl_unlock();
}
static int e100_loopback_test(struct nic *nic, enum loopback loopback_mode)
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
@ 2010-04-23 16:20 ` Jeff Garzik
2010-04-23 23:31 ` David Miller
1 sibling, 0 replies; 9+ messages in thread
From: Jeff Garzik @ 2010-04-23 16:20 UTC (permalink / raw)
To: Alan Cox; +Cc: e1000-devel, netdev
On 04/23/2010 10:34 AM, Alan Cox wrote:
> I'd assumed someone would have picked up on this and fixed it using rtnl_lock
> as was suggested but it seems to have fallen through the cracks ?
>
> Anyway this is I assume what was meant ?
>
> ---
>
> Nothing stops the workqueue being left to run in parallel with close or a
> few other operations. This causes double unmaps and the like.
>
> See kerneloops.org #1041230 for an example
>
> Signed-off-by: Alan Cox<alan@linux.intel.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Glad someone finally fixed this, it has bugged me for years...
------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
2010-04-23 16:20 ` Jeff Garzik
@ 2010-04-23 23:31 ` David Miller
2010-04-23 23:35 ` David Miller
1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2010-04-23 23:31 UTC (permalink / raw)
To: alan; +Cc: netdev, e1000-devel
From: Alan Cox <alan@linux.intel.com>
Date: Fri, 23 Apr 2010 15:34:43 +0100
> I'd assumed someone would have picked up on this and fixed it using rtnl_lock
> as was suggested but it seems to have fallen through the cracks ?
>
> Anyway this is I assume what was meant ?
I hope this doesn't deadlock with linkwatch, as that's generally
a problem we hit with trying to take RTNL from workqueues in
the networking.
Linkwatch takes the RTNL lock, and then can make calls into the driver
in it's main work loop.
But since you don't hold any driver locks here (you can't as if we did
we couldn't take the RTNL lock here at all) so it should be OK.
I'll apply this to net-2.6, thanks Alan.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-23 23:31 ` David Miller
@ 2010-04-23 23:35 ` David Miller
2010-04-24 10:36 ` Alan Cox
2010-04-24 11:11 ` Alan Cox
0 siblings, 2 replies; 9+ messages in thread
From: David Miller @ 2010-04-23 23:35 UTC (permalink / raw)
To: alan; +Cc: netdev, e1000-devel
From: David Miller <davem@davemloft.net>
Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)
> I'll apply this to net-2.6, thanks Alan.
Nevermind...
Doesn't apply to net-2.6, but even when I fix that up it doesn't
even compile. There is no 'dev' variable present etc.
You even use a combination of "dev" and "netdev" in the resulting
code block.
If it doesn't even build, I doubt it's been tested either.
Please resolve this and get some testing on it, thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-23 23:35 ` David Miller
@ 2010-04-24 10:36 ` Alan Cox
2010-04-25 3:00 ` David Miller
2010-04-24 11:11 ` Alan Cox
1 sibling, 1 reply; 9+ messages in thread
From: Alan Cox @ 2010-04-24 10:36 UTC (permalink / raw)
To: David Miller; +Cc: e1000-devel, netdev
On Fri, 23 Apr 2010 16:35:45 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)
>
> > I'll apply this to net-2.6, thanks Alan.
>
> Nevermind...
>
> Doesn't apply to net-2.6, but even when I fix that up it doesn't
> even compile. There is no 'dev' variable present etc.
>
> You even use a combination of "dev" and "netdev" in the resulting
> code block.
>
> If it doesn't even build, I doubt it's been tested either.
Puzzling as it came from a building -next tree. Will see whats happened
next week if I get time, but I'm afraid net stuff isn't a priority - in
fact its disappointing that having diagnosed a bug months ago (which
was the hard bit) and posted a test patch months ago the maintainers
haven't fixed it.
Alan
------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-24 10:36 ` Alan Cox
@ 2010-04-25 3:00 ` David Miller
0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2010-04-25 3:00 UTC (permalink / raw)
To: alan; +Cc: netdev, e1000-devel
From: Alan Cox <alan@linux.intel.com>
Date: Sat, 24 Apr 2010 11:36:29 +0100
> Puzzling as it came from a building -next tree. Will see whats
> happened next week if I get time, but I'm afraid net stuff isn't a
> priority - in fact its disappointing that having diagnosed a bug
> months ago (which was the hard bit) and posted a test patch months
> ago the maintainers haven't fixed it.
It's disappointing to me that someone as experienced and skilled
as yourself can't generate a clean patch which is 1) against
the appropriate tree for a bug fix and 2) actually compiles.
Or is this too much to ask? :-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] e100: Fix the TX workqueue race
2010-04-23 23:35 ` David Miller
2010-04-24 10:36 ` Alan Cox
@ 2010-04-24 11:11 ` Alan Cox
2010-04-25 2:58 ` David Miller
1 sibling, 1 reply; 9+ messages in thread
From: Alan Cox @ 2010-04-24 11:11 UTC (permalink / raw)
To: David Miller; +Cc: netdev, e1000-devel
On Fri, 23 Apr 2010 16:35:45 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 23 Apr 2010 16:31:27 -0700 (PDT)
>
> > I'll apply this to net-2.6, thanks Alan.
>
> Nevermind...
>
> Doesn't apply to net-2.6, but even when I fix that up it doesn't
> even compile. There is no 'dev' variable present etc.
>
> You even use a combination of "dev" and "netdev" in the resulting
> code block.
>
> If it doesn't even build, I doubt it's been tested either.
No idea why it won't apply - I guess net has diverged from -next in
this area. Other problem is not typing "stg ref" before "stg export"
(If it doesn't apply I'll look at it next weekend when -next and -net ought to be back in sync ?)
commit 526fb792b745da7c9532725a1a6ecc83a01110cf
Author: Alan Cox <alan@linux.intel.com>
Date: Sat Apr 24 12:09:23 2010 +0100
e100: Fix the TX workqueue race
Nothing stops the workqueue being left to run in parallel with close or a
few other operations. This causes double unmaps and the like.
See kerneloops.org #1041230 for an example
Signed-off-by: Alan Cox <alan@linux.intel.com>
diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 3e8d000..ef97bfc 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -168,6 +168,7 @@
#include <linux/ethtool.h>
#include <linux/string.h>
#include <linux/firmware.h>
+#include <linux/rtnetlink.h>
#include <asm/unaligned.h>
@@ -2280,8 +2281,13 @@ static void e100_tx_timeout_task(struct work_struct *work)
netif_printk(nic, tx_err, KERN_DEBUG, nic->netdev,
"scb.status=0x%02X\n", ioread8(&nic->csr->scb.status));
- e100_down(netdev_priv(netdev));
- e100_up(netdev_priv(netdev));
+
+ rtnl_lock();
+ if (netif_running(netdev)) {
+ e100_down(netdev_priv(netdev));
+ e100_up(netdev_priv(netdev));
+ }
+ rtnl_unlock();
}
static int e100_loopback_test(struct nic *nic, enum loopback loopback_mode)
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-04-25 4:10 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-23 14:34 [PATCH] e100: Fix the TX workqueue race Alan Cox
2010-04-23 16:20 ` Jeff Garzik
2010-04-23 23:31 ` David Miller
2010-04-23 23:35 ` David Miller
2010-04-24 10:36 ` Alan Cox
2010-04-25 3:00 ` David Miller
2010-04-24 11:11 ` Alan Cox
2010-04-25 2:58 ` David Miller
2010-04-25 4:10 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).