* [PATCH] sky2: Lock transmit queue while disabling device
@ 2009-12-31 10:55 Mike McCormack
2009-12-31 15:58 ` Michael Breuer
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Mike McCormack @ 2009-12-31 10:55 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, flyboy, dhazelton, mbreuer, "netdev
netif_device_detach() does not take the tx_lock, so it's
possible that a call to sky2_xmit_frame is still in
progress after netif_device_detach() is complete.
Take netif_tx_lock() to make sure all transmits have
stopped while we're disabling the devices and that
no other CPU is still transmitting a frame after
we've disabling the device.
Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
drivers/net/sky2.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index faa4841..8ae8520 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
static void sky2_detach(struct net_device *dev)
{
if (netif_running(dev)) {
+ netif_tx_lock(dev);
netif_device_detach(dev); /* stop txq */
+ netif_tx_unlock(dev);
sky2_down(dev);
}
}
--
1.5.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 10:55 [PATCH] sky2: Lock transmit queue while disabling device Mike McCormack
@ 2009-12-31 15:58 ` Michael Breuer
2009-12-31 16:15 ` Daniel Hazelton
2009-12-31 16:33 ` Berck Nash
` (2 subsequent siblings)
3 siblings, 1 reply; 20+ messages in thread
From: Michael Breuer @ 2009-12-31 15:58 UTC (permalink / raw)
To: Mike McCormack; +Cc: Stephen Hemminger, netdev, flyboy, dhazelton
This doesn't solve the issue on my system. Just ran a test - same
errors. I don't think that the under-load errors I'm seeing involve
sky2_detatch. I'm thinking it's related to the dhcp AF_INTERNET fix that
solved the lockup under load - that something outside of sky2 is
corrupting an skb.
On 12/31/2009 5:55 AM, Mike McCormack wrote:
> netif_device_detach() does not take the tx_lock, so it's
> possible that a call to sky2_xmit_frame is still in
> progress after netif_device_detach() is complete.
>
> Take netif_tx_lock() to make sure all transmits have
> stopped while we're disabling the devices and that
> no other CPU is still transmitting a frame after
> we've disabling the device.
>
> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>
> Signed-off-by: Mike McCormack<mikem@ring3k.org>
> ---
> drivers/net/sky2.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index faa4841..8ae8520 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
> static void sky2_detach(struct net_device *dev)
> {
> if (netif_running(dev)) {
> + netif_tx_lock(dev);
> netif_device_detach(dev); /* stop txq */
> + netif_tx_unlock(dev);
> sky2_down(dev);
> }
> }
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 15:58 ` Michael Breuer
@ 2009-12-31 16:15 ` Daniel Hazelton
0 siblings, 0 replies; 20+ messages in thread
From: Daniel Hazelton @ 2009-12-31 16:15 UTC (permalink / raw)
To: Michael Breuer; +Cc: Mike McCormack, Stephen Hemminger, netdev, flyboy
On Thursday 31 December 2009 10:58:15 am Michael Breuer wrote:
> This doesn't solve the issue on my system. Just ran a test - same
> errors. I don't think that the under-load errors I'm seeing involve
> sky2_detatch. I'm thinking it's related to the dhcp AF_INTERNET fix that
> solved the lockup under load - that something outside of sky2 is
> corrupting an skb.
and I am unable to try because I have never had the time - or the free machine
- to narrow down what, exactly is causing a similar crash here. It appears,
however, to be caused by some a**hole out there deciding to point a botnet at
that server...
(at least... that's what has been going on each time the machine has gone
down...)
DRH
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 10:55 [PATCH] sky2: Lock transmit queue while disabling device Mike McCormack
2009-12-31 15:58 ` Michael Breuer
@ 2009-12-31 16:33 ` Berck Nash
2009-12-31 18:51 ` Jarek Poplawski
2010-01-07 4:27 ` [PATCH] sky2: Lock transmit queue while disabling device David Miller
3 siblings, 0 replies; 20+ messages in thread
From: Berck Nash @ 2009-12-31 16:33 UTC (permalink / raw)
To: Mike McCormack; +Cc: Stephen Hemminger, netdev, dhazelton, mbreuer
Thanks. I'll patch and report back. Since I'm not able to reproduce
this oops on demand, I'll just have to run it for a week or so and see
if it comes comes back. I've been getting one every few days, so a week
should be enough to see if this works.
Mike McCormack wrote:
> netif_device_detach() does not take the tx_lock, so it's
> possible that a call to sky2_xmit_frame is still in
> progress after netif_device_detach() is complete.
>
> Take netif_tx_lock() to make sure all transmits have
> stopped while we're disabling the devices and that
> no other CPU is still transmitting a frame after
> we've disabling the device.
>
> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>
> Signed-off-by: Mike McCormack <mikem@ring3k.org>
> ---
> drivers/net/sky2.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index faa4841..8ae8520 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
> static void sky2_detach(struct net_device *dev)
> {
> if (netif_running(dev)) {
> + netif_tx_lock(dev);
> netif_device_detach(dev); /* stop txq */
> + netif_tx_unlock(dev);
> sky2_down(dev);
> }
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 10:55 [PATCH] sky2: Lock transmit queue while disabling device Mike McCormack
2009-12-31 15:58 ` Michael Breuer
2009-12-31 16:33 ` Berck Nash
@ 2009-12-31 18:51 ` Jarek Poplawski
2009-12-31 23:51 ` Mike McCormack
2010-01-07 4:27 ` [PATCH] sky2: Lock transmit queue while disabling device David Miller
3 siblings, 1 reply; 20+ messages in thread
From: Jarek Poplawski @ 2009-12-31 18:51 UTC (permalink / raw)
To: Mike McCormack
Cc: Stephen Hemminger, netdev, flyboy, dhazelton, mbreuer,
Linux Netdev List
Mike McCormack wrote, On 12/31/2009 11:55 AM:
> netif_device_detach() does not take the tx_lock, so it's
> possible that a call to sky2_xmit_frame is still in
> progress after netif_device_detach() is complete.
>
> Take netif_tx_lock() to make sure all transmits have
> stopped while we're disabling the devices and that
> no other CPU is still transmitting a frame after
> we've disabling the device.
>
> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
Could you give some scenario of the oops/fix?
Btw, even if it worked, you should use netif_tx_lock_bh
version considering sky2_detach use contexts, I guess.
Jarek P.
>
> Signed-off-by: Mike McCormack <mikem@ring3k.org>
> ---
> drivers/net/sky2.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index faa4841..8ae8520 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
> static void sky2_detach(struct net_device *dev)
> {
> if (netif_running(dev)) {
> + netif_tx_lock(dev);
> netif_device_detach(dev); /* stop txq */
> + netif_tx_unlock(dev);
> sky2_down(dev);
> }
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 18:51 ` Jarek Poplawski
@ 2009-12-31 23:51 ` Mike McCormack
2010-01-01 3:06 ` Berck E. Nash
2010-01-01 18:31 ` Jarek Poplawski
0 siblings, 2 replies; 20+ messages in thread
From: Mike McCormack @ 2009-12-31 23:51 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: Stephen Hemminger, netdev, flyboy, dhazelton, mbreuer
Hi Jarek,
This is based on my analysis of the oops at:
http://bugzilla.kernel.org/show_bug.cgi?id=14925
Specifically:
>>> [ 8673.345873] sky2 eth0: receiver hang detected
>>> [ 8673.350368] sky2 eth0: disabling interface
>>> [ 8673.354749] BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000010
>>> [ 8673.359748] IP: [<ffffffffa00373d3>] sky2_xmit_frame+0x321/0x5d8
>>> [sky2]
netif_device_detach() does not guarantee that all transmits have completed
after it returns.
CPU 1 stack will look like:
dev_queue_xmit()
HARD_TX_LOCK() -> __netif_tx_lock()
...
dev_hard_start_xmit()
ops->ndo_start_xmit() -> sky2_xmit_frame()
sky2_xmit_frame() pushing skb to hardware
use NULL tx_ring here
CPU 2 stack will look like:
sky2_restart()
rtnl_lock()
sky2_detach()
netif_device_detach()
sky2_down()
printk("sky2 eth0: disabling interface")
...
sky2_free_buffers(sky2);
sky2->tx_ring = NULL;
...
Another way to solve the problem would be to take the transmit lock in
netif_device_detach() to make sure that any in progress transmits have
completed before returning.
Note that most of these backtraces are using the nvidia binary only
module. This may change the timings and make the sky2 race more likely,
or be involved in the "tx timeout" condition that triggers a sky2_restart().
Will test with netif_tx_lock_bh and resubmit.
thanks,
Mike
Jarek Poplawski wrote:
> Mike McCormack wrote, On 12/31/2009 11:55 AM:
>
>> netif_device_detach() does not take the tx_lock, so it's
>> possible that a call to sky2_xmit_frame is still in
>> progress after netif_device_detach() is complete.
>>
>> Take netif_tx_lock() to make sure all transmits have
>> stopped while we're disabling the devices and that
>> no other CPU is still transmitting a frame after
>> we've disabling the device.
>>
>> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>
> Could you give some scenario of the oops/fix?
> Btw, even if it worked, you should use netif_tx_lock_bh
> version considering sky2_detach use contexts, I guess.
>
> Jarek P.
>
>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
>> ---
>> drivers/net/sky2.c | 2 ++
>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
>> index faa4841..8ae8520 100644
>> --- a/drivers/net/sky2.c
>> +++ b/drivers/net/sky2.c
>> @@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
>> static void sky2_detach(struct net_device *dev)
>> {
>> if (netif_running(dev)) {
>> + netif_tx_lock(dev);
>> netif_device_detach(dev); /* stop txq */
>> + netif_tx_unlock(dev);
>> sky2_down(dev);
>> }
>> }
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 23:51 ` Mike McCormack
@ 2010-01-01 3:06 ` Berck E. Nash
2010-01-01 6:42 ` Stephen Hemminger
2010-01-01 18:31 ` Jarek Poplawski
1 sibling, 1 reply; 20+ messages in thread
From: Berck E. Nash @ 2010-01-01 3:06 UTC (permalink / raw)
To: Mike McCormack
Cc: Jarek Poplawski, Stephen Hemminger, netdev, dhazelton, mbreuer
[-- Attachment #1: Type: text/plain, Size: 3059 bytes --]
Well, that didn't fix it. Oops attached, looks pretty much the same to me.
Mike McCormack wrote:
> Hi Jarek,
>
> This is based on my analysis of the oops at:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14925
>
> Specifically:
>
>>>> [ 8673.345873] sky2 eth0: receiver hang detected
>>>> [ 8673.350368] sky2 eth0: disabling interface
>>>> [ 8673.354749] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000010
>>>> [ 8673.359748] IP: [<ffffffffa00373d3>] sky2_xmit_frame+0x321/0x5d8
>>>> [sky2]
>
> netif_device_detach() does not guarantee that all transmits have completed
> after it returns.
>
> CPU 1 stack will look like:
>
> dev_queue_xmit()
> HARD_TX_LOCK() -> __netif_tx_lock()
> ...
> dev_hard_start_xmit()
> ops->ndo_start_xmit() -> sky2_xmit_frame()
> sky2_xmit_frame() pushing skb to hardware
> use NULL tx_ring here
>
>
> CPU 2 stack will look like:
>
> sky2_restart()
> rtnl_lock()
> sky2_detach()
> netif_device_detach()
> sky2_down()
> printk("sky2 eth0: disabling interface")
> ...
> sky2_free_buffers(sky2);
> sky2->tx_ring = NULL;
> ...
>
> Another way to solve the problem would be to take the transmit lock in
> netif_device_detach() to make sure that any in progress transmits have
> completed before returning.
>
> Note that most of these backtraces are using the nvidia binary only
> module. This may change the timings and make the sky2 race more likely,
> or be involved in the "tx timeout" condition that triggers a sky2_restart().
>
> Will test with netif_tx_lock_bh and resubmit.
>
> thanks,
>
> Mike
>
>
>
>
> Jarek Poplawski wrote:
>> Mike McCormack wrote, On 12/31/2009 11:55 AM:
>>
>>> netif_device_detach() does not take the tx_lock, so it's
>>> possible that a call to sky2_xmit_frame is still in
>>> progress after netif_device_detach() is complete.
>>>
>>> Take netif_tx_lock() to make sure all transmits have
>>> stopped while we're disabling the devices and that
>>> no other CPU is still transmitting a frame after
>>> we've disabling the device.
>>>
>>> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>> Could you give some scenario of the oops/fix?
>> Btw, even if it worked, you should use netif_tx_lock_bh
>> version considering sky2_detach use contexts, I guess.
>>
>> Jarek P.
>>
>>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
>>> ---
>>> drivers/net/sky2.c | 2 ++
>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
>>> index faa4841..8ae8520 100644
>>> --- a/drivers/net/sky2.c
>>> +++ b/drivers/net/sky2.c
>>> @@ -3176,7 +3176,9 @@ static void sky2_reset(struct sky2_hw *hw)
>>> static void sky2_detach(struct net_device *dev)
>>> {
>>> if (netif_running(dev)) {
>>> + netif_tx_lock(dev);
>>> netif_device_detach(dev); /* stop txq */
>>> + netif_tx_unlock(dev);
>>> sky2_down(dev);
>>> }
>>> }
>>
>
[-- Attachment #2: sky2crash2.txt --]
[-- Type: text/plain, Size: 5379 bytes --]
[ 5768.704033] sky2 eth0: receiver hang detected
[ 5768.708579] sky2 eth0: disabling interface
[ 5768.712928] BUG: unable to handle kernel NULL pointer dereference at 0000000000000ad0
[ 5768.717776] IP: [<ffffffffa003d46f>] sky2_xmit_frame+0x321/0x5d8 [sky2]
[ 5768.726935] PGD beaa3067 PUD ba837067 PMD 0
[ 5768.731121] Oops: 0002 [#1] SMP
[ 5768.731121] last sysfs file: /sys/devices/platform/coretemp.0/temp1_label
[ 5768.740188] CPU 0
[ 5768.742247] Modules linked in: nvidia(P) nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc nls_cp437 msdos fat kvm_intel kvm fuse snd_rtctimer usbhid hwmon_vid tuner_simple tuner_types wm8775 snd_hda_codec_realtek tda9887 tda8290 snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_pcm tuner snd_seq_dummy cx25840 ivtv i2c_algo_bit cx2341x snd_seq_oss uhci_hcd snd_seq_midi_event ehci_hcd v4l2_common i2c_i801 snd_seq videodev snd_timer v4l1_compat v4l2_compat_ioctl32 snd_seq_device tveeprom snd floppy sky2 usbcore soundcore snd_page_alloc [last unloaded: nvidia]
[ 5768.794811] Pid: 4, comm: ksoftirqd/0 Tainted: P 2.6.32.2 #9 P5W DH Deluxe
[ 5768.801019] RIP: 0010:[<ffffffffa003d46f>] [<ffffffffa003d46f>] sky2_xmit_frame+0x321/0x5d8 [sky2]
[ 5768.808600] RSP: 0018:ffff880001603df8 EFLAGS: 00010206
[ 5768.817679] RAX: 00000000000002b0 RBX: ffff8800bd184540 RCX: 0000000000000ac0
[ 5768.822147] RDX: 0000000000000000 RSI: 000000000000008c RDI: 0000000000000ac0
[ 5768.831325] RBP: ffff880001603e48 R08: 0000000000000001 R09: 0000000000000000
[ 5768.835840] R10: 000000000000001e R11: 0000000000000d7f R12: ffff880006a40ec8
[ 5768.844917] R13: ffff8800be922e00 R14: 0000000000560056 R15: 000000009553807e
[ 5768.853995] FS: 0000000000000000(0000) GS:ffff880001600000(0000) knlGS:0000000000000000
[ 5768.859584] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 5768.867708] CR2: 0000000000000ad0 CR3: 00000000ba8d8000 CR4: 00000000000026f0
[ 5768.872155] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5768.881235] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 5768.888900] Process ksoftirqd/0 (pid: 4, threadinfo ffff8800bf8b4000, task ffff8800bf8a8650)
[ 5768.895027] Stack:
[ 5768.899363] ffff88004c485ec0 ffff88009553807e ffff8800bd184000 0000004281229811
[ 5768.904093] <0> ffff880001603e48 ffff880006a40ec8 ffff88004c485ec0 ffffffff813ead30
[ 5768.913174] <0> ffff8800bd184000 ffff8800beb1dec0 ffff880001603e98 ffffffff81230fa0
[ 5768.922327] Call Trace:
[ 5768.922327] <IRQ>
[ 5768.926596] [<ffffffff81230fa0>] dev_hard_start_xmit+0x21c/0x2b7
[ 5768.931393] [<ffffffff8123f422>] sch_direct_xmit+0x5e/0x154
[ 5768.935678] [<ffffffff8123f5d4>] __qdisc_run+0xbc/0xd5
[ 5768.944755] [<ffffffff8122ecd1>] net_tx_action+0xbb/0x10e
[ 5768.949643] [<ffffffff8103d292>] __do_softirq+0x91/0x11b
[ 5768.956054] [<ffffffff8100be9c>] call_softirq+0x1c/0x28
[ 5768.958733] <EOI>
[ 5768.963566] [<ffffffff8100d907>] do_softirq+0x33/0x6b
[ 5768.967800] [<ffffffff8103cd9a>] ksoftirqd+0x60/0xd7
[ 5768.972644] [<ffffffff8103cd3a>] ? ksoftirqd+0x0/0xd7
[ 5768.976939] [<ffffffff8104a563>] kthread+0x7a/0x82
[ 5768.981727] [<ffffffff8100bd9a>] child_rip+0xa/0x20
[ 5768.986004] [<ffffffff8104a4e9>] ? kthread+0x0/0x82
[ 5768.990805] [<ffffffff8100bd90>] ? child_rip+0x0/0x20
[ 5768.995081] Code: 06 00 00 00 00 89 08 66 c7 40 04 00 00 c6 40 06 01 c6 40 07 9f 41 0f b7 c6 48 89 c7 48 c1 e0 03 48 c1 e7 05 48 89 f9 48 03 4b 20 <4c> 89 79 10 48 c7 41 08 01 00 00 00 8b 75 cc 89 71 18 48 03 7b
[ 5769.018044] RIP [<ffffffffa003d46f>] sky2_xmit_frame+0x321/0x5d8 [sky2]
[ 5769.025816] RSP <ffff880001603df8>
[ 5769.027123] CR2: 0000000000000ad0
[ 5769.033031] ---[ end trace 90bf20a10331c8d8 ]---
[ 5769.037702] Kernel panic - not syncing: Fatal exception in interrupt
[ 5769.044106] Pid: 4, comm: ksoftirqd/0 Tainted: P D 2.6.32.2 #9
[ 5769.050724] Call Trace:
[ 5769.053213] <IRQ> [<ffffffff812975ed>] panic+0x75/0x11c
[ 5769.058677] [<ffffffff8100e9a7>] oops_end+0x81/0x8e
[ 5769.063712] [<ffffffff81026413>] no_context+0x1ee/0x1fd
[ 5769.069067] [<ffffffff8102ee15>] ? walk_tg_tree+0x5e/0x74
[ 5769.074605] [<ffffffff81026594>] __bad_area_nosemaphore+0x172/0x195
[ 5769.081044] [<ffffffff810265c5>] bad_area_nosemaphore+0xe/0x10
[ 5769.087032] [<ffffffff810267ff>] do_page_fault+0x114/0x252
[ 5769.092659] [<ffffffff810307de>] ? update_shares+0x26/0x57
[ 5769.098291] [<ffffffff81299bff>] page_fault+0x1f/0x30
[ 5769.103489] [<ffffffffa003d46f>] ? sky2_xmit_frame+0x321/0x5d8 [sky2]
[ 5769.110097] [<ffffffffa003d254>] ? sky2_xmit_frame+0x106/0x5d8 [sky2]
[ 5769.116706] [<ffffffff81230fa0>] dev_hard_start_xmit+0x21c/0x2b7
[ 5769.122863] [<ffffffff8123f422>] sch_direct_xmit+0x5e/0x154
[ 5769.128600] [<ffffffff8123f5d4>] __qdisc_run+0xbc/0xd5
[ 5769.133933] [<ffffffff8122ecd1>] net_tx_action+0xbb/0x10e
[ 5769.139468] [<ffffffff8103d292>] __do_softirq+0x91/0x11b
[ 5769.144929] [<ffffffff8100be9c>] call_softirq+0x1c/0x28
[ 5769.150291] <EOI> [<ffffffff8100d907>] do_softirq+0x33/0x6b
[ 5769.156125] [<ffffffff8103cd9a>] ksoftirqd+0x60/0xd7
[ 5769.161240] [<ffffffff8103cd3a>] ? ksoftirqd+0x0/0xd7
[ 5769.166430] [<ffffffff8104a563>] kthread+0x7a/0x82
[ 5769.171370] [<ffffffff8100bd9a>] child_rip+0xa/0x20
[ 5769.176398] [<ffffffff8104a4e9>] ? kthread+0x0/0x82
[ 5769.181416] [<ffffffff8100bd90>] ? child_rip+0x0/0x20
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-01 3:06 ` Berck E. Nash
@ 2010-01-01 6:42 ` Stephen Hemminger
0 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2010-01-01 6:42 UTC (permalink / raw)
To: Berck E. Nash
Cc: Mike McCormack, Jarek Poplawski, Stephen Hemminger, netdev,
dhazelton, mbreuer
Berck E. Nash wrote:
> Well, that didn't fix it. Oops attached, looks pretty much the same to me.
>
>
Not surprised really. What Mike saw is a real race, but unlikely because
by the time timeout
happens the device is going to be quiet (no more traffic).
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 23:51 ` Mike McCormack
2010-01-01 3:06 ` Berck E. Nash
@ 2010-01-01 18:31 ` Jarek Poplawski
2010-01-04 2:44 ` Berck E. Nash
1 sibling, 1 reply; 20+ messages in thread
From: Jarek Poplawski @ 2010-01-01 18:31 UTC (permalink / raw)
To: Mike McCormack; +Cc: Stephen Hemminger, netdev, flyboy, dhazelton, mbreuer
On Fri, Jan 01, 2010 at 08:51:23AM +0900, Mike McCormack wrote:
> Another way to solve the problem would be to take the transmit lock in
> netif_device_detach() to make sure that any in progress transmits have
> completed before returning.
Yes, it seems this lock might be needed around this place, but I'd
like to check another idea: if it's not about awakening too early.
Berck, could you try this patch?
Thanks,
Jarek P.
---
drivers/net/sky2.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 1c01b96..31f1629 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1844,7 +1844,7 @@ static void sky2_tx_complete(struct sky2_port *sky2, u16 done)
sky2->tx_cons = idx;
smp_mb();
- if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
+ if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 && netif_device_present(dev))
netif_wake_queue(dev);
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-01 18:31 ` Jarek Poplawski
@ 2010-01-04 2:44 ` Berck E. Nash
2010-01-04 13:49 ` [PATCH] sky2: Fix oops in sky2_xmit_frame() after TX timeout Jarek Poplawski
0 siblings, 1 reply; 20+ messages in thread
From: Berck E. Nash @ 2010-01-04 2:44 UTC (permalink / raw)
To: Jarek Poplawski
Cc: Mike McCormack, Stephen Hemminger, netdev, dhazelton, mbreuer
Jarek Poplawski wrote:
> Yes, it seems this lock might be needed around this place, but I'd
> like to check another idea: if it's not about awakening too early.
> Berck, could you try this patch?
Okay, after running with that for several days I have not gotten the
oops. That doesn't mean that I won't tomorrow, but I've gotten several
of these:
[45621.704025] sky2 eth0: hung mac 124:21 fifo 195 (127:122)
[45621.704027] sky2 eth0: receiver hang detected
[45621.708524] sky2 eth0: disabling interface
[45621.715229] sky2 eth0: enabling interface
[45624.862111] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
control both
[61024.704036] sky2 eth0: hung mac 124:75 fifo 195 (133:128)
[61024.704039] sky2 eth0: receiver hang detected
[61024.708487] sky2 eth0: disabling interface
[61024.714791] sky2 eth0: enabling interface
[61027.864288] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
control both
And it hasn't crashed. The "receiver hang detected" would often (but
not always) be followed by the oops before.
Berck
> ---
>
> drivers/net/sky2.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index 1c01b96..31f1629 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -1844,7 +1844,7 @@ static void sky2_tx_complete(struct sky2_port *sky2, u16 done)
> sky2->tx_cons = idx;
> smp_mb();
>
> - if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
> + if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 && netif_device_present(dev))
> netif_wake_queue(dev);
> }
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] sky2: Fix oops in sky2_xmit_frame() after TX timeout
2010-01-04 2:44 ` Berck E. Nash
@ 2010-01-04 13:49 ` Jarek Poplawski
2010-01-04 18:26 ` Stephen Hemminger
0 siblings, 1 reply; 20+ messages in thread
From: Jarek Poplawski @ 2010-01-04 13:49 UTC (permalink / raw)
To: David Miller
Cc: Berck E. Nash, Mike McCormack, Stephen Hemminger, netdev,
dhazelton, mbreuer
On Sun, Jan 03, 2010 at 07:44:28PM -0700, Berck E. Nash wrote:
> Jarek Poplawski wrote:
> > Yes, it seems this lock might be needed around this place, but I'd
> > like to check another idea: if it's not about awakening too early.
> > Berck, could you try this patch?
>
> Okay, after running with that for several days I have not gotten the
> oops. That doesn't mean that I won't tomorrow, but I've gotten several
> of these:
>
> [45621.704025] sky2 eth0: hung mac 124:21 fifo 195 (127:122)
> [45621.704027] sky2 eth0: receiver hang detected
> [45621.708524] sky2 eth0: disabling interface
> [45621.715229] sky2 eth0: enabling interface
> [45624.862111] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
> control both
> [61024.704036] sky2 eth0: hung mac 124:75 fifo 195 (133:128)
> [61024.704039] sky2 eth0: receiver hang detected
> [61024.708487] sky2 eth0: disabling interface
> [61024.714791] sky2 eth0: enabling interface
> [61027.864288] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
> control both
>
> And it hasn't crashed. The "receiver hang detected" would often (but
> not always) be followed by the oops before.
>
> Berck
OK, here it is with some cosmetics; let David decide if it needs more
testing.
Thanks everybody,
Jarek P.
------------->
During TX timeout procedure dev could be awaken too early, e.g. by
sky2_complete_tx() called from sky2_down(). Then sky2_xmit_frame()
can run while buffers are freed causing an oops. This patch fixes it
by adding netif_device_present() test in sky2_tx_complete().
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14925
With debugging by: Mike McCormack <mikem@ring3k.org>
Reported-by: Berck E. Nash <flyboy@gmail.com>
Tested-by: Berck E. Nash <flyboy@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---
drivers/net/sky2.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 1c01b96..2f32fab 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1844,7 +1844,9 @@ static void sky2_tx_complete(struct sky2_port *sky2, u16 done)
sky2->tx_cons = idx;
smp_mb();
- if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
+ /* Wake unless it's detached, and called e.g. from sky2_down() */
+ if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 &&
+ likely(netif_device_present(dev)))
netif_wake_queue(dev);
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Fix oops in sky2_xmit_frame() after TX timeout
2010-01-04 13:49 ` [PATCH] sky2: Fix oops in sky2_xmit_frame() after TX timeout Jarek Poplawski
@ 2010-01-04 18:26 ` Stephen Hemminger
2010-01-04 18:48 ` [PATCH v2] " Jarek Poplawski
0 siblings, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2010-01-04 18:26 UTC (permalink / raw)
To: Jarek Poplawski
Cc: David Miller, Berck E. Nash, Mike McCormack, netdev, dhazelton,
mbreuer
On Mon, 4 Jan 2010 13:49:04 +0000
Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Sun, Jan 03, 2010 at 07:44:28PM -0700, Berck E. Nash wrote:
> > Jarek Poplawski wrote:
> > > Yes, it seems this lock might be needed around this place, but I'd
> > > like to check another idea: if it's not about awakening too early.
> > > Berck, could you try this patch?
> >
> > Okay, after running with that for several days I have not gotten the
> > oops. That doesn't mean that I won't tomorrow, but I've gotten several
> > of these:
> >
> > [45621.704025] sky2 eth0: hung mac 124:21 fifo 195 (127:122)
> > [45621.704027] sky2 eth0: receiver hang detected
> > [45621.708524] sky2 eth0: disabling interface
> > [45621.715229] sky2 eth0: enabling interface
> > [45624.862111] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
> > control both
> > [61024.704036] sky2 eth0: hung mac 124:75 fifo 195 (133:128)
> > [61024.704039] sky2 eth0: receiver hang detected
> > [61024.708487] sky2 eth0: disabling interface
> > [61024.714791] sky2 eth0: enabling interface
> > [61027.864288] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow
> > control both
> >
> > And it hasn't crashed. The "receiver hang detected" would often (but
> > not always) be followed by the oops before.
> >
> > Berck
>
> OK, here it is with some cosmetics; let David decide if it needs more
> testing.
>
> Thanks everybody,
> Jarek P.
> ------------->
>
> During TX timeout procedure dev could be awaken too early, e.g. by
> sky2_complete_tx() called from sky2_down(). Then sky2_xmit_frame()
> can run while buffers are freed causing an oops. This patch fixes it
> by adding netif_device_present() test in sky2_tx_complete().
>
> Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14925
>
> With debugging by: Mike McCormack <mikem@ring3k.org>
>
> Reported-by: Berck E. Nash <flyboy@gmail.com>
> Tested-by: Berck E. Nash <flyboy@gmail.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
>
> ---
>
> drivers/net/sky2.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index 1c01b96..2f32fab 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -1844,7 +1844,9 @@ static void sky2_tx_complete(struct sky2_port *sky2, u16 done)
> sky2->tx_cons = idx;
> smp_mb();
>
> - if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
> + /* Wake unless it's detached, and called e.g. from sky2_down() */
> + if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 &&
> + likely(netif_device_present(dev)))
> netif_wake_queue(dev);
> }
>
The likely() seems unnecessary noise here. It can't possibly matter that
much.
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] sky2: Fix oops in sky2_xmit_frame() after TX timeout
2010-01-04 18:26 ` Stephen Hemminger
@ 2010-01-04 18:48 ` Jarek Poplawski
0 siblings, 0 replies; 20+ messages in thread
From: Jarek Poplawski @ 2010-01-04 18:48 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Miller, Berck E. Nash, Mike McCormack, netdev, dhazelton,
mbreuer
On Mon, Jan 04, 2010 at 10:26:37AM -0800, Stephen Hemminger wrote:
> > - if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
> > + /* Wake unless it's detached, and called e.g. from sky2_down() */
> > + if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 &&
> > + likely(netif_device_present(dev)))
> > netif_wake_queue(dev);
> > }
> >
>
> The likely() seems unnecessary noise here. It can't possibly matter that
> much.
OK. Thanks,
Jarek P.
------------> (take 2)
During TX timeout procedure dev could be awoken too early, e.g. by
sky2_complete_tx() called from sky2_down(). Then sky2_xmit_frame()
can run while buffers are freed causing an oops. This patch fixes it
by adding netif_device_present() test in sky2_tx_complete().
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14925
With debugging by: Mike McCormack <mikem@ring3k.org>
Reported-by: Berck E. Nash <flyboy@gmail.com>
Tested-by: Berck E. Nash <flyboy@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---
drivers/net/sky2.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 1c01b96..7650f73 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1844,7 +1844,8 @@ static void sky2_tx_complete(struct sky2_port *sky2, u16 done)
sky2->tx_cons = idx;
smp_mb();
- if (tx_avail(sky2) > MAX_SKB_TX_LE + 4)
+ /* Wake unless it's detached, and called e.g. from sky2_down() */
+ if (tx_avail(sky2) > MAX_SKB_TX_LE + 4 && netif_device_present(dev))
netif_wake_queue(dev);
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2009-12-31 10:55 [PATCH] sky2: Lock transmit queue while disabling device Mike McCormack
` (2 preceding siblings ...)
2009-12-31 18:51 ` Jarek Poplawski
@ 2010-01-07 4:27 ` David Miller
2010-01-07 6:35 ` Jarek Poplawski
3 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2010-01-07 4:27 UTC (permalink / raw)
To: mikem; +Cc: shemminger, netdev, flyboy, dhazelton, mbreuer, "netdev
From: Mike McCormack <mikem@ring3k.org>
Date: Thu, 31 Dec 2009 19:55:31 +0900
>
> netif_device_detach() does not take the tx_lock, so it's
> possible that a call to sky2_xmit_frame is still in
> progress after netif_device_detach() is complete.
>
> Take netif_tx_lock() to make sure all transmits have
> stopped while we're disabling the devices and that
> no other CPU is still transmitting a frame after
> we've disabling the device.
>
> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>
> Signed-off-by: Mike McCormack <mikem@ring3k.org>
Applied to net-next-2.6
Stephen has asked for some further refinements, once that is
all sorted we can think about backporting this to net-2.6
and -stable if necessary.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 4:27 ` [PATCH] sky2: Lock transmit queue while disabling device David Miller
@ 2010-01-07 6:35 ` Jarek Poplawski
2010-01-07 8:01 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Jarek Poplawski @ 2010-01-07 6:35 UTC (permalink / raw)
To: David Miller; +Cc: mikem, shemminger, netdev, flyboy, dhazelton, mbreuer
On 07-01-2010 05:27, David Miller wrote:
> From: Mike McCormack <mikem@ring3k.org>
> Date: Thu, 31 Dec 2009 19:55:31 +0900
>
>> netif_device_detach() does not take the tx_lock, so it's
>> possible that a call to sky2_xmit_frame is still in
>> progress after netif_device_detach() is complete.
>>
>> Take netif_tx_lock() to make sure all transmits have
>> stopped while we're disabling the devices and that
>> no other CPU is still transmitting a frame after
>> we've disabling the device.
>>
>> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>>
>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
>
> Applied to net-next-2.6
>
> Stephen has asked for some further refinements, once that is
> all sorted we can think about backporting this to net-2.6
> and -stable if necessary.
>
David, I'm not sure you chose the right (working) patch from this
thread. Please, reconsider this:
Subject: [PATCH v2] sky2: Fix oops in sky2_xmit_frame() after TX timeout
Date: Mon, 4 Jan 2010 19:48:41 +0100
http://permalink.gmane.org/gmane.linux.network/148160
Thanks,
Jarek P.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 6:35 ` Jarek Poplawski
@ 2010-01-07 8:01 ` David Miller
2010-01-07 8:15 ` Jarek Poplawski
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2010-01-07 8:01 UTC (permalink / raw)
To: jarkao2; +Cc: mikem, shemminger, netdev, flyboy, dhazelton, mbreuer
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Thu, 7 Jan 2010 06:35:46 +0000
> On 07-01-2010 05:27, David Miller wrote:
>> From: Mike McCormack <mikem@ring3k.org>
>> Date: Thu, 31 Dec 2009 19:55:31 +0900
>>
>>> netif_device_detach() does not take the tx_lock, so it's
>>> possible that a call to sky2_xmit_frame is still in
>>> progress after netif_device_detach() is complete.
>>>
>>> Take netif_tx_lock() to make sure all transmits have
>>> stopped while we're disabling the devices and that
>>> no other CPU is still transmitting a frame after
>>> we've disabling the device.
>>>
>>> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
>>>
>>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
>>
>> Applied to net-next-2.6
>>
>> Stephen has asked for some further refinements, once that is
>> all sorted we can think about backporting this to net-2.6
>> and -stable if necessary.
>>
>
> David, I'm not sure you chose the right (working) patch from this
> thread. Please, reconsider this:
> Subject: [PATCH v2] sky2: Fix oops in sky2_xmit_frame() after TX timeout
> Date: Mon, 4 Jan 2010 19:48:41 +0100
> http://permalink.gmane.org/gmane.linux.network/148160
These are two seperate crashes and two seperate sets of patches.
Mike's patch fixes crashes due to races when bringing the
device down or suspending it.
The other patch you reference is handling something different,
a crash that happens while the device is staying up.
Right?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 8:01 ` David Miller
@ 2010-01-07 8:15 ` Jarek Poplawski
2010-01-07 8:19 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Jarek Poplawski @ 2010-01-07 8:15 UTC (permalink / raw)
To: David Miller; +Cc: mikem, shemminger, netdev, flyboy, dhazelton, mbreuer
On Thu, Jan 07, 2010 at 12:01:11AM -0800, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Thu, 7 Jan 2010 06:35:46 +0000
>
> > On 07-01-2010 05:27, David Miller wrote:
> >> From: Mike McCormack <mikem@ring3k.org>
> >> Date: Thu, 31 Dec 2009 19:55:31 +0900
> >>
> >>> netif_device_detach() does not take the tx_lock, so it's
> >>> possible that a call to sky2_xmit_frame is still in
> >>> progress after netif_device_detach() is complete.
> >>>
> >>> Take netif_tx_lock() to make sure all transmits have
> >>> stopped while we're disabling the devices and that
> >>> no other CPU is still transmitting a frame after
> >>> we've disabling the device.
> >>>
> >>> Proposed fix for "sky2 panic under load" reported by Berck E. Nash.
> >>>
> >>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
> >>
> >> Applied to net-next-2.6
> >>
> >> Stephen has asked for some further refinements, once that is
> >> all sorted we can think about backporting this to net-2.6
> >> and -stable if necessary.
> >>
> >
> > David, I'm not sure you chose the right (working) patch from this
> > thread. Please, reconsider this:
> > Subject: [PATCH v2] sky2: Fix oops in sky2_xmit_frame() after TX timeout
> > Date: Mon, 4 Jan 2010 19:48:41 +0100
> > http://permalink.gmane.org/gmane.linux.network/148160
>
> These are two seperate crashes and two seperate sets of patches.
>
> Mike's patch fixes crashes due to races when bringing the
> device down or suspending it.
>
> The other patch you reference is handling something different,
> a crash that happens while the device is staying up.
>
> Right?
Maybe right, but Mike's patch changelog above writes:
"Proposed fix for "sky2 panic under load" reported by Berck E. Nash.",
and maybe I missed something, but I can't find any report which
acknowledges this patch fixed this or down/suspending problem.
Jarek P.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 8:15 ` Jarek Poplawski
@ 2010-01-07 8:19 ` David Miller
2010-01-07 13:48 ` Stephen Hemminger
2010-01-07 18:08 ` Stephen Hemminger
0 siblings, 2 replies; 20+ messages in thread
From: David Miller @ 2010-01-07 8:19 UTC (permalink / raw)
To: jarkao2; +Cc: mikem, shemminger, netdev, flyboy, dhazelton, mbreuer
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Thu, 7 Jan 2010 08:15:01 +0000
> Maybe right, but Mike's patch changelog above writes:
> "Proposed fix for "sky2 panic under load" reported by Berck E. Nash.",
> and maybe I missed something, but I can't find any report which
> acknowledges this patch fixed this or down/suspending problem.
Certainly by my reading Mike's patch does in fact fix a race
between device detach and the TX handler, so regardless of
whether it fixes actual reported user problems it should
at a minimum go into net-next-2.6
And that's _exactly_ what I did, and how I analyzed his work.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 8:19 ` David Miller
@ 2010-01-07 13:48 ` Stephen Hemminger
2010-01-07 18:08 ` Stephen Hemminger
1 sibling, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2010-01-07 13:48 UTC (permalink / raw)
To: David Miller; +Cc: jarkao2, mikem, netdev, flyboy, dhazelton, mbreuer
On Thu, 07 Jan 2010 00:19:23 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Thu, 7 Jan 2010 08:15:01 +0000
>
> > Maybe right, but Mike's patch changelog above writes:
> > "Proposed fix for "sky2 panic under load" reported by Berck E. Nash.",
> > and maybe I missed something, but I can't find any report which
> > acknowledges this patch fixed this or down/suspending problem.
>
> Certainly by my reading Mike's patch does in fact fix a race
> between device detach and the TX handler, so regardless of
> whether it fixes actual reported user problems it should
> at a minimum go into net-next-2.6
>
> And that's _exactly_ what I did, and how I analyzed his work.
No objections, thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sky2: Lock transmit queue while disabling device
2010-01-07 8:19 ` David Miller
2010-01-07 13:48 ` Stephen Hemminger
@ 2010-01-07 18:08 ` Stephen Hemminger
1 sibling, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2010-01-07 18:08 UTC (permalink / raw)
To: David Miller; +Cc: jarkao2, mikem, netdev, flyboy, dhazelton, mbreuer
On Thu, 07 Jan 2010 00:19:23 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Thu, 7 Jan 2010 08:15:01 +0000
>
> > Maybe right, but Mike's patch changelog above writes:
> > "Proposed fix for "sky2 panic under load" reported by Berck E. Nash.",
> > and maybe I missed something, but I can't find any report which
> > acknowledges this patch fixed this or down/suspending problem.
>
> Certainly by my reading Mike's patch does in fact fix a race
> between device detach and the TX handler, so regardless of
> whether it fixes actual reported user problems it should
> at a minimum go into net-next-2.6
>
> And that's _exactly_ what I did, and how I analyzed his work.
No objections, thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-01-07 18:08 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-31 10:55 [PATCH] sky2: Lock transmit queue while disabling device Mike McCormack
2009-12-31 15:58 ` Michael Breuer
2009-12-31 16:15 ` Daniel Hazelton
2009-12-31 16:33 ` Berck Nash
2009-12-31 18:51 ` Jarek Poplawski
2009-12-31 23:51 ` Mike McCormack
2010-01-01 3:06 ` Berck E. Nash
2010-01-01 6:42 ` Stephen Hemminger
2010-01-01 18:31 ` Jarek Poplawski
2010-01-04 2:44 ` Berck E. Nash
2010-01-04 13:49 ` [PATCH] sky2: Fix oops in sky2_xmit_frame() after TX timeout Jarek Poplawski
2010-01-04 18:26 ` Stephen Hemminger
2010-01-04 18:48 ` [PATCH v2] " Jarek Poplawski
2010-01-07 4:27 ` [PATCH] sky2: Lock transmit queue while disabling device David Miller
2010-01-07 6:35 ` Jarek Poplawski
2010-01-07 8:01 ` David Miller
2010-01-07 8:15 ` Jarek Poplawski
2010-01-07 8:19 ` David Miller
2010-01-07 13:48 ` Stephen Hemminger
2010-01-07 18:08 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).