* 2.6.23-rc4-mm1: e1000e napi lockup
@ 2007-09-07 7:19 Jiri Slaby
2007-09-07 8:03 ` David Miller
2007-09-09 9:58 ` Jiri Slaby
0 siblings, 2 replies; 11+ messages in thread
From: Jiri Slaby @ 2007-09-07 7:19 UTC (permalink / raw)
To: Andrew Morton; +Cc: netdev, e1000-devel, Auke Kok, David S. Miller
Hi,
I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
regards,
--
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 7:19 2.6.23-rc4-mm1: e1000e napi lockup Jiri Slaby
@ 2007-09-07 8:03 ` David Miller
2007-09-07 16:24 ` Kok, Auke
2007-09-09 9:58 ` Jiri Slaby
1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2007-09-07 8:03 UTC (permalink / raw)
To: jirislaby; +Cc: akpm, netdev, e1000-devel, auke-jan.h.kok
From: Jiri Slaby <jirislaby@gmail.com>
Date: Fri, 07 Sep 2007 09:19:30 +0200
> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
Yes, the semantics changed slightly in the net-2.6.24 tree the
other week and someone needs to fix it up.
The netif_napi_add() implicitly does a napi_disable() call. Device
open must explicitly napi_enable() and device close must explicitly
napi_disable(), and if done elsewhere these calls must be strictly
balanced.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 8:03 ` David Miller
@ 2007-09-07 16:24 ` Kok, Auke
2007-09-07 23:31 ` Jeff Garzik
0 siblings, 1 reply; 11+ messages in thread
From: Kok, Auke @ 2007-09-07 16:24 UTC (permalink / raw)
To: David Miller, jirislaby; +Cc: akpm, netdev, e1000-devel
David Miller wrote:
> From: Jiri Slaby <jirislaby@gmail.com>
> Date: Fri, 07 Sep 2007 09:19:30 +0200
>
>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
>
> Yes, the semantics changed slightly in the net-2.6.24 tree the
> other week and someone needs to fix it up.
>
> The netif_napi_add() implicitly does a napi_disable() call. Device
> open must explicitly napi_enable() and device close must explicitly
> napi_disable(), and if done elsewhere these calls must be strictly
> balanced.
I'll fix it... it's my patch that adds the new napi code to it and I need to get
it ready for the merge window anyway.
thanks for testing.
Auke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 16:24 ` Kok, Auke
@ 2007-09-07 23:31 ` Jeff Garzik
2007-09-07 23:40 ` Kok, Auke
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Garzik @ 2007-09-07 23:31 UTC (permalink / raw)
To: David Miller; +Cc: e1000-devel, netdev, Kok, Auke, jirislaby, akpm
Kok, Auke wrote:
> David Miller wrote:
>> From: Jiri Slaby <jirislaby@gmail.com>
>> Date: Fri, 07 Sep 2007 09:19:30 +0200
>>
>>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e
>>> driver.
>>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
>>
>> Yes, the semantics changed slightly in the net-2.6.24 tree the
>> other week and someone needs to fix it up.
>>
>> The netif_napi_add() implicitly does a napi_disable() call. Device
>> open must explicitly napi_enable() and device close must explicitly
>> napi_disable(), and if done elsewhere these calls must be strictly
>> balanced.
>
> I'll fix it... it's my patch that adds the new napi code to it and I
> need to get it ready for the merge window anyway.
well.... since its close to the merge window opening, we could see what
happens if DaveM pulls branch 'upstream' of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
That should make this class of pre-merge-window annoyance go away.
Jeff
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 23:31 ` Jeff Garzik
@ 2007-09-07 23:40 ` Kok, Auke
2007-09-07 23:49 ` Jeff Garzik
0 siblings, 1 reply; 11+ messages in thread
From: Kok, Auke @ 2007-09-07 23:40 UTC (permalink / raw)
To: Jeff Garzik; +Cc: e1000-devel, netdev, akpm, David Miller, jirislaby
Jeff Garzik wrote:
> Kok, Auke wrote:
>> David Miller wrote:
>>> From: Jiri Slaby <jirislaby@gmail.com>
>>> Date: Fri, 07 Sep 2007 09:19:30 +0200
>>>
>>>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e
>>>> driver.
>>>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
>>> Yes, the semantics changed slightly in the net-2.6.24 tree the
>>> other week and someone needs to fix it up.
>>>
>>> The netif_napi_add() implicitly does a napi_disable() call. Device
>>> open must explicitly napi_enable() and device close must explicitly
>>> napi_disable(), and if done elsewhere these calls must be strictly
>>> balanced.
>> I'll fix it... it's my patch that adds the new napi code to it and I
>> need to get it ready for the merge window anyway.
>
> well.... since its close to the merge window opening, we could see what
> happens if DaveM pulls branch 'upstream' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
>
> That should make this class of pre-merge-window annoyance go away.
If I do that now I get a big merge conflict:
$ git-pull . net-2.6.24
100% (22583/22583) done
Removed Documentation/networking/NAPI_HOWTO.txt
Auto-merged drivers/net/8139cp.c
Auto-merged drivers/net/8139too.c
CONFLICT (content): Merge conflict in drivers/net/8139too.c
Auto-merged drivers/net/Kconfig
Auto-merged drivers/net/Makefile
Auto-merged drivers/net/cxgb3/cxgb3_main.c
Auto-merged drivers/net/cxgb3/sge.c
Auto-merged drivers/net/fs_enet/fs_enet-main.c
Auto-merged drivers/net/gianfar.h
Auto-merged drivers/net/ibmveth.c
CONFLICT (content): Merge conflict in drivers/net/ibmveth.c
Auto-merged drivers/net/ibmveth.h
Auto-merged drivers/net/myri10ge/myri10ge.c
Auto-merged drivers/net/netxen/netxen_nic_main.c
Auto-merged drivers/net/pasemi_mac.c
CONFLICT (content): Merge conflict in drivers/net/pasemi_mac.c
Auto-merged drivers/net/pasemi_mac.h
Auto-merged drivers/net/pcnet32.c
Auto-merged drivers/net/ps3_gelic_net.c
Auto-merged drivers/net/qla3xxx.c
Auto-merged drivers/net/s2io.c
CONFLICT (content): Merge conflict in drivers/net/s2io.c
Auto-merged drivers/net/s2io.h
Auto-merged drivers/net/sb1250-mac.c
Auto-merged drivers/net/sky2.c
Auto-merged drivers/net/sky2.h
Auto-merged drivers/net/tsi108_eth.c
Auto-merged drivers/net/wireless/rtl8187_dev.c
Auto-merged drivers/net/xen-netfront.c
Removed include/net/tcp_ecn.h
Automatic merge failed; fix conflicts and then commit the result.
for e1000e the fixup is just a 1-line change from the previous -mm napi fixup
patch, so I don't really care (it's peanuts), but people might want to start
looking into the above conflicts ;)
Cheers,
Auke
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 23:40 ` Kok, Auke
@ 2007-09-07 23:49 ` Jeff Garzik
2007-09-07 23:52 ` Kok, Auke
0 siblings, 1 reply; 11+ messages in thread
From: Jeff Garzik @ 2007-09-07 23:49 UTC (permalink / raw)
To: Kok, Auke; +Cc: e1000-devel, netdev, akpm, David Miller, jirislaby
Kok, Auke wrote:
> Jeff Garzik wrote:
>> Kok, Auke wrote:
>>> David Miller wrote:
>>>> From: Jiri Slaby <jirislaby@gmail.com>
>>>> Date: Fri, 07 Sep 2007 09:19:30 +0200
>>>>
>>>>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e
>>>>> driver.
>>>>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on
>>>>> boot.
>>>> Yes, the semantics changed slightly in the net-2.6.24 tree the
>>>> other week and someone needs to fix it up.
>>>>
>>>> The netif_napi_add() implicitly does a napi_disable() call. Device
>>>> open must explicitly napi_enable() and device close must explicitly
>>>> napi_disable(), and if done elsewhere these calls must be strictly
>>>> balanced.
>>> I'll fix it... it's my patch that adds the new napi code to it and I
>>> need to get it ready for the merge window anyway.
>>
>> well.... since its close to the merge window opening, we could see
>> what happens if DaveM pulls branch 'upstream' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
>>
>> That should make this class of pre-merge-window annoyance go away.
>
> If I do that now I get a big merge conflict:
oh you are _guaranteed_ conflicts. most of that is NAPI-area code that
got changed by both.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 23:49 ` Jeff Garzik
@ 2007-09-07 23:52 ` Kok, Auke
0 siblings, 0 replies; 11+ messages in thread
From: Kok, Auke @ 2007-09-07 23:52 UTC (permalink / raw)
To: Jeff Garzik; +Cc: David Miller, jirislaby, akpm, netdev, e1000-devel
Jeff Garzik wrote:
> Kok, Auke wrote:
>> Jeff Garzik wrote:
>>> Kok, Auke wrote:
>>>> David Miller wrote:
>>>>> From: Jiri Slaby <jirislaby@gmail.com>
>>>>> Date: Fri, 07 Sep 2007 09:19:30 +0200
>>>>>
>>>>>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e
>>>>>> driver.
>>>>>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on
>>>>>> boot.
>>>>> Yes, the semantics changed slightly in the net-2.6.24 tree the
>>>>> other week and someone needs to fix it up.
>>>>>
>>>>> The netif_napi_add() implicitly does a napi_disable() call. Device
>>>>> open must explicitly napi_enable() and device close must explicitly
>>>>> napi_disable(), and if done elsewhere these calls must be strictly
>>>>> balanced.
>>>> I'll fix it... it's my patch that adds the new napi code to it and I
>>>> need to get it ready for the merge window anyway.
>>> well.... since its close to the merge window opening, we could see
>>> what happens if DaveM pulls branch 'upstream' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
>>>
>>> That should make this class of pre-merge-window annoyance go away.
>> If I do that now I get a big merge conflict:
>
> oh you are _guaranteed_ conflicts. most of that is NAPI-area code that
> got changed by both.
actually that's the only thing it was, and fixing it up was trivial (took me
about 3 minutes). it was 3x the napi code and once a struct indent change...
I'll have a new e1000e napi patch for andrew in a sec.
Auke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-07 7:19 2.6.23-rc4-mm1: e1000e napi lockup Jiri Slaby
2007-09-07 8:03 ` David Miller
@ 2007-09-09 9:58 ` Jiri Slaby
2007-09-09 22:50 ` Kok, Auke
1 sibling, 1 reply; 11+ messages in thread
From: Jiri Slaby @ 2007-09-09 9:58 UTC (permalink / raw)
Cc: Andrew Morton, netdev, e1000-devel, Auke Kok, David S. Miller
On 09/07/2007 09:19 AM, Jiri Slaby wrote:
> Hi,
>
> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
Ok, after these changes:
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index c1c64e2..f8ec537 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -1693,10 +1693,7 @@ quit_polling:
if (adapter->itr_setting & 3)
e1000_set_itr(adapter);
netif_rx_complete(poll_dev, napi);
- if (test_bit(__E1000_DOWN, &adapter->state))
- atomic_dec(&adapter->irq_sem);
- else
- e1000_irq_enable(adapter);
+ e1000_irq_enable(adapter);
return 0;
}
@@ -4257,7 +4254,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
/* tell the stack to leave us alone until e1000_open() is called */
netif_carrier_off(netdev);
netif_stop_queue(netdev);
- napi_disable(&adapter->napi);
strcpy(netdev->name, "eth%d");
err = register_netdev(netdev);
I still have problems with the driver. When I do `ip link set eth0 up', ksoftirq
runs with 100 % cpu time, so I think you endlessly re-schedule some timer (or
the new napi layer?)
regards,
--
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-09 9:58 ` Jiri Slaby
@ 2007-09-09 22:50 ` Kok, Auke
2007-09-10 6:24 ` Jiri Slaby
0 siblings, 1 reply; 11+ messages in thread
From: Kok, Auke @ 2007-09-09 22:50 UTC (permalink / raw)
To: Jiri Slaby
Cc: no To-header on input, Andrew Morton, netdev, e1000-devel,
Auke Kok, David S. Miller
Jiri Slaby wrote:
> On 09/07/2007 09:19 AM, Jiri Slaby wrote:
>> Hi,
>>
>> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
>> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
>
> Ok, after these changes:
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index c1c64e2..f8ec537 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -1693,10 +1693,7 @@ quit_polling:
> if (adapter->itr_setting & 3)
> e1000_set_itr(adapter);
> netif_rx_complete(poll_dev, napi);
> - if (test_bit(__E1000_DOWN, &adapter->state))
> - atomic_dec(&adapter->irq_sem);
> - else
> - e1000_irq_enable(adapter);
> + e1000_irq_enable(adapter);
> return 0;
> }
>
> @@ -4257,7 +4254,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
> /* tell the stack to leave us alone until e1000_open() is called */
> netif_carrier_off(netdev);
> netif_stop_queue(netdev);
> - napi_disable(&adapter->napi);
>
> strcpy(netdev->name, "eth%d");
> err = register_netdev(netdev);
>
>
> I still have problems with the driver. When I do `ip link set eth0 up', ksoftirq
> runs with 100 % cpu time, so I think you endlessly re-schedule some timer (or
> the new napi layer?)
something changed in the logic and e1000e apparently does something wrong. I'll
look into it on monday and resubmit a fixup patch (see robert olsson's mail as
well discussing this issue)
Auke
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-09 22:50 ` Kok, Auke
@ 2007-09-10 6:24 ` Jiri Slaby
2007-09-10 6:31 ` Kok, Auke
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Slaby @ 2007-09-10 6:24 UTC (permalink / raw)
To: Kok, Auke; +Cc: Andrew Morton, netdev, e1000-devel, David S. Miller
Kok, Auke napsal(a):
> Jiri Slaby wrote:
>> I still have problems with the driver. When I do `ip link set eth0
>> up', ksoftirq
>> runs with 100 % cpu time, so I think you endlessly re-schedule some
>> timer (or
>> the new napi layer?)
>
> something changed in the logic and e1000e apparently does something
> wrong. I'll look into it on monday and resubmit a fixup patch (see
> robert olsson's mail as well discussing this issue)
he's posted me a patch, but no change on my side :(.
Anyway, I'm going away of the box on monday (today). will be back on fri to
test patches :).
thanks,
--
http://www.fi.muni.cz/~xslaby/ Jiri Slaby
faculty of informatics, masaryk university, brno, cz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
2007-09-10 6:24 ` Jiri Slaby
@ 2007-09-10 6:31 ` Kok, Auke
0 siblings, 0 replies; 11+ messages in thread
From: Kok, Auke @ 2007-09-10 6:31 UTC (permalink / raw)
To: Jiri Slaby; +Cc: Andrew Morton, netdev, e1000-devel, David S. Miller
Jiri Slaby wrote:
> Kok, Auke napsal(a):
>> Jiri Slaby wrote:
>>> I still have problems with the driver. When I do `ip link set eth0
>>> up', ksoftirq
>>> runs with 100 % cpu time, so I think you endlessly re-schedule some
>>> timer (or
>>> the new napi layer?)
>> something changed in the logic and e1000e apparently does something
>> wrong. I'll look into it on monday and resubmit a fixup patch (see
>> robert olsson's mail as well discussing this issue)
>
> he's posted me a patch, but no change on my side :(.
that's what I feared/thought as well when I saw his patch - e1000e and e1000 are
slightly different so the exit condition will be as well.
> Anyway, I'm going away of the box on monday (today). will be back on fri to
> test patches :).
I'll be on top of it. Since e1000e will be merged I need to get this workign
properly before the merge window opens - no pressure :)
Auke
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-09-10 6:31 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-07 7:19 2.6.23-rc4-mm1: e1000e napi lockup Jiri Slaby
2007-09-07 8:03 ` David Miller
2007-09-07 16:24 ` Kok, Auke
2007-09-07 23:31 ` Jeff Garzik
2007-09-07 23:40 ` Kok, Auke
2007-09-07 23:49 ` Jeff Garzik
2007-09-07 23:52 ` Kok, Auke
2007-09-09 9:58 ` Jiri Slaby
2007-09-09 22:50 ` Kok, Auke
2007-09-10 6:24 ` Jiri Slaby
2007-09-10 6:31 ` Kok, Auke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).