* eHEA driver issues from net-2.6.24
@ 2007-08-22 21:55 Andrew Theurer
2007-08-22 22:03 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Theurer @ 2007-08-22 21:55 UTC (permalink / raw)
To: David S. Miller, Jan-Bernd Themann; +Cc: netdev
In testing the new NAPI improvements on ehea, I get the following:
kernel BUG at include/linux/netdevice.h:318!
cpu 0x1: Vector: 700 (Program Check) at [c00000000f613ac0]
pc: d000000000091054: .ehea_poll+0x1e8/0x1334 [ehea]
lr: c0000000003fe394: .net_rx_action+0x1b8/0x254
sp: c00000000f613d40
msr: 8000000000029032
current = 0xc0000005aff9e8b0
paca = 0xc0000000006e1300
pid = 0, comm = swapper
kernel BUG at include/linux/netdevice.h:318!
enter ? for help
[c00000000f613e40] c0000000003fe394 .net_rx_action+0x1b8/0x254
[c00000000f613ef0] c000000000057b70 .__do_softirq+0xa8/0x164
[c00000000f613f90] c000000000024438 .call_do_softirq+0x14/0x24
[c000000b8ffbf9f0] c00000000000bd30 .do_softirq+0x68/0xac
[c000000b8ffbfa80] c000000000057cc4 .irq_exit+0x54/0x6c
[c000000b8ffbfb00] c00000000000c358 .do_IRQ+0x170/0x1ac
[c000000b8ffbfb90] c000000000004780 hardware_interrupt_entry+0x18/0x98
--- Exception: 501 (Hardware Interrupt) at c000000000010bdc
.cpu_idle+0x114/0x1e0
[c000000b8ffbfe80] c000000000010bd0 .cpu_idle+0x108/0x1e0 (unreliable)
[c000000b8ffbff00] c000000000026db0 .start_secondary+0x160/0x184
[c000000b8ffbff90] c000000000008364 .start_secondary_prolog+0xc/0x10
I'm a little confused if the port_napi_enable() is being called when the
device is initialized, but then again, this is all new to me (should it
be called in ehea_open?). I see it called on some reset routines, but
not on the first initialization.
Also, on this code, in ehea_sense_port_attr()
/* Number of default QPs */
if (use_mcs)
port->num_def_qps = cb0->num_default_qps;
else
port->num_def_qps = 1;
When using napi, since we have multi-queue napi support now, wouldn't we
want to use all the default qps instead of 1?
Thanks,
-Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: eHEA driver issues from net-2.6.24
2007-08-22 21:55 eHEA driver issues from net-2.6.24 Andrew Theurer
@ 2007-08-22 22:03 ` David Miller
2007-08-22 22:20 ` Andrew Theurer
0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2007-08-22 22:03 UTC (permalink / raw)
To: habanero; +Cc: ossthema, netdev
From: Andrew Theurer <habanero@us.ibm.com>
Date: Wed, 22 Aug 2007 16:55:03 -0500
Thanks for finally getting to test this, I thought nobody
would test this until it got merged into 2.6.24 :-/
> kernel BUG at include/linux/netdevice.h:318!
> enter ? for help
> [c00000000f613e40] c0000000003fe394 .net_rx_action+0x1b8/0x254
> [c00000000f613ef0] c000000000057b70 .__do_softirq+0xa8/0x164
> [c00000000f613f90] c000000000024438 .call_do_softirq+0x14/0x24
> [c000000b8ffbf9f0] c00000000000bd30 .do_softirq+0x68/0xac
> [c000000b8ffbfa80] c000000000057cc4 .irq_exit+0x54/0x6c
> [c000000b8ffbfb00] c00000000000c358 .do_IRQ+0x170/0x1ac
> [c000000b8ffbfb90] c000000000004780 hardware_interrupt_entry+0x18/0x98
> --- Exception: 501 (Hardware Interrupt) at c000000000010bdc
> .cpu_idle+0x114/0x1e0
> [c000000b8ffbfe80] c000000000010bd0 .cpu_idle+0x108/0x1e0 (unreliable)
> [c000000b8ffbff00] c000000000026db0 .start_secondary+0x160/0x184
> [c000000b8ffbff90] c000000000008364 .start_secondary_prolog+0xc/0x10
>
> I'm a little confused if the port_napi_enable() is being called when the
> device is initialized, but then again, this is all new to me (should it
> be called in ehea_open?). I see it called on some reset routines, but
> not on the first initialization.
This is similar to the problem that Arnaldo hit a few minutes
ago in the VIA Rhine driver.
You can't only make a napi_enable() call when there has been
a previous napi_disable().
One way to fix this would be to forcefully napi_disable() on
all the per-port NAPI structs at the beginning of ehea_open(),
which should set things up to satisfy the pre-condition of the
napi_enable() calls.
You'll need to audit the entire driver to make sure this invariant
is held properly.
> Also, on this code, in ehea_sense_port_attr()
>
> /* Number of default QPs */
> if (use_mcs)
> port->num_def_qps = cb0->num_default_qps;
> else
> port->num_def_qps = 1;
>
>
> When using napi, since we have multi-queue napi support now, wouldn't we
> want to use all the default qps instead of 1?
I don't know how this hardware works, you tell me :-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eHEA driver issues from net-2.6.24
2007-08-22 22:03 ` David Miller
@ 2007-08-22 22:20 ` Andrew Theurer
2007-08-23 6:55 ` Jan-Bernd Themann
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Theurer @ 2007-08-22 22:20 UTC (permalink / raw)
To: David Miller; +Cc: ossthema, netdev
David Miller wrote:
> From: Andrew Theurer <habanero@us.ibm.com>
> Date: Wed, 22 Aug 2007 16:55:03 -0500
>
> Thanks for finally getting to test this, I thought nobody
> would test this until it got merged into 2.6.24 :-/
>
>
>> kernel BUG at include/linux/netdevice.h:318!
>> enter ? for help
>> [c00000000f613e40] c0000000003fe394 .net_rx_action+0x1b8/0x254
>> [c00000000f613ef0] c000000000057b70 .__do_softirq+0xa8/0x164
>> [c00000000f613f90] c000000000024438 .call_do_softirq+0x14/0x24
>> [c000000b8ffbf9f0] c00000000000bd30 .do_softirq+0x68/0xac
>> [c000000b8ffbfa80] c000000000057cc4 .irq_exit+0x54/0x6c
>> [c000000b8ffbfb00] c00000000000c358 .do_IRQ+0x170/0x1ac
>> [c000000b8ffbfb90] c000000000004780 hardware_interrupt_entry+0x18/0x98
>> --- Exception: 501 (Hardware Interrupt) at c000000000010bdc
>> .cpu_idle+0x114/0x1e0
>> [c000000b8ffbfe80] c000000000010bd0 .cpu_idle+0x108/0x1e0 (unreliable)
>> [c000000b8ffbff00] c000000000026db0 .start_secondary+0x160/0x184
>> [c000000b8ffbff90] c000000000008364 .start_secondary_prolog+0xc/0x10
>>
>> I'm a little confused if the port_napi_enable() is being called when the
>> device is initialized, but then again, this is all new to me (should it
>> be called in ehea_open?). I see it called on some reset routines, but
>> not on the first initialization.
>>
>
> This is similar to the problem that Arnaldo hit a few minutes
> ago in the VIA Rhine driver.
>
> You can't only make a napi_enable() call when there has been
> a previous napi_disable().
>
> One way to fix this would be to forcefully napi_disable() on
> all the per-port NAPI structs at the beginning of ehea_open(),
> which should set things up to satisfy the pre-condition of the
> napi_enable() calls.
>
OK, Ill try this.
> You'll need to audit the entire driver to make sure this invariant
> is held properly.
>
>
>> Also, on this code, in ehea_sense_port_attr()
>>
>> /* Number of default QPs */
>> if (use_mcs)
>> port->num_def_qps = cb0->num_default_qps;
>> else
>> port->num_def_qps = 1;
>>
>>
>> When using napi, since we have multi-queue napi support now, wouldn't we
>> want to use all the default qps instead of 1?
>>
>
> I don't know how this hardware works, you tell me :-)
>
Heh, I don't know it well, either. Maybe Jan Bernd can chime in.
Thanks for your help,
-Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eHEA driver issues from net-2.6.24
2007-08-22 22:20 ` Andrew Theurer
@ 2007-08-23 6:55 ` Jan-Bernd Themann
2007-08-23 8:17 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Jan-Bernd Themann @ 2007-08-23 6:55 UTC (permalink / raw)
To: Andrew Theurer; +Cc: David Miller, netdev
On Thursday 23 August 2007 00:20, Andrew Theurer wrote:
> David Miller wrote:
> > From: Andrew Theurer <habanero@us.ibm.com>
> > Date: Wed, 22 Aug 2007 16:55:03 -0500
> >
> > Thanks for finally getting to test this, I thought nobody
> > would test this until it got merged into 2.6.24 :-/
> >
Yes, sorry for the delay.
> >
> >> kernel BUG at include/linux/netdevice.h:318!
> >> enter ? for help
> >> [c00000000f613e40] c0000000003fe394 .net_rx_action+0x1b8/0x254
> >> [c00000000f613ef0] c000000000057b70 .__do_softirq+0xa8/0x164
> >> [c00000000f613f90] c000000000024438 .call_do_softirq+0x14/0x24
> >> [c000000b8ffbf9f0] c00000000000bd30 .do_softirq+0x68/0xac
> >> [c000000b8ffbfa80] c000000000057cc4 .irq_exit+0x54/0x6c
> >> [c000000b8ffbfb00] c00000000000c358 .do_IRQ+0x170/0x1ac
> >> [c000000b8ffbfb90] c000000000004780 hardware_interrupt_entry+0x18/0x98
> >> --- Exception: 501 (Hardware Interrupt) at c000000000010bdc
> >> .cpu_idle+0x114/0x1e0
> >> [c000000b8ffbfe80] c000000000010bd0 .cpu_idle+0x108/0x1e0 (unreliable)
> >> [c000000b8ffbff00] c000000000026db0 .start_secondary+0x160/0x184
> >> [c000000b8ffbff90] c000000000008364 .start_secondary_prolog+0xc/0x10
> >>
> >> I'm a little confused if the port_napi_enable() is being called when the
> >> device is initialized, but then again, this is all new to me (should it
> >> be called in ehea_open?). I see it called on some reset routines, but
> >> not on the first initialization.
> >>
> >
> > This is similar to the problem that Arnaldo hit a few minutes
> > ago in the VIA Rhine driver.
> >
> > You can't only make a napi_enable() call when there has been
> > a previous napi_disable().
> >
> > One way to fix this would be to forcefully napi_disable() on
> > all the per-port NAPI structs at the beginning of ehea_open(),
> > which should set things up to satisfy the pre-condition of the
> > napi_enable() calls.
> >
> OK, Ill try this.
Let me fix this. I'll try to get it done today.
> > You'll need to audit the entire driver to make sure this invariant
> > is held properly.
> >
> >
> >> Also, on this code, in ehea_sense_port_attr()
> >>
> >> /* Number of default QPs */
> >> if (use_mcs)
> >> port->num_def_qps = cb0->num_default_qps;
> >> else
> >> port->num_def_qps = 1;
> >>
> >>
> >> When using napi, since we have multi-queue napi support now, wouldn't we
> >> want to use all the default qps instead of 1?
> >>
> >
> > I don't know how this hardware works, you tell me :-)
> >
> Heh, I don't know it well, either. Maybe Jan Bernd can chime in.
We'd like to keep the possibility to switch back to a single queue for now.
However, we could activate multi queue support as default now.
I'll include this in the patch.
>
> Thanks for your help,
>
> -Andrew
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eHEA driver issues from net-2.6.24
2007-08-23 6:55 ` Jan-Bernd Themann
@ 2007-08-23 8:17 ` David Miller
2007-08-23 7:56 ` Jan-Bernd Themann
0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2007-08-23 8:17 UTC (permalink / raw)
To: ossthema; +Cc: habanero, netdev
From: Jan-Bernd Themann <ossthema@de.ibm.com>
Date: Thu, 23 Aug 2007 08:55:29 +0200
> We'd like to keep the possibility to switch back to a single queue
> for now.
Please do not do this, we already have way too much configurability
out there.
If you have the physical hardware queues enabled, use multiqueue napi
support.
If you add a knob to use or not use multi-napi, this makes life
more miserable for your users and your driver more complicated
and harder to maintain.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eHEA driver issues from net-2.6.24
2007-08-23 8:17 ` David Miller
@ 2007-08-23 7:56 ` Jan-Bernd Themann
0 siblings, 0 replies; 6+ messages in thread
From: Jan-Bernd Themann @ 2007-08-23 7:56 UTC (permalink / raw)
To: David Miller; +Cc: habanero, netdev
Hi David,
On Thursday 23 August 2007 10:17, David Miller wrote:
> From: Jan-Bernd Themann <ossthema@de.ibm.com>
> Date: Thu, 23 Aug 2007 08:55:29 +0200
>
> > We'd like to keep the possibility to switch back to a single queue
> > for now.
>
> Please do not do this, we already have way too much configurability
> out there.
ok, we decided to remove the switch for kernel 2.6.24
Regards,
Jan-Bernd
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-08-23 8:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-22 21:55 eHEA driver issues from net-2.6.24 Andrew Theurer
2007-08-22 22:03 ` David Miller
2007-08-22 22:20 ` Andrew Theurer
2007-08-23 6:55 ` Jan-Bernd Themann
2007-08-23 8:17 ` David Miller
2007-08-23 7:56 ` Jan-Bernd Themann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).