* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery [not found] ` <20071107222446.GM4239@austin.ibm.com> @ 2007-11-07 22:45 ` Kok, Auke 2007-11-07 23:21 ` Linas Vepstas 2007-11-08 1:15 ` Stephen Hemminger 0 siblings, 2 replies; 6+ messages in thread From: Kok, Auke @ 2007-11-07 22:45 UTC (permalink / raw) To: Linas Vepstas Cc: wenxiong, e1000-devel, Brandeburg, Jesse, john.ronciak, jeffrey.t.kirsher, Jeff Garzik, NetDev, 'Stephen Hemminger' [adding netdev, jeff G to the Cc] Linas Vepstas wrote: > On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote: >> Linas Vepstas wrote: >>> If a PCI bus error is encountered during device open, the >>> error recovery routines will attempt to close the device. >>> If napi has not yet been enabled, the napi disable in the >>> close will hang. >>> >>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com> >>> >>> ---- >>> The "elegence" of this solution is arguable: one could >>> say its "better" to perform this check in e1000_down(). >>> However, doing so will disrupt a commonly used path, >>> whereas here, the hack is in the infrequently used >>> error path, and thus less intrusive. >>> >>> drivers/net/e1000/e1000_main.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c >>> =================================================================== >>> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c 2007-11-07 15:04:45.000000000 -0600 >>> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c 2007-11-07 15:04:52.000000000 -0600 >>> @@ -5268,8 +5268,15 @@ static pci_ers_result_t e1000_io_error_d >>> >>> netif_device_detach(netdev); >>> >>> - if (netif_running(netdev)) >>> + if (netif_running(netdev)) { >>> +#ifdef CONFIG_E1000_NAPI >>> + /* e1000_down will spinlock in napi_disable() if we >>> + * catch an error before napi_enable() was called. */ >>> + if (test_bit(NAPI_STATE_SCHED, &adapter->napi.state)) >>> + napi_enable(&adapter->napi); >>> +#endif >>> e1000_down(adapter); >>> + } >>> pci_disable_device(pdev); >>> >>> /* Request a slot slot reset. */ >> I think this is OK, but it's quite awful looking if you ask me. > > Yeah, ... > > There are several alternatives: below are two. If you > find one to be more appealing.. could you use it? Consider them > to be "signed-off-by"; I have not actually compiled or tested > either of them. I'm not a particular fan of putting extra state tracking in the driver for something we could extract from the napi subsystem already. Jeff, Stephen, can't we have a generic napi_enabled() inline in netdevice.h that tests for NAPI_STATE_SCHED ? I wonder if there isn't something in the PCI error recovery missing the point and we can solve this problem better for all drivers somehow. Auke > > > drivers/net/e1000/e1000_main.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c > =================================================================== > --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c 2007-11-07 16:11:36.000000000 -0600 > +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c 2007-11-07 16:15:17.000000000 -0600 > @@ -633,7 +633,11 @@ e1000_down(struct e1000_adapter *adapter > set_bit(__E1000_DOWN, &adapter->flags); > > #ifdef CONFIG_E1000_NAPI > - napi_disable(&adapter->napi); > + /* napi_disable() will spinlock if we are in the > + * pci error recovery path, and caught a pci > + * error before napi_enable() was called. */ > + if (!test_bit(NAPI_STATE_SCHED, &adapter->napi.state)) > + napi_disable(&adapter->napi); > #endif > e1000_irq_disable(adapter); > > and here's another: > > drivers/net/e1000/e1000.h | 1 + > drivers/net/e1000/e1000_main.c | 7 ++++++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h > =================================================================== > --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000.h 2007-09-26 15:06:56.000000000 -0500 > +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h 2007-11-07 16:19:16.000000000 -0600 > @@ -301,6 +301,7 @@ struct e1000_adapter { > struct e1000_rx_ring *rx_ring; /* One per active queue */ > #ifdef CONFIG_E1000_NAPI > struct napi_struct napi; > + int napi_enabled; > struct net_device *polling_netdev; /* One per active queue */ > #endif > int num_tx_queues; > Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c > =================================================================== > --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c 2007-11-07 16:16:11.000000000 -0600 > +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c 2007-11-07 16:22:30.000000000 -0600 > @@ -545,6 +545,7 @@ int e1000_up(struct e1000_adapter *adapt > > #ifdef CONFIG_E1000_NAPI > napi_enable(&adapter->napi); > + adapter->napi_enabled = 1; > #endif > e1000_irq_enable(adapter); > > @@ -633,7 +634,10 @@ e1000_down(struct e1000_adapter *adapter > set_bit(__E1000_DOWN, &adapter->flags); > > #ifdef CONFIG_E1000_NAPI > - napi_disable(&adapter->napi); > + if (adapter->napi_enabled) { > + napi_disable(&adapter->napi); > + adapter->napi_enabled = 0; > + } > #endif > e1000_irq_disable(adapter); > > @@ -1437,6 +1441,7 @@ e1000_open(struct net_device *netdev) > > #ifdef CONFIG_E1000_NAPI > napi_enable(&adapter->napi); > + adapter->napi_enabled = 1; > #endif > > e1000_irq_enable(adapter); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery 2007-11-07 22:45 ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke @ 2007-11-07 23:21 ` Linas Vepstas 2007-11-09 17:02 ` Ingo Oeser 2007-11-08 1:15 ` Stephen Hemminger 1 sibling, 1 reply; 6+ messages in thread From: Linas Vepstas @ 2007-11-07 23:21 UTC (permalink / raw) To: Kok, Auke Cc: Jeff Garzik, e1000-devel, NetDev, Brandeburg, Jesse, john.ronciak, jeffrey.t.kirsher, 'Stephen Hemminger', wenxiong On Wed, Nov 07, 2007 at 02:45:18PM -0800, Kok, Auke wrote: > [adding netdev, jeff G to the Cc] > > Linas Vepstas wrote: > > On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote: > >> Linas Vepstas wrote: > >>> If a PCI bus error is encountered during device open, the > >>> error recovery routines will attempt to close the device. > >>> If napi has not yet been enabled, the napi disable in the > >>> close will hang. > >>> > >>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com> > >>> > >>> ---- > >>> The "elegence" of this solution is arguable: one could > >>> say its "better" to perform this check in e1000_down(). > >>> However, doing so will disrupt a commonly used path, > >>> whereas here, the hack is in the infrequently used > >>> error path, and thus less intrusive. > >>> > >>> drivers/net/e1000/e1000_main.c | 9 ++++++++- > >>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>> > >> I think this is OK, but it's quite awful looking if you ask me. > > > > Yeah, ... > > > > There are several alternatives: below are two. If you > > find one to be more appealing.. could you use it? Consider them > > to be "signed-off-by"; I have not actually compiled or tested > > either of them. > > I'm not a particular fan of putting extra state tracking in the driver for > something we could extract from the napi subsystem already. > > Jeff, Stephen, can't we have a generic napi_enabled() inline in netdevice.h that > tests for NAPI_STATE_SCHED ? Like this? include/linux/netdevice.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) Index: linux-2.6.23-rc8-mm1/include/linux/netdevice.h =================================================================== --- linux-2.6.23-rc8-mm1.orig/include/linux/netdevice.h 2007-09-26 15:07:05.000000000 -0500 +++ linux-2.6.23-rc8-mm1/include/linux/netdevice.h 2007-11-07 17:14:50.000000000 -0600 @@ -384,6 +384,18 @@ static inline void napi_enable(struct na clear_bit(NAPI_STATE_SCHED, &n->state); } +/** + * napi_enabled_p - return non-zero if napi enabled + * @n: napi context + * + * Mnemonic: _p stands for "predicate", returning a yes/no + * answer to the question. + */ +static inline int napi_enabled_p(struct napi_struct *n) +{ + return !test_bit(NAPI_STATE_SCHED, &n->state); +} + /* * The DEVICE structure. * Actually, this whole structure is a big mistake. It mixes I/O > I wonder if there isn't something in the PCI error recovery missing the point and > we can solve this problem better for all drivers somehow. Well, there's also scsi, which doesn't use napi :-) For the most part, error recovery is a fairly cut-n-paste set of steps. However, I don't quite have enough confidence to say "yea verily, all network adapters will use these same steps." --linas ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery 2007-11-07 23:21 ` Linas Vepstas @ 2007-11-09 17:02 ` Ingo Oeser 2007-11-09 22:40 ` Linas Vepstas 0 siblings, 1 reply; 6+ messages in thread From: Ingo Oeser @ 2007-11-09 17:02 UTC (permalink / raw) To: Linas Vepstas Cc: Kok, Auke, Jeff Garzik, e1000-devel, NetDev, Brandeburg, Jesse, john.ronciak, jeffrey.t.kirsher, 'Stephen Hemminger', wenxiong Hi Linas, Linas Vepstas schrieb: > Index: linux-2.6.23-rc8-mm1/include/linux/netdevice.h > =================================================================== > --- linux-2.6.23-rc8-mm1.orig/include/linux/netdevice.h 2007-09-26 15:07:05.000000000 -0500 > +++ linux-2.6.23-rc8-mm1/include/linux/netdevice.h 2007-11-07 17:14:50.000000000 -0600 > @@ -384,6 +384,18 @@ static inline void napi_enable(struct na > clear_bit(NAPI_STATE_SCHED, &n->state); > } > > +/** > + * napi_enabled_p - return non-zero if napi enabled > + * @n: napi context > + * > + * Mnemonic: _p stands for "predicate", returning a yes/no > + * answer to the question. Call it "is_napi_enabled()" an nobody will ask :-) > + */ > +static inline int napi_enabled_p(struct napi_struct *n) And please make it return "bool" instead of "int". Best Regards Ingo Oeser ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery 2007-11-09 17:02 ` Ingo Oeser @ 2007-11-09 22:40 ` Linas Vepstas 2007-11-10 16:55 ` Stephen Hemminger 0 siblings, 1 reply; 6+ messages in thread From: Linas Vepstas @ 2007-11-09 22:40 UTC (permalink / raw) To: Ingo Oeser Cc: Kok, Auke, wenxiong, e1000-devel, Brandeburg, Jesse, john.ronciak, jeffrey.t.kirsher, Jeff Garzik, NetDev, 'Stephen Hemminger' On Fri, Nov 09, 2007 at 06:02:34PM +0100, Ingo Oeser wrote: > Linas Vepstas schrieb: > > + * napi_enabled_p - return non-zero if napi enabled > > + * > > + * Mnemonic: _p stands for "predicate", returning a yes/no > > + * answer to the question. > > Call it "is_napi_enabled()" an nobody will ask :-) Heh. The suffix _p is standard coding style for lisp/scheme and first-order logic interpreters. This was my lame attempt to introduce it to the kernel. I guess that lame duck won't fly. --linas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery 2007-11-09 22:40 ` Linas Vepstas @ 2007-11-10 16:55 ` Stephen Hemminger 0 siblings, 0 replies; 6+ messages in thread From: Stephen Hemminger @ 2007-11-10 16:55 UTC (permalink / raw) To: Linas Vepstas Cc: Kok, Auke, Jeff Garzik, e1000-devel, NetDev, Brandeburg, Jesse, Ingo Oeser, john.ronciak, jeffrey.t.kirsher, wenxiong On Fri, 9 Nov 2007 16:40:05 -0600 linas@austin.ibm.com (Linas Vepstas) wrote: > On Fri, Nov 09, 2007 at 06:02:34PM +0100, Ingo Oeser wrote: > > Linas Vepstas schrieb: > > > + * napi_enabled_p - return non-zero if napi enabled > > > + * > > > + * Mnemonic: _p stands for "predicate", returning a yes/no > > > + * answer to the question. > > > > Call it "is_napi_enabled()" an nobody will ask :-) > > Heh. The suffix _p is standard coding style for lisp/scheme > and first-order logic interpreters. This was my lame attempt > to introduce it to the kernel. I guess that lame duck won't fly. > > --linas > NO. it reeks of hungarian notation. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery 2007-11-07 22:45 ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke 2007-11-07 23:21 ` Linas Vepstas @ 2007-11-08 1:15 ` Stephen Hemminger 1 sibling, 0 replies; 6+ messages in thread From: Stephen Hemminger @ 2007-11-08 1:15 UTC (permalink / raw) To: Kok, Auke Cc: Jeff Garzik, e1000-devel, NetDev, Brandeburg, Jesse, john.ronciak, jeffrey.t.kirsher, Linas Vepstas, wenxiong On Wed, 07 Nov 2007 14:45:18 -0800 "Kok, Auke" <auke-jan.h.kok@intel.com> wrote: > [adding netdev, jeff G to the Cc] > > Linas Vepstas wrote: > > On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote: > >> Linas Vepstas wrote: > >>> If a PCI bus error is encountered during device open, the > >>> error recovery routines will attempt to close the device. > >>> If napi has not yet been enabled, the napi disable in the > >>> close will hang. > >>> > >>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com> > >>> > >>> ---- > >>> The "elegence" of this solution is arguable: one could > >>> say its "better" to perform this check in e1000_down(). > >>> However, doing so will disrupt a commonly used path, > >>> whereas here, the hack is in the infrequently used > >>> error path, and thus less intrusive. > >>> Same problem might be possible in suspend/resume. The problem with testing napi_enabled is that state can get changed by racing irq. -- Stephen Hemminger <shemminger@linux-foundation.org> ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-11-10 16:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20071107211935.GH4239@austin.ibm.com>
[not found] ` <20071107212404.GA15251@austin.ibm.com>
[not found] ` <47323319.10709@intel.com>
[not found] ` <20071107222446.GM4239@austin.ibm.com>
2007-11-07 22:45 ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke
2007-11-07 23:21 ` Linas Vepstas
2007-11-09 17:02 ` Ingo Oeser
2007-11-09 22:40 ` Linas Vepstas
2007-11-10 16:55 ` Stephen Hemminger
2007-11-08 1:15 ` Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).