netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
       [not found]     ` <20071107222446.GM4239@austin.ibm.com>
@ 2007-11-07 22:45       ` Kok, Auke
  2007-11-07 23:21         ` Linas Vepstas
  2007-11-08  1:15         ` Stephen Hemminger
  0 siblings, 2 replies; 6+ messages in thread
From: Kok, Auke @ 2007-11-07 22:45 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: wenxiong, e1000-devel, Brandeburg, Jesse, john.ronciak,
	jeffrey.t.kirsher, Jeff Garzik, NetDev,
	'Stephen Hemminger'

[adding netdev, jeff G to the Cc]

Linas Vepstas wrote:
> On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote:
>> Linas Vepstas wrote:
>>> If a PCI bus error is encountered during device open, the
>>> error recovery routines will attempt to close the device.
>>> If napi has not yet been enabled, the napi disable in the
>>> close will hang. 
>>>
>>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
>>>
>>> ----
>>> The "elegence" of this solution is arguable: one could
>>> say its "better" to perform this check in e1000_down().
>>> However, doing so will disrupt a commonly used path,
>>> whereas here, the hack is in the infrequently used
>>> error path, and thus less intrusive. 
>>>
>>>  drivers/net/e1000/e1000_main.c |    9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
>>> ===================================================================
>>> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 15:04:45.000000000 -0600
>>> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 15:04:52.000000000 -0600
>>> @@ -5268,8 +5268,15 @@ static pci_ers_result_t e1000_io_error_d
>>>  
>>>  	netif_device_detach(netdev);
>>>  
>>> -	if (netif_running(netdev))
>>> +	if (netif_running(netdev)) {
>>> +#ifdef CONFIG_E1000_NAPI
>>> +		/* e1000_down will spinlock in napi_disable() if we
>>> +		 * catch an error before napi_enable() was called. */
>>> +		if (test_bit(NAPI_STATE_SCHED, &adapter->napi.state))
>>> +			napi_enable(&adapter->napi);
>>> +#endif
>>>  		e1000_down(adapter);
>>> +	}
>>>  	pci_disable_device(pdev);
>>>  
>>>  	/* Request a slot slot reset. */
>> I think this is OK, but it's quite awful looking if you ask me.
> 
> Yeah, ... 
> 
> There are several alternatives: below are two. If you
> find one to be more appealing.. could you use it? Consider them
> to be "signed-off-by"; I have not actually compiled or tested
> either of them.

I'm not a particular fan of putting extra state tracking in the driver for
something we could extract from the napi subsystem already.

Jeff, Stephen, can't we have a generic napi_enabled() inline in netdevice.h that
tests for NAPI_STATE_SCHED ?

I wonder if there isn't something in the PCI error recovery missing the point and
we can solve this problem better for all drivers somehow.

Auke



> 
> 
>  drivers/net/e1000/e1000_main.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 16:11:36.000000000 -0600
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 16:15:17.000000000 -0600
> @@ -633,7 +633,11 @@ e1000_down(struct e1000_adapter *adapter
>  	set_bit(__E1000_DOWN, &adapter->flags);
>  
>  #ifdef CONFIG_E1000_NAPI
> -	napi_disable(&adapter->napi);
> +	/* napi_disable() will spinlock if we are in the
> +	 * pci error recovery path, and caught a pci
> +	 * error before napi_enable() was called. */
> +	if (!test_bit(NAPI_STATE_SCHED, &adapter->napi.state))
> +		napi_disable(&adapter->napi);
>  #endif
>  	e1000_irq_disable(adapter);
>  
> and here's another:
> 
>  drivers/net/e1000/e1000.h      |    1 +
>  drivers/net/e1000/e1000_main.c |    7 ++++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000.h	2007-09-26 15:06:56.000000000 -0500
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h	2007-11-07 16:19:16.000000000 -0600
> @@ -301,6 +301,7 @@ struct e1000_adapter {
>  	struct e1000_rx_ring *rx_ring;      /* One per active queue */
>  #ifdef CONFIG_E1000_NAPI
>  	struct napi_struct napi;
> +	int napi_enabled;
>  	struct net_device *polling_netdev;  /* One per active queue */
>  #endif
>  	int num_tx_queues;
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 16:16:11.000000000 -0600
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 16:22:30.000000000 -0600
> @@ -545,6 +545,7 @@ int e1000_up(struct e1000_adapter *adapt
>  
>  #ifdef CONFIG_E1000_NAPI
>  	napi_enable(&adapter->napi);
> +	adapter->napi_enabled = 1;
>  #endif
>  	e1000_irq_enable(adapter);
>  
> @@ -633,7 +634,10 @@ e1000_down(struct e1000_adapter *adapter
>  	set_bit(__E1000_DOWN, &adapter->flags);
>  
>  #ifdef CONFIG_E1000_NAPI
> -	napi_disable(&adapter->napi);
> +	if (adapter->napi_enabled) {
> +		napi_disable(&adapter->napi);
> +		adapter->napi_enabled = 0;
> +	}
>  #endif
>  	e1000_irq_disable(adapter);
>  
> @@ -1437,6 +1441,7 @@ e1000_open(struct net_device *netdev)
>  
>  #ifdef CONFIG_E1000_NAPI
>  	napi_enable(&adapter->napi);
> +	adapter->napi_enabled = 1;
>  #endif
>  
>  	e1000_irq_enable(adapter);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
  2007-11-07 22:45       ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke
@ 2007-11-07 23:21         ` Linas Vepstas
  2007-11-09 17:02           ` Ingo Oeser
  2007-11-08  1:15         ` Stephen Hemminger
  1 sibling, 1 reply; 6+ messages in thread
From: Linas Vepstas @ 2007-11-07 23:21 UTC (permalink / raw)
  To: Kok, Auke
  Cc: Jeff Garzik, e1000-devel, NetDev, Brandeburg,  Jesse,
	john.ronciak, jeffrey.t.kirsher, 'Stephen Hemminger',
	wenxiong

On Wed, Nov 07, 2007 at 02:45:18PM -0800, Kok, Auke wrote:
> [adding netdev, jeff G to the Cc]
> 
> Linas Vepstas wrote:
> > On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote:
> >> Linas Vepstas wrote:
> >>> If a PCI bus error is encountered during device open, the
> >>> error recovery routines will attempt to close the device.
> >>> If napi has not yet been enabled, the napi disable in the
> >>> close will hang. 
> >>>
> >>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
> >>>
> >>> ----
> >>> The "elegence" of this solution is arguable: one could
> >>> say its "better" to perform this check in e1000_down().
> >>> However, doing so will disrupt a commonly used path,
> >>> whereas here, the hack is in the infrequently used
> >>> error path, and thus less intrusive. 
> >>>
> >>>  drivers/net/e1000/e1000_main.c |    9 ++++++++-
> >>>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>>
> >> I think this is OK, but it's quite awful looking if you ask me.
> > 
> > Yeah, ... 
> > 
> > There are several alternatives: below are two. If you
> > find one to be more appealing.. could you use it? Consider them
> > to be "signed-off-by"; I have not actually compiled or tested
> > either of them.
> 
> I'm not a particular fan of putting extra state tracking in the driver for
> something we could extract from the napi subsystem already.
> 
> Jeff, Stephen, can't we have a generic napi_enabled() inline in netdevice.h that
> tests for NAPI_STATE_SCHED ?

Like this?

 include/linux/netdevice.h |   12 ++++++++++++
 1 file changed, 12 insertions(+)

Index: linux-2.6.23-rc8-mm1/include/linux/netdevice.h
===================================================================
--- linux-2.6.23-rc8-mm1.orig/include/linux/netdevice.h	2007-09-26 15:07:05.000000000 -0500
+++ linux-2.6.23-rc8-mm1/include/linux/netdevice.h	2007-11-07 17:14:50.000000000 -0600
@@ -384,6 +384,18 @@ static inline void napi_enable(struct na
 	clear_bit(NAPI_STATE_SCHED, &n->state);
 }
 
+/**
+ *	napi_enabled_p - return non-zero if napi enabled
+ *	@n: napi context
+ * 
+ * Mnemonic: _p stands for "predicate", returning a yes/no
+ * answer to the question.
+ */
+static inline int napi_enabled_p(struct napi_struct *n)
+{
+	return !test_bit(NAPI_STATE_SCHED, &n->state);
+}
+
 /*
  *	The DEVICE structure.
  *	Actually, this whole structure is a big mistake.  It mixes I/O


> I wonder if there isn't something in the PCI error recovery missing the point and
> we can solve this problem better for all drivers somehow.

Well, there's also scsi, which doesn't use napi :-)
For the most part, error recovery is a fairly cut-n-paste
set of steps. However, I don't quite have enough confidence
to say "yea verily, all network adapters will use these 
same steps."

--linas

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
  2007-11-07 22:45       ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke
  2007-11-07 23:21         ` Linas Vepstas
@ 2007-11-08  1:15         ` Stephen Hemminger
  1 sibling, 0 replies; 6+ messages in thread
From: Stephen Hemminger @ 2007-11-08  1:15 UTC (permalink / raw)
  To: Kok, Auke
  Cc: Jeff Garzik, e1000-devel, NetDev, Brandeburg,  Jesse,
	john.ronciak, jeffrey.t.kirsher, Linas Vepstas, wenxiong

On Wed, 07 Nov 2007 14:45:18 -0800
"Kok, Auke" <auke-jan.h.kok@intel.com> wrote:

> [adding netdev, jeff G to the Cc]
> 
> Linas Vepstas wrote:
> > On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote:
> >> Linas Vepstas wrote:
> >>> If a PCI bus error is encountered during device open, the
> >>> error recovery routines will attempt to close the device.
> >>> If napi has not yet been enabled, the napi disable in the
> >>> close will hang. 
> >>>
> >>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
> >>>
> >>> ----
> >>> The "elegence" of this solution is arguable: one could
> >>> say its "better" to perform this check in e1000_down().
> >>> However, doing so will disrupt a commonly used path,
> >>> whereas here, the hack is in the infrequently used
> >>> error path, and thus less intrusive. 
> >>>

Same problem might be possible in suspend/resume.
The problem with testing napi_enabled is that state can get changed
by racing irq.

-- 
Stephen Hemminger <shemminger@linux-foundation.org>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
  2007-11-07 23:21         ` Linas Vepstas
@ 2007-11-09 17:02           ` Ingo Oeser
  2007-11-09 22:40             ` Linas Vepstas
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Oeser @ 2007-11-09 17:02 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: Kok, Auke, Jeff Garzik, e1000-devel, NetDev, Brandeburg,  Jesse,
	john.ronciak, jeffrey.t.kirsher, 'Stephen Hemminger',
	wenxiong

Hi Linas,

Linas Vepstas schrieb:
> Index: linux-2.6.23-rc8-mm1/include/linux/netdevice.h
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/include/linux/netdevice.h	2007-09-26 15:07:05.000000000 -0500
> +++ linux-2.6.23-rc8-mm1/include/linux/netdevice.h	2007-11-07 17:14:50.000000000 -0600
> @@ -384,6 +384,18 @@ static inline void napi_enable(struct na
>  	clear_bit(NAPI_STATE_SCHED, &n->state);
>  }
>  
> +/**
> + *	napi_enabled_p - return non-zero if napi enabled
> + *	@n: napi context
> + * 
> + * Mnemonic: _p stands for "predicate", returning a yes/no
> + * answer to the question.

Call it "is_napi_enabled()" an nobody will ask :-)

> + */
> +static inline int napi_enabled_p(struct napi_struct *n)

And please make it return "bool" instead of "int".


Best Regards

Ingo Oeser

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
  2007-11-09 17:02           ` Ingo Oeser
@ 2007-11-09 22:40             ` Linas Vepstas
  2007-11-10 16:55               ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Linas Vepstas @ 2007-11-09 22:40 UTC (permalink / raw)
  To: Ingo Oeser
  Cc: Kok, Auke, wenxiong, e1000-devel, Brandeburg, Jesse, john.ronciak,
	jeffrey.t.kirsher, Jeff Garzik, NetDev,
	'Stephen Hemminger'

On Fri, Nov 09, 2007 at 06:02:34PM +0100, Ingo Oeser wrote:
> Linas Vepstas schrieb:
> > + *	napi_enabled_p - return non-zero if napi enabled
> > + * 
> > + * Mnemonic: _p stands for "predicate", returning a yes/no
> > + * answer to the question.
> 
> Call it "is_napi_enabled()" an nobody will ask :-)

Heh. The suffix _p is standard coding style for lisp/scheme
and first-order logic interpreters.  This was my lame attempt
to introduce it to the kernel. I guess that lame duck won't fly.

--linas


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
  2007-11-09 22:40             ` Linas Vepstas
@ 2007-11-10 16:55               ` Stephen Hemminger
  0 siblings, 0 replies; 6+ messages in thread
From: Stephen Hemminger @ 2007-11-10 16:55 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: Kok, Auke, Jeff Garzik, e1000-devel, NetDev, Brandeburg,  Jesse,
	Ingo Oeser, john.ronciak, jeffrey.t.kirsher, wenxiong

On Fri, 9 Nov 2007 16:40:05 -0600
linas@austin.ibm.com (Linas Vepstas) wrote:

> On Fri, Nov 09, 2007 at 06:02:34PM +0100, Ingo Oeser wrote:
> > Linas Vepstas schrieb:
> > > + *	napi_enabled_p - return non-zero if napi enabled
> > > + * 
> > > + * Mnemonic: _p stands for "predicate", returning a yes/no
> > > + * answer to the question.
> > 
> > Call it "is_napi_enabled()" an nobody will ask :-)
> 
> Heh. The suffix _p is standard coding style for lisp/scheme
> and first-order logic interpreters.  This was my lame attempt
> to introduce it to the kernel. I guess that lame duck won't fly.
> 
> --linas
> 

NO.  it reeks of hungarian notation.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-11-10 16:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20071107211935.GH4239@austin.ibm.com>
     [not found] ` <20071107212404.GA15251@austin.ibm.com>
     [not found]   ` <47323319.10709@intel.com>
     [not found]     ` <20071107222446.GM4239@austin.ibm.com>
2007-11-07 22:45       ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Kok, Auke
2007-11-07 23:21         ` Linas Vepstas
2007-11-09 17:02           ` Ingo Oeser
2007-11-09 22:40             ` Linas Vepstas
2007-11-10 16:55               ` Stephen Hemminger
2007-11-08  1:15         ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).