netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kok, Auke" <auke-jan.h.kok@intel.com>
To: Linas Vepstas <linas@austin.ibm.com>
Cc: wenxiong@us.ibm.com, e1000-devel@lists.sourceforge.net,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	john.ronciak@intel.com, jeffrey.t.kirsher@intel.com,
	Jeff Garzik <jeff@garzik.org>, NetDev <netdev@vger.kernel.org>,
	'Stephen Hemminger' <shemminger@linux-foundation.org>
Subject: Re: [PATCH 2/2]: e1000: avoid lockup durig error recovery
Date: Wed, 07 Nov 2007 14:45:18 -0800	[thread overview]
Message-ID: <47323FFE.2060008@intel.com> (raw)
In-Reply-To: <20071107222446.GM4239@austin.ibm.com>

[adding netdev, jeff G to the Cc]

Linas Vepstas wrote:
> On Wed, Nov 07, 2007 at 01:50:17PM -0800, Kok, Auke wrote:
>> Linas Vepstas wrote:
>>> If a PCI bus error is encountered during device open, the
>>> error recovery routines will attempt to close the device.
>>> If napi has not yet been enabled, the napi disable in the
>>> close will hang. 
>>>
>>> Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
>>>
>>> ----
>>> The "elegence" of this solution is arguable: one could
>>> say its "better" to perform this check in e1000_down().
>>> However, doing so will disrupt a commonly used path,
>>> whereas here, the hack is in the infrequently used
>>> error path, and thus less intrusive. 
>>>
>>>  drivers/net/e1000/e1000_main.c |    9 ++++++++-
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
>>> ===================================================================
>>> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 15:04:45.000000000 -0600
>>> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 15:04:52.000000000 -0600
>>> @@ -5268,8 +5268,15 @@ static pci_ers_result_t e1000_io_error_d
>>>  
>>>  	netif_device_detach(netdev);
>>>  
>>> -	if (netif_running(netdev))
>>> +	if (netif_running(netdev)) {
>>> +#ifdef CONFIG_E1000_NAPI
>>> +		/* e1000_down will spinlock in napi_disable() if we
>>> +		 * catch an error before napi_enable() was called. */
>>> +		if (test_bit(NAPI_STATE_SCHED, &adapter->napi.state))
>>> +			napi_enable(&adapter->napi);
>>> +#endif
>>>  		e1000_down(adapter);
>>> +	}
>>>  	pci_disable_device(pdev);
>>>  
>>>  	/* Request a slot slot reset. */
>> I think this is OK, but it's quite awful looking if you ask me.
> 
> Yeah, ... 
> 
> There are several alternatives: below are two. If you
> find one to be more appealing.. could you use it? Consider them
> to be "signed-off-by"; I have not actually compiled or tested
> either of them.

I'm not a particular fan of putting extra state tracking in the driver for
something we could extract from the napi subsystem already.

Jeff, Stephen, can't we have a generic napi_enabled() inline in netdevice.h that
tests for NAPI_STATE_SCHED ?

I wonder if there isn't something in the PCI error recovery missing the point and
we can solve this problem better for all drivers somehow.

Auke



> 
> 
>  drivers/net/e1000/e1000_main.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 16:11:36.000000000 -0600
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 16:15:17.000000000 -0600
> @@ -633,7 +633,11 @@ e1000_down(struct e1000_adapter *adapter
>  	set_bit(__E1000_DOWN, &adapter->flags);
>  
>  #ifdef CONFIG_E1000_NAPI
> -	napi_disable(&adapter->napi);
> +	/* napi_disable() will spinlock if we are in the
> +	 * pci error recovery path, and caught a pci
> +	 * error before napi_enable() was called. */
> +	if (!test_bit(NAPI_STATE_SCHED, &adapter->napi.state))
> +		napi_disable(&adapter->napi);
>  #endif
>  	e1000_irq_disable(adapter);
>  
> and here's another:
> 
>  drivers/net/e1000/e1000.h      |    1 +
>  drivers/net/e1000/e1000_main.c |    7 ++++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000.h	2007-09-26 15:06:56.000000000 -0500
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000.h	2007-11-07 16:19:16.000000000 -0600
> @@ -301,6 +301,7 @@ struct e1000_adapter {
>  	struct e1000_rx_ring *rx_ring;      /* One per active queue */
>  #ifdef CONFIG_E1000_NAPI
>  	struct napi_struct napi;
> +	int napi_enabled;
>  	struct net_device *polling_netdev;  /* One per active queue */
>  #endif
>  	int num_tx_queues;
> Index: linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.23-rc8-mm1.orig/drivers/net/e1000/e1000_main.c	2007-11-07 16:16:11.000000000 -0600
> +++ linux-2.6.23-rc8-mm1/drivers/net/e1000/e1000_main.c	2007-11-07 16:22:30.000000000 -0600
> @@ -545,6 +545,7 @@ int e1000_up(struct e1000_adapter *adapt
>  
>  #ifdef CONFIG_E1000_NAPI
>  	napi_enable(&adapter->napi);
> +	adapter->napi_enabled = 1;
>  #endif
>  	e1000_irq_enable(adapter);
>  
> @@ -633,7 +634,10 @@ e1000_down(struct e1000_adapter *adapter
>  	set_bit(__E1000_DOWN, &adapter->flags);
>  
>  #ifdef CONFIG_E1000_NAPI
> -	napi_disable(&adapter->napi);
> +	if (adapter->napi_enabled) {
> +		napi_disable(&adapter->napi);
> +		adapter->napi_enabled = 0;
> +	}
>  #endif
>  	e1000_irq_disable(adapter);
>  
> @@ -1437,6 +1441,7 @@ e1000_open(struct net_device *netdev)
>  
>  #ifdef CONFIG_E1000_NAPI
>  	napi_enable(&adapter->napi);
> +	adapter->napi_enabled = 1;
>  #endif
>  
>  	e1000_irq_enable(adapter);

       reply	other threads:[~2007-11-07 22:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20071107211935.GH4239@austin.ibm.com>
     [not found] ` <20071107212404.GA15251@austin.ibm.com>
     [not found]   ` <47323319.10709@intel.com>
     [not found]     ` <20071107222446.GM4239@austin.ibm.com>
2007-11-07 22:45       ` Kok, Auke [this message]
2007-11-07 23:21         ` [PATCH 2/2]: e1000: avoid lockup durig error recovery Linas Vepstas
2007-11-09 17:02           ` Ingo Oeser
2007-11-09 22:40             ` Linas Vepstas
2007-11-10 16:55               ` Stephen Hemminger
2007-11-08  1:15         ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47323FFE.2060008@intel.com \
    --to=auke-jan.h.kok@intel.com \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=jeff@garzik.org \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=linas@austin.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@linux-foundation.org \
    --cc=wenxiong@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).