Re: e1000_down and tx_timeout worker race cleaning the transmit buffers

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Auke Kok <sofar@foo-projects.org>
To: Shaw <shawvrana@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>,
	Michael Chan <mchan@broadcom.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	netdev@vger.kernel.org, auke-jan.h.kok@intel.com,
	davem@davemloft.net, jgarzik@pobox.com
Subject: Re: e1000_down and tx_timeout worker race cleaning the transmit buffers
Date: Wed, 26 Apr 2006 21:55:03 -0700	[thread overview]
Message-ID: <44504EA7.5050901@foo-projects.org> (raw)
In-Reply-To: <7bb8b8de0604261714h2471420xa06bb6639ddb6cea@mail.gmail.com>

Shaw wrote:
> On 4/21/06, Andy Gospodarek <andy@greyhouse.net> wrote:
>> On 4/21/06, Michael Chan <mchan@broadcom.com> wrote:
>>> On Fri, 2006-04-21 at 16:01 -0400, Andy Gospodarek wrote:
>>>
>>>> I just hate to see extra resources used to solve problems that good
>>>> coding can solve (not that my suggestion is necessarily a 'good' one),
>>>> so I was trying to think of a way to resolve this without explicitly
>>>> adding another workqueue.
>>> If you don't want to add another workqueue, then look at tg3, bnx2, and
>>> one of the smc drivers on how to effectively wait for the driver's
>>> workqueue task to finish without deadlocking with linkwatch_event.
>>>
>> I agree 100%.  I just hope others can manage to figure that out too.
> 
> Ok, here's another attempt.  The goal here is to serialize attempts to
> clean the tx and rx buffers, and ensure that e1000_close is called
> after the tx_timeout_task has completed running and/or that the task
> is safe to run after e1000_close hasrun.
> 
> I'm concerned about the addition of the netif_running check to
> e1000_down.  While something like this is needed, I'm not familiar
> enough w/ the code to know if this is okay.
> All explanations and comments are greatly appreciated.

I apologise for not getting back on this earlier but Jesse Brandeburg and I 
have been digging into this for two days and making some big progress. One of 
the main fixes will be that we're taking out a watchdog reset task completely 
and doing down/up cycles instead, which removes a large portion of the race 
conditions at this stage completely (the tx_timeout triggers a watchdog reset 
which can happen during an e1000_down causing a double free interrupt, or a 
double allocation).

We're making good progress with this and are now working on removing the last 
race between the ioctl path and the ifdn/ifup stuff, where the last remaining 
race location is in the ethtool test which does all sorts of funny lowlevel 
driver stuff that can seriously OOPS if you're running ethtool tests while 
ifup/downing your interface.

While I appreciate patches ;^) I think we're on a better path by making these 
cleanups, and actually reducing the code in large places. I hope to be able to 
push something out for RFC soon. Added benefit will be that we're dropping a 
whole bunch of irq operations where we didn't need to (soft resets).

Cheers,

Auke

next prev parent reply	other threads:[~2006-04-27  4:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200604201035.00100.shaw@vranix.com>
2006-04-20 23:36 ` e1000_down and tx_timeout worker race cleaning the transmit buffers Herbert Xu
2006-04-20 23:51   ` Herbert Xu
2006-04-20 22:36     ` Michael Chan
2006-04-21  1:27       ` Herbert Xu
2006-04-21  1:33         ` Herbert Xu
2006-04-21  0:10           ` Michael Chan
2006-04-21  2:37             ` Herbert Xu
2006-04-21  2:40               ` Herbert Xu
2006-04-21  1:24                 ` Michael Chan
2006-04-21 13:27                   ` Andy Gospodarek
2006-04-21 15:28                     ` Michael Chan
2006-04-21 20:01                       ` Andy Gospodarek
2006-04-21 19:00                         ` Michael Chan
2006-04-21 20:46                           ` Andy Gospodarek
2006-04-27  0:14                             ` Shaw
2006-04-27  4:55                               ` Auke Kok [this message]
2006-04-29 21:57                                 ` Shaw Vrana
2006-04-21  2:42             ` Shaw Vrana
2006-04-21  1:33               ` Michael Chan
2006-04-21  3:05             ` shaw

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44504EA7.5050901@foo-projects.org \
    --to=sofar@foo-projects.org \
    --cc=andy@greyhouse.net \
    --cc=auke-jan.h.kok@intel.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgarzik@pobox.com \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=shawvrana@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).