All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Grant Grundler <grundler@parisc-linux.org>,
	Andrew Morton <akpm@osdl.org>
Cc: netdev@vger.kernel.org, Val Henson <val.henson@gmail.com>
Subject: Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
Date: Thu, 08 Jun 2006 10:43:04 -0400	[thread overview]
Message-ID: <44883778.8000209@pobox.com> (raw)
In-Reply-To: <20060531195234.GA4967@colo.lackof.org>


(CC'ing our newly minted tulip maintainer, Val)

Grant Grundler wrote:
> Jeff,
> SLES10 testing exposed an MCA that was confirmed to be a DMA IO TLB miss.
> This means tulip device was attempting to DMA to memory that was already
> unmapped. The test was crashing in the "ifconfig down" step when a 4-port
> tulip card was under this work load:
> 
> while :
> do
> 	ifconfig eth24 up
> 	ifconfig eth25 up
> 	ifconfig eth26 up
> 	ifconfig eth27 up
>         # Pound both interfaces with ethtool
>         for i in `seq 1000`
>         do
>                 ethtool eth24 &>/dev/null
>                 ethtool eth25 &>/dev/null
>                 ethtool eth26 &>/dev/null
>                 ethtool eth27 &>/dev/null
>         done
> 
> 	# Bring interfaces down
>         echo ifconfig $nic1 down
>         ifconfig eth24 down
>         ifconfig eth25 down
>         ifconfig eth26 down
>         ifconfig eth27 down
> 
>         sleep 5
> done
> 
> 
> [ And yes, I know tulip doesn't support ethtool. Don't ask.
>   It's still a sore point at the moment. Just consider it 
>   a delay loop or use "sleep 5" instead. ]
> 
> The real "network load" comes from another box(en) running 4 instances
> of "ping -f -s 1450 192.168.x.y" where "x.y" is the subnet/IP of eth24-27.
> The parisc and ia64 machines will crash in minutes.
> 
> I believe the problem is a race condition between an interrupt coming
> in and the tulip_down() code path. Moving the "free_irq()" to before
> tulip_down() call fixes the problem. I've been able to run the above
> test for several hours now.

NAK.  This is a band-aid, and one that creates new problems even as it 
attempts to solve problems.

Calling free_irq() while the chip is still active is just a bad idea, 
because the chip could raise an interrupt, creating a 
screaming-interrupts situation.  Consider especially the case of shared 
interrupts here, as a concrete example of how this won't work.

Perhaps cp_close() in 8139cp.c could be an example of a good ordering? 
It stops the chip, syncs irqs, frees irq, then frees [thus unmapping] 
the rings.

	Jeff





  reply	other threads:[~2006-06-08 14:43 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-31 19:52 PATCH 2.6.17-rc5 tulip free_irq() called too late Grant Grundler
2006-06-08 14:43 ` Jeff Garzik [this message]
2006-06-08 15:22   ` Grant Grundler
2006-06-08 15:32     ` Grant Grundler
2006-06-08 15:38       ` Jeff Garzik
2006-06-08 15:47         ` Grant Grundler
2006-06-08 15:32     ` Jeff Garzik
2006-06-08 15:36       ` Grant Grundler
2006-06-08 17:01   ` Grant Grundler
2006-06-13 23:55     ` PATCHv3 " Grant Grundler
2006-06-14  0:06       ` Valerie Henson
2006-06-14  0:33       ` Jeff Garzik
2006-06-14  4:44         ` Grant Grundler
2006-06-14 13:05           ` Kyle McMartin
2006-06-14 14:54             ` Grant Grundler
2006-06-14 15:03           ` Jeff Garzik
2006-06-14 18:14             ` Grant Grundler
2006-06-14 19:51               ` Jeff Garzik
2006-06-14 22:25                 ` Grant Grundler
2006-06-14 20:47               ` Francois Romieu
2006-06-14 22:30                 ` Grant Grundler
2006-06-15 20:30                   ` Francois Romieu
2006-06-16  5:47                     ` Grant Grundler
2006-06-16  7:32                       ` Jeff Garzik
2006-06-16 15:25                         ` Grant Grundler
     [not found]                         ` <20060616152400.GA7868@colo.lackof.org>
     [not found]                           ` <4492CE98.50900@pobox.com>
2006-06-16 16:06                             ` Grant Grundler
2006-06-16 16:16                               ` Jeff Garzik
2006-06-22  0:43       ` Valerie Henson
2006-06-23  5:00         ` Grant Grundler
2006-06-26 22:31           ` [PATCH] Fix tulip shutdown DMA/irq race Valerie Henson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44883778.8000209@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=akpm@osdl.org \
    --cc=grundler@parisc-linux.org \
    --cc=netdev@vger.kernel.org \
    --cc=val.henson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.