From: Ingo Molnar <mingo@elte.hu>
To: Chuck Ebbert <cebbert@redhat.com>
Cc: "Marcin Ślusarz" <marcin.slusarz@gmail.com>,
"Jarek Poplawski" <jarkao2@o2.pl>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
"Jean-Baptiste Vignaud" <vignaud@xandmail.fr>,
linux-kernel <linux-kernel@vger.kernel.org>,
shemminger <shemminger@linux-foundation.org>,
linux-net <linux-net@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Alan Cox" <alan@lxorguk.ukuu.org.uk>
Subject: Re: 2.6.20->2.6.21 - networking dies after random time
Date: Mon, 6 Aug 2007 21:08:12 +0200 [thread overview]
Message-ID: <20070806190812.GD26868@elte.hu> (raw)
In-Reply-To: <46B75DD4.5080709@redhat.com>
* Chuck Ebbert <cebbert@redhat.com> wrote:
> Before, they would print:
>
> eth0: transmit timed out, tx_status 00 status e601.
> diagnostics: net 0ccc media 8880 dma 0000003a fifo 0000
> eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
> Flags; bus-master 1, dirty 295757(13) current 295757(13)
> Transmit list 00000000 vs. f7150a20.
> 0: @f7150200 length 80000070 status 0c010070
> 1: @f71502a0 length 80000070 status 0c010070
> 2: @f7150340 length 8000005c status 0c01005c
>
> Now they just work, apparently...
could you please try the patch below? If this doesnt do the trick then i
guess we need to revert that change.
Ingo
------------>
(take 2)
Subject: genirq: fix simple and fasteoi irq handlers
After the "genirq: do not mask interrupts by default" patch interrupts
should be disabled not immediately upon request, but after they happen.
But, handle_simple_irq() and handle_fasteoi_irq() can skip this once or
more if an irq is just serviced (IRQ_INPROGRESS), possibly disrupting a
driver's work.
The main reason of problems here, pointing the broken patch and making
the first patch which can fix this was done by Marcin Slusarz.
Additional test patches of Thomas Gleixner and Ingo Molnar tested by
Marcin Slusarz helped to narrow possible reasons even more. Thanks.
PS: this patch fixes only one evident error here, but there could be
more places affected by above-mentioned change in irq handling.
PS 2:
After rethinking, IMHO, there are two most probable scenarios here:
1. After hw resend there could be a conflict between retriggered
edge type irq and the next level type one: e.g. if this level type
irq (io_apic is enabled then) is triggered while retriggered irq is
serviced (IRQ_INPROGRESS) there is goto out with eoi, and probably
the next such levels are triggered and looping, so probably kind of
flood in io_apic until this retriggered edge service has ended.
2. There is something wrong with ioapic_retrigger_irq (less probable
because this should be probably seen with 'normal' edge retriggers,
but on the other hand, they could be less common).
So, if there is #1, this fixed patch should work.
But, since level types don't need this retriggers too much I think
this "don't mask interrupts by default" idea should be rethinked:
is there enough gain to risk such hard to diagnose errors?
So, IMHO, there should be at least possibility to turn this off for
level types in config (it should be a visible option, so people could
find & try this before writing for help or changing a network card).
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
---
diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c
--- 2.6.23-rc1-/kernel/irq/chip.c 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.23-rc1/kernel/irq/chip.c 2007-08-05 21:49:46.000000000 +0200
@@ -295,12 +295,11 @@ handle_simple_irq(unsigned int irq, stru
spin_lock(&desc->lock);
- if (unlikely(desc->status & IRQ_INPROGRESS))
- goto out_unlock;
kstat_cpu(cpu).irqs[irq]++;
action = desc->action;
- if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
+ if (unlikely(!action || (desc->status & (IRQ_INPROGRESS |
+ IRQ_DISABLED)))) {
if (desc->chip->mask)
desc->chip->mask(irq);
desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
@@ -318,6 +317,8 @@ handle_simple_irq(unsigned int irq, stru
spin_lock(&desc->lock);
desc->status &= ~IRQ_INPROGRESS;
+ if (!(desc->status & IRQ_DISABLED) && desc->chip->unmask)
+ desc->chip->unmask(irq);
out_unlock:
spin_unlock(&desc->lock);
}
@@ -392,18 +393,16 @@ handle_fasteoi_irq(unsigned int irq, str
spin_lock(&desc->lock);
- if (unlikely(desc->status & IRQ_INPROGRESS))
- goto out;
-
desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
kstat_cpu(cpu).irqs[irq]++;
/*
- * If its disabled or no action available
+ * If it's running, disabled or no action available
* then mask it and get out of here:
*/
action = desc->action;
- if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
+ if (unlikely(!action || (desc->status & (IRQ_INPROGRESS |
+ IRQ_DISABLED)))) {
desc->status |= IRQ_PENDING;
if (desc->chip->mask)
desc->chip->mask(irq);
@@ -420,6 +419,8 @@ handle_fasteoi_irq(unsigned int irq, str
spin_lock(&desc->lock);
desc->status &= ~IRQ_INPROGRESS;
+ if (!(desc->status & IRQ_DISABLED) && desc->chip->unmask)
+ desc->chip->unmask(irq);
out:
desc->chip->eoi(irq);
next prev parent reply other threads:[~2007-08-06 19:08 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-29 8:50 2.6.20->2.6.21 - networking dies after random time Jean-Baptiste Vignaud
2007-06-29 15:07 ` Jarek Poplawski
2007-07-23 5:44 ` Marcin Ślusarz
2007-07-23 8:53 ` Jarek Poplawski
2007-07-24 7:18 ` Jarek Poplawski
2007-07-24 8:05 ` Ingo Molnar
2007-07-24 9:42 ` Ingo Molnar
2007-07-24 19:30 ` Linus Torvalds
2007-07-24 20:04 ` Ingo Molnar
2007-07-25 0:19 ` Thomas Gleixner
2007-07-25 7:23 ` Jarek Poplawski
2007-07-25 13:57 ` Jarek Poplawski
2007-07-25 14:46 ` Alan Cox
2007-07-26 12:44 ` [PATCH][netdrvr] lib8390: comment on locking by Alan Cox " Jarek Poplawski
2007-07-26 12:47 ` Alan Cox
2007-07-30 19:47 ` Jeff Garzik
2007-07-30 8:46 ` Ingo Molnar
2007-07-30 13:05 ` Alan Cox
2007-07-26 7:16 ` Marcin Ślusarz
2007-07-26 8:13 ` Jarek Poplawski
2007-07-26 8:10 ` Thomas Gleixner
2007-07-26 8:31 ` Ingo Molnar
2007-07-26 8:55 ` Jarek Poplawski
2007-07-26 9:12 ` Ingo Molnar
2007-07-30 7:29 ` Marcin Ślusarz
2007-07-30 8:49 ` Ingo Molnar
2007-08-01 7:24 ` Marcin Ślusarz
2007-08-01 7:27 ` Ingo Molnar
2007-08-06 6:58 ` Marcin Ślusarz
2007-07-31 13:20 ` Jarek Poplawski
2007-08-06 7:00 ` Marcin Ślusarz
2007-08-06 7:03 ` Ingo Molnar
2007-08-06 17:43 ` Chuck Ebbert
2007-08-06 19:08 ` Ingo Molnar [this message]
2007-08-09 14:50 ` [RFC] " Jarek Poplawski
[not found] ` <p738x8kg0dp.fsf@bingen.suse.de>
2007-08-09 15:30 ` Jarek Poplawski
2007-08-07 10:09 ` Jarek Poplawski
2007-08-07 7:46 ` Marcin Ślusarz
2007-08-07 8:23 ` Jarek Poplawski
[not found] ` <4bacf17f0708070237w19d184b3p7f74b53612edb9a6@mail.gmail.com>
2007-08-07 9:52 ` Jarek Poplawski
2007-08-07 12:13 ` Jarek Poplawski
2007-08-07 12:55 ` Jarek Poplawski
2007-08-08 11:11 ` Marcin Ślusarz
2007-08-08 11:09 ` Marcin Ślusarz
2007-08-08 11:42 ` Jarek Poplawski
2007-08-08 11:53 ` Jarek Poplawski
2007-08-09 9:19 ` [patch (testing)] " Jarek Poplawski
[not found] ` <4bacf17f0708092333n17e0ba19jf2c769531610868d@mail.gmail.com>
2007-08-10 7:10 ` Jarek Poplawski
2007-08-10 10:43 ` Marcin Ślusarz
2007-08-10 11:37 ` Jarek Poplawski
2007-07-31 15:58 ` [patch] genirq: temporary fix for level-triggered IRQ resend Ingo Molnar
2007-07-31 16:00 ` Ingo Molnar
2007-08-08 11:00 ` Jarek Poplawski
2007-08-02 17:03 ` Gabriel C
2007-08-02 20:11 ` Ingo Molnar
2007-08-03 6:07 ` [patch] genirq: fix simple and fasteoi irq handlers Jarek Poplawski
2007-08-03 8:04 ` Ingo Molnar
2007-08-03 8:46 ` Ingo Molnar
2007-08-03 9:10 ` Jarek Poplawski
2007-08-03 11:57 ` Marcin Ślusarz
2007-08-03 12:26 ` Jarek Poplawski
2007-08-06 7:05 ` Marcin Ślusarz
2007-08-06 6:07 ` [patch (take 2)] " Jarek Poplawski
2007-08-06 6:14 ` Ingo Molnar
2007-08-06 7:07 ` Marcin Ślusarz
2007-08-06 7:19 ` Jarek Poplawski
2007-07-26 9:11 ` 2.6.20->2.6.21 - networking dies after random time Jarek Poplawski
2007-07-26 8:19 ` Jarek Poplawski
2007-07-26 8:16 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2007-08-08 8:59 Jean-Baptiste Vignaud
2007-08-08 9:30 ` Jarek Poplawski
2007-08-08 12:16 ` Jarek Poplawski
2007-08-07 17:16 Jean-Baptiste Vignaud
2007-08-08 7:21 ` Jarek Poplawski
2007-08-08 7:36 ` Jarek Poplawski
2007-08-07 9:21 Jean-Baptiste Vignaud
2007-08-07 9:44 ` Jarek Poplawski
2007-08-07 8:10 Jean-Baptiste Vignaud
2007-08-07 9:05 ` Jarek Poplawski
2007-08-06 20:42 Jean-Baptiste Vignaud
2007-08-06 21:19 ` Chuck Ebbert
2007-08-07 7:26 ` Jarek Poplawski
2007-08-06 21:30 ` Al Boldi
2007-08-06 19:36 Jean-Baptiste Vignaud
2007-06-26 14:24 Jean-Baptiste Vignaud
2007-06-27 10:17 ` Jarek Poplawski
[not found] <4bacf17f0706161435g1bb7c08bpd427901f64d57fa@mail.gmail.com>
2007-06-18 11:08 ` Jarek Poplawski
2007-06-18 15:10 ` Stephen Hemminger
2007-06-19 5:27 ` Jarek Poplawski
2007-06-19 5:50 ` Jarek Poplawski
2007-06-22 8:56 ` Marcin Ślusarz
2007-06-22 13:32 ` Jarek Poplawski
[not found] ` <4bacf17f0706252310w155fc4d7v1bf12319a650559a@mail.gmail.com>
2007-06-26 8:08 ` Jarek Poplawski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070806190812.GD26868@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=cebbert@redhat.com \
--cc=jarkao2@o2.pl \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-net@vger.kernel.org \
--cc=marcin.slusarz@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vignaud@xandmail.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).