netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Marcin Ślusarz" <marcin.slusarz@gmail.com>
To: "Ingo Molnar" <mingo@elte.hu>
Cc: "Jarek Poplawski" <jarkao2@o2.pl>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Jean-Baptiste Vignaud" <vignaud@xandmail.fr>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	shemminger <shemminger@linux-foundation.org>,
	linux-net <linux-net@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Alan Cox" <alan@lxorguk.ukuu.org.uk>
Subject: Re: 2.6.20->2.6.21 - networking dies after random time
Date: Tue, 7 Aug 2007 09:46:36 +0200	[thread overview]
Message-ID: <4bacf17f0708070046o14403089v8376a4544f72fec3@mail.gmail.com> (raw)
In-Reply-To: <20070806070300.GA4509@elte.hu>

2007/8/6, Ingo Molnar <mingo@elte.hu>:
> (..)
> please try Jarek's second patch too - there was a missing unmask.
>
>         Ingo
>
> -------------->
> Subject: genirq: fix simple and fasteoi irq handlers
> From: Jarek Poplawski <jarkao2@o2.pl>
>
> After the "genirq: do not mask interrupts by default" patch interrupts
> should be disabled not immediately upon request, but after they happen.
> But, handle_simple_irq() and handle_fasteoi_irq() can skip this once or
> more if an irq is just serviced (IRQ_INPROGRESS), possibly disrupting a
> driver's work.
>
> The main reason of problems here, pointing the broken patch and making
> the first patch which can fix this was done by Marcin Slusarz.
> Additional test patches of Thomas Gleixner and Ingo Molnar tested by
> Marcin Slusarz helped to narrow possible reasons even more. Thanks.
>
> PS: this patch fixes only one evident error here, but there could be
> more places affected by above-mentioned change in irq handling.
>
> PS 2:
> After rethinking, IMHO, there are two most probable scenarios here:
>
> 1. After hw resend there could be a conflict between retriggered
> edge type irq and the next level type one: e.g. if this level type
> irq (io_apic is enabled then) is triggered while retriggered irq is
> serviced (IRQ_INPROGRESS) there is goto out with eoi, and probably
> the next such levels are triggered and looping, so probably kind of
> flood in io_apic until this retriggered edge service has ended.
> 2. There is something wrong with ioapic_retrigger_irq (less probable
> because this should be probably seen with 'normal' edge retriggers,
> but on the other hand, they could be less common).
>
> So, if there is #1, this fixed patch should work.
>
> But, since level types don't need this retriggers too much I think
> this "don't mask interrupts by default" idea should be rethinked:
> is there enough gain to risk such hard to diagnose errors?
>
> So, IMHO, there should be at least possibility to turn this off for
> level types in config (it should be a visible option, so people could
> find & try this before writing for help or changing a network card).
>
>
> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
>
> ---
>
> diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c
> --- 2.6.23-rc1-/kernel/irq/chip.c       2007-07-09 01:32:17.000000000 +0200
> +++ 2.6.23-rc1/kernel/irq/chip.c        2007-08-05 21:49:46.000000000 +0200
> @@ -295,12 +295,11 @@ handle_simple_irq(unsigned int irq, stru
>
>         spin_lock(&desc->lock);
>
> -       if (unlikely(desc->status & IRQ_INPROGRESS))
> -               goto out_unlock;
>         kstat_cpu(cpu).irqs[irq]++;
>
>         action = desc->action;
> -       if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
> +       if (unlikely(!action || (desc->status & (IRQ_INPROGRESS |
> +                                                IRQ_DISABLED)))) {
>                 if (desc->chip->mask)
>                         desc->chip->mask(irq);
>                 desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
> @@ -318,6 +317,8 @@ handle_simple_irq(unsigned int irq, stru
>
>         spin_lock(&desc->lock);
>         desc->status &= ~IRQ_INPROGRESS;
> +       if (!(desc->status & IRQ_DISABLED) && desc->chip->unmask)
> +               desc->chip->unmask(irq);
>  out_unlock:
>         spin_unlock(&desc->lock);
>  }
> @@ -392,18 +393,16 @@ handle_fasteoi_irq(unsigned int irq, str
>
>         spin_lock(&desc->lock);
>
> -       if (unlikely(desc->status & IRQ_INPROGRESS))
> -               goto out;
> -
>         desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
>         kstat_cpu(cpu).irqs[irq]++;
>
>         /*
> -        * If its disabled or no action available
> +        * If it's running, disabled or no action available
>          * then mask it and get out of here:
>          */
>         action = desc->action;
> -       if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
> +       if (unlikely(!action || (desc->status & (IRQ_INPROGRESS |
> +                                                IRQ_DISABLED)))) {
>                 desc->status |= IRQ_PENDING;
>                 if (desc->chip->mask)
>                         desc->chip->mask(irq);
> @@ -420,6 +419,8 @@ handle_fasteoi_irq(unsigned int irq, str
>
>         spin_lock(&desc->lock);
>         desc->status &= ~IRQ_INPROGRESS;
> +       if (!(desc->status & IRQ_DISABLED) && desc->chip->unmask)
> +               desc->chip->unmask(irq);
>  out:
>         desc->chip->eoi(irq);
>
>
Network card still locks up (tested on 2.6.22.1). I had to upload more
data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it
might be a coincidence...

Marcin

  parent reply	other threads:[~2007-08-07  7:46 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-29  8:50 2.6.20->2.6.21 - networking dies after random time Jean-Baptiste Vignaud
2007-06-29 15:07 ` Jarek Poplawski
2007-07-23  5:44   ` Marcin Ślusarz
2007-07-23  8:53     ` Jarek Poplawski
2007-07-24  7:18     ` Jarek Poplawski
2007-07-24  8:05     ` Ingo Molnar
2007-07-24  9:42       ` Ingo Molnar
2007-07-24 19:30         ` Linus Torvalds
2007-07-24 20:04           ` Ingo Molnar
2007-07-25  0:19             ` Thomas Gleixner
2007-07-25  7:23               ` Jarek Poplawski
2007-07-25 13:57               ` Jarek Poplawski
2007-07-25 14:46                 ` Alan Cox
2007-07-26 12:44                   ` [PATCH][netdrvr] lib8390: comment on locking by Alan Cox " Jarek Poplawski
2007-07-26 12:47                     ` Alan Cox
2007-07-30 19:47                     ` Jeff Garzik
2007-07-30  8:46                   ` Ingo Molnar
2007-07-30 13:05                     ` Alan Cox
2007-07-26  7:16               ` Marcin Ślusarz
2007-07-26  8:13                 ` Jarek Poplawski
2007-07-26  8:10                   ` Thomas Gleixner
2007-07-26  8:31                     ` Ingo Molnar
2007-07-26  8:55                       ` Jarek Poplawski
2007-07-26  9:12                         ` Ingo Molnar
2007-07-30  7:29                           ` Marcin Ślusarz
2007-07-30  8:49                             ` Ingo Molnar
2007-08-01  7:24                               ` Marcin Ślusarz
2007-08-01  7:27                                 ` Ingo Molnar
2007-08-06  6:58                                   ` Marcin Ślusarz
2007-07-31 13:20                             ` Jarek Poplawski
2007-08-06  7:00                               ` Marcin Ślusarz
2007-08-06  7:03                                 ` Ingo Molnar
2007-08-06 17:43                                   ` Chuck Ebbert
2007-08-06 19:08                                     ` Ingo Molnar
2007-08-09 14:50                                       ` [RFC] " Jarek Poplawski
     [not found]                                         ` <p738x8kg0dp.fsf@bingen.suse.de>
2007-08-09 15:30                                           ` Jarek Poplawski
2007-08-07 10:09                                     ` Jarek Poplawski
2007-08-07  7:46                                   ` Marcin Ślusarz [this message]
2007-08-07  8:23                                     ` Jarek Poplawski
     [not found]                                       ` <4bacf17f0708070237w19d184b3p7f74b53612edb9a6@mail.gmail.com>
2007-08-07  9:52                                         ` Jarek Poplawski
2007-08-07 12:13                                           ` Jarek Poplawski
2007-08-07 12:55                                             ` Jarek Poplawski
2007-08-08 11:11                                               ` Marcin Ślusarz
2007-08-08 11:09                                             ` Marcin Ślusarz
2007-08-08 11:42                                               ` Jarek Poplawski
2007-08-08 11:53                                                 ` Jarek Poplawski
2007-08-09  9:19                                                 ` [patch (testing)] " Jarek Poplawski
     [not found]                                                   ` <4bacf17f0708092333n17e0ba19jf2c769531610868d@mail.gmail.com>
2007-08-10  7:10                                                     ` Jarek Poplawski
2007-08-10 10:43                                                       ` Marcin Ślusarz
2007-08-10 11:37                                                         ` Jarek Poplawski
2007-07-31 15:58                             ` [patch] genirq: temporary fix for level-triggered IRQ resend Ingo Molnar
2007-07-31 16:00                               ` Ingo Molnar
2007-08-08 11:00                                 ` Jarek Poplawski
2007-08-02 17:03                               ` Gabriel C
2007-08-02 20:11                                 ` Ingo Molnar
2007-08-03  6:07                                   ` [patch] genirq: fix simple and fasteoi irq handlers Jarek Poplawski
2007-08-03  8:04                                     ` Ingo Molnar
2007-08-03  8:46                                       ` Ingo Molnar
2007-08-03  9:10                                       ` Jarek Poplawski
2007-08-03 11:57                                         ` Marcin Ślusarz
2007-08-03 12:26                                           ` Jarek Poplawski
2007-08-06  7:05                                     ` Marcin Ślusarz
2007-08-06  6:07                                   ` [patch (take 2)] " Jarek Poplawski
2007-08-06  6:14                                     ` Ingo Molnar
2007-08-06  7:07                                       ` Marcin Ślusarz
2007-08-06  7:19                                       ` Jarek Poplawski
2007-07-26  9:11                     ` 2.6.20->2.6.21 - networking dies after random time Jarek Poplawski
2007-07-26  8:19                   ` Jarek Poplawski
2007-07-26  8:16                 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2007-08-08  8:59 Jean-Baptiste Vignaud
2007-08-08  9:30 ` Jarek Poplawski
2007-08-08 12:16 ` Jarek Poplawski
2007-08-07 17:16 Jean-Baptiste Vignaud
2007-08-08  7:21 ` Jarek Poplawski
2007-08-08  7:36   ` Jarek Poplawski
2007-08-07  9:21 Jean-Baptiste Vignaud
2007-08-07  9:44 ` Jarek Poplawski
2007-08-07  8:10 Jean-Baptiste Vignaud
2007-08-07  9:05 ` Jarek Poplawski
2007-08-06 20:42 Jean-Baptiste Vignaud
2007-08-06 21:19 ` Chuck Ebbert
2007-08-07  7:26   ` Jarek Poplawski
2007-08-06 21:30 ` Al Boldi
2007-08-06 19:36 Jean-Baptiste Vignaud
2007-06-26 14:24 Jean-Baptiste Vignaud
2007-06-27 10:17 ` Jarek Poplawski
     [not found] <4bacf17f0706161435g1bb7c08bpd427901f64d57fa@mail.gmail.com>
2007-06-18 11:08 ` Jarek Poplawski
2007-06-18 15:10   ` Stephen Hemminger
2007-06-19  5:27     ` Jarek Poplawski
2007-06-19  5:50     ` Jarek Poplawski
2007-06-22  8:56       ` Marcin Ślusarz
2007-06-22 13:32         ` Jarek Poplawski
     [not found]           ` <4bacf17f0706252310w155fc4d7v1bf12319a650559a@mail.gmail.com>
2007-06-26  8:08             ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bacf17f0708070046o14403089v8376a4544f72fec3@mail.gmail.com \
    --to=marcin.slusarz@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=jarkao2@o2.pl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-net@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@linux-foundation.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vignaud@xandmail.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).