From: Thomas Gleixner <tglx@linutronix.de>
To: Andi Kleen <andi@firstfloor.org>
Cc: x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
jesse.brandeburg@intel.com,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] Prevent nested interrupts when the IRQ stack is near overflowing v2
Date: Thu, 25 Mar 2010 14:17:59 +0100 (CET) [thread overview]
Message-ID: <alpine.LFD.2.00.1003251338130.3147@localhost.localdomain> (raw)
In-Reply-To: <20100325121141.GO20695@one.firstfloor.org>
On Thu, 25 Mar 2010, Andi Kleen wrote:
> On Thu, Mar 25, 2010 at 12:09:10PM +0100, Thomas Gleixner wrote:
> > On Thu, 25 Mar 2010, Andi Kleen wrote:
> > > On Thu, Mar 25, 2010 at 02:46:42AM +0100, Thomas Gleixner wrote:
> > > > 3) Why does the NIC driver code not set IRQF_DISABLED in the first
> > > > place? AFAICT the network drivers just kick off NAPI, so whats the
> > > > point to run those handlers with IRQs enabled at all ?
> > >
> > > I think the idea was to minimize irq latency for other interrupts
> >
> > So what's the point ? Is the irq handler of that card so long running,
> > that it causes trouble ?
>
> I believe it's more like that you have a lot of them (even with interrupt
> mitigation and NAPI polling) and they have to scale up the work to handle
> high bandwidths, so it ends up with a lot of time in them anyways,
> ending with skew in the timers and similar issues.
Timekeeping is an absolute non issue here. The time keeping code can
cope with pretty large delays and there is nothing which
> Anyways my goal here was simply to have a least intrusive fix, not
> change semantics for everyone.
>
> Yes it could work out this. Or it could not. I'm not sure, so I chose
> the safe option.
Oh it's safe because it solves the problem on that particular machine
and workload?
No, it's not safe at all.
- it does not prevent a driver reenabling interrupts
- it will cause real large latencies when the interrupt which
happens to trigger that check is one of the long running
ones. Are you going to deal with the "oh we see >500us" latencies
bugreports ?
- worst case it'll brick your machine if the interrupt which
happens to trigger that check relies on irq enabled semantics
(e.g. missing jiffies update). Yes, such a driver is broken by
design, but we have such crap lurking. So much for not changing
semantics.
> > So what's the point of running a well written (short) interrupt
> > handler with interrupts enabled ? Nothing at all. It just makes us
> > deal with crap like stacks overflowing for no good reason.
>
> Ok so you're just proposing to always set IRQF_DISABLED?
For any sane written driver, yes. We have crap irq handlers which need
to be fixed, so we can't enforce it right away. See above.
> I don't know what problems it would cause. My fear is that any latency problems
> would be answered with a "move to RT, moron" from your mouth though.
Oh well.
> > > Anyways if such a thing was done it would be a long term project
> > > and that short term fix would be still needed.
> >
> > Your patch is not a fix, It's a lousy, horrible and unreliable
> > workaround. It's not fixing the root cause of the problem at hand.
>
> It fixes the bug in a minimally intrusive way.
It papers over the problem. We already know that the NIC driver floods
the machine with interrupts, so why are you insisting that we need to
bandaid that problem ?
The minimal intrusive way is a one liner in that very driver code and
if it causes problems for that very driver then we don't fix them with
adding a callback in the generic interrupt code path.
The message which we would send out with applying that band aid would
be simply: Go ahead driver writers and let your handlers run as long
as they want, we'll safe you in 99.9% of the cases and we'll happily
go and debug the 0.1% of completely undebuggable shit which will
result out of that.
No thanks,
tglx
next prev parent reply other threads:[~2010-03-25 13:18 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-24 19:02 [PATCH] Prevent nested interrupts when the IRQ stack is near overflowing v2 Andi Kleen
2010-03-24 20:42 ` Thomas Gleixner
2010-03-24 23:08 ` Thomas Gleixner
2010-03-25 0:36 ` Andi Kleen
2010-03-25 1:46 ` Thomas Gleixner
2010-03-25 9:37 ` Andi Kleen
2010-03-25 11:09 ` Thomas Gleixner
2010-03-25 12:11 ` Andi Kleen
2010-03-25 13:17 ` Thomas Gleixner [this message]
2010-03-25 13:32 ` Andi Kleen
2010-03-25 14:16 ` Thomas Gleixner
2010-03-25 15:38 ` Andi Kleen
2010-03-25 16:06 ` Alan Cox
2010-03-25 16:13 ` Linus Torvalds
2010-03-25 16:17 ` Linus Torvalds
2010-03-25 16:27 ` Ingo Molnar
2010-03-25 16:33 ` Ingo Molnar
2010-03-25 18:27 ` Andi Kleen
2010-03-26 4:55 ` Eric W. Biederman
2010-03-25 16:52 ` Thomas Gleixner
2010-03-25 17:47 ` Peter Zijlstra
2010-03-25 18:01 ` Linus Torvalds
2010-03-25 18:21 ` Peter Zijlstra
2010-03-25 18:23 ` Peter Zijlstra
2010-03-25 18:44 ` Andi Kleen
2010-03-25 19:01 ` Ingo Molnar
2010-03-25 18:29 ` Ingo Molnar
2010-03-25 19:10 ` Linus Torvalds
2010-03-25 19:42 ` David Miller
2010-03-25 20:40 ` Linus Torvalds
2010-03-26 3:33 ` David Miller
2010-03-25 20:51 ` Ingo Molnar
2010-03-25 20:53 ` Linus Torvalds
2010-03-26 6:10 ` Andi Kleen
2010-03-25 10:50 ` Alan Cox
2010-03-25 11:16 ` Thomas Gleixner
2010-03-25 11:59 ` Alan Cox
2010-03-25 12:00 ` Andi Kleen
2010-03-25 11:57 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1003251338130.3147@localhost.localdomain \
--to=tglx@linutronix.de \
--cc=andi@firstfloor.org \
--cc=jesse.brandeburg@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox