From: Gary Hade <garyhade@us.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gary Hade <garyhade@us.ibm.com>,
mingo@elte.hu, mingo@redhat.com, tglx@linutronix.de,
hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org,
lcm@us.ibm.com
Subject: Re: [PATCH 2/3] [BUGFIX] x86/x86_64: fix CPU offlining triggered inactive device IRQ interrruption
Date: Mon, 13 Apr 2009 14:09:13 -0700 [thread overview]
Message-ID: <20090413210913.GC8393@us.ibm.com> (raw)
In-Reply-To: <m1ljq5r7lw.fsf@fess.ebiederm.org>
On Sun, Apr 12, 2009 at 12:32:11PM -0700, Eric W. Biederman wrote:
> Gary Hade <garyhade@us.ibm.com> writes:
>
> > Impact: Eliminates a race that can leave the system in an
> > unusable state
> >
> > During rapid offlining of multiple CPUs there is a chance
> > that an IRQ affinity move destination CPU will be offlined
> > before the IRQ affinity move initiated during the offlining
> > of a previous CPU completes. This can happen when the device
> > is not very active and thus fails to generate the IRQ that is
> > needed to complete the IRQ affinity move before the move
> > destination CPU is offlined. When this happens there is an
> > -EBUSY return from __assign_irq_vector() during the offlining
> > of the IRQ move destination CPU which prevents initiation of
> > a new IRQ affinity move operation to an online CPU. This
> > leaves the IRQ affinity set to an offlined CPU.
> >
> > I have been able to reproduce the problem on some of our
> > systems using the following script. When the system is idle
> > the problem often reproduces during the first CPU offlining
> > sequence.
>
> Ok. I have had a chance to think through what you your patches
> are doing and it is assuming the broken logic in cpu_down is correct
> and patching over some but not all of the problems.
>
> First the problem is not migrating irqs when IRR is set.
When the device is very active, a printk in __target_IO_APIC_irq()
immediately prior to
io_apic_modify(apic, 0x10 + pin*2, reg);
intermittently displays 'reg' values indicating that the
Remote IRR bit is set.
With PATCH 3/3 the same printk displays no 'reg' values
indicating that the Remote IRR bit is set _and_ the IRQ
interruption problem disappears.
This is what led me to very strongly believe that the
problem was caused by writing the I/O redirection table
register while the Remote IRR bit was set.
> The general
> problem is that the state machines in most ioapics are fragile and
> can get confused if you reprogram them at any point when an irq can
> come in.
IRQs are masked [from fixup_irqs() when offlining a CPU, from
ack_apic_level() when not offlining a CPU] during the reprogramming.
Does this not help avoid the issue? Sorry if this is a nieve
question.
> In the middle of an interrupt handler is the one time we
> know interrupts can not come in.
>
> To really fix this problem we need to do two things.
> 1) Tack when irqs that can not be migrated from process context are
> on a cpu, and deny cpu hot-unplug.
> 2) Modify every interrupt that can be safely migrated in interrupt context
> to migrate irqs in interrupt context so no one encounters this problem
> in practice.
>
> We can update MSIs and do a pci read to know when the update has made it
> to a device. Multi MSI is a disaster but I won't go there.
>
> In lowest priority delivery mode when the irq is not changing domain but
> just changing the set of possible cpus the interrupt can be delivered to.
>
> And then of course all of the fun iommus that remap irqs.
Sounds non-trivial.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
prev parent reply other threads:[~2009-04-13 21:09 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-08 21:07 [PATCH 2/3] [BUGFIX] x86/x86_64: fix CPU offlining triggered inactive device IRQ interrruption Gary Hade
2009-04-08 22:30 ` Yinghai Lu
2009-04-08 23:37 ` Gary Hade
2009-04-08 23:58 ` Yinghai Lu
2009-04-08 23:59 ` Yinghai Lu
2009-04-09 19:17 ` Gary Hade
2009-04-09 22:38 ` Yinghai Lu
2009-04-10 0:53 ` Gary Hade
2009-04-10 1:29 ` Eric W. Biederman
2009-04-10 20:09 ` Gary Hade
2009-04-10 22:02 ` Eric W. Biederman
2009-04-11 7:44 ` Yinghai Lu
2009-04-11 7:51 ` Yinghai Lu
2009-04-11 11:01 ` Eric W. Biederman
2009-04-13 17:41 ` Pallipadi, Venkatesh
2009-04-13 18:50 ` Eric W. Biederman
2009-04-13 22:20 ` [PATCH] irq, x86: Remove IRQ_DISABLED check in process context IRQ move Pallipadi, Venkatesh
2009-04-14 1:40 ` Eric W. Biederman
2009-04-14 14:06 ` [tip:irq/urgent] x86, irq: " tip-bot for Pallipadi, Venkatesh
2009-04-12 19:32 ` [PATCH 2/3] [BUGFIX] x86/x86_64: fix CPU offlining triggered inactive device IRQ interrruption Eric W. Biederman
2009-04-13 21:09 ` Gary Hade [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090413210913.GC8393@us.ibm.com \
--to=garyhade@us.ibm.com \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=lcm@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox