From: ebiederm@xmission.com (Eric W. Biederman)
To: Gary Hade <garyhade@us.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>,
mingo@elte.hu, mingo@redhat.com, tglx@linutronix.de,
hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org,
lcm@us.ibm.com
Subject: Re: [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption
Date: Wed, 29 Apr 2009 10:46:29 -0700 [thread overview]
Message-ID: <m13abr2w0a.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20090429171719.GA7385@us.ibm.com> (Gary Hade's message of "Wed\, 29 Apr 2009 10\:17\:19 -0700")
Gary Hade <garyhade@us.ibm.com> writes:
>> > This didn't help. Using 2.6.30-rc3 plus your patch both bugs
>> > are unfortunately still present.
>>
>> You could offline the cpus? I know when I tested it on my
>> laptop I could not offline the cpus.
>
> Eric, I'm sorry! This was due to my stupid mistake. When I
> went to apply your patch I included --dry-run to test it but
> apparently got distracted and never actually ran patch(1)
> without --dry-run. <SIGH>
>
> So, I just rebuilt after _really_ applying the patch and got
> the following result which probably to be what you intended.
Ok. Good to see.
>> >> I propose detecting thpe cases that we know are safe to migrate in
>> >> process context, aka logical deliver with less than 8 cpus aka "flat"
>> >> routing mode and modifying the code so that those work in process
>> >> context and simply deny cpu hotplug in all of the rest of the cases.
>> >
>> > Humm, are you suggesting that CPU offlining/onlining would not
>> > be possible at all on systems with >8 logical CPUs (i.e. most
>> > of our systems) or would this just force users to separately
>> > migrate IRQ affinities away from a CPU (e.g. by shutting down
>> > the irqbalance daemon and writing to /proc/irq/<irq>/smp_affinity)
>> > before attempting to offline it?
>>
>> A separate migration, for those hard to handle irqs.
>>
>> The newest systems have iommus that irqs go through or are using MSIs
>> for the important irqs, and as such can be migrated in process
>> context. So this is not a restriction for future systems.
>
> I understand your concerns but we need a solution for the
> earlier systems that does NOT remove or cripple the existing
> CPU hotplug functionality. If you can come up with a way to
> retain CPU hotplug function while doing all IRQ migration in
> interrupt context I would certainly be willing to try to find
> some time to help test and debug your changes on our systems.
Well that is ultimately what I am looking towards.
How do we move to a system that works by design, instead of
one with design goals that are completely conflicting.
Thinking about it, we should be able to preemptively migrate
irqs in the hook I am using that denies cpu hotplug.
If they don't migrate after a short while I expect we should
still fail but that would relieve some of the pain, and certainly
prevent a non-working system.
There are little bits we can tweak like special casing irqs that
no-one is using.
My preference here is that I would rather deny cpu hotplug unplug than
have the non-working system problems that you have seen.
All of that said I have some questions about your hardware.
- How many sockets and how many cores do you have?
- How many irqs do you have?
- Do you have an iommu that irqs can go through?
If you have <= 8 cores this problem is totally solvable.
Other cases may be but I don't know what the tradeoffs are.
For very large systems we don't have enough irqs without
limiting running in physical flat mode which makes things
even more of a challenge.
It may also be that your ioapics don't have the bugs that
intel and amd ioapics have and we could have a way to recognize
high quality ioapics.
Eric
next prev parent reply other threads:[~2009-04-29 17:46 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-08 21:07 [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption Gary Hade
2009-04-08 22:03 ` Yinghai Lu
2009-04-08 23:08 ` Gary Hade
2009-04-11 6:46 ` Yinghai Lu
2009-04-13 19:37 ` Gary Hade
2009-04-13 20:17 ` Eric W. Biederman
2009-04-28 0:05 ` Gary Hade
2009-04-28 10:27 ` Eric W. Biederman
2009-04-29 0:44 ` Gary Hade
2009-04-29 1:44 ` Eric W. Biederman
2009-04-29 17:17 ` Gary Hade
2009-04-29 17:46 ` Eric W. Biederman [this message]
2009-04-30 18:15 ` Gary Hade
2009-04-30 21:17 ` Gary Hade
2009-05-24 0:24 ` Yinghai Lu
2009-04-10 21:39 ` Gary Hade
2009-04-11 7:35 ` Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m13abr2w0a.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=garyhade@us.ibm.com \
--cc=hpa@zytor.com \
--cc=lcm@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox