From: Gary Hade <garyhade@us.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gary Hade <garyhade@us.ibm.com>,
Yinghai Lu <yhlu.kernel@gmail.com>,
mingo@elte.hu, mingo@redhat.com, tglx@linutronix.de,
hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org,
lcm@us.ibm.com
Subject: Re: [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption
Date: Thu, 30 Apr 2009 11:15:46 -0700 [thread overview]
Message-ID: <20090430181546.GA7257@us.ibm.com> (raw)
In-Reply-To: <m13abr2w0a.fsf@fess.ebiederm.org>
On Wed, Apr 29, 2009 at 10:46:29AM -0700, Eric W. Biederman wrote:
> Gary Hade <garyhade@us.ibm.com> writes:
>
> >> > This didn't help. Using 2.6.30-rc3 plus your patch both bugs
> >> > are unfortunately still present.
> >>
> >> You could offline the cpus? I know when I tested it on my
> >> laptop I could not offline the cpus.
> >
> > Eric, I'm sorry! This was due to my stupid mistake. When I
> > went to apply your patch I included --dry-run to test it but
> > apparently got distracted and never actually ran patch(1)
> > without --dry-run. <SIGH>
> >
> > So, I just rebuilt after _really_ applying the patch and got
> > the following result which probably to be what you intended.
>
> Ok. Good to see.
>
> >> >> I propose detecting thpe cases that we know are safe to migrate in
> >> >> process context, aka logical deliver with less than 8 cpus aka "flat"
> >> >> routing mode and modifying the code so that those work in process
> >> >> context and simply deny cpu hotplug in all of the rest of the cases.
> >> >
> >> > Humm, are you suggesting that CPU offlining/onlining would not
> >> > be possible at all on systems with >8 logical CPUs (i.e. most
> >> > of our systems) or would this just force users to separately
> >> > migrate IRQ affinities away from a CPU (e.g. by shutting down
> >> > the irqbalance daemon and writing to /proc/irq/<irq>/smp_affinity)
> >> > before attempting to offline it?
> >>
> >> A separate migration, for those hard to handle irqs.
> >>
> >> The newest systems have iommus that irqs go through or are using MSIs
> >> for the important irqs, and as such can be migrated in process
> >> context. So this is not a restriction for future systems.
> >
> > I understand your concerns but we need a solution for the
> > earlier systems that does NOT remove or cripple the existing
> > CPU hotplug functionality. If you can come up with a way to
> > retain CPU hotplug function while doing all IRQ migration in
> > interrupt context I would certainly be willing to try to find
> > some time to help test and debug your changes on our systems.
>
> Well that is ultimately what I am looking towards.
>
> How do we move to a system that works by design, instead of
> one with design goals that are completely conflicting.
>
> Thinking about it, we should be able to preemptively migrate
> irqs in the hook I am using that denies cpu hotplug.
>
> If they don't migrate after a short while I expect we should
> still fail but that would relieve some of the pain, and certainly
> prevent a non-working system.
>
> There are little bits we can tweak like special casing irqs that
> no-one is using.
>
> My preference here is that I would rather deny cpu hotplug unplug than
> have the non-working system problems that you have seen.
>
> All of that said I have some questions about your hardware.
> - How many sockets and how many cores do you have?
The largest is the x3950 M2 with up to 16 sockets and
96 cores in currently supported configurations and I
expect that there could be at least double those numbers
in the future.
http://www-03.ibm.com/systems/x/hardware/enterprise/x3950m2/index.html
> - How many irqs do you have?
On the single node x3950 M2 that I have been using with
all of it's 7 PCIe slots vacant I see:
[root@elm3c160 ~]# cat /proc/interrupts | wc -l
21
Up to 4 nodes are currently supported and I expect
that there could be at least double that number in
the future.
> - Do you have an iommu that irqs can go through?
Only a subset of our systems (e.g. x460, x3850, x3950
w/Calgary iommu) have this.
>
> If you have <= 8 cores this problem is totally solvable.
Dreamer :-)
>
> Other cases may be but I don't know what the tradeoffs are.
> For very large systems we don't have enough irqs without
> limiting running in physical flat mode which makes things
> even more of a challenge.
>
> It may also be that your ioapics don't have the bugs that
> intel and amd ioapics have and we could have a way to recognize
> high quality ioapics.
I believe all our System x boxes have Intel and AMD ioapics.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
next prev parent reply other threads:[~2009-04-30 18:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-08 21:07 [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption Gary Hade
2009-04-08 22:03 ` Yinghai Lu
2009-04-08 23:08 ` Gary Hade
2009-04-11 6:46 ` Yinghai Lu
2009-04-13 19:37 ` Gary Hade
2009-04-13 20:17 ` Eric W. Biederman
2009-04-28 0:05 ` Gary Hade
2009-04-28 10:27 ` Eric W. Biederman
2009-04-29 0:44 ` Gary Hade
2009-04-29 1:44 ` Eric W. Biederman
2009-04-29 17:17 ` Gary Hade
2009-04-29 17:46 ` Eric W. Biederman
2009-04-30 18:15 ` Gary Hade [this message]
2009-04-30 21:17 ` Gary Hade
2009-05-24 0:24 ` Yinghai Lu
2009-04-10 21:39 ` Gary Hade
2009-04-11 7:35 ` Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090430181546.GA7257@us.ibm.com \
--to=garyhade@us.ibm.com \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=lcm@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.