All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Gary Hade <garyhade@us.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>,
	mingo@elte.hu, mingo@redhat.com, tglx@linutronix.de,
	hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org,
	lcm@us.ibm.com
Subject: Re: [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption
Date: Wed, 29 Apr 2009 10:46:29 -0700	[thread overview]
Message-ID: <m13abr2w0a.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20090429171719.GA7385@us.ibm.com> (Gary Hade's message of "Wed\, 29 Apr 2009 10\:17\:19 -0700")

Gary Hade <garyhade@us.ibm.com> writes:

>> > This didn't help.  Using 2.6.30-rc3 plus your patch both bugs
>> > are unfortunately still present.
>> 
>> You could offline the cpus?  I know when I tested it on my
>> laptop I could not offline the cpus.
>
> Eric, I'm sorry!  This was due to my stupid mistake.  When I
> went to apply your patch I included --dry-run to test it but
> apparently got distracted and never actually ran patch(1)
> without --dry-run. <SIGH>
>
> So, I just rebuilt after _really_ applying the patch and got
> the following result which probably to be what you intended.

Ok.  Good to see.

>> >> I propose detecting thpe cases that we know are safe to migrate in
>> >> process context, aka logical deliver with less than 8 cpus aka "flat"
>> >> routing mode and modifying the code so that those work in process
>> >> context and simply deny cpu hotplug in all of the rest of the cases.
>> >
>> > Humm, are you suggesting that CPU offlining/onlining would not
>> > be possible at all on systems with >8 logical CPUs (i.e. most
>> > of our systems) or would this just force users to separately
>> > migrate IRQ affinities away from a CPU (e.g. by shutting down
>> > the irqbalance daemon and writing to /proc/irq/<irq>/smp_affinity)
>> > before attempting to offline it?
>> 
>> A separate migration, for those hard to handle irqs.
>> 
>> The newest systems have iommus that irqs go through or are using MSIs
>> for the important irqs, and as such can be migrated in process
>> context.  So this is not a restriction for future systems.
>
> I understand your concerns but we need a solution for the
> earlier systems that does NOT remove or cripple the existing
> CPU hotplug functionality.  If you can come up with a way to
> retain CPU hotplug function while doing all IRQ migration in
> interrupt context I would certainly be willing to try to find
> some time to help test and debug your changes on our systems.

Well that is ultimately what I am looking towards.

How do we move to a system that works by design, instead of
one with design goals that are completely conflicting.

Thinking about it, we should be able to preemptively migrate
irqs in the hook I am using that denies cpu hotplug.

If they don't migrate after a short while I expect we should
still fail but that would relieve some of the pain, and certainly
prevent a non-working system.

There are little bits we can tweak like special casing irqs that
no-one is using.

My preference here is that I would rather deny cpu hotplug unplug than
have the non-working system problems that you have seen.

All of that said I have some questions about your hardware.
- How many sockets and how many cores do you have?
- How many irqs do you have?
- Do you have an iommu that irqs can go through?

If you have <= 8 cores this problem is totally solvable.

Other cases may be but I don't know what the tradeoffs are.
For very large systems we don't have enough irqs without
limiting running in physical flat mode which makes things
even more of a challenge.

It may also be that your ioapics don't have the bugs that
intel and amd ioapics have and we could have a way to recognize
high quality ioapics.

Eric

  reply	other threads:[~2009-04-29 17:46 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-08 21:07 [PATCH 3/3] [BUGFIX] x86/x86_64: fix IRQ migration triggered active device IRQ interrruption Gary Hade
2009-04-08 22:03 ` Yinghai Lu
2009-04-08 23:08   ` Gary Hade
2009-04-11  6:46     ` Yinghai Lu
2009-04-13 19:37       ` Gary Hade
2009-04-13 20:17         ` Eric W. Biederman
2009-04-28  0:05           ` Gary Hade
2009-04-28 10:27             ` Eric W. Biederman
2009-04-29  0:44               ` Gary Hade
2009-04-29  1:44                 ` Eric W. Biederman
2009-04-29 17:17                   ` Gary Hade
2009-04-29 17:46                     ` Eric W. Biederman [this message]
2009-04-30 18:15                       ` Gary Hade
2009-04-30 21:17                         ` Gary Hade
2009-05-24  0:24                       ` Yinghai Lu
2009-04-10 21:39   ` Gary Hade
2009-04-11  7:35     ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m13abr2w0a.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=garyhade@us.ibm.com \
    --cc=hpa@zytor.com \
    --cc=lcm@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.