public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Scott Wood <scottwood@freescale.com>
To: Paul Mackerras <paulus@samba.org>
Cc: Alexander Graf <agraf@suse.de>, <kvm@vger.kernel.org>,
	<kvm-ppc@vger.kernel.org>
Subject: Re: [PATCH 3/9] KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller
Date: Fri, 15 Feb 2013 21:57:06 -0600	[thread overview]
Message-ID: <1360987026.6960.15@snotra> (raw)
In-Reply-To: <20130216025614.GB21288@iris.ozlabs.ibm.com> (from paulus@samba.org on Fri Feb 15 20:56:14 2013)

On 02/15/2013 08:56:14 PM, Paul Mackerras wrote:
> I have no particular objection to the device control API per se, but
> I have two objections to using it as the primary interface to the XICS
> emulation.
> 
> First, I dislike the magical side-effect where creating a device of a
> particular type (e.g. MPIC or XICS) automatically attaches it to the
> interrupt lines of the vcpus.  I prefer an explicit request to do
> in-kernel interrupt control.

OK.  This is device-specific behavior, so you could define it  
differently for XICS than MPIC.  I suppose we could change it for MPIC  
as well, to leave an opening for the unlikely case where we'd want to  
model an MPIC that isn't directly connected to the CPUs.

How is the explicit request made in this patchset?

> Secondly, it means that we are completely abandoning any attempt to
> define an abstract or generic interface to in-kernel interrupt
> controller emulations.  Each device will have its own unique set of
> attribute groups and its own unique userspace code to drive it, with
> no commonality between them.

Yes.  I am unconvinced that such an abstraction is well-advised  
(especially after seeing existing "generic" interfaces that are clearly  
APIC-oriented).  This isn't like normal driver interfaces where we're  
abstracting away hardware differences to let generic code use a  
device.  Userspace knows what kind of device it wants, and how it wants  
it to integrate with the rest of the emulated system.  We'd have to go  
out of our way to apply the abstraction on *both* ends.  What do we get  
from that other than a chance that the abstraction leaks?  What  
significant code actually becomes common?  kvm_set_irq() is just a thin  
wrapper around the ioctl.

> > >We have live migration working in qemu for
> > >pSeries guests with in-kernel XICS emulation using this interface.
> > >If you're not doing live migration,
> >
> > We don't yet, but would prefer not to assume that it'll never  
> happen.
> >
> > >> for interrupt injection, what if there's a race with the user
> > >changing
> > >> other flags via MMIO?  Maybe this isn't an issue with XICS, but
> > >this is
> > >> being presented as a generic API.
> > >
> > >They're not used while the guest is running, as I said, but even if
> > >they were, there is appropriate locking in there to handle any  
> races.
> >
> > OK, KVM_IRQ_LINE is still used for interrupt injection.  I was
> > hoping to avoid going through a standardized interface that forces a
> > global interrupt numberspace.
> 
> Why?

The standardized interface doesn't make things any easier (as noted  
above, the caller is already mpic-specific code), and we'd have to come  
up with a scheme for flattening our interrupt numberspace (rather than  
introduce new attribute groups for things like IPI and timer  
interrupts).  It may still be necessary when it comes to irqfd,  
though...

> > How do MSIs get injected?
> 
> Just like other interrupts - from the point of view of the interrupt
> controller they're edge-triggered interrupt sources.

Ah right, I guess this is all set up via hcalls for XICS.

With MPIC exposing its registers via the device control api, everything  
just works -- the PCI device generates a write to the MPIC's memory  
region, the QEMU MPIC stub sends the write to the kernel as for any  
other MMIO access (this passthrough is also useful for debugging), the  
in-kernel MPIC sees the write to the "generate an MSI" register and  
does its thing.  Compare that to all special the MSI code for APIC...  
:-)

> > BTW, do you have any plans regarding irqfd?
> 
> I'm going to look at that next.

Likewise...  We should probably coordinate our efforts so that at least  
the de-APICization of the code only has to get done once.

> > What about interrupt controllers that allow multiple destinations?
> 
> The destination can be an identifier for a group of vcpus, or even a
> bitmap -- that's why I made it 32 bits.

So you can have single delivery, or be limited to 32 vcpus, or have to  
implement some destination ID allocation scheme (which is more state  
that needs to be accessible somehow).

> > More than 256 priorities?  Different "levels" of output (normal,
> > critical, machine check)?  Programmable vector numbers?  Active
> > high/low control?
> 
> There are plenty of bits free in the 64 bits per source that I have
> allowed.  We can accommodate those things.

MPIC vector numbers take up 16 of the bits.  The architected interrupt  
level field is 8 bits, though only a handful of values are actually  
needed.  Add a couple binary flags, and it gets pretty tight if a third  
type of interrupt controller starts wanting something new.

>  (BTW, I think having more
> than 256 priorities would be insane - do you know of any actual
> example that does?)

No, but hardware designers have been known to do insane things.

> > The per-vcpu state isn't even part of this AFAICT.  It's an
> > XICS-specific ONE_REG -- which is fine, but all that's left of the
> > "generic" API is the get/set sources which is an imperfect match to
> > our per-IRQ state and it's not clear how an implementation should
> > extend it.
> 
> Yes, the names of the bitfields in the ICP state word are
> XICS-specific, but the concepts are pretty generic - current processor
> priority, identifier for interrupt awaiting service, pending IPI
> request priority, pending interrupt request priority.

We don't have separate concepts of "pending IPI request priority" and  
"pending interrupt request priority".  There can be multiple interrupts  
awaiting service (or even in service, if different priorities).  We  
have both "current task priority" (which is a user-set mask-by-priority  
register) and the priority of the highest-prio in-service interrupt --  
which would "current processor priorty" be?  Etc.

-Scott

  reply	other threads:[~2013-02-16  3:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-14 23:59 [PATCH 0/9] In-kernel XICS interrupt controller emulation Paul Mackerras
2013-02-14 23:59 ` [PATCH 1/9] KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls Paul Mackerras
2013-03-21  8:52   ` Alexander Graf
2013-04-04  5:37     ` Paul Mackerras
2013-04-04  9:49       ` Alexander Graf
2013-04-04 22:38         ` Paul Mackerras
2013-04-19 15:16           ` Alexander Graf
2013-02-15  0:00 ` [PATCH 2/9] KVM: PPC: Remove unused argument to kvmppc_core_dequeue_external Paul Mackerras
2013-03-21  8:58   ` Alexander Graf
2013-02-15  0:01 ` [PATCH 3/9] KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller Paul Mackerras
2013-02-15 20:05   ` Scott Wood
2013-02-15 23:18     ` Paul Mackerras
2013-02-15 23:59       ` Scott Wood
2013-02-16  2:56         ` Paul Mackerras
2013-02-16  3:57           ` Scott Wood [this message]
2013-02-16  4:51             ` Paul Mackerras
2013-02-18 22:43               ` Scott Wood
2013-02-20  0:41                 ` Paul Mackerras
2013-02-20  1:01                   ` Scott Wood
2013-02-20 19:58           ` Marcelo Tosatti
2013-02-21  0:20             ` Scott Wood
2013-02-21  1:09               ` Marcelo Tosatti
2013-02-21  1:45                 ` Scott Wood
2013-02-24 14:08               ` Gleb Natapov
2013-02-25  0:59             ` Paul Mackerras
2013-03-21  9:20               ` Alexander Graf
2013-02-15  0:01 ` [PATCH 4/9] KVM: PPC: Book3S: Generalize interfaces to interrupt controller emulation Paul Mackerras
2013-02-15  0:02 ` [PATCH 5/9] KVM: PPC: Book3S: Facilities to save/restore XICS presentation ctrler state Paul Mackerras
2013-02-15  0:02 ` [PATCH 6/9] KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVM Paul Mackerras
2013-02-15  0:03 ` [PATCH 7/9] KVM: PPC: Book3S HV: Add support for real mode ICP in XICS emulation Paul Mackerras
2013-02-15  0:03 ` [PATCH 8/9] KVM: PPC: Book3S HV: Improve real-mode handling of external interrupts Paul Mackerras
2013-02-15  0:04 ` [PATCH 9/9] KVM: PPC: Book3S: Add support for ibm,int-on/off RTAS calls Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360987026.6960.15@snotra \
    --to=scottwood@freescale.com \
    --cc=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox