qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: aafabbri <aafabbri@cisco.com>,
	Alexey Kardashevskiy <aik@au1.ibm.com>,
	kvm@vger.kernel.org, Paul Mackerras <pmac@au1.ibm.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	David Gibson <dwg@au1.ibm.com>, chrisw <chrisw@sous-sol.org>,
	iommu <iommu@lists.linux-foundation.org>,
	Avi Kivity <avi@redhat.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	benve@cisco.com
Subject: Re: [Qemu-devel] kvm PCI assignment & VFIO ramblings
Date: Wed, 24 Aug 2011 08:47:46 -0600	[thread overview]
Message-ID: <1314197268.2859.177.camel@bling.home> (raw)
In-Reply-To: <1314143508.30478.72.camel@pasglop>

On Wed, 2011-08-24 at 09:51 +1000, Benjamin Herrenschmidt wrote:
> > > For us the most simple and logical approach (which is also what pHyp
> > > uses and what Linux handles well) is really to expose a given PCI host
> > > bridge per group to the guest. Believe it or not, it makes things
> > > easier :-)
> > 
> > I'm all for easier.  Why does exposing the bridge use less bus numbers
> > than emulating a bridge?
> 
> Because a host bridge doesn't look like a PCI to PCI bridge at all for
> us. It's an entire separate domain with it's own bus number space
> (unlike most x86 setups).

Ok, I missed the "host" bridge.

> In fact we have some problems afaik in qemu today with the concept of
> PCI domains, for example, I think qemu has assumptions about a single
> shared IO space domain which isn't true for us (each PCI host bridge
> provides a distinct IO space domain starting at 0). We'll have to fix
> that, but it's not a huge deal.

Yep, I've seen similar on ia64 systems.

> So for each "group" we'd expose in the guest an entire separate PCI
> domain space with its own IO, MMIO etc... spaces, handed off from a
> single device-tree "host bridge" which doesn't itself appear in the
> config space, doesn't need any emulation of any config space etc...
> 
> > On x86, I want to maintain that our default assignment is at the device
> > level.  A user should be able to pick single or multiple devices from
> > across several groups and have them all show up as individual,
> > hotpluggable devices on bus 0 in the guest.  Not surprisingly, we've
> > also seen cases where users try to attach a bridge to the guest,
> > assuming they'll get all the devices below the bridge, so I'd be in
> > favor of making this "just work" if possible too, though we may have to
> > prevent hotplug of those.
> >
> > Given the device requirement on x86 and since everything is a PCI device
> > on x86, I'd like to keep a qemu command line something like -device
> > vfio,host=00:19.0.  I assume that some of the iommu properties, such as
> > dma window size/address, will be query-able through an architecture
> > specific (or general if possible) ioctl on the vfio group fd.  I hope
> > that will help the specification, but I don't fully understand what all
> > remains.  Thanks,
> 
> Well, for iommu there's a couple of different issues here but yes,
> basically on one side we'll have some kind of ioctl to know what segment
> of the device(s) DMA address space is assigned to the group and we'll
> need to represent that to the guest via a device-tree property in some
> kind of "parent" node of all the devices in that group.
> 
> We -might- be able to implement some kind of hotplug of individual
> devices of a group under such a PHB (PCI Host Bridge), I don't know for
> sure yet, some of that PAPR stuff is pretty arcane, but basically, for
> all intend and purpose, we really want a group to be represented as a
> PHB in the guest.
> 
> We cannot arbitrary have individual devices of separate groups be
> represented in the guest as siblings on a single simulated PCI bus.

I think the vfio kernel layer we're describing easily supports both.
This is just a matter of adding qemu-vfio code to expose different
topologies based on group iommu capabilities and mapping mode.  Thanks,

Alex

  parent reply	other threads:[~2011-08-24 14:48 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1311983933.8793.42.camel@pasglop>
2011-07-30 18:20 ` [Qemu-devel] kvm PCI assignment & VFIO ramblings Alex Williamson
2011-07-30 23:54   ` Benjamin Herrenschmidt
2011-08-01 18:59     ` Alex Williamson
2011-08-02  2:00       ` Benjamin Herrenschmidt
2011-07-30 23:55   ` Benjamin Herrenschmidt
2011-08-02  8:28   ` David Gibson
2011-08-02 18:14     ` Alex Williamson
2011-08-02 18:35       ` Alex Williamson
2011-08-03  2:04         ` David Gibson
2011-08-03  3:44           ` Alex Williamson
2011-08-04  0:39             ` David Gibson
2011-08-08  8:28           ` Avi Kivity
2011-08-09 23:24             ` Alex Williamson
2011-08-10  2:48               ` Benjamin Herrenschmidt
2011-08-20 16:51                 ` Alex Williamson
2011-08-22  5:55                   ` David Gibson
2011-08-22 15:45                     ` Alex Williamson
2011-08-22 21:01                       ` Benjamin Herrenschmidt
2011-08-23 19:30                         ` Alex Williamson
2011-08-23 23:51                           ` Benjamin Herrenschmidt
2011-08-24  3:40                             ` Alexander Graf
2011-08-24 14:47                             ` Alex Williamson [this message]
2011-08-24  8:43                           ` Joerg Roedel
2011-08-24 14:56                             ` Alex Williamson
2011-08-25 11:01                               ` Roedel, Joerg
2011-08-23  2:38                       ` David Gibson
2011-08-23 16:23                         ` Alex Williamson
2011-08-23 23:41                           ` Benjamin Herrenschmidt
2011-08-24  3:36                             ` Alexander Graf
2011-08-22  6:30                   ` Avi Kivity
2011-08-22 10:46                     ` Joerg Roedel
2011-08-22 10:51                       ` Avi Kivity
2011-08-22 12:36                         ` Roedel, Joerg
2011-08-22 12:42                           ` Avi Kivity
2011-08-22 12:55                             ` Roedel, Joerg
2011-08-22 13:06                               ` Avi Kivity
2011-08-22 13:15                                 ` Roedel, Joerg
2011-08-22 13:17                                   ` Avi Kivity
2011-08-22 14:37                                     ` Roedel, Joerg
2011-08-22 20:53                     ` Benjamin Herrenschmidt
2011-08-22 17:25                   ` Joerg Roedel
2011-08-22 19:17                     ` Alex Williamson
2011-08-23 13:14                       ` Roedel, Joerg
2011-08-23 17:08                         ` Alex Williamson
2011-08-24  8:52                           ` Roedel, Joerg
2011-08-24 15:07                             ` Alex Williamson
2011-08-25 12:31                               ` Roedel, Joerg
2011-08-25 13:25                                 ` Alexander Graf
2011-08-26  4:24                                   ` David Gibson
2011-08-26  9:24                                     ` Roedel, Joerg
2011-08-28 13:14                                       ` Avi Kivity
2011-08-28 13:56                                         ` Joerg Roedel
2011-08-28 14:04                                           ` Avi Kivity
2011-08-30 16:14                                             ` Joerg Roedel
2011-08-22 21:03                     ` Benjamin Herrenschmidt
2011-08-23 13:18                       ` Roedel, Joerg
2011-08-23 23:35                         ` Benjamin Herrenschmidt
2011-08-24  8:53                           ` Roedel, Joerg
2011-08-22 20:29                   ` aafabbri
2011-08-22 20:49                     ` Benjamin Herrenschmidt
2011-08-22 21:38                       ` aafabbri
2011-08-22 21:49                         ` Benjamin Herrenschmidt
2011-08-23  0:52                           ` aafabbri
2011-08-23  6:54                             ` Benjamin Herrenschmidt
2011-08-23 11:09                               ` Joerg Roedel
2011-08-23 17:01                               ` Alex Williamson
2011-08-23 17:33                                 ` Aaron Fabbri
2011-08-23 18:01                                   ` Alex Williamson
2011-08-24  9:10                                   ` Joerg Roedel
2011-08-24 21:13                                     ` Alex Williamson
2011-08-25 10:54                                       ` Roedel, Joerg
2011-08-25 15:38                                         ` Don Dutile
2011-08-25 16:46                                           ` Roedel, Joerg
2011-08-25 17:20                                         ` Alex Williamson
2011-08-25 18:05                                           ` Joerg Roedel
2011-08-26 18:04                                             ` Alex Williamson
2011-08-30 16:13                                               ` Joerg Roedel
2011-08-23 11:04                             ` Joerg Roedel
2011-08-23 16:54                               ` aafabbri
2011-08-24  9:14                                 ` Roedel, Joerg
2011-08-24  9:33                                   ` David Gibson
2011-08-24 11:03                                     ` Roedel, Joerg
2011-08-26  4:20                                       ` David Gibson
2011-08-26  9:33                                         ` Roedel, Joerg
2011-08-26 14:07                                           ` Alexander Graf
2011-08-26 15:24                                             ` Joerg Roedel
2011-08-26 15:29                                               ` Alexander Graf
2011-08-26 17:52                                             ` Aaron Fabbri
2011-08-26 19:35                                               ` Chris Wright
2011-08-26 20:17                                                 ` Aaron Fabbri
2011-08-26 21:06                                                   ` Chris Wright
2011-08-30  1:29                                                   ` David Gibson
2011-08-04 10:35   ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1314197268.2859.177.camel@bling.home \
    --to=alex.williamson@redhat.com \
    --cc=aafabbri@cisco.com \
    --cc=aik@au1.ibm.com \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=benve@cisco.com \
    --cc=chrisw@sous-sol.org \
    --cc=dwg@au1.ibm.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=pmac@au1.ibm.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).