qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: chrisw@redhat.com, aik@ozlabs.ru, rusty@rustcorp.com.au,
	agraf@suse.de, qemu-devel@nongnu.org, B08248@freescale.com,
	iommu@lists.linux-foundation.org, joerg.roedel@amd.com,
	scottwood@freescale.com, dwmw2@infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Qemu-devel] [RFC] Device isolation infrastructure v2
Date: Fri, 16 Dec 2011 12:40:26 +1100	[thread overview]
Message-ID: <20111216014026.GA4921@truffala.fritz.box> (raw)
In-Reply-To: <1323972307.2437.19.camel@x201.home>

On Thu, Dec 15, 2011 at 11:05:07AM -0700, Alex Williamson wrote:
> On Thu, 2011-12-15 at 17:25 +1100, David Gibson wrote:
> > Here's the second spin of my preferred approach to handling grouping
> > of devices for safe assignment to guests.
> > 
> > Changes since v1:
> >  * Many name changes and file moves for improved consistency
> >  * Bugfixes and cleanups
> >  * The interface to the next layer up is considerably fleshed out,
> >    although it still needs work.
> >  * Example initialization of groups for p5ioc2 and p7ioc.
> > 
> > TODO:
> >  * Need sample initialization of groups for intel and/or amd iommus
> 
> I think this very well might imposed significant bloat for those
> implementations.  On POWER you typically don't have singleton groups,
> while it's the norm on x86.  I don't know that either intel or amd iommu
> code have existing structures that they can simply tack the group
> pointer to.

Actually, I think they can probably just use the group pointer in the
struct device.  Each PCI function will typically allocate a new group
and put the pointer in the struct device and no-where else.  Devices
hidden under bridges copy the pointer from the bridge parent instead.
I will have to check the unplug path to ensure we can manage the group
lifetime properly, of course.

>  Again, this is one of the reasons that I think the current
> vfio implementation is the right starting point.  We keep groups within
> vfio, imposing zero overhead for systems not making use of it and only
> require iommu drivers to implement a trivial function to opt-in.  As we
> start to make groups more pervasive in the dma layer, independent of
> userspace driver exposure, we can offload pieces to the core.  Starting
> with it in the core and hand waving some future use that we don't plan
> to implement right now seems like the wrong direction.

Well, I think we must agree to disagree here; I think treating groups
as identifiable objects is worthwhile.  That said, I am looking for
ways to whittle down the overhead when they're not in use.

> >  * Use of sysfs attributes to control group permission is probably a
> >    mistake.  Although it seems a bit odd, registering a chardev for
> >    each group is probably better, because perms can be set from udev
> >    rules, just like everything else.
> 
> I agree, this is a horrible mistake.  Reinventing file permissions via
> sysfs is bound to be broken and doesn't account for selinux permissions.
> Again, I know you don't like aspects of the vfio group management, but
> it gets this right imho.

Yeah.  I came up with this because I was trying to avoid registering a
device whose only purpose was to act as a permissioned "handle" on the
group.  But it is a better approach, despite that.  I just wanted to
send out the new patches for comment without waiting to do that
rework.

> >  * Need more details of what the binder structure will need to
> >    contain.
> >  * Handle complete removal of groups.
> >  * Clarify what will need to happen on the hot unplug path.
> 
> We're still also removing devices from the driver model, this means
> drivers like vfio need to re-implement a lot of the driver core for
> driving each individual device in the group,

Ah, so, that's a bit I've yet to flesh out.  The subtle distinction is
that we prevent driver _matching_ not driver _binding.  It's
intentionally still possible to explicitly bind drivers to devices in
the group, by bypassing the automatic match mechanism.

I'm intending that when a group is bound, the binder struct will
(optionally) specify a driver (or several, for different bus types)
which will be "force bound" to all the devices in the group.

> and as you indicate, it's
> unclear what happens in the hotplug path

It's clear enough in concept.  I just have to work out exactly where
we need to call d_i_dev_remove(), and whether we'll need some sort of
callback to the bridge / iommu code.

> and I wonder if things like
> suspend/resume will also require non-standard support.  I really prefer
> attaching individual bus drivers to devices using the standard
> bind/unbind mechanisms.  I have a hard time seeing how this is an
> improvement from vfio.  Thanks,
> 
> Alex
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

  parent reply	other threads:[~2011-12-16  1:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-15  6:25 [Qemu-devel] [RFC] Device isolation infrastructure v2 David Gibson
2011-12-15  6:25 ` [Qemu-devel] [PATCH 1/3] device_isolation: Infrastructure for managing device isolation groups David Gibson
2011-12-15  6:25 ` [Qemu-devel] [PATCH 2/3] device_isolation: Support isolation on POWER p5ioc2 bridges David Gibson
2011-12-15  6:25 ` [Qemu-devel] [PATCH 3/3] device_isolation: Support isolation on POWER p7ioc (IODA) bridges David Gibson
2011-12-15 18:05 ` [Qemu-devel] [RFC] Device isolation infrastructure v2 Alex Williamson
2011-12-15 22:39   ` Alex Williamson
2011-12-16  1:40   ` David Gibson [this message]
2011-12-16  4:49     ` Alex Williamson
2011-12-16  6:00       ` David Gibson
2011-12-16 14:53   ` Joerg Roedel
2011-12-19  0:11     ` David Gibson
2011-12-19 15:41       ` Joerg Roedel
2011-12-21  3:32         ` David Gibson
2011-12-21  4:30           ` Alex Williamson
2011-12-21  6:12             ` Aaron Fabbri
2012-01-25  3:13             ` David Gibson
2012-01-25 23:44               ` Alex Williamson
2012-01-30 23:22                 ` David Gibson
2011-12-21 16:46           ` Joerg Roedel
2011-12-19 15:46       ` David Woodhouse
2011-12-19 22:31         ` David Gibson
2011-12-19 22:56           ` David Woodhouse
2011-12-20  0:25             ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111216014026.GA4921@truffala.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=B08248@freescale.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joerg.roedel@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rusty@rustcorp.com.au \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).