public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: David Gibson <dwg@au1.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	joerg.roedel@amd.com, dwmw2@infradead.org,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	chrisw@redhat.com, agraf@suse.de, scottwood@freescale.com,
	B08248@freescale.com
Subject: Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
Date: Wed, 30 Nov 2011 23:48:30 -0700	[thread overview]
Message-ID: <1322722110.26545.68.camel@bling.home> (raw)
In-Reply-To: <20111201000337.GA5427@truffala.fritz.box>

On Thu, 2011-12-01 at 11:03 +1100, David Gibson wrote:
> On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> > On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> > > On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> > > 
> > > > > +static ssize_t show_iommu_group(struct device *dev,
> > > > > +				struct device_attribute *attr, char *buf)
> > > > > +{
> > > > > +	unsigned int groupid;
> > > > > +
> > > > > +	if (iommu_device_group(dev, &groupid))
> > > > > +		return 0;
> > > > > +
> > > > > +	return sprintf(buf, "%u", groupid);
> > > > > +}
> > > > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > > > 
> > > > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > > > More seriously, we really want these to be unique across the whole
> > > > system, but they're allocated by the iommu driver which can't
> > > > guarantee that if it's not the only one present.  Seems to me it would
> > > > be safer to have an actual iommu_group structure allocated for each
> > > > group, and use the pointer to it as the ID to hand around (with NULL
> > > > meaning "no iommu" / untranslated).  The structure could contain a
> > > > more human readable - or more relevant to platform documentation - ID
> > > > where appropriate.
> > 
> > Note that iommu drivers are registered per bus_type, so the unique pair
> > is {bus_type, groupid}, which seems sufficient for vfio.
> 
> Hrm.  That's.. far from obvious.  And still breaks down if we have two
> separate iommus on the same bus type (e.g. two independent PCI host
> bridges with inbuilt IOMMUs).

See bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops).  It
wasn't long ago that we had a global iommu_ops.  Are heterogeneous
iommus using different drivers on the same bus_type really a possibility
on your system, or are we dealing with hypothetical situations?  You'd
currently need to multiplex iommu_ops callbacks between the drivers
yourself.

> > > Don't forget that to keep sanity, we really want to expose the groups
> > > via sysfs (per-group dir with symlinks to the devices).
> > > 
> > > I'm working with Alexey on providing an in-kernel powerpc specific API
> > > to expose the PE stuff to whatever's going to interface to VFIO to
> > > create the groups, though we can eventually collapse that. The idea is
> > > that on non-PE capable brigdes (old style), I would make a single group
> > > per host bridge.
> > 
> > If your non-PE capable bridges aren't actually providing isolation, they
> > should return -ENODEV for the group_device() callback, then vfio will
> > ignore them.
> > 
> > > In addition, Alex, I noticed that you still have the domain stuff there,
> > > which is fine I suppose, we could make it a requirement on power that
> > > you only put a single group in a domain... but the API is still to put
> > > individual devices in a domain, not groups, and that somewhat sucks.
> > > 
> > > You could "fix" that by having some kind of ->domain_enable() or
> > > whatever that's used to "activate" the domain and verifies that it
> > > contains entire groups but that looks like a pointless way to complicate
> > > both the API and the implementation.
> > 
> > Right, groups are currently just a way to identify dependent sets, not a
> > unit of work.
> 
> I'm not quite sure what you mean by "unit of work".  But assigning the
> groups as a unit generally makes more sense to me than assigning
> devices individually, but only being able to use them when the group
> is together.  Particularly when things are hotplugged into groups.

iommu_attach_device() takes a device as a "unit of work", not a group.
If we have a group in use by a userspace driver and a new physical
device is hotplugged into the host and belongs to the group, we need the
granularity of the device to be able to add it to the domain.

I'm not sure if we're getting into VM usage with "assigning" terminology
above.  You're free to architect qemu however you want on POWER to make
groups be the assignable unit to a guest.  On x86, an individual device
is the assignable unit for a guest.  Unassigned group devices will still
be required to be held by vfio, they'll just be unused.  Thanks,

Alex


  parent reply	other threads:[~2011-12-01  6:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
2011-11-30  2:42   ` David Gibson
2011-11-30  4:51     ` Benjamin Herrenschmidt
2011-11-30  5:25       ` Alex Williamson
2011-11-30  9:23         ` Benjamin Herrenschmidt
2011-12-01  0:06           ` David Gibson
2011-12-01  6:20           ` Alex Williamson
2011-12-01  0:03         ` David Gibson
2011-12-01  0:52           ` Chris Wright
2011-12-01  0:57             ` David Gibson
2011-12-01  1:04               ` Chris Wright
2011-12-01  1:50                 ` Benjamin Herrenschmidt
2011-12-01  2:00                   ` David Gibson
2011-12-01  2:05                   ` Chris Wright
2011-12-01  7:28                     ` Alex Williamson
2011-12-01 14:02                     ` Yoder Stuart-B08248
2011-12-01  6:48           ` Alex Williamson [this message]
2011-12-01 10:33             ` David Woodhouse
2011-12-01 14:34               ` Alex Williamson
2011-12-01 21:46                 ` Benjamin Herrenschmidt
2011-12-01 22:37                   ` Alex Williamson
2011-12-01 23:14                 ` David Woodhouse
2011-12-07  6:20                   ` Benjamin Herrenschmidt
2011-12-01 21:32             ` Benjamin Herrenschmidt
2011-10-21 19:56 ` [PATCH 2/4] intel-iommu: Implement iommu_device_group Alex Williamson
2011-11-08 17:23   ` Roedel, Joerg
2011-11-10 15:22     ` David Woodhouse
2011-10-21 19:56 ` [PATCH 3/4] amd-iommu: " Alex Williamson
2011-10-21 19:56 ` [PATCH 4/4] iommu: Add option to group multi-function devices Alex Williamson
2011-12-01  0:11   ` David Gibson
2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
2011-10-21 21:16   ` Alex Williamson
2011-10-21 22:39     ` Woodhouse, David
2011-10-21 22:34   ` Alex Williamson
2011-10-27 16:31 ` Alex Williamson
2011-11-15 15:51 ` Roedel, Joerg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1322722110.26545.68.camel@bling.home \
    --to=alex.williamson@redhat.com \
    --cc=B08248@freescale.com \
    --cc=agraf@suse.de \
    --cc=benh@kernel.crashing.org \
    --cc=chrisw@redhat.com \
    --cc=dwg@au1.ibm.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joerg.roedel@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox