[PATCH 0/4] iommu: iommu_ops group interface

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] iommu: iommu_ops group interface
@ 2011-10-21 19:55 Alex Williamson
  2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 19:55 UTC (permalink / raw)
  To: joerg.roedel, dwmw2, iommu
  Cc: linux-kernel, chrisw, agraf, dwg, scottwood, B08248, benh,
	alex.williamson

IOMMUs can't always distiguish transactions from each individual
device in a system.  Sometimes this is by design (such as powerpc
partitionable endpoints), other times by topology (PCIe-to-PCI
bridges masking downstream devices).  We call these sets of
indistinguishable devices "groups".

In order to support secure userspace drivers, like vfio, we need
an interface to expose the device-to-group relationship.  This
allows us to create policies ensuring that userspace controls all
of the devices in the group before allowing individual device
access.

This series implements the iommu_ops API interface and sysfs
interface for exposing groups to userspace.  This also includes
the intel-iommu and amd-iommu backend implementations.  It's
intended that the vfio driver will make use of these interfaces
to support generic device assignment for virtual machines.  See
git://github.com/awilliam/linux-vfio.git (vfio-ng) for a working
example using this interface.

Patches based on Joerg's next branch to support per-bus iommu_ops.

Note the amd-iommu is untested, I'm still working on setting up
an AMD-Vi capable system.  Thanks,

Alex

---

Alex Williamson (4):
      iommu: Add option to group multi-function devices
      amd-iommu: Implement iommu_device_group
      intel-iommu: Implement iommu_device_group
      iommu: Add iommu_device_group callback and iommu_group sysfs entry

 Documentation/kernel-parameters.txt |    4 ++
 arch/ia64/include/asm/iommu.h       |    2 +
 arch/ia64/kernel/pci-dma.c          |    1 +
 arch/x86/include/asm/iommu.h        |    1 +
 arch/x86/kernel/pci-dma.c           |   11 ++++++
 drivers/iommu/amd_iommu.c           |   21 ++++++++++++
 drivers/iommu/intel-iommu.c         |   47 +++++++++++++++++++++++++++
 drivers/iommu/iommu.c               |   61 +++++++++++++++++++++++++++++++++++
 include/linux/iommu.h               |    7 ++++
 9 files changed, 154 insertions(+), 1 deletions(-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
@ 2011-10-21 19:56 ` Alex Williamson
  2011-11-30  2:42   ` David Gibson
  2011-10-21 19:56 ` [PATCH 2/4] intel-iommu: Implement iommu_device_group Alex Williamson
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 19:56 UTC (permalink / raw)
  To: joerg.roedel, dwmw2, iommu
  Cc: linux-kernel, chrisw, agraf, dwg, scottwood, B08248, benh,
	alex.williamson

An IOMMU group is a set of devices for which the IOMMU cannot
distinguish transactions.  For PCI devices, a group often occurs
when a PCI bridge is involved.  Transactions from any device
behind the bridge appear to be sourced from the bridge itself.
We leave it to the IOMMU driver to define the grouping restraints
for their platform.

Using this new interface, the group for a device can be retrieved
using the iommu_device_group() callback.  Users will compare the
value returned against the value returned for other devices to
determine whether they are part of the same group.  Devices with
no group are not translated by the IOMMU.  There should be no
expectations about the group numbers as they may be arbitrarily
assigned by the IOMMU driver and may not be persistent across boots.

We also provide a sysfs interface to the group numbers here so
that userspace can understand IOMMU dependencies between devices
for managing safe, userspace drivers.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 drivers/iommu/iommu.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h |    7 ++++++
 2 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2fb2963..10615ad 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -25,8 +25,60 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 
+static ssize_t show_iommu_group(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	unsigned int groupid;
+
+	if (iommu_device_group(dev, &groupid))
+		return 0;
+
+	return sprintf(buf, "%u", groupid);
+}
+static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
+
+static int add_remove_iommu_group(struct device *dev, void *data)
+{
+	unsigned int groupid;
+	int add = *(int *)data;
+
+	if (iommu_device_group(dev, &groupid) == 0) {
+		if (add)
+			return device_create_file(dev, &dev_attr_iommu_group);
+		else
+			device_remove_file(dev, &dev_attr_iommu_group);
+	}
+
+	return 0;
+}
+
+static int iommu_device_notifier(struct notifier_block *nb,
+				 unsigned long action, void *data)
+{
+	struct device *dev = data;
+	int add;
+
+	if (action == BUS_NOTIFY_ADD_DEVICE) {
+		add = 1;
+		return add_remove_iommu_group(dev, &add);
+	} else if (action == BUS_NOTIFY_DEL_DEVICE) {
+		add = 0;
+		return add_remove_iommu_group(dev, &add);
+	}
+
+	return 0;
+}
+
+static struct notifier_block iommu_device_nb = {
+	.notifier_call = iommu_device_notifier,
+};
+
 static void iommu_bus_init(struct bus_type *bus, struct iommu_ops *ops)
 {
+	int add = 1;
+
+	bus_register_notifier(bus, &iommu_device_nb);
+	bus_for_each_dev(bus, NULL, &add, add_remove_iommu_group);
 }
 
 /**
@@ -186,3 +238,12 @@ int iommu_unmap(struct iommu_domain *domain, unsigned long iova, int gfp_order)
 	return domain->ops->unmap(domain, iova, gfp_order);
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
+
+int iommu_device_group(struct device *dev, unsigned int *groupid)
+{
+	if (iommu_present(dev->bus) && dev->bus->iommu_ops->device_group)
+		return dev->bus->iommu_ops->device_group(dev, groupid);
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL_GPL(iommu_device_group);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 432acc4..93617e7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -61,6 +61,7 @@ struct iommu_ops {
 				    unsigned long iova);
 	int (*domain_has_cap)(struct iommu_domain *domain,
 			      unsigned long cap);
+	int (*device_group)(struct device *dev, unsigned int *groupid);
 };
 
 extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops);
@@ -81,6 +82,7 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain,
 				unsigned long cap);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 					iommu_fault_handler_t handler);
+extern int iommu_device_group(struct device *dev, unsigned int *groupid);
 
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
@@ -179,6 +181,11 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain,
 {
 }
 
+static inline int iommu_device_group(struct device *dev, unsigned int *groupid);
+{
+	return -ENODEV;
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/4] intel-iommu: Implement iommu_device_group
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
  2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
@ 2011-10-21 19:56 ` Alex Williamson
  2011-11-08 17:23   ` Roedel, Joerg
  2011-10-21 19:56 ` [PATCH 3/4] amd-iommu: " Alex Williamson
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 19:56 UTC (permalink / raw)
  To: joerg.roedel, dwmw2, iommu
  Cc: linux-kernel, chrisw, agraf, dwg, scottwood, B08248, benh,
	alex.williamson

We generally have BDF granularity for devices, so we just need
to make sure devices aren't hidden behind PCIe-to-PCI bridges.
We can then make up a group number that's simply the concatenated
seg|bus|dev|fn so we don't have to track them (not that users
should depend on that).

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 drivers/iommu/intel-iommu.c |   44 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2d53c3d..d23d66a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3901,6 +3901,49 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain,
 	return 0;
 }
 
+/* Group numbers are arbitrary.  Device with the same group number
+ * indicate the iommu cannot differentiate between them.  To avoid
+ * tracking used groups we just use the seg|bus|devfn of the lowest
+ * level we're able to differentiate devices */
+static int intel_iommu_device_group(struct device *dev, unsigned int *groupid)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct pci_dev *bridge;
+	union {
+		struct {
+			u8 devfn;
+			u8 bus;
+			u16 segment;
+		} pci;
+		u32 group;
+	} id;
+
+	if (iommu_no_mapping(dev))
+		return -ENODEV;
+
+	id.pci.segment = pci_domain_nr(pdev->bus);
+	id.pci.bus = pdev->bus->number;
+	id.pci.devfn = pdev->devfn;
+
+	if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn))
+		return -ENODEV;
+
+	bridge = pci_find_upstream_pcie_bridge(pdev);
+	if (bridge) {
+		if (pci_is_pcie(bridge)) {
+			id.pci.bus = bridge->subordinate->number;
+			id.pci.devfn = 0;
+		} else {
+			id.pci.bus = bridge->bus->number;
+			id.pci.devfn = bridge->devfn;
+		}
+	}
+
+	*groupid = id.group;
+
+	return 0;
+}
+
 static struct iommu_ops intel_iommu_ops = {
 	.domain_init	= intel_iommu_domain_init,
 	.domain_destroy = intel_iommu_domain_destroy,
@@ -3910,6 +3953,7 @@ static struct iommu_ops intel_iommu_ops = {
 	.unmap		= intel_iommu_unmap,
 	.iova_to_phys	= intel_iommu_iova_to_phys,
 	.domain_has_cap = intel_iommu_domain_has_cap,
+	.device_group	= intel_iommu_device_group,
 };
 
 static void __devinit quirk_iommu_rwbf(struct pci_dev *dev)


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/4] amd-iommu: Implement iommu_device_group
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
  2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
  2011-10-21 19:56 ` [PATCH 2/4] intel-iommu: Implement iommu_device_group Alex Williamson
@ 2011-10-21 19:56 ` Alex Williamson
  2011-10-21 19:56 ` [PATCH 4/4] iommu: Add option to group multi-function devices Alex Williamson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 19:56 UTC (permalink / raw)
  To: joerg.roedel, dwmw2, iommu
  Cc: linux-kernel, chrisw, agraf, dwg, scottwood, B08248, benh,
	alex.williamson

Just use the amd_iommu_alias_table directly.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 drivers/iommu/amd_iommu.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4ee277a..1d82b63 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2773,6 +2773,18 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain,
 	return 0;
 }
 
+static int amd_iommu_device_group(struct device *dev, unsigned int *groupid)
+{
+	struct iommu_dev_data *dev_data = dev->archdata.iommu;
+
+	if (!dev_data)
+		return -ENODEV;
+
+	*groupid = amd_iommu_alias_table[dev_data->devid];
+
+	return 0;
+}
+
 static struct iommu_ops amd_iommu_ops = {
 	.domain_init = amd_iommu_domain_init,
 	.domain_destroy = amd_iommu_domain_destroy,
@@ -2782,6 +2794,7 @@ static struct iommu_ops amd_iommu_ops = {
 	.unmap = amd_iommu_unmap,
 	.iova_to_phys = amd_iommu_iova_to_phys,
 	.domain_has_cap = amd_iommu_domain_has_cap,
+	.device_group = amd_iommu_device_group,
 };
 
 /*****************************************************************************


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/4] iommu: Add option to group multi-function devices
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
                   ` (2 preceding siblings ...)
  2011-10-21 19:56 ` [PATCH 3/4] amd-iommu: " Alex Williamson
@ 2011-10-21 19:56 ` Alex Williamson
  2011-12-01  0:11   ` David Gibson
  2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 19:56 UTC (permalink / raw)
  To: joerg.roedel, dwmw2, iommu
  Cc: linux-kernel, chrisw, agraf, dwg, scottwood, B08248, benh,
	alex.williamson

The option iommu=group_mf indicates the that the iommu driver should
expose all functions of a multi-function PCI device as the same
iommu_device_group.  This is useful for disallowing individual functions
being exposed as independent devices to userspace as there are often
hidden dependencies.  Virtual functions are not affected by this option.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 Documentation/kernel-parameters.txt |    4 +++-
 arch/ia64/include/asm/iommu.h       |    2 ++
 arch/ia64/kernel/pci-dma.c          |    1 +
 arch/x86/include/asm/iommu.h        |    1 +
 arch/x86/kernel/pci-dma.c           |   11 +++++++++++
 drivers/iommu/amd_iommu.c           |   10 +++++++++-
 drivers/iommu/intel-iommu.c         |    3 +++
 7 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 614d038..db78b6f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1036,7 +1036,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 		nomerge
 		forcesac
 		soft
-		pt	[x86, IA-64]
+		pt		[x86, IA-64]
+		group_mf	[x86, IA-64]
+
 
 	io7=		[HW] IO7 for Marvel based alpha systems
 			See comment before marvel_specify_io7 in
diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h
index 745e095..196b7b2 100644
--- a/arch/ia64/include/asm/iommu.h
+++ b/arch/ia64/include/asm/iommu.h
@@ -11,8 +11,10 @@ extern int force_iommu, no_iommu;
 extern int iommu_detected;
 #ifdef	CONFIG_DMAR
 extern int iommu_pass_through;
+extern int iommu_group_mf;
 #else
 #define iommu_pass_through	(0)
+#define iommu_group_mf		(0)
 #endif
 extern void iommu_dma_init(void);
 extern void machvec_init(const char *name);
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index f6b1ff0..7b024c7 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -33,6 +33,7 @@ int force_iommu __read_mostly;
 #endif
 
 int iommu_pass_through;
+int iommu_group_mf;
 
 /* Dummy device used for NULL arguments (normally ISA). Better would
    be probably a smaller DMA mask, but this is bug-to-bug compatible
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index 345c99c..dffc38e 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -5,6 +5,7 @@ extern struct dma_map_ops nommu_dma_ops;
 extern int force_iommu, no_iommu;
 extern int iommu_detected;
 extern int iommu_pass_through;
+extern int iommu_group_mf;
 
 /* 10 seconds */
 #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index b49d00d..3b730fb 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -44,6 +44,15 @@ int iommu_detected __read_mostly = 0;
  */
 int iommu_pass_through __read_mostly;
 
+/*
+ * Group multi-function PCI devices into a single device-group for the
+ * iommu_device_group interface.  This tells the iommu driver to pretend
+ * it cannot distinguish between functions of a device, exposing only one
+ * group for the device.  Useful for disallowing use of individual PCI
+ * functions from userspace drivers.
+ */
+int iommu_group_mf __read_mostly;
+
 extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];
 
 /* Dummy device used for NULL arguments (normally ISA). */
@@ -168,6 +177,8 @@ static __init int iommu_setup(char *p)
 #endif
 		if (!strncmp(p, "pt", 2))
 			iommu_pass_through = 1;
+		if (!strncmp(p, "group_mf", 8))
+			iommu_group_mf = 1;
 
 		gart_parse_options(p);
 
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1d82b63..6f75536 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2776,11 +2776,19 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain,
 static int amd_iommu_device_group(struct device *dev, unsigned int *groupid)
 {
 	struct iommu_dev_data *dev_data = dev->archdata.iommu;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 devid;
 
 	if (!dev_data)
 		return -ENODEV;
 
-	*groupid = amd_iommu_alias_table[dev_data->devid];
+	if (pdev->is_virtfn || !iommu_group_mf)
+		devid = dev_data->devid;
+	else
+		devid = calc_devid(pdev->bus->number,
+				   PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));
+
+	*groupid = amd_iommu_alias_table[devid];
 
 	return 0;
 }
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d23d66a..a5e1a2f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3939,6 +3939,9 @@ static int intel_iommu_device_group(struct device *dev, unsigned int *groupid)
 		}
 	}
 
+	if (!pdev->is_virtfn && iommu_group_mf)
+		id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0);
+
 	*groupid = id.group;
 
 	return 0;


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
                   ` (3 preceding siblings ...)
  2011-10-21 19:56 ` [PATCH 4/4] iommu: Add option to group multi-function devices Alex Williamson
@ 2011-10-21 20:34 ` Woodhouse, David
  2011-10-21 21:16   ` Alex Williamson
  2011-10-21 22:34   ` Alex Williamson
  2011-10-27 16:31 ` Alex Williamson
  2011-11-15 15:51 ` Roedel, Joerg
  6 siblings, 2 replies; 37+ messages in thread
From: Woodhouse, David @ 2011-10-21 20:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joerg.roedel@amd.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

On Fri, 2011-10-21 at 13:55 -0600, Alex Williamson wrote:
> IOMMUs can't always distiguish transactions from each individual
> device in a system.  Sometimes this is by design (such as powerpc
> partitionable endpoints), other times by topology (PCIe-to-PCI
> bridges masking downstream devices).  We call these sets of
> indistinguishable devices "groups". 

Other times it's because a multi-function PCIe device is broken and does
all its DMA from function zero.... like some Ricoh devices seen in
laptops. Can you handle quirks to "group" those too?

-- 
                   Sent with MeeGo's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
@ 2011-10-21 21:16   ` Alex Williamson
  2011-10-21 22:39     ` Woodhouse, David
  2011-10-21 22:34   ` Alex Williamson
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 21:16 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: joerg.roedel@amd.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

On Fri, 2011-10-21 at 20:34 +0000, Woodhouse, David wrote:
> On Fri, 2011-10-21 at 13:55 -0600, Alex Williamson wrote:
> > IOMMUs can't always distiguish transactions from each individual
> > device in a system.  Sometimes this is by design (such as powerpc
> > partitionable endpoints), other times by topology (PCIe-to-PCI
> > bridges masking downstream devices).  We call these sets of
> > indistinguishable devices "groups". 
> 
> Other times it's because a multi-function PCIe device is broken and does
> all its DMA from function zero.... like some Ricoh devices seen in
> laptops. Can you handle quirks to "group" those too?

I could add:

drivers/pci/quirks.c: int pci_iommu_group_mf_quirk(struct pci_dev *)

That's used just like iommu_group_mf to always blacklist specific
devices.  Let me know VID/DID if you have them.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
  2011-10-21 21:16   ` Alex Williamson
@ 2011-10-21 22:34   ` Alex Williamson
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-10-21 22:34 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: joerg.roedel@amd.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

On Fri, 2011-10-21 at 15:16 -0600, Alex Williamson wrote:
> On Fri, 2011-10-21 at 20:34 +0000, Woodhouse, David wrote:
> > On Fri, 2011-10-21 at 13:55 -0600, Alex Williamson wrote:
> > > IOMMUs can't always distiguish transactions from each individual
> > > device in a system.  Sometimes this is by design (such as powerpc
> > > partitionable endpoints), other times by topology (PCIe-to-PCI
> > > bridges masking downstream devices).  We call these sets of
> > > indistinguishable devices "groups". 
> > 
> > Other times it's because a multi-function PCIe device is broken and does
> > all its DMA from function zero.... like some Ricoh devices seen in
> > laptops. Can you handle quirks to "group" those too?
> 
> I could add:
> 
> drivers/pci/quirks.c: int pci_iommu_group_mf_quirk(struct pci_dev *)
> 
> That's used just like iommu_group_mf to always blacklist specific
> devices.  Let me know VID/DID if you have them.  Thanks,

I guess just a bit and a quirk is easiest, something like below
(quirking an 82576 just as an example).  Thanks,

Alex

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 6f75536..de158d9 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2782,11 +2782,11 @@ static int amd_iommu_device_group(struct device *dev, unsigned int *groupid)
 	if (!dev_data)
 		return -ENODEV;
 
-	if (pdev->is_virtfn || !iommu_group_mf)
-		devid = dev_data->devid;
-	else
+	if (pdev->group_mf || (iommu_group_mf && !pdev->is_virtfn))
 		devid = calc_devid(pdev->bus->number,
 				   PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));
+	else
+		devid = dev_data->devid;
 
 	*groupid = amd_iommu_alias_table[devid];
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a5e1a2f..fba0256 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3939,7 +3939,7 @@ static int intel_iommu_device_group(struct device *dev, unsigned int *groupid)
 		}
 	}
 
-	if (!pdev->is_virtfn && iommu_group_mf)
+	if (pdev->group_mf || (iommu_group_mf && !pdev->is_virtfn))
 		id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0);
 
 	*groupid = id.group;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 1196f61..4764645 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3009,3 +3009,9 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 
 	return -ENOTTY;
 }
+
+static void __devinit quirk_iommu_group_mf(struct pci_dev* dev)
+{
+	dev->group_mf = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x10c9, quirk_iommu_group_mf);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8c230cb..d763027 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -322,6 +322,7 @@ struct pci_dev {
 	unsigned int    is_hotplug_bridge:1;
 	unsigned int    __aer_firmware_first_valid:1;
 	unsigned int	__aer_firmware_first:1;
+	unsigned int	group_mf:1;	/* per device iommu=group_mf */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
 




^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 21:16   ` Alex Williamson
@ 2011-10-21 22:39     ` Woodhouse, David
  0 siblings, 0 replies; 37+ messages in thread
From: Woodhouse, David @ 2011-10-21 22:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joerg.roedel@amd.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

[-- Attachment #1: Type: text/plain, Size: 707 bytes --]

On Fri, 2011-10-21 at 15:16 -0600, Alex Williamson wrote:
> 
> 
> drivers/pci/quirks.c: int pci_iommu_group_mf_quirk(struct pci_dev *)
> 
> That's used just like iommu_group_mf to always blacklist specific
> devices.  Let me know VID/DID if you have them.  Thanks, 

Don't remember the ID offhand, but there are reports in Red Hat bugzilla
of Ricoh multifunction devices on some (HP, I think) laptops which cause
streams of DMA errors unless the sdhci driver is blacklisted...

-- 
                   Sent with MeeGo's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
                   ` (4 preceding siblings ...)
  2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
@ 2011-10-27 16:31 ` Alex Williamson
  2011-11-15 15:51 ` Roedel, Joerg
  6 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-10-27 16:31 UTC (permalink / raw)
  To: joerg.roedel
  Cc: dwmw2, iommu, linux-kernel, chrisw, agraf, dwg, scottwood, B08248,
	benh

On Fri, 2011-10-21 at 13:55 -0600, Alex Williamson wrote:
> IOMMUs can't always distiguish transactions from each individual
> device in a system.  Sometimes this is by design (such as powerpc
> partitionable endpoints), other times by topology (PCIe-to-PCI
> bridges masking downstream devices).  We call these sets of
> indistinguishable devices "groups".
> 
> In order to support secure userspace drivers, like vfio, we need
> an interface to expose the device-to-group relationship.  This
> allows us to create policies ensuring that userspace controls all
> of the devices in the group before allowing individual device
> access.
> 
> This series implements the iommu_ops API interface and sysfs
> interface for exposing groups to userspace.  This also includes
> the intel-iommu and amd-iommu backend implementations.  It's
> intended that the vfio driver will make use of these interfaces
> to support generic device assignment for virtual machines.  See
> git://github.com/awilliam/linux-vfio.git (vfio-ng) for a working
> example using this interface.
> 
> Patches based on Joerg's next branch to support per-bus iommu_ops.
> 
> Note the amd-iommu is untested, I'm still working on setting up
> an AMD-Vi capable system.  Thanks,

Now tested, even seems to work :)

Alex

> ---
> 
> Alex Williamson (4):
>       iommu: Add option to group multi-function devices
>       amd-iommu: Implement iommu_device_group
>       intel-iommu: Implement iommu_device_group
>       iommu: Add iommu_device_group callback and iommu_group sysfs entry
> 
> 
>  Documentation/kernel-parameters.txt |    4 ++
>  arch/ia64/include/asm/iommu.h       |    2 +
>  arch/ia64/kernel/pci-dma.c          |    1 +
>  arch/x86/include/asm/iommu.h        |    1 +
>  arch/x86/kernel/pci-dma.c           |   11 ++++++
>  drivers/iommu/amd_iommu.c           |   21 ++++++++++++
>  drivers/iommu/intel-iommu.c         |   47 +++++++++++++++++++++++++++
>  drivers/iommu/iommu.c               |   61 +++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h               |    7 ++++
>  9 files changed, 154 insertions(+), 1 deletions(-)




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/4] intel-iommu: Implement iommu_device_group
  2011-10-21 19:56 ` [PATCH 2/4] intel-iommu: Implement iommu_device_group Alex Williamson
@ 2011-11-08 17:23   ` Roedel, Joerg
  2011-11-10 15:22     ` David Woodhouse
  0 siblings, 1 reply; 37+ messages in thread
From: Roedel, Joerg @ 2011-11-08 17:23 UTC (permalink / raw)
  To: Alex Williamson
  Cc: dwmw2@infradead.org, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

On Fri, Oct 21, 2011 at 03:56:11PM -0400, Alex Williamson wrote:
> We generally have BDF granularity for devices, so we just need
> to make sure devices aren't hidden behind PCIe-to-PCI bridges.
> We can then make up a group number that's simply the concatenated
> seg|bus|dev|fn so we don't have to track them (not that users
> should depend on that).
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

David, does this look good to you? Can you send an Acked-by if its okay?

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/4] intel-iommu: Implement iommu_device_group
  2011-11-08 17:23   ` Roedel, Joerg
@ 2011-11-10 15:22     ` David Woodhouse
  0 siblings, 0 replies; 37+ messages in thread
From: David Woodhouse @ 2011-11-10 15:22 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Alex Williamson, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

On Tue, 2011-11-08 at 18:23 +0100, Roedel, Joerg wrote:
> On Fri, Oct 21, 2011 at 03:56:11PM -0400, Alex Williamson wrote:
> > We generally have BDF granularity for devices, so we just need
> > to make sure devices aren't hidden behind PCIe-to-PCI bridges.
> > We can then make up a group number that's simply the concatenated
> > seg|bus|dev|fn so we don't have to track them (not that users
> > should depend on that).
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> 
> David, does this look good to you? Can you send an Acked-by if its okay?

Looks good to me in principle, although I'd like to see it tested on one
of those HP laptops with the dodgy Ricoh multi-function devices.

Acked-By: David Woodhouse <David.Woodhouse@intel.com>

-- 
dwmw2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/4] iommu: iommu_ops group interface
  2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
                   ` (5 preceding siblings ...)
  2011-10-27 16:31 ` Alex Williamson
@ 2011-11-15 15:51 ` Roedel, Joerg
  6 siblings, 0 replies; 37+ messages in thread
From: Roedel, Joerg @ 2011-11-15 15:51 UTC (permalink / raw)
  To: Alex Williamson
  Cc: dwmw2@infradead.org, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, chrisw@redhat.com, agraf@suse.de,
	dwg@au1.ibm.com, scottwood@freescale.com, B08248@freescale.com,
	benh@kernel.crashing.org

On Fri, Oct 21, 2011 at 03:55:58PM -0400, Alex Williamson wrote:
> Alex Williamson (4):
>       iommu: Add option to group multi-function devices
>       amd-iommu: Implement iommu_device_group
>       intel-iommu: Implement iommu_device_group
>       iommu: Add iommu_device_group callback and iommu_group sysfs entry
> 
> 
>  Documentation/kernel-parameters.txt |    4 ++
>  arch/ia64/include/asm/iommu.h       |    2 +
>  arch/ia64/kernel/pci-dma.c          |    1 +
>  arch/x86/include/asm/iommu.h        |    1 +
>  arch/x86/kernel/pci-dma.c           |   11 ++++++
>  drivers/iommu/amd_iommu.c           |   21 ++++++++++++
>  drivers/iommu/intel-iommu.c         |   47 +++++++++++++++++++++++++++
>  drivers/iommu/iommu.c               |   61 +++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h               |    7 ++++
>  9 files changed, 154 insertions(+), 1 deletions(-)

Applied with some changes to patch 1. I basically split up the
add_remove function into a add and a remove function. I think the code
is cleaner that way.


	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
@ 2011-11-30  2:42   ` David Gibson
  2011-11-30  4:51     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: David Gibson @ 2011-11-30  2:42 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joerg.roedel, dwmw2, iommu, linux-kernel, chrisw, agraf,
	scottwood, B08248, benh

On Fri, Oct 21, 2011 at 01:56:05PM -0600, Alex Williamson wrote:
> An IOMMU group is a set of devices for which the IOMMU cannot
> distinguish transactions.  For PCI devices, a group often occurs
> when a PCI bridge is involved.  Transactions from any device
> behind the bridge appear to be sourced from the bridge itself.
> We leave it to the IOMMU driver to define the grouping restraints
> for their platform.
> 
> Using this new interface, the group for a device can be retrieved
> using the iommu_device_group() callback.  Users will compare the
> value returned against the value returned for other devices to
> determine whether they are part of the same group.  Devices with
> no group are not translated by the IOMMU.  There should be no
> expectations about the group numbers as they may be arbitrarily
> assigned by the IOMMU driver and may not be persistent across boots.
> 
> We also provide a sysfs interface to the group numbers here so
> that userspace can understand IOMMU dependencies between devices
> for managing safe, userspace drivers.

Finally giving these patches a close read.  Sorry it's been so long.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
> 
>  drivers/iommu/iommu.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h |    7 ++++++
>  2 files changed, 68 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2fb2963..10615ad 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -25,8 +25,60 @@
>  #include <linux/errno.h>
>  #include <linux/iommu.h>
>  
> +static ssize_t show_iommu_group(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	unsigned int groupid;
> +
> +	if (iommu_device_group(dev, &groupid))
> +		return 0;
> +
> +	return sprintf(buf, "%u", groupid);
> +}
> +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);

Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
More seriously, we really want these to be unique across the whole
system, but they're allocated by the iommu driver which can't
guarantee that if it's not the only one present.  Seems to me it would
be safer to have an actual iommu_group structure allocated for each
group, and use the pointer to it as the ID to hand around (with NULL
meaning "no iommu" / untranslated).  The structure could contain a
more human readable - or more relevant to platform documentation - ID
where appropriate.

> +static int add_remove_iommu_group(struct device *dev, void *data)
> +{
> +	unsigned int groupid;
> +	int add = *(int *)data;
> +
> +	if (iommu_device_group(dev, &groupid) == 0) {
> +		if (add)
> +			return device_create_file(dev, &dev_attr_iommu_group);
> +		else
> +			device_remove_file(dev, &dev_attr_iommu_group);
> +	}
> +
> +	return 0;
> +}

Multiplexing add and remove together seems pointlessly obfuscated.

> +static int iommu_device_notifier(struct notifier_block *nb,
> +				 unsigned long action, void *data)
> +{
> +	struct device *dev = data;
> +	int add;
> +
> +	if (action == BUS_NOTIFY_ADD_DEVICE) {
> +		add = 1;
> +		return add_remove_iommu_group(dev, &add);
> +	} else if (action == BUS_NOTIFY_DEL_DEVICE) {
> +		add = 0;
> +		return add_remove_iommu_group(dev, &add);
> +	}
> +
> +	return 0;
> +}
> +
> +static struct notifier_block iommu_device_nb = {
> +	.notifier_call = iommu_device_notifier,
> +};
> +
>  static void iommu_bus_init(struct bus_type *bus, struct iommu_ops *ops)
>  {

I don't know of any current examples, but I do worry that makin this
per bus-type rather than bus-instance might bite us in the arse later
on.

> +	int add = 1;
> +
> +	bus_register_notifier(bus, &iommu_device_nb);
> +	bus_for_each_dev(bus, NULL, &add, add_remove_iommu_group);
>  }
>  
>  /**
> @@ -186,3 +238,12 @@ int iommu_unmap(struct iommu_domain *domain, unsigned long iova, int gfp_order)
>  	return domain->ops->unmap(domain, iova, gfp_order);
>  }
>  EXPORT_SYMBOL_GPL(iommu_unmap);
> +
> +int iommu_device_group(struct device *dev, unsigned int *groupid)
> +{
> +	if (iommu_present(dev->bus) && dev->bus->iommu_ops->device_group)
> +		return dev->bus->iommu_ops->device_group(dev, groupid);
> +
> +	return -ENODEV;
> +}
> +EXPORT_SYMBOL_GPL(iommu_device_group);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 432acc4..93617e7 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -61,6 +61,7 @@ struct iommu_ops {
>  				    unsigned long iova);
>  	int (*domain_has_cap)(struct iommu_domain *domain,
>  			      unsigned long cap);
> +	int (*device_group)(struct device *dev, unsigned int *groupid);
>  };
>  
>  extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops);
> @@ -81,6 +82,7 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain,
>  				unsigned long cap);
>  extern void iommu_set_fault_handler(struct iommu_domain *domain,
>  					iommu_fault_handler_t handler);
> +extern int iommu_device_group(struct device *dev, unsigned int *groupid);
>  
>  /**
>   * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
> @@ -179,6 +181,11 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain,
>  {
>  }
>  
> +static inline int iommu_device_group(struct device *dev, unsigned int *groupid);
> +{
> +	return -ENODEV;
> +}
> +
>  #endif /* CONFIG_IOMMU_API */
>  
>  #endif /* __LINUX_IOMMU_H */
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  2:42   ` David Gibson
@ 2011-11-30  4:51     ` Benjamin Herrenschmidt
  2011-11-30  5:25       ` Alex Williamson
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-11-30  4:51 UTC (permalink / raw)
  To: David Gibson
  Cc: Alex Williamson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:

> > +static ssize_t show_iommu_group(struct device *dev,
> > +				struct device_attribute *attr, char *buf)
> > +{
> > +	unsigned int groupid;
> > +
> > +	if (iommu_device_group(dev, &groupid))
> > +		return 0;
> > +
> > +	return sprintf(buf, "%u", groupid);
> > +}
> > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> 
> Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> More seriously, we really want these to be unique across the whole
> system, but they're allocated by the iommu driver which can't
> guarantee that if it's not the only one present.  Seems to me it would
> be safer to have an actual iommu_group structure allocated for each
> group, and use the pointer to it as the ID to hand around (with NULL
> meaning "no iommu" / untranslated).  The structure could contain a
> more human readable - or more relevant to platform documentation - ID
> where appropriate.

Don't forget that to keep sanity, we really want to expose the groups
via sysfs (per-group dir with symlinks to the devices).

I'm working with Alexey on providing an in-kernel powerpc specific API
to expose the PE stuff to whatever's going to interface to VFIO to
create the groups, though we can eventually collapse that. The idea is
that on non-PE capable brigdes (old style), I would make a single group
per host bridge.

In addition, Alex, I noticed that you still have the domain stuff there,
which is fine I suppose, we could make it a requirement on power that
you only put a single group in a domain... but the API is still to put
individual devices in a domain, not groups, and that somewhat sucks.

You could "fix" that by having some kind of ->domain_enable() or
whatever that's used to "activate" the domain and verifies that it
contains entire groups but that looks like a pointless way to complicate
both the API and the implementation.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  4:51     ` Benjamin Herrenschmidt
@ 2011-11-30  5:25       ` Alex Williamson
  2011-11-30  9:23         ` Benjamin Herrenschmidt
  2011-12-01  0:03         ` David Gibson
  0 siblings, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2011-11-30  5:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: David Gibson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> 
> > > +static ssize_t show_iommu_group(struct device *dev,
> > > +				struct device_attribute *attr, char *buf)
> > > +{
> > > +	unsigned int groupid;
> > > +
> > > +	if (iommu_device_group(dev, &groupid))
> > > +		return 0;
> > > +
> > > +	return sprintf(buf, "%u", groupid);
> > > +}
> > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > 
> > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > More seriously, we really want these to be unique across the whole
> > system, but they're allocated by the iommu driver which can't
> > guarantee that if it's not the only one present.  Seems to me it would
> > be safer to have an actual iommu_group structure allocated for each
> > group, and use the pointer to it as the ID to hand around (with NULL
> > meaning "no iommu" / untranslated).  The structure could contain a
> > more human readable - or more relevant to platform documentation - ID
> > where appropriate.

Note that iommu drivers are registered per bus_type, so the unique pair
is {bus_type, groupid}, which seems sufficient for vfio.

> Don't forget that to keep sanity, we really want to expose the groups
> via sysfs (per-group dir with symlinks to the devices).
> 
> I'm working with Alexey on providing an in-kernel powerpc specific API
> to expose the PE stuff to whatever's going to interface to VFIO to
> create the groups, though we can eventually collapse that. The idea is
> that on non-PE capable brigdes (old style), I would make a single group
> per host bridge.

If your non-PE capable bridges aren't actually providing isolation, they
should return -ENODEV for the group_device() callback, then vfio will
ignore them.

> In addition, Alex, I noticed that you still have the domain stuff there,
> which is fine I suppose, we could make it a requirement on power that
> you only put a single group in a domain... but the API is still to put
> individual devices in a domain, not groups, and that somewhat sucks.
> 
> You could "fix" that by having some kind of ->domain_enable() or
> whatever that's used to "activate" the domain and verifies that it
> contains entire groups but that looks like a pointless way to complicate
> both the API and the implementation.

Right, groups are currently just a way to identify dependent sets, not a
unit of work.  We can also have group membership change dynamically
(hotplug slot behind a PCIe-to-PCI bridge), so there are cases where we
might need to formally attach/detach a group element to a domain at some
later point.  This really hasn't felt like a stumbling point for vfio,
at least on x86.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  5:25       ` Alex Williamson
@ 2011-11-30  9:23         ` Benjamin Herrenschmidt
  2011-12-01  0:06           ` David Gibson
  2011-12-01  6:20           ` Alex Williamson
  2011-12-01  0:03         ` David Gibson
  1 sibling, 2 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-11-30  9:23 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Gibson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Tue, 2011-11-29 at 22:25 -0700, Alex Williamson wrote:

> Note that iommu drivers are registered per bus_type, so the unique pair
> is {bus_type, groupid}, which seems sufficient for vfio.
> 
> > Don't forget that to keep sanity, we really want to expose the groups
> > via sysfs (per-group dir with symlinks to the devices).
> > 
> > I'm working with Alexey on providing an in-kernel powerpc specific API
> > to expose the PE stuff to whatever's going to interface to VFIO to
> > create the groups, though we can eventually collapse that. The idea is
> > that on non-PE capable brigdes (old style), I would make a single group
> > per host bridge.
> 
> If your non-PE capable bridges aren't actually providing isolation, they
> should return -ENODEV for the group_device() callback, then vfio will
> ignore them.

Why ignore them ? It's perfectly fine to consider everything below the
host bridge as one group. There is isolation ... at the host bridge
level.

Really groups should be a structure, not a magic number. We want to
iterate them and their content, represent them via an API, etc... and so
magic numbers means that anything under the hood will have to constantly
convert between that and some kind of internal data structure.

I also somewhat dislike the bus_type as the anchor to the grouping
system, but that's not necessarily as bad an issue for us to deal with.

Eventually what will happen on my side is that I will have a powerpc
"generic" (ie. accross platforms) that allow to enumerate groups and
retrieve the dma windows associated with them etc...

That API will use underlying function pointers provided by the PCI host
bridge (for which we do have a data structure, struct pci_controller,
like many other archs except I think x86 :-)

Any host platform that doesn't provide those pointers (ie. all of them
initially) will get a default behaviour which is to group everything
below a host bridge (since host bridges still have independent iommu
windows, at least for us they all do). 

On top of that we can implement a "backend" that provides those pointers
for the p7ioc bridge used on the powernv platform, which will expose
more fine grained groups based on our "partitionable endpoint"
mechanism.

The grouping will have been decided early at boot time based on a mix of
HW resources and bus topology, plus things like whether there is a PCI-X
bridge etc... and will be initially immutable.

Ideally, we need to expose a subset of this API as a "generic" interface
to allow generic code to iterate the groups and their content, and to
construct the appropriate sysfs representation.

> > In addition, Alex, I noticed that you still have the domain stuff there,
> > which is fine I suppose, we could make it a requirement on power that
> > you only put a single group in a domain... but the API is still to put
> > individual devices in a domain, not groups, and that somewhat sucks.
> > 
> > You could "fix" that by having some kind of ->domain_enable() or
> > whatever that's used to "activate" the domain and verifies that it
> > contains entire groups but that looks like a pointless way to complicate
> > both the API and the implementation.
> 
> Right, groups are currently just a way to identify dependent sets, not a
> unit of work.  We can also have group membership change dynamically
> (hotplug slot behind a PCIe-to-PCI bridge), so there are cases where we
> might need to formally attach/detach a group element to a domain at some
> later point.  This really hasn't felt like a stumbling point for vfio,
> at least on x86.  Thanks,

It doesn't matter much as long as we have a way to know that a group is
"complete", ie that all devices of a group have been taken over by vfio
and put into a domain, and block them from being lost. Only then can we
actually "use" the group and start reconfiguring the iommu etc... for
use by the guest.

Note that groups -will- contain briges eventually. We need to take that
into account since bridges -usually- don't have an ordinary driver
attached to them so there may be issues there with tracking whether they
are taken over by vfio...

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  5:25       ` Alex Williamson
  2011-11-30  9:23         ` Benjamin Herrenschmidt
@ 2011-12-01  0:03         ` David Gibson
  2011-12-01  0:52           ` Chris Wright
  2011-12-01  6:48           ` Alex Williamson
  1 sibling, 2 replies; 37+ messages in thread
From: David Gibson @ 2011-12-01  0:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Benjamin Herrenschmidt, joerg.roedel, dwmw2, iommu, linux-kernel,
	chrisw, agraf, scottwood, B08248

On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> > On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> > 
> > > > +static ssize_t show_iommu_group(struct device *dev,
> > > > +				struct device_attribute *attr, char *buf)
> > > > +{
> > > > +	unsigned int groupid;
> > > > +
> > > > +	if (iommu_device_group(dev, &groupid))
> > > > +		return 0;
> > > > +
> > > > +	return sprintf(buf, "%u", groupid);
> > > > +}
> > > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > > 
> > > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > > More seriously, we really want these to be unique across the whole
> > > system, but they're allocated by the iommu driver which can't
> > > guarantee that if it's not the only one present.  Seems to me it would
> > > be safer to have an actual iommu_group structure allocated for each
> > > group, and use the pointer to it as the ID to hand around (with NULL
> > > meaning "no iommu" / untranslated).  The structure could contain a
> > > more human readable - or more relevant to platform documentation - ID
> > > where appropriate.
> 
> Note that iommu drivers are registered per bus_type, so the unique pair
> is {bus_type, groupid}, which seems sufficient for vfio.

Hrm.  That's.. far from obvious.  And still breaks down if we have two
separate iommus on the same bus type (e.g. two independent PCI host
bridges with inbuilt IOMMUs).

> > Don't forget that to keep sanity, we really want to expose the groups
> > via sysfs (per-group dir with symlinks to the devices).
> > 
> > I'm working with Alexey on providing an in-kernel powerpc specific API
> > to expose the PE stuff to whatever's going to interface to VFIO to
> > create the groups, though we can eventually collapse that. The idea is
> > that on non-PE capable brigdes (old style), I would make a single group
> > per host bridge.
> 
> If your non-PE capable bridges aren't actually providing isolation, they
> should return -ENODEV for the group_device() callback, then vfio will
> ignore them.
> 
> > In addition, Alex, I noticed that you still have the domain stuff there,
> > which is fine I suppose, we could make it a requirement on power that
> > you only put a single group in a domain... but the API is still to put
> > individual devices in a domain, not groups, and that somewhat sucks.
> > 
> > You could "fix" that by having some kind of ->domain_enable() or
> > whatever that's used to "activate" the domain and verifies that it
> > contains entire groups but that looks like a pointless way to complicate
> > both the API and the implementation.
> 
> Right, groups are currently just a way to identify dependent sets, not a
> unit of work.

I'm not quite sure what you mean by "unit of work".  But assigning the
groups as a unit generally makes more sense to me than assigning
devices individually, but only being able to use them when the group
is together.  Particularly when things are hotplugged into groups.

>  We can also have group membership change dynamically
> (hotplug slot behind a PCIe-to-PCI bridge), so there are cases where we
> might need to formally attach/detach a group element to a domain at some
> later point.  This really hasn't felt like a stumbling point for vfio,
> at least on x86.  Thanks,
> 
> Alex
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  9:23         ` Benjamin Herrenschmidt
@ 2011-12-01  0:06           ` David Gibson
  2011-12-01  6:20           ` Alex Williamson
  1 sibling, 0 replies; 37+ messages in thread
From: David Gibson @ 2011-12-01  0:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alex Williamson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Wed, Nov 30, 2011 at 08:23:48PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2011-11-29 at 22:25 -0700, Alex Williamson wrote:
> 
> > Note that iommu drivers are registered per bus_type, so the unique pair
> > is {bus_type, groupid}, which seems sufficient for vfio.
> > 
> > > Don't forget that to keep sanity, we really want to expose the groups
> > > via sysfs (per-group dir with symlinks to the devices).
> > > 
> > > I'm working with Alexey on providing an in-kernel powerpc specific API
> > > to expose the PE stuff to whatever's going to interface to VFIO to
> > > create the groups, though we can eventually collapse that. The idea is
> > > that on non-PE capable brigdes (old style), I would make a single group
> > > per host bridge.
> > 
> > If your non-PE capable bridges aren't actually providing isolation, they
> > should return -ENODEV for the group_device() callback, then vfio will
> > ignore them.
> 
> Why ignore them ? It's perfectly fine to consider everything below the
> host bridge as one group. There is isolation ... at the host bridge
> level.
> 
> Really groups should be a structure, not a magic number. We want to
> iterate them and their content, represent them via an API, etc... and so
> magic numbers means that anything under the hood will have to constantly
> convert between that and some kind of internal data structure.

Right.  These have to be discoverable, so we need some kind of
in-kernel object to represent them.  Might as well use that
everywhere, rather than just at higher levels.

> I also somewhat dislike the bus_type as the anchor to the grouping
> system, but that's not necessarily as bad an issue for us to deal with.
> 
> Eventually what will happen on my side is that I will have a powerpc
> "generic" (ie. accross platforms) that allow to enumerate groups and
> retrieve the dma windows associated with them etc...
> 
> That API will use underlying function pointers provided by the PCI host
> bridge (for which we do have a data structure, struct pci_controller,
> like many other archs except I think x86 :-)
> 
> Any host platform that doesn't provide those pointers (ie. all of them
> initially) will get a default behaviour which is to group everything
> below a host bridge (since host bridges still have independent iommu
> windows, at least for us they all do). 
> 
> On top of that we can implement a "backend" that provides those pointers
> for the p7ioc bridge used on the powernv platform, which will expose
> more fine grained groups based on our "partitionable endpoint"
> mechanism.
> 
> The grouping will have been decided early at boot time based on a mix of
> HW resources and bus topology, plus things like whether there is a PCI-X
> bridge etc... and will be initially immutable.
> 
> Ideally, we need to expose a subset of this API as a "generic" interface
> to allow generic code to iterate the groups and their content, and to
> construct the appropriate sysfs representation.
> 
> > > In addition, Alex, I noticed that you still have the domain stuff there,
> > > which is fine I suppose, we could make it a requirement on power that
> > > you only put a single group in a domain... but the API is still to put
> > > individual devices in a domain, not groups, and that somewhat sucks.
> > > 
> > > You could "fix" that by having some kind of ->domain_enable() or
> > > whatever that's used to "activate" the domain and verifies that it
> > > contains entire groups but that looks like a pointless way to complicate
> > > both the API and the implementation.
> > 
> > Right, groups are currently just a way to identify dependent sets, not a
> > unit of work.  We can also have group membership change dynamically
> > (hotplug slot behind a PCIe-to-PCI bridge), so there are cases where we
> > might need to formally attach/detach a group element to a domain at some
> > later point.  This really hasn't felt like a stumbling point for vfio,
> > at least on x86.  Thanks,
> 
> It doesn't matter much as long as we have a way to know that a group is
> "complete", ie that all devices of a group have been taken over by vfio
> and put into a domain, and block them from being lost. Only then can we
> actually "use" the group and start reconfiguring the iommu etc... for
> use by the guest.

I think this is handled by later patches in the series.

> Note that groups -will- contain briges eventually. We need to take that
> into account since bridges -usually- don't have an ordinary driver
> attached to them so there may be issues there with tracking whether they
> are taken over by vfio...
> 
> Cheers,
> Ben.
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/4] iommu: Add option to group multi-function devices
  2011-10-21 19:56 ` [PATCH 4/4] iommu: Add option to group multi-function devices Alex Williamson
@ 2011-12-01  0:11   ` David Gibson
  0 siblings, 0 replies; 37+ messages in thread
From: David Gibson @ 2011-12-01  0:11 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joerg.roedel, dwmw2, iommu, linux-kernel, chrisw, agraf,
	scottwood, B08248, benh

On Fri, Oct 21, 2011 at 01:56:24PM -0600, Alex Williamson wrote:
> The option iommu=group_mf indicates the that the iommu driver should
> expose all functions of a multi-function PCI device as the same
> iommu_device_group.  This is useful for disallowing individual functions
> being exposed as independent devices to userspace as there are often
> hidden dependencies.  Virtual functions are not affected by this option.

This seems ok as a testing/debugging hack.  But in general, surely,
whether to group the functions should be a device quirk.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  0:03         ` David Gibson
@ 2011-12-01  0:52           ` Chris Wright
  2011-12-01  0:57             ` David Gibson
  2011-12-01  6:48           ` Alex Williamson
  1 sibling, 1 reply; 37+ messages in thread
From: Chris Wright @ 2011-12-01  0:52 UTC (permalink / raw)
  To: David Gibson, Alex Williamson, Benjamin Herrenschmidt,
	joerg.roedel, dwmw2, iommu, linux-kernel, chrisw, agraf,
	scottwood, B08248

* David Gibson (dwg@au1.ibm.com) wrote:
> On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> > On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> > > On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> > > 
> > > > > +static ssize_t show_iommu_group(struct device *dev,
> > > > > +				struct device_attribute *attr, char *buf)
> > > > > +{
> > > > > +	unsigned int groupid;
> > > > > +
> > > > > +	if (iommu_device_group(dev, &groupid))
> > > > > +		return 0;
> > > > > +
> > > > > +	return sprintf(buf, "%u", groupid);
> > > > > +}
> > > > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > > > 
> > > > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > > > More seriously, we really want these to be unique across the whole
> > > > system, but they're allocated by the iommu driver which can't
> > > > guarantee that if it's not the only one present.  Seems to me it would
> > > > be safer to have an actual iommu_group structure allocated for each
> > > > group, and use the pointer to it as the ID to hand around (with NULL
> > > > meaning "no iommu" / untranslated).  The structure could contain a
> > > > more human readable - or more relevant to platform documentation - ID
> > > > where appropriate.
> > 
> > Note that iommu drivers are registered per bus_type, so the unique pair
> > is {bus_type, groupid}, which seems sufficient for vfio.
> 
> Hrm.  That's.. far from obvious.  And still breaks down if we have two
> separate iommus on the same bus type (e.g. two independent PCI host
> bridges with inbuilt IOMMUs).

Happens to still work for Intel IOMMU on x86 the way Alex wrote the
Intel VT-d patch in this series, as well as AMD IOMMU.  The caveat for
AMD IOMMU is that the groupid generation would break (as-is) once
there's support for multiple PCI segments.  This is not an inherent
shortcoming of the groupid mechanism though, just a current limitation
of AMD IOMMU's implementation.  Alex overloaded B:D.F for those which is
a convenient id since that maps to the device (or in the case of devices
behind a PCIe-to-PCI bridge, the requestor ID of all devices behind the
bridge, or "the group").

thanks,
-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  0:52           ` Chris Wright
@ 2011-12-01  0:57             ` David Gibson
  2011-12-01  1:04               ` Chris Wright
  0 siblings, 1 reply; 37+ messages in thread
From: David Gibson @ 2011-12-01  0:57 UTC (permalink / raw)
  To: Chris Wright
  Cc: Alex Williamson, Benjamin Herrenschmidt, joerg.roedel, dwmw2,
	iommu, linux-kernel, agraf, scottwood, B08248

On Wed, Nov 30, 2011 at 04:52:20PM -0800, Chris Wright wrote:
> * David Gibson (dwg@au1.ibm.com) wrote:
> > On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> > > On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> > > > On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> > > > 
> > > > > > +static ssize_t show_iommu_group(struct device *dev,
> > > > > > +				struct device_attribute *attr, char *buf)
> > > > > > +{
> > > > > > +	unsigned int groupid;
> > > > > > +
> > > > > > +	if (iommu_device_group(dev, &groupid))
> > > > > > +		return 0;
> > > > > > +
> > > > > > +	return sprintf(buf, "%u", groupid);
> > > > > > +}
> > > > > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > > > > 
> > > > > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > > > > More seriously, we really want these to be unique across the whole
> > > > > system, but they're allocated by the iommu driver which can't
> > > > > guarantee that if it's not the only one present.  Seems to me it would
> > > > > be safer to have an actual iommu_group structure allocated for each
> > > > > group, and use the pointer to it as the ID to hand around (with NULL
> > > > > meaning "no iommu" / untranslated).  The structure could contain a
> > > > > more human readable - or more relevant to platform documentation - ID
> > > > > where appropriate.
> > > 
> > > Note that iommu drivers are registered per bus_type, so the unique pair
> > > is {bus_type, groupid}, which seems sufficient for vfio.
> > 
> > Hrm.  That's.. far from obvious.  And still breaks down if we have two
> > separate iommus on the same bus type (e.g. two independent PCI host
> > bridges with inbuilt IOMMUs).
> 
> Happens to still work for Intel IOMMU on x86 the way Alex wrote the
> Intel VT-d patch in this series, as well as AMD IOMMU.  The caveat for
> AMD IOMMU is that the groupid generation would break (as-is) once
> there's support for multiple PCI segments.  This is not an inherent
> shortcoming of the groupid mechanism though, just a current limitation
> of AMD IOMMU's implementation.  Alex overloaded B:D.F for those which is
> a convenient id since that maps to the device (or in the case of devices
> behind a PCIe-to-PCI bridge, the requestor ID of all devices behind the
> bridge, or "the group").

"Happens to still work" is not exactly a ringing endorsement.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  0:57             ` David Gibson
@ 2011-12-01  1:04               ` Chris Wright
  2011-12-01  1:50                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: Chris Wright @ 2011-12-01  1:04 UTC (permalink / raw)
  To: David Gibson, Chris Wright, Alex Williamson,
	Benjamin Herrenschmidt, joerg.roedel, dwmw2, iommu, linux-kernel,
	agraf, scottwood, B08248

* David Gibson (dwg@au1.ibm.com) wrote:
> On Wed, Nov 30, 2011 at 04:52:20PM -0800, Chris Wright wrote:
> > * David Gibson (dwg@au1.ibm.com) wrote:
> > > On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> > > > Note that iommu drivers are registered per bus_type, so the unique pair
> > > > is {bus_type, groupid}, which seems sufficient for vfio.
> > > 
> > > Hrm.  That's.. far from obvious.  And still breaks down if we have two
> > > separate iommus on the same bus type (e.g. two independent PCI host
> > > bridges with inbuilt IOMMUs).
> > 
> > Happens to still work for Intel IOMMU on x86 the way Alex wrote the
> > Intel VT-d patch in this series, as well as AMD IOMMU.  The caveat for
> > AMD IOMMU is that the groupid generation would break (as-is) once
> > there's support for multiple PCI segments.  This is not an inherent
> > shortcoming of the groupid mechanism though, just a current limitation
> > of AMD IOMMU's implementation.  Alex overloaded B:D.F for those which is
> > a convenient id since that maps to the device (or in the case of devices
> > behind a PCIe-to-PCI bridge, the requestor ID of all devices behind the
> > bridge, or "the group").
> 
> "Happens to still work" is not exactly a ringing endorsement.

Heh.  Put it another way.  Generating the group ID is left up to the
IOMMU.  This will break down when there's a system with multiple IOMMU's
on the same bus_type that don't have any awareness of one another.  This
is not the case for the existing series and x86 hw.

I'm not opposed to doing the allocation and ptr as id (taking care for
possibility that PCI hotplug/unplug/replug could reuse the same memory
for group id, however).  Just pointing out that the current system works
as is, and there's some value in it's simplicity (overloading ID ==
group structure + pretty printing ID in sysfs, for example).

thanks,
-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  1:04               ` Chris Wright
@ 2011-12-01  1:50                 ` Benjamin Herrenschmidt
  2011-12-01  2:00                   ` David Gibson
  2011-12-01  2:05                   ` Chris Wright
  0 siblings, 2 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-01  1:50 UTC (permalink / raw)
  To: Chris Wright
  Cc: David Gibson, Alex Williamson, joerg.roedel, dwmw2, iommu,
	linux-kernel, agraf, scottwood, B08248

On Wed, 2011-11-30 at 17:04 -0800, Chris Wright wrote:
> Heh.  Put it another way.  Generating the group ID is left up to the
> IOMMU.  This will break down when there's a system with multiple IOMMU's
> on the same bus_type that don't have any awareness of one another.  This
> is not the case for the existing series and x86 hw.
> 
> I'm not opposed to doing the allocation and ptr as id (taking care for
> possibility that PCI hotplug/unplug/replug could reuse the same memory
> for group id, however).  Just pointing out that the current system works
> as is, and there's some value in it's simplicity (overloading ID ==
> group structure + pretty printing ID in sysfs, for example). 

Well, ID can work even with multiple domains since we have domains
numbers. bdfn is 16-bit, which leaves 16-bit for the domain number,
which is sufficient.

So by encoding (domain << 16) | bdfn, we can get away with a 32-bit
number... it just sucks.

Note that on pseries, I wouldn't use bdfn anyway, I would use my
internal "PE#" which is also a number that I can constraint to 16-bits.

So I can work with a number as long as it's at least an unsigned int
(32-bit), but I think it somewhat sucks, and will impose gratuituous
number <-> structure conversions all over, but if we keep the whole
group thing an iommu specific data structure, then let's stick to the
number and move on with life.

We might get better results if we kept the number as

struct iommu_group_id {
	u16	domain;
	u16	group;
};

(Or a union of that with an unsigned int)

That way the domain information is available generically (can be match
with pci_domain_nr() for example), and sysfs can then be layed out as

/sys/bus/pci/groups/<domain>/<id>

Which is nicer than having enormous id's

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  1:50                 ` Benjamin Herrenschmidt
@ 2011-12-01  2:00                   ` David Gibson
  2011-12-01  2:05                   ` Chris Wright
  1 sibling, 0 replies; 37+ messages in thread
From: David Gibson @ 2011-12-01  2:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Chris Wright, Alex Williamson, joerg.roedel, dwmw2, iommu,
	linux-kernel, agraf, scottwood, B08248

On Thu, Dec 01, 2011 at 12:50:07PM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2011-11-30 at 17:04 -0800, Chris Wright wrote:
> > Heh.  Put it another way.  Generating the group ID is left up to the
> > IOMMU.  This will break down when there's a system with multiple IOMMU's
> > on the same bus_type that don't have any awareness of one another.  This
> > is not the case for the existing series and x86 hw.
> > 
> > I'm not opposed to doing the allocation and ptr as id (taking care for
> > possibility that PCI hotplug/unplug/replug could reuse the same memory
> > for group id, however).  Just pointing out that the current system works
> > as is, and there's some value in it's simplicity (overloading ID ==
> > group structure + pretty printing ID in sysfs, for example). 
> 
> Well, ID can work even with multiple domains since we have domains
> numbers. bdfn is 16-bit, which leaves 16-bit for the domain number,
> which is sufficient.
> 
> So by encoding (domain << 16) | bdfn, we can get away with a 32-bit
> number... it just sucks.

Except that the iommu group structure is not supposed to be specific
to PCI.

> Note that on pseries, I wouldn't use bdfn anyway, I would use my
> internal "PE#" which is also a number that I can constraint to 16-bits.
> 
> So I can work with a number as long as it's at least an unsigned int
> (32-bit), but I think it somewhat sucks, and will impose gratuituous
> number <-> structure conversions all over, but if we keep the whole
> group thing an iommu specific data structure, then let's stick to the
> number and move on with life.
> 
> We might get better results if we kept the number as
> 
> struct iommu_group_id {
> 	u16	domain;
> 	u16	group;
> };
> 
> (Or a union of that with an unsigned int)
> 
> That way the domain information is available generically (can be match
> with pci_domain_nr() for example), and sysfs can then be layed out as
> 
> /sys/bus/pci/groups/<domain>/<id>
> 
> Which is nicer than having enormous id's
> 
> Cheers,
> Ben.
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  1:50                 ` Benjamin Herrenschmidt
  2011-12-01  2:00                   ` David Gibson
@ 2011-12-01  2:05                   ` Chris Wright
  2011-12-01  7:28                     ` Alex Williamson
  2011-12-01 14:02                     ` Yoder Stuart-B08248
  1 sibling, 2 replies; 37+ messages in thread
From: Chris Wright @ 2011-12-01  2:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Chris Wright, David Gibson, Alex Williamson, joerg.roedel, dwmw2,
	iommu, linux-kernel, agraf, scottwood, B08248

* Benjamin Herrenschmidt (benh@kernel.crashing.org) wrote:
> On Wed, 2011-11-30 at 17:04 -0800, Chris Wright wrote:
> > Heh.  Put it another way.  Generating the group ID is left up to the
> > IOMMU.  This will break down when there's a system with multiple IOMMU's
> > on the same bus_type that don't have any awareness of one another.  This
> > is not the case for the existing series and x86 hw.
> > 
> > I'm not opposed to doing the allocation and ptr as id (taking care for
> > possibility that PCI hotplug/unplug/replug could reuse the same memory
> > for group id, however).  Just pointing out that the current system works
> > as is, and there's some value in it's simplicity (overloading ID ==
> > group structure + pretty printing ID in sysfs, for example). 
> 
> Well, ID can work even with multiple domains since we have domains
> numbers. bdfn is 16-bit, which leaves 16-bit for the domain number,
> which is sufficient.
> 
> So by encoding (domain << 16) | bdfn, we can get away with a 32-bit
> number... it just sucks.

Yup, that's just what Alex did for VT-d ;)

+	union {
+               struct {
+                       u8 devfn;
+                       u8 bus;
+                       u16 segment;
+               } pci;
+               u32 group;
+	} id;

Just that the alias table used for AMD IOMMU to map bdf -> requestor ID
is not multi-segment aware, so the id is only bdf of bridge.

> Note that on pseries, I wouldn't use bdfn anyway, I would use my
> internal "PE#" which is also a number that I can constraint to 16-bits.
> 
> So I can work with a number as long as it's at least an unsigned int
> (32-bit), but I think it somewhat sucks, and will impose gratuituous
> number <-> structure conversions all over, but if we keep the whole
> group thing an iommu specific data structure, then let's stick to the
> number and move on with life.
> 
> We might get better results if we kept the number as
> 
> struct iommu_group_id {
> 	u16	domain;
> 	u16	group;
> };
> 
> (Or a union of that with an unsigned int)
> 
> That way the domain information is available generically (can be match
> with pci_domain_nr() for example), and sysfs can then be layed out as
> 
> /sys/bus/pci/groups/<domain>/<id>
> 
> Which is nicer than having enormous id's

Seems fine to me (although I missed /sys/bus/pci/groups/ introduction),
except that I think the freescale folks aren't interested in PCI which
is one reason why the thing is just an opaque id.

thanks,
-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-11-30  9:23         ` Benjamin Herrenschmidt
  2011-12-01  0:06           ` David Gibson
@ 2011-12-01  6:20           ` Alex Williamson
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-12-01  6:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: David Gibson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Wed, 2011-11-30 at 20:23 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2011-11-29 at 22:25 -0700, Alex Williamson wrote:
> 
> > Note that iommu drivers are registered per bus_type, so the unique pair
> > is {bus_type, groupid}, which seems sufficient for vfio.
> > 
> > > Don't forget that to keep sanity, we really want to expose the groups
> > > via sysfs (per-group dir with symlinks to the devices).
> > > 
> > > I'm working with Alexey on providing an in-kernel powerpc specific API
> > > to expose the PE stuff to whatever's going to interface to VFIO to
> > > create the groups, though we can eventually collapse that. The idea is
> > > that on non-PE capable brigdes (old style), I would make a single group
> > > per host bridge.
> > 
> > If your non-PE capable bridges aren't actually providing isolation, they
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > should return -ENODEV for the group_device() callback, then vfio will
> > ignore them.
> 
> Why ignore them ? It's perfectly fine to consider everything below the
> host bridge as one group. There is isolation ... at the host bridge
> level.

If there is isolation, yes, report a group.  A host bridge by itself
does not imply DMA isolation on most platforms.  We could say devices
without isolation are all in the same group, but then imagine a system
with two PCI host bridges.  Host bridge A has an iommu providing B:D.F
isolation, host bridge B has no iommu and no isolation.  Clearly devices
behind A have groupids.  However if we assign a single groupid for all
the devices behind B, we subvert the isolation of devices behind A.  To
avoid that, devices that cannot be isolated should not have a groupid.

> Really groups should be a structure, not a magic number. We want to
> iterate them and their content, represent them via an API, etc... and so
> magic numbers means that anything under the hood will have to constantly
> convert between that and some kind of internal data structure.

What would a group structure in core driver code contain?  Who manages
it?  Don't forget, most users don't have hardware capable of making use
of this and most users with the right hardware still aren't going to use
it.  Those group structures live in vfio and get allocated when vfio is
loaded.  VT-d and AMD-Vi can support this new groupid interface with
zero additional data structures floating around.

As Chris noted, the choice of an unsigned int to efficiently represent
groupids in PCI isn't just coincidence.  If you can't make use of the
same trick, surely it's not too difficult to enumerate your PEs and use
that as the groupid.

> I also somewhat dislike the bus_type as the anchor to the grouping
> system, but that's not necessarily as bad an issue for us to deal with.
> 
> Eventually what will happen on my side is that I will have a powerpc
> "generic" (ie. accross platforms) that allow to enumerate groups and
> retrieve the dma windows associated with them etc...
> 
> That API will use underlying function pointers provided by the PCI host
> bridge (for which we do have a data structure, struct pci_controller,
> like many other archs except I think x86 :-)
> 
> Any host platform that doesn't provide those pointers (ie. all of them
> initially) will get a default behaviour which is to group everything
> below a host bridge (since host bridges still have independent iommu
> windows, at least for us they all do). 
> 
> On top of that we can implement a "backend" that provides those pointers
> for the p7ioc bridge used on the powernv platform, which will expose
> more fine grained groups based on our "partitionable endpoint"
> mechanism.
> 
> The grouping will have been decided early at boot time based on a mix of
> HW resources and bus topology, plus things like whether there is a PCI-X
> bridge etc... and will be initially immutable.
> 
> Ideally, we need to expose a subset of this API as a "generic" interface
> to allow generic code to iterate the groups and their content, and to
> construct the appropriate sysfs representation.
> 
> > > In addition, Alex, I noticed that you still have the domain stuff there,
> > > which is fine I suppose, we could make it a requirement on power that
> > > you only put a single group in a domain... but the API is still to put
> > > individual devices in a domain, not groups, and that somewhat sucks.
> > > 
> > > You could "fix" that by having some kind of ->domain_enable() or
> > > whatever that's used to "activate" the domain and verifies that it
> > > contains entire groups but that looks like a pointless way to complicate
> > > both the API and the implementation.
> > 
> > Right, groups are currently just a way to identify dependent sets, not a
> > unit of work.  We can also have group membership change dynamically
> > (hotplug slot behind a PCIe-to-PCI bridge), so there are cases where we
> > might need to formally attach/detach a group element to a domain at some
> > later point.  This really hasn't felt like a stumbling point for vfio,
> > at least on x86.  Thanks,
> 
> It doesn't matter much as long as we have a way to know that a group is
> "complete", ie that all devices of a group have been taken over by vfio
> and put into a domain, and block them from being lost. Only then can we
> actually "use" the group and start reconfiguring the iommu etc... for
> use by the guest.

This is done.  I call a group "viable" when all of the devices are bound
to their vfio bus driver and ready for use.  The API will not allow a
user access to the iommu or device file descriptors unless the group is
viable.  Unbinding a device from the vfio bus driver will block as long
as the group is in use (need to decide if we enable a system policy to
let vfio kill the process to release a device).  I'm also working on
code that proactively attaches the vfio bus driver to a hot-added device
if it's being added to an in-use group.

> Note that groups -will- contain briges eventually. We need to take that
> into account since bridges -usually- don't have an ordinary driver
> attached to them so there may be issues there with tracking whether they
> are taken over by vfio...

I look forward to patches for this ;)  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  0:03         ` David Gibson
  2011-12-01  0:52           ` Chris Wright
@ 2011-12-01  6:48           ` Alex Williamson
  2011-12-01 10:33             ` David Woodhouse
  2011-12-01 21:32             ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2011-12-01  6:48 UTC (permalink / raw)
  To: David Gibson
  Cc: Benjamin Herrenschmidt, joerg.roedel, dwmw2, iommu, linux-kernel,
	chrisw, agraf, scottwood, B08248

On Thu, 2011-12-01 at 11:03 +1100, David Gibson wrote:
> On Tue, Nov 29, 2011 at 10:25:51PM -0700, Alex Williamson wrote:
> > On Wed, 2011-11-30 at 15:51 +1100, Benjamin Herrenschmidt wrote:
> > > On Wed, 2011-11-30 at 13:42 +1100, David Gibson wrote:
> > > 
> > > > > +static ssize_t show_iommu_group(struct device *dev,
> > > > > +				struct device_attribute *attr, char *buf)
> > > > > +{
> > > > > +	unsigned int groupid;
> > > > > +
> > > > > +	if (iommu_device_group(dev, &groupid))
> > > > > +		return 0;
> > > > > +
> > > > > +	return sprintf(buf, "%u", groupid);
> > > > > +}
> > > > > +static DEVICE_ATTR(iommu_group, S_IRUGO, show_iommu_group, NULL);
> > > > 
> > > > Hrm.  Assuming the group is is an unsigned int seems dangerous to me.
> > > > More seriously, we really want these to be unique across the whole
> > > > system, but they're allocated by the iommu driver which can't
> > > > guarantee that if it's not the only one present.  Seems to me it would
> > > > be safer to have an actual iommu_group structure allocated for each
> > > > group, and use the pointer to it as the ID to hand around (with NULL
> > > > meaning "no iommu" / untranslated).  The structure could contain a
> > > > more human readable - or more relevant to platform documentation - ID
> > > > where appropriate.
> > 
> > Note that iommu drivers are registered per bus_type, so the unique pair
> > is {bus_type, groupid}, which seems sufficient for vfio.
> 
> Hrm.  That's.. far from obvious.  And still breaks down if we have two
> separate iommus on the same bus type (e.g. two independent PCI host
> bridges with inbuilt IOMMUs).

See bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops).  It
wasn't long ago that we had a global iommu_ops.  Are heterogeneous
iommus using different drivers on the same bus_type really a possibility
on your system, or are we dealing with hypothetical situations?  You'd
currently need to multiplex iommu_ops callbacks between the drivers
yourself.

> > > Don't forget that to keep sanity, we really want to expose the groups
> > > via sysfs (per-group dir with symlinks to the devices).
> > > 
> > > I'm working with Alexey on providing an in-kernel powerpc specific API
> > > to expose the PE stuff to whatever's going to interface to VFIO to
> > > create the groups, though we can eventually collapse that. The idea is
> > > that on non-PE capable brigdes (old style), I would make a single group
> > > per host bridge.
> > 
> > If your non-PE capable bridges aren't actually providing isolation, they
> > should return -ENODEV for the group_device() callback, then vfio will
> > ignore them.
> > 
> > > In addition, Alex, I noticed that you still have the domain stuff there,
> > > which is fine I suppose, we could make it a requirement on power that
> > > you only put a single group in a domain... but the API is still to put
> > > individual devices in a domain, not groups, and that somewhat sucks.
> > > 
> > > You could "fix" that by having some kind of ->domain_enable() or
> > > whatever that's used to "activate" the domain and verifies that it
> > > contains entire groups but that looks like a pointless way to complicate
> > > both the API and the implementation.
> > 
> > Right, groups are currently just a way to identify dependent sets, not a
> > unit of work.
> 
> I'm not quite sure what you mean by "unit of work".  But assigning the
> groups as a unit generally makes more sense to me than assigning
> devices individually, but only being able to use them when the group
> is together.  Particularly when things are hotplugged into groups.

iommu_attach_device() takes a device as a "unit of work", not a group.
If we have a group in use by a userspace driver and a new physical
device is hotplugged into the host and belongs to the group, we need the
granularity of the device to be able to add it to the domain.

I'm not sure if we're getting into VM usage with "assigning" terminology
above.  You're free to architect qemu however you want on POWER to make
groups be the assignable unit to a guest.  On x86, an individual device
is the assignable unit for a guest.  Unassigned group devices will still
be required to be held by vfio, they'll just be unused.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  2:05                   ` Chris Wright
@ 2011-12-01  7:28                     ` Alex Williamson
  2011-12-01 14:02                     ` Yoder Stuart-B08248
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-12-01  7:28 UTC (permalink / raw)
  To: Chris Wright
  Cc: Benjamin Herrenschmidt, David Gibson, joerg.roedel, dwmw2, iommu,
	linux-kernel, agraf, scottwood, B08248

On Wed, 2011-11-30 at 18:05 -0800, Chris Wright wrote:
> * Benjamin Herrenschmidt (benh@kernel.crashing.org) wrote:
> > On Wed, 2011-11-30 at 17:04 -0800, Chris Wright wrote:
> > > Heh.  Put it another way.  Generating the group ID is left up to the
> > > IOMMU.  This will break down when there's a system with multiple IOMMU's
> > > on the same bus_type that don't have any awareness of one another.  This
> > > is not the case for the existing series and x86 hw.
> > > 
> > > I'm not opposed to doing the allocation and ptr as id (taking care for
> > > possibility that PCI hotplug/unplug/replug could reuse the same memory
> > > for group id, however).  Just pointing out that the current system works
> > > as is, and there's some value in it's simplicity (overloading ID ==
> > > group structure + pretty printing ID in sysfs, for example). 
> > 
> > Well, ID can work even with multiple domains since we have domains
> > numbers. bdfn is 16-bit, which leaves 16-bit for the domain number,
> > which is sufficient.
> > 
> > So by encoding (domain << 16) | bdfn, we can get away with a 32-bit
> > number... it just sucks.
> 
> Yup, that's just what Alex did for VT-d ;)
> 
> +	union {
> +               struct {
> +                       u8 devfn;
> +                       u8 bus;
> +                       u16 segment;
> +               } pci;
> +               u32 group;
> +	} id;
> 
> Just that the alias table used for AMD IOMMU to map bdf -> requestor ID
> is not multi-segment aware, so the id is only bdf of bridge.
> 
> > Note that on pseries, I wouldn't use bdfn anyway, I would use my
> > internal "PE#" which is also a number that I can constraint to 16-bits.
> > 
> > So I can work with a number as long as it's at least an unsigned int
> > (32-bit), but I think it somewhat sucks, and will impose gratuituous
> > number <-> structure conversions all over, but if we keep the whole
> > group thing an iommu specific data structure, then let's stick to the
> > number and move on with life.

I think you're over emphasizing these number <-> struct conversions.
Nowhere in the iommu/driver core code is there a need for this.  It's a
simple struct device -> groupid translation.  Even in the vfio code we
only need to do lookups based on groupid rarely and never in anything
resembling a performance path.
 
> > We might get better results if we kept the number as
> > 
> > struct iommu_group_id {
> > 	u16	domain;
> > 	u16	group;
> > };
> > 
> > (Or a union of that with an unsigned int)
> > 
> > That way the domain information is available generically (can be match
> > with pci_domain_nr() for example), and sysfs can then be layed out as
> > 
> > /sys/bus/pci/groups/<domain>/<id>
> > 
> > Which is nicer than having enormous id's
> 
> Seems fine to me (although I missed /sys/bus/pci/groups/ introduction),
> except that I think the freescale folks aren't interested in PCI which
> is one reason why the thing is just an opaque id.

I missed the /sys/bus/pci/groups/ introduction as well.  Groupids are
not PCI specific.  FWIW, the current sysfs representation looks
something like this:

$ cat /sys/bus/pci/devices/0000:01:10.6/iommu_group 
384

$ ls -l /sys/devices/virtual/vfio/pci:384
total 0
-r--r--r-- 1 root root 4096 Nov 30 16:25 dev
drwxr-xr-x 2 root root    0 Nov 30 16:25 devices
drwxr-xr-x 2 root root    0 Nov 30 16:25 power
lrwxrwxrwx 1 root root    0 Nov 30 16:24 subsystem -> ../../../../class/vfio
-rw-r--r-- 1 root root 4096 Nov 30 16:24 uevent

$ cat /sys/devices/virtual/vfio/pci:384/dev
252:28

$ ls -l /dev/vfio/pci:384
crw-rw---- 1 root root 252, 28 Nov 30 16:24 /dev/vfio/pci:384

$ ls -l /sys/devices/virtual/vfio/pci:384/devices
total 0
lrwxrwxrwx 1 root root 0 Nov 30 16:36 0000:01:10.0 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.0
lrwxrwxrwx 1 root root 0 Nov 30 16:25 0000:01:10.1 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.1
lrwxrwxrwx 1 root root 0 Nov 30 16:36 0000:01:10.2 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.2
lrwxrwxrwx 1 root root 0 Nov 30 16:25 0000:01:10.3 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.3
lrwxrwxrwx 1 root root 0 Nov 30 16:36 0000:01:10.4 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.4
lrwxrwxrwx 1 root root 0 Nov 30 16:25 0000:01:10.5 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.5
lrwxrwxrwx 1 root root 0 Nov 30 16:36 0000:01:10.6 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.6
lrwxrwxrwx 1 root root 0 Nov 30 16:25 0000:01:10.7 -> ../../../../pci0000:00/0000:00:01.0/0000:01:10.7

$ for i in $(find /sys/devices/virtual/vfio/pci:384/devices/ -type l); do cat $i/iommu_group; echo; done
384
384
384
384
384
384
384
384

I'm a little annoyed with the pci:384 notation, but that seems like how
it works out best for using {bus_type, groupid}.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  6:48           ` Alex Williamson
@ 2011-12-01 10:33             ` David Woodhouse
  2011-12-01 14:34               ` Alex Williamson
  2011-12-01 21:32             ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 37+ messages in thread
From: David Woodhouse @ 2011-12-01 10:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Gibson, Benjamin Herrenschmidt, joerg.roedel, iommu,
	linux-kernel, chrisw, agraf, scottwood, B08248

On Wed, 2011-11-30 at 23:48 -0700, Alex Williamson wrote:
> I'm not sure if we're getting into VM usage with "assigning" terminology
> above.  You're free to architect qemu however you want on POWER to make
> groups be the assignable unit to a guest.  On x86, an individual device
> is the assignable unit for a guest.  Unassigned group devices will still
> be required to be held by vfio, they'll just be unused.  Thanks,

I'm not sure I understand this.

BY DEFINITION, the smallest assignable unit is the group, surely? Isn't
that the *point* of the groups? That the IOMMU cannot tell the
difference between the devices in the group?

So in *practice*, surely you cannot assign just *one* device from a
group? You can assign the while group, or nothing.

You might *pretend* to assign single devices, and we might try to cope
with the weirdness that happens when you want *one* device of a group to
be owned by one VM, another device in the same group to be owned by a
second VM, and a third device from the same group to be driven by a
native driver in the host. But why not just assign groups as a whole?
Surely that makes most sense?

Btw, did we get a quirk for the Ricoh multi-function devices which all
need to be in the same group because they do all their DMA from function
zero? I think we need another similar quirk for a Marvell SATA
controller which seems to do its AHCI DMA from its IDE function; see
https://bugzilla.redhat.com/757166

-- 
dwmw2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  2:05                   ` Chris Wright
  2011-12-01  7:28                     ` Alex Williamson
@ 2011-12-01 14:02                     ` Yoder Stuart-B08248
  1 sibling, 0 replies; 37+ messages in thread
From: Yoder Stuart-B08248 @ 2011-12-01 14:02 UTC (permalink / raw)
  To: Chris Wright, Benjamin Herrenschmidt
  Cc: David Gibson, Alex Williamson, joerg.roedel@amd.com,
	dwmw2@infradead.org, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, agraf@suse.de, Wood Scott-B07421


> > That way the domain information is available generically (can be match
> > with pci_domain_nr() for example), and sysfs can then be layed out as
> >
> > /sys/bus/pci/groups/<domain>/<id>
> >
> > Which is nicer than having enormous id's
> 
> Seems fine to me (although I missed /sys/bus/pci/groups/ introduction), except that I think
> the freescale folks aren't interested in PCI which is one reason why the thing is just an
> opaque id.

We are interested in PCI, but non-PCI as well.  All DMA devices,
have a globally unique "logical I/O device number" (LIODN) which
will map to a group # for us.

Stuart


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01 10:33             ` David Woodhouse
@ 2011-12-01 14:34               ` Alex Williamson
  2011-12-01 21:46                 ` Benjamin Herrenschmidt
  2011-12-01 23:14                 ` David Woodhouse
  0 siblings, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2011-12-01 14:34 UTC (permalink / raw)
  To: David Woodhouse
  Cc: David Gibson, Benjamin Herrenschmidt, joerg.roedel, iommu,
	linux-kernel, chrisw, agraf, scottwood, B08248

On Thu, 2011-12-01 at 10:33 +0000, David Woodhouse wrote:
> On Wed, 2011-11-30 at 23:48 -0700, Alex Williamson wrote:
> > I'm not sure if we're getting into VM usage with "assigning" terminology
> > above.  You're free to architect qemu however you want on POWER to make
> > groups be the assignable unit to a guest.  On x86, an individual device
> > is the assignable unit for a guest.  Unassigned group devices will still
> > be required to be held by vfio, they'll just be unused.  Thanks,
> 
> I'm not sure I understand this.
> 
> BY DEFINITION, the smallest assignable unit is the group, surely? Isn't
> that the *point* of the groups? That the IOMMU cannot tell the
> difference between the devices in the group?
> 
> So in *practice*, surely you cannot assign just *one* device from a
> group? You can assign the while group, or nothing.
> 
> You might *pretend* to assign single devices, and we might try to cope
> with the weirdness that happens when you want *one* device of a group to
> be owned by one VM, another device in the same group to be owned by a
> second VM, and a third device from the same group to be driven by a
> native driver in the host. But why not just assign groups as a whole?
> Surely that makes most sense?

We've got multiple levels when we add qemu and guests into the mix.  A
group is the smallest assignable unit for kernel->userspace.  In fact,
vfio is constructed so that the user cannot do anything with a group
until all the devices of the group are bound to the vfio bus drivers.
Qemu, as a userspace driver, must therefore take ownership of the entire
group.  However, there's no requirement that a userspace driver must
make use of all the devices in the group, so qemu is free to expose
individual devices from the group to the guest.  IMHO, it doesn't make
sense to have a default model saying "I know you just wanted the nic,
but it's in the same group as this graphics card, so surprise, you get
both!".  Obviously if a user does want to expose multiple devices from a
group to a guest, we support that too.

Spitting groups among multiple VMs or between VM and native host drivers
defeats the purpose of the group.  Neither of these are allowed.

> Btw, did we get a quirk for the Ricoh multi-function devices which all
> need to be in the same group because they do all their DMA from function
> zero? I think we need another similar quirk for a Marvell SATA
> controller which seems to do its AHCI DMA from its IDE function; see
> https://bugzilla.redhat.com/757166

No, as I mentioned, groups are currently for iommu_ops, not dma_ops,
though it makes sense that iommu drivers could use the group info or
create common quirk infrastructure for handling broken devices like
these.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01  6:48           ` Alex Williamson
  2011-12-01 10:33             ` David Woodhouse
@ 2011-12-01 21:32             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-01 21:32 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Gibson, joerg.roedel, dwmw2, iommu, linux-kernel, chrisw,
	agraf, scottwood, B08248

On Wed, 2011-11-30 at 23:48 -0700, Alex Williamson wrote:

> See bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops).  It
> wasn't long ago that we had a global iommu_ops.

Right now, the dma_ops are per-device but the iommu populate/depopulate
hooks are system global, installed by the platform code at boot time, on
powerpc. We could try to "fit in" an iommu "driver" at that point.

In theory, you can connect separate IO chips on POWER server and in fact
it happens on high end boxes, tho they tend to have a similar "PE#"
mechanism, but with different constraints and resources capacities.
However we aren't concerned with that initially for KVM as we are
targetting only lower end machines with only one p7ioc for now.

But I can see more heterogenous setup showing up in embedded space..

> Are heterogeneous
> iommus using different drivers on the same bus_type really a possibility
> on your system, or are we dealing with hypothetical situations?  You'd
> currently need to multiplex iommu_ops callbacks between the drivers
> yourself.

We could.

Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01 14:34               ` Alex Williamson
@ 2011-12-01 21:46                 ` Benjamin Herrenschmidt
  2011-12-01 22:37                   ` Alex Williamson
  2011-12-01 23:14                 ` David Woodhouse
  1 sibling, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-01 21:46 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Woodhouse, David Gibson, joerg.roedel, iommu, linux-kernel,
	chrisw, agraf, scottwood, B08248

On Thu, 2011-12-01 at 07:34 -0700, Alex Williamson wrote:

> We've got multiple levels when we add qemu and guests into the mix.  A
> group is the smallest assignable unit for kernel->userspace.  In fact,
> vfio is constructed so that the user cannot do anything with a group
> until all the devices of the group are bound to the vfio bus drivers.
> Qemu, as a userspace driver, must therefore take ownership of the entire
> group.  However, there's no requirement that a userspace driver must
> make use of all the devices in the group, so qemu is free to expose
> individual devices from the group to the guest.

Sure but that has nothing to do with your kernel->user API. As you said,
you still need to take "ownership" of the entire group. If qemu chose to
only present to the guest part of that group, it's qemu's problem.

Put it differently. One day, maybe, we'll finally get some clues and
implement a proper way for admin tools to know what can or cannot be put
into a guest. In any case, how would you present to the user
(administrator) the groups ? By individual devices with odd behaviours /
errors etc... when the said user/admin tries to put individual devices
into different partitions while  they belong in the same group ? Or by
just exposing ... groups ?

So while those various tools will want to see what's in the group (to
represent it to the user, to expose it to the guest, for other finer
grained operations such as MSI setup, etc...) the basic unit of
ownership is and remains the group, and I don't see how it makes sense
to have your underlying iommu interface operate on anything else.

>   IMHO, it doesn't make
> sense to have a default model saying "I know you just wanted the nic,
> but it's in the same group as this graphics card, so surprise, you get
> both!".  Obviously if a user does want to expose multiple devices from a
> group to a guest, we support that too.

No you don't get it. That's the wrong way around. The user will be
presented to start with with a group of nic+graphics card so that user
knows from the beginning what's going on.

Because your other option is to put the nic in ... and suddenly have the
graphic card go away from the host without showing up in a guest. That
won't be any better than having it "just show up" in the guest without
asking for it.

IE. In both cases, it's wrong. The only thing that makes sense is from
the beginning, expose those two as a group so that the user has no other
choice but put them both into the guest at once and have it represented
to the user as such to begin with.

> Spitting groups among multiple VMs or between VM and native host drivers
> defeats the purpose of the group.  Neither of these are allowed.
> 
> > Btw, did we get a quirk for the Ricoh multi-function devices which all
> > need to be in the same group because they do all their DMA from function
> > zero? I think we need another similar quirk for a Marvell SATA
> > controller which seems to do its AHCI DMA from its IDE function; see
> > https://bugzilla.redhat.com/757166
> 
> No, as I mentioned, groups are currently for iommu_ops, not dma_ops,
> though it makes sense that iommu drivers could use the group info or
> create common quirk infrastructure for handling broken devices like
> these.  Thanks,

Which is why the group info in iommu_ops should just be a representation
of something else under the hood that needs to be sorted out more
clearly than what we are doing at the moment.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01 21:46                 ` Benjamin Herrenschmidt
@ 2011-12-01 22:37                   ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2011-12-01 22:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: David Woodhouse, David Gibson, joerg.roedel, iommu, linux-kernel,
	chrisw, agraf, scottwood, B08248

On Fri, 2011-12-02 at 08:46 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2011-12-01 at 07:34 -0700, Alex Williamson wrote:
> 
> > We've got multiple levels when we add qemu and guests into the mix.  A
> > group is the smallest assignable unit for kernel->userspace.  In fact,
> > vfio is constructed so that the user cannot do anything with a group
> > until all the devices of the group are bound to the vfio bus drivers.
> > Qemu, as a userspace driver, must therefore take ownership of the entire
> > group.  However, there's no requirement that a userspace driver must
> > make use of all the devices in the group, so qemu is free to expose
> > individual devices from the group to the guest.
> 
> Sure but that has nothing to do with your kernel->user API. As you said,
> you still need to take "ownership" of the entire group. If qemu chose to
> only present to the guest part of that group, it's qemu's problem.
> 
> Put it differently. One day, maybe, we'll finally get some clues and
> implement a proper way for admin tools to know what can or cannot be put
> into a guest. In any case, how would you present to the user
> (administrator) the groups ? By individual devices with odd behaviours /
> errors etc... when the said user/admin tries to put individual devices
> into different partitions while  they belong in the same group ? Or by
> just exposing ... groups ?

Why are we even talking about this here?  We both agree on the
kernel->user ownership model and it's qemu's decision what to do from
there.  x86 will typically have groups with a single device.  When
that's not the case, it really doesn't seem that hard for an admin tool
to notify the user and ask them if they want to also assign the
dependent device or leave it unused.  The complexity vs "hey, why did
this other device just appear on my guest and disappear from my host" or
even "why won't this stupid tool let me separate these two devices"
feels like a wash.

> So while those various tools will want to see what's in the group (to
> represent it to the user, to expose it to the guest, for other finer
> grained operations such as MSI setup, etc...) the basic unit of
> ownership is and remains the group, and I don't see how it makes sense
> to have your underlying iommu interface operate on anything else.

Are we going to tear apart every aspect of the IOMMU API, or can we use
some pieces of it as it exists and incrementally improve it?  Yes, if
the IOMMU really can't distinguish between devices in a group, it
probably doesn't matter which of them you attach to a domain.  However,
groups of devices aren't a basic unit of work for the rest of the kernel
and even if the iommu device decoders can't tell the difference between
transactions from hardware, the iommu driver might care for accounting
purposes.

> >   IMHO, it doesn't make
> > sense to have a default model saying "I know you just wanted the nic,
> > but it's in the same group as this graphics card, so surprise, you get
> > both!".  Obviously if a user does want to expose multiple devices from a
> > group to a guest, we support that too.
> 
> No you don't get it. That's the wrong way around. The user will be
> presented to start with with a group of nic+graphics card so that user
> knows from the beginning what's going on.
> 
> Because your other option is to put the nic in ... and suddenly have the
> graphic card go away from the host without showing up in a guest. That
> won't be any better than having it "just show up" in the guest without
> asking for it.
> 
> IE. In both cases, it's wrong. The only thing that makes sense is from
> the beginning, expose those two as a group so that the user has no other
> choice but put them both into the guest at once and have it represented
> to the user as such to begin with.

People have very different ideas about what's intuitive and how
interfaces should work, just look at gnome3/unity versus the useful
desktops we used to have.  Neither of us are likely to write the piece
of management code that does this and the design decisions we're making
here enable either usage model.

> > Spitting groups among multiple VMs or between VM and native host drivers
> > defeats the purpose of the group.  Neither of these are allowed.
> > 
> > > Btw, did we get a quirk for the Ricoh multi-function devices which all
> > > need to be in the same group because they do all their DMA from function
> > > zero? I think we need another similar quirk for a Marvell SATA
> > > controller which seems to do its AHCI DMA from its IDE function; see
> > > https://bugzilla.redhat.com/757166
> > 
> > No, as I mentioned, groups are currently for iommu_ops, not dma_ops,
> > though it makes sense that iommu drivers could use the group info or
> > create common quirk infrastructure for handling broken devices like
> > these.  Thanks,
> 
> Which is why the group info in iommu_ops should just be a representation
> of something else under the hood that needs to be sorted out more
> clearly than what we are doing at the moment.

I will be happy to review your patches to transform groups from an
identifier into an object and make them the pervasive unit of work in
the iommu layer.  The above problem can be solved by:

struct pci_dev *pci_quirk_dma_dev(struct pci_dev *pdev)
{
    if (some match code)
        return lookup different device

    return pdev
}

Which would get called by both the dma_ops code and iommu_device_group.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01 14:34               ` Alex Williamson
  2011-12-01 21:46                 ` Benjamin Herrenschmidt
@ 2011-12-01 23:14                 ` David Woodhouse
  2011-12-07  6:20                   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 37+ messages in thread
From: David Woodhouse @ 2011-12-01 23:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Gibson, Benjamin Herrenschmidt, joerg.roedel, iommu,
	linux-kernel, chrisw, agraf, scottwood, B08248

On Thu, 2011-12-01 at 07:34 -0700, Alex Williamson wrote:
> > Btw, did we get a quirk for the Ricoh multi-function devices which all
> > need to be in the same group because they do all their DMA from function
> > zero? I think we need another similar quirk for a Marvell SATA
> > controller which seems to do its AHCI DMA from its IDE function; see
> > https://bugzilla.redhat.com/757166
> 
> No, as I mentioned, groups are currently for iommu_ops, not dma_ops,

That doesn't matter. There are multiple devices which do all their DMA
from the same source-id, and which have to be handled together. It
doesn't *matter* that it's a hardware bug in this case, and it's just
because they're multiple PCI devices behind a PCIe bridge in the other
case. The grouping code needs to handle it, whether it's for the benefit
of iommu_ops, dma_ops, or both.

> though it makes sense that iommu drivers could use the group info or
> create common quirk infrastructure for handling broken devices like
> these. 

Well, I think we mostly have a consensus that dma_ops should be
implemented *using* iommu_ops; I'd rather be trying to make things
*consistent* between the two rather than building up the differences.

So I'd definitely go for your "use the group info" option. It is the
*same* info, and needs the *same* quirks.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry
  2011-12-01 23:14                 ` David Woodhouse
@ 2011-12-07  6:20                   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-07  6:20 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Alex Williamson, David Gibson, joerg.roedel, iommu, linux-kernel,
	chrisw, agraf, scottwood, B08248

On Thu, 2011-12-01 at 23:14 +0000, David Woodhouse wrote:
> On Thu, 2011-12-01 at 07:34 -0700, Alex Williamson wrote:
> > > Btw, did we get a quirk for the Ricoh multi-function devices which all
> > > need to be in the same group because they do all their DMA from function
> > > zero? I think we need another similar quirk for a Marvell SATA
> > > controller which seems to do its AHCI DMA from its IDE function; see
> > > https://bugzilla.redhat.com/757166
> > 
> > No, as I mentioned, groups are currently for iommu_ops, not dma_ops,
> 
> That doesn't matter. There are multiple devices which do all their DMA
> from the same source-id, and which have to be handled together. It
> doesn't *matter* that it's a hardware bug in this case, and it's just
> because they're multiple PCI devices behind a PCIe bridge in the other
> case. The grouping code needs to handle it, whether it's for the benefit
> of iommu_ops, dma_ops, or both.
> 
> > though it makes sense that iommu drivers could use the group info or
> > create common quirk infrastructure for handling broken devices like
> > these. 
> 
> Well, I think we mostly have a consensus that dma_ops should be
> implemented *using* iommu_ops; I'd rather be trying to make things
> *consistent* between the two rather than building up the differences.
> 
> So I'd definitely go for your "use the group info" option. It is the
> *same* info, and needs the *same* quirks.

The problem at this stage is that all implementations proposed for the
group info (both Alex and David) are somewhat higher level stuff that is
itself constructed from the underlying iommu implementation.

And this is probably the right thing to do to avoid breaking too much
stuff at once.

I think the best approach for devices like that is to have a header
quirk setting a new bit flag in pci_dev along the line of
"dma_group_functions" which causes the platform iommu implementation to
"do the right thing" and in turn create the isolation groups accordingly
as well.

Later on, we might want to revisit but I think it's nasty enough to
leave it as-is. The creation of isolation groups will have to answer to
all sort of other platform specific constraints so I'd rather not try to
put too much of that logic in generic code.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2011-12-07  6:20 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-21 19:55 [PATCH 0/4] iommu: iommu_ops group interface Alex Williamson
2011-10-21 19:56 ` [PATCH 1/4] iommu: Add iommu_device_group callback and iommu_group sysfs entry Alex Williamson
2011-11-30  2:42   ` David Gibson
2011-11-30  4:51     ` Benjamin Herrenschmidt
2011-11-30  5:25       ` Alex Williamson
2011-11-30  9:23         ` Benjamin Herrenschmidt
2011-12-01  0:06           ` David Gibson
2011-12-01  6:20           ` Alex Williamson
2011-12-01  0:03         ` David Gibson
2011-12-01  0:52           ` Chris Wright
2011-12-01  0:57             ` David Gibson
2011-12-01  1:04               ` Chris Wright
2011-12-01  1:50                 ` Benjamin Herrenschmidt
2011-12-01  2:00                   ` David Gibson
2011-12-01  2:05                   ` Chris Wright
2011-12-01  7:28                     ` Alex Williamson
2011-12-01 14:02                     ` Yoder Stuart-B08248
2011-12-01  6:48           ` Alex Williamson
2011-12-01 10:33             ` David Woodhouse
2011-12-01 14:34               ` Alex Williamson
2011-12-01 21:46                 ` Benjamin Herrenschmidt
2011-12-01 22:37                   ` Alex Williamson
2011-12-01 23:14                 ` David Woodhouse
2011-12-07  6:20                   ` Benjamin Herrenschmidt
2011-12-01 21:32             ` Benjamin Herrenschmidt
2011-10-21 19:56 ` [PATCH 2/4] intel-iommu: Implement iommu_device_group Alex Williamson
2011-11-08 17:23   ` Roedel, Joerg
2011-11-10 15:22     ` David Woodhouse
2011-10-21 19:56 ` [PATCH 3/4] amd-iommu: " Alex Williamson
2011-10-21 19:56 ` [PATCH 4/4] iommu: Add option to group multi-function devices Alex Williamson
2011-12-01  0:11   ` David Gibson
2011-10-21 20:34 ` [PATCH 0/4] iommu: iommu_ops group interface Woodhouse, David
2011-10-21 21:16   ` Alex Williamson
2011-10-21 22:39     ` Woodhouse, David
2011-10-21 22:34   ` Alex Williamson
2011-10-27 16:31 ` Alex Williamson
2011-11-15 15:51 ` Roedel, Joerg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox