public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 0/3] VFIO SRIOV support
@ 2016-08-18  7:29 Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl,
	ilyal

Changes from V2:
        1. Enabling and disabling SR-IOV is now done
        through the sysfs interface, requiring
        admin privileges.
        2. Since admin privileges are now required
        to enable SR-IOV most of the the security
        measures introduced in RFC V2 were removed.
        Unfortunately we still need a mutex to prevent
        the VFIO user from changing the number of
        VFs while enable_sriov is in progress.

Changes from V1:
        1. The VF are no longer assigned to PFs iommu group
        2. Add a pci_enable_sriov_with_override API to allow
        enablind sriov without probing the VFs with the
        default driver

Changes from RFC V2:
        1. pci_disable_sriov() is now called from a workqueue
        To avoid the situation where a process is blocked
        in pci_disable_sriov() wating for itself to relase the VFs.
        2. a mutex was added to synchronize calls to
        pci_enable_sriov() and pci_disable_sriov()

Changes from RFC V1:
        Due to the security concern raised in RFC V1, we add two patches
        to make sure the VFs belong to the same IOMMU group as
        the PF and are probed by VFIO.

Today the QEMU hypervisor allows assigning a physical device to a VM,
facilitating driver development. However, it does not support enabling
SR-IOV by the VM kernel driver. Our goal is to implement such support,
allowing developers working on SR-IOV physical function drivers to work
inside VMs as well.

This patch series implements the kernel side of our solution.  It extends
the VFIO driver to support the PCIE SRIOV extended capability with
following features:
1. The ability to probe SR-IOV BAR sizes.
2. The ability to enable and disable SR-IOV.

This patch series is going to be used by QEMU to expose SR-IOV capabilities
to VM. We already have an early prototype based on Knut Omang's patches for
SR-IOV[1].

Limitations:
1. Per SR-IOV spec section 3.3.12, PFs are required to support
4-KB, 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
Unfourtently the kernel currently initializes the System Page Size register once
and assumes it doesn't change therefore we cannot allow guests to change this
register at will. We currently map both the Supported Page sizes and
System Page Size as virtualized and read only in violation of the spec.
In practice this is not an issue since both the hypervisor and the
guest typically select the same System Page Size.

[1] https://github.com/knuto/qemu/tree/sriov_patches_v6

Ilya Lesokhin (3):
  pci: Extend PCI IOV API
  vfio/pci: Allow control SR-IOV through sysfs interface
  vfio/pci: Add support for SR-IOV extended capablity

 drivers/pci/iov.c                   |  41 ++++++++--
 drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
 drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
 drivers/vfio/pci/vfio_pci_private.h |   2 +
 include/linux/pci.h                 |  13 +++-
 5 files changed, 219 insertions(+), 31 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 22:09   ` Christoph Hellwig
  2016-08-22 18:51   ` kbuild test robot
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl,
	ilyal

1. Add pci_enable_sriov_with_override to allow
enabling sriov with a driver override
on the VFs.

2. Expose pci_iov_set_numvfs and pci_iov_resource_size
to make them available for other modules

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
---
 drivers/pci/iov.c   | 41 +++++++++++++++++++++++++++++++++--------
 include/linux/pci.h | 13 ++++++++++++-
 2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 2194b44..98f6f10 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -41,7 +41,7 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
  *
  * Update iov->offset and iov->stride when NumVFs is written.
  */
-static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
+void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
 {
 	struct pci_sriov *iov = dev->sriov;
 
@@ -49,6 +49,7 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
 	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
 	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
 }
+EXPORT_SYMBOL(pci_iov_set_numvfs);
 
 /*
  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
@@ -112,8 +113,10 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
+EXPORT_SYMBOL(pci_iov_resource_size);
 
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
+		       char *driver_override)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -154,14 +157,20 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
 		rc = request_resource(res, &virtfn->resource[i]);
 		BUG_ON(rc);
 	}
-
 	if (reset)
 		__pci_reset_function(virtfn);
 
 	pci_device_add(virtfn, virtfn->bus);
 	mutex_unlock(&iov->dev->sriov->lock);
 
+	if (driver_override) {
+		virtfn->driver_override = kstrdup(driver_override, GFP_KERNEL);
+		if (!virtfn->driver_override)
+			goto failed1;
+	}
+
 	pci_bus_add_device(virtfn);
+
 	sprintf(buf, "virtfn%u", id);
 	rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
 	if (rc)
@@ -235,7 +244,8 @@ int __weak pcibios_sriov_disable(struct pci_dev *pdev)
 	return 0;
 }
 
-static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
+static int sriov_enable(struct pci_dev *dev, int nr_virtfn,
+			char *driver_override)
 {
 	int rc;
 	int i;
@@ -321,7 +331,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = pci_iov_add_virtfn(dev, i, 0);
+		rc = pci_iov_add_virtfn(dev, i, 0, driver_override);
 		if (rc)
 			goto failed;
 	}
@@ -622,20 +632,35 @@ int pci_iov_bus_range(struct pci_bus *bus)
 }
 
 /**
- * pci_enable_sriov - enable the SR-IOV capability
+ * pci_enable_sriov_with_override - enable the SR-IOV capability
  * @dev: the PCI device
  * @nr_virtfn: number of virtual functions to enable
+ * @driver_override: driver override for VFs
  *
  * Returns 0 on success, or negative on failure.
  */
-int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+int pci_enable_sriov_with_override(struct pci_dev *dev, int nr_virtfn,
+				   char *driver_override)
 {
 	might_sleep();
 
 	if (!dev->is_physfn)
 		return -ENOSYS;
 
-	return sriov_enable(dev, nr_virtfn);
+	return sriov_enable(dev, nr_virtfn, driver_override);
+}
+EXPORT_SYMBOL_GPL(pci_enable_sriov_with_override);
+
+/**
+ * pci_enable_sriov - enable the SR-IOV capability
+ * @dev: the PCI device
+ * @nr_virtfn: number of virtual functions to enable
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+	return pci_enable_sriov_with_override(dev, nr_virtfn, NULL);
 }
 EXPORT_SYMBOL_GPL(pci_enable_sriov);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b67e4df..54b3059 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1739,15 +1739,20 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
 int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
 int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
+int pci_enable_sriov_with_override(struct pci_dev *dev, int nr_virtfn,
+				   char *driver_override);
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset);
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
+		       char *driver_override);
 void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
 int pci_sriov_get_totalvfs(struct pci_dev *dev);
 resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
+
+void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn);
 #else
 static inline int pci_iov_virtfn_bus(struct pci_dev *dev, int id)
 {
@@ -1757,6 +1762,11 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id)
 {
 	return -ENOSYS;
 }
+
+static inline int pci_enable_sriov_with_override(struct pci_dev *dev,
+						 int nr_virtfn,
+						 char *driver_override)
+{ return -ENODEV; }
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 { return -ENODEV; }
 static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
@@ -1775,6 +1785,7 @@ static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
 { return 0; }
 static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 { return 0; }
+static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn) { }
 #endif
 
 #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 22:11   ` Christoph Hellwig
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
  3 siblings, 1 reply; 12+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl,
	ilyal

This patch allows enabling and disabling SR-IOV for
devices probed by vfio-pci.
Since the devices might be assigned to an untrusted entities
we use driver_override to make sure the VFs are also
probed by the the vfio-pci driver.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
---
 drivers/vfio/pci/vfio_pci.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d624a52..6a203a7 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -297,6 +297,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 	struct vfio_pci_dummy_resource *dummy_res, *tmp;
 	int i, bar;
 
+	pci_disable_sriov(pdev);
 	/* Stop the device from further DMA */
 	pci_clear_master(pdev);
 
@@ -1314,12 +1315,25 @@ static const struct pci_error_handlers vfio_err_handlers = {
 	.error_detected = vfio_pci_aer_err_detected,
 };
 
+static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+	if (!num_vfs) {
+		pci_disable_sriov(pdev);
+		return 0;
+	}
+
+	return pci_enable_sriov_with_override(pdev,
+					      num_vfs,
+					     "vfio-pci");
+}
+
 static struct pci_driver vfio_pci_driver = {
-	.name		= "vfio-pci",
-	.id_table	= NULL, /* only dynamic ids */
-	.probe		= vfio_pci_probe,
-	.remove		= vfio_pci_remove,
-	.err_handler	= &vfio_err_handlers,
+	.name		 = "vfio-pci",
+	.id_table	 = NULL, /* only dynamic ids */
+	.probe		 = vfio_pci_probe,
+	.remove		 = vfio_pci_remove,
+	.err_handler	 = &vfio_err_handlers,
+	.sriov_configure = vfio_pci_sriov_configure,
 };
 
 struct vfio_devices {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 20:32   ` Alex Williamson
  2016-08-22  6:48   ` kbuild test robot
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
  3 siblings, 2 replies; 12+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl,
	ilyal

Add support for PCIE SR-IOV extended capability.
The capability gives the VFIO user the following abilities:
1. Detect that the device has an SR-IOV capability
2. Change sriov_numvfs and read the corresponding changes in
sriov_vf_offset and sriov_vf_stride
3. Probe vf bar sizes

Enabling and disable sriov is still done through the sysfs interface

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
---
 drivers/vfio/pci/vfio_pci.c         |  23 +++++-
 drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
 drivers/vfio/pci/vfio_pci_private.h |   2 +
 3 files changed, 157 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6a203a7..807caf2c 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1229,6 +1229,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	mutex_init(&vdev->igate);
 	spin_lock_init(&vdev->irqlock);
+	mutex_init(&vdev->sriov_mutex);
 
 	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
 	if (ret) {
@@ -1317,14 +1318,32 @@ static const struct pci_error_handlers vfio_err_handlers = {
 
 static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
 {
+	struct vfio_pci_device *vdev;
+	struct vfio_device *device;
+	int ret = 0;
+
+	device = vfio_device_get_from_dev(&pdev->dev);
+	if (!device)
+		return -EINVAL;
+
+	vdev = vfio_device_data(device);
+	if (!vdev) {
+		vfio_device_put(device);
+		return -EINVAL;
+	}
+
+	mutex_lock(&vdev->sriov_mutex);
 	if (!num_vfs) {
 		pci_disable_sriov(pdev);
-		return 0;
+		goto out;
 	}
 
-	return pci_enable_sriov_with_override(pdev,
+	ret =  pci_enable_sriov_with_override(pdev,
 					      num_vfs,
 					     "vfio-pci");
+out:
+	mutex_unlock(&vdev->sriov_mutex);
+	return ret;
 }
 
 static struct pci_driver vfio_pci_driver = {
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 688691d..6c813d3 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -448,6 +448,35 @@ static __le32 vfio_generate_bar_flags(struct pci_dev *pdev, int bar)
 	return cpu_to_le32(val);
 }
 
+static void vfio_sriov_bar_fixup(struct vfio_pci_device *vdev,
+				 int sriov_cap_start)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	int i;
+	__le32 *bar;
+	u64 mask;
+
+	bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
+
+	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {
+		if (!pci_resource_start(pdev, i)) {
+			*bar = 0; /* Unmapped by host = unimplemented to user */
+			continue;
+		}
+
+		mask = ~(pci_iov_resource_size(pdev, i) - 1);
+
+		*bar &= cpu_to_le32((u32)mask);
+		*bar |= vfio_generate_bar_flags(pdev, i);
+
+		if (*bar & cpu_to_le32(PCI_BASE_ADDRESS_MEM_TYPE_64)) {
+			bar++;
+			*bar &= cpu_to_le32((u32)(mask >> 32));
+			i++;
+		}
+	}
+}
+
 /*
  * Pretend we're hardware and tweak the values of the *virtual* PCI BARs
  * to reflect the hardware capabilities.  This implements BAR sizing.
@@ -901,6 +930,106 @@ static int __init init_pci_ext_cap_pwr_perm(struct perm_bits *perm)
 	return 0;
 }
 
+static int __init init_pci_ext_cap_sriov_perm(struct perm_bits *perm)
+{
+	int i;
+
+	if (alloc_perm_bits(perm, pci_ext_cap_length[PCI_EXT_CAP_ID_SRIOV]))
+		return -ENOMEM;
+
+	/*
+	 * Virtualize the first dword of all express capabilities
+	 * because it includes the next pointer.  This lets us later
+	 * remove capabilities from the chain if we need to.
+	 */
+	p_setd(perm, 0, ALL_VIRT, NO_WRITE);
+
+	/* VF Enable - Virtualized and writable
+	 * Memory Space Enable - Non-virtualized and writable
+	 */
+	p_setw(perm, PCI_SRIOV_CTRL, NO_VIRT,
+	       PCI_SRIOV_CTRL_MSE);
+
+	p_setw(perm, PCI_SRIOV_NUM_VF, (u16)NO_VIRT, (u16)ALL_WRITE);
+	p_setw(perm, PCI_SRIOV_SUP_PGSIZE, (u16)ALL_VIRT, NO_WRITE);
+
+	/* We cannot let user space application change the page size
+	 * so we mark it as read only and trust the user application
+	 * (e.g. qemu) to virtualize this correctly for the guest
+	 */
+	p_setw(perm, PCI_SRIOV_SYS_PGSIZE, (u16)ALL_VIRT, NO_WRITE);
+
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
+		p_setd(perm, PCI_SRIOV_BAR + 4 * i, ALL_VIRT, ALL_WRITE);
+
+	return 0;
+}
+
+static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
+{
+	u8 cap;
+	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
+						 PCI_STD_HEADER_SIZEOF;
+	cap = vdev->pci_config_map[pos];
+
+	if (cap == PCI_CAP_ID_BASIC)
+		return 0;
+
+	/* XXX Can we have to abutting capabilities of the same type? */
+	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
+		pos--;
+
+	return pos;
+}
+
+static int vfio_sriov_cap_config_read(struct vfio_pci_device *vdev, int pos,
+				      int count, struct perm_bits *perm,
+				      int offset, __le32 *val)
+{
+	int cap_start = vfio_find_cap_start(vdev, pos);
+
+	vfio_sriov_bar_fixup(vdev, cap_start);
+	return vfio_default_config_read(vdev, pos, count, perm, offset, val);
+}
+
+static int vfio_sriov_cap_config_write(struct vfio_pci_device *vdev, int pos,
+				       int count, struct perm_bits *perm,
+				       int offset, __le32 val)
+{
+	switch (offset) {
+	case  PCI_SRIOV_NUM_VF:
+	/* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset
+	 * and VF Stride may change when NumVFs changes.
+	 *
+	 * Therefore we should pass valid writes to the hardware.
+	 *
+	 * Per SR-IOV spec sec 3.3.7
+	 * The results are undefined if NumVFs is set to a value greater
+	 * than TotalVFs.
+	 * NumVFs may only be written while VF Enable is Clear.
+	 * If NumVFs is written when VF Enable is Set, the results
+	 * are undefined.
+
+	 * Avoid passing such writes to the Hardware just in case.
+	 */
+		mutex_lock(&vdev->sriov_mutex);
+		if (pci_num_vf(vdev->pdev) ||
+		    val > pci_sriov_get_totalvfs(vdev->pdev)) {
+			mutex_unlock(&vdev->sriov_mutex);
+			return count;
+		}
+
+		pci_iov_set_numvfs(vdev->pdev, val);
+		mutex_unlock(&vdev->sriov_mutex);
+		break;
+	default:
+		break;
+	}
+
+	return vfio_default_config_write(vdev, pos, count, perm,
+					 offset, val);
+}
+
 /*
  * Initialize the shared permission tables
  */
@@ -916,6 +1045,7 @@ void vfio_pci_uninit_perm_bits(void)
 
 	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_ERR]);
 	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
+	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
 }
 
 int __init vfio_pci_init_perm_bits(void)
@@ -938,29 +1068,16 @@ int __init vfio_pci_init_perm_bits(void)
 	ret |= init_pci_ext_cap_pwr_perm(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
 	ecap_perms[PCI_EXT_CAP_ID_VNDR].writefn = vfio_raw_config_write;
 
+	ret |= init_pci_ext_cap_sriov_perm(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
+	ecap_perms[PCI_EXT_CAP_ID_SRIOV].readfn = vfio_sriov_cap_config_read;
+	ecap_perms[PCI_EXT_CAP_ID_SRIOV].writefn = vfio_sriov_cap_config_write;
+
 	if (ret)
 		vfio_pci_uninit_perm_bits();
 
 	return ret;
 }
 
-static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
-{
-	u8 cap;
-	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
-						 PCI_STD_HEADER_SIZEOF;
-	cap = vdev->pci_config_map[pos];
-
-	if (cap == PCI_CAP_ID_BASIC)
-		return 0;
-
-	/* XXX Can we have to abutting capabilities of the same type? */
-	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
-		pos--;
-
-	return pos;
-}
-
 static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 *val)
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 2128de8..02732eb 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -96,6 +96,8 @@ struct vfio_pci_device {
 	struct eventfd_ctx	*err_trigger;
 	struct eventfd_ctx	*req_trigger;
 	struct list_head	dummy_resources_list;
+	struct mutex		sriov_mutex;
+
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
@ 2016-08-18 20:32   ` Alex Williamson
  2016-08-22  6:48   ` kbuild test robot
  1 sibling, 0 replies; 12+ messages in thread
From: Alex Williamson @ 2016-08-18 20:32 UTC (permalink / raw)
  To: Ilya Lesokhin; +Cc: kvm, linux-pci, bhelgaas, noaos, haggaie, ogerlitz, liranl

On Thu, 18 Aug 2016 10:29:17 +0300
Ilya Lesokhin <ilyal@mellanox.com> wrote:

> Add support for PCIE SR-IOV extended capability.
> The capability gives the VFIO user the following abilities:
> 1. Detect that the device has an SR-IOV capability
> 2. Change sriov_numvfs and read the corresponding changes in
> sriov_vf_offset and sriov_vf_stride
> 3. Probe vf bar sizes
> 
> Enabling and disable sriov is still done through the sysfs interface
> 
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Noa Osherovich <noaos@mellanox.com>
> Signed-off-by: Haggai Eran <haggaie@mellanox.com>
> ---
>  drivers/vfio/pci/vfio_pci.c         |  23 +++++-
>  drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
>  drivers/vfio/pci/vfio_pci_private.h |   2 +
>  3 files changed, 157 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 6a203a7..807caf2c 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1229,6 +1229,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	mutex_init(&vdev->igate);
>  	spin_lock_init(&vdev->irqlock);
> +	mutex_init(&vdev->sriov_mutex);
>  
>  	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
>  	if (ret) {
> @@ -1317,14 +1318,32 @@ static const struct pci_error_handlers vfio_err_handlers = {
>  
>  static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
>  {
> +	struct vfio_pci_device *vdev;
> +	struct vfio_device *device;
> +	int ret = 0;
> +
> +	device = vfio_device_get_from_dev(&pdev->dev);
> +	if (!device)
> +		return -EINVAL;
> +
> +	vdev = vfio_device_data(device);
> +	if (!vdev) {
> +		vfio_device_put(device);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&vdev->sriov_mutex);
>  	if (!num_vfs) {
>  		pci_disable_sriov(pdev);
> -		return 0;
> +		goto out;
>  	}
>  
> -	return pci_enable_sriov_with_override(pdev,
> +	ret =  pci_enable_sriov_with_override(pdev,
>  					      num_vfs,
>  					     "vfio-pci");
> +out:
> +	mutex_unlock(&vdev->sriov_mutex);

vfio_device_put(device);

> +	return ret;
>  }
>  
>  static struct pci_driver vfio_pci_driver = {
> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> index 688691d..6c813d3 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -448,6 +448,35 @@ static __le32 vfio_generate_bar_flags(struct pci_dev *pdev, int bar)
>  	return cpu_to_le32(val);
>  }
>  
> +static void vfio_sriov_bar_fixup(struct vfio_pci_device *vdev,
> +				 int sriov_cap_start)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int i;
> +	__le32 *bar;
> +	u64 mask;
> +
> +	bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
> +
> +	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {

These are only defined when CONFIG_PCI_IOV

> +		if (!pci_resource_start(pdev, i)) {
> +			*bar = 0; /* Unmapped by host = unimplemented to user */
> +			continue;
> +		}
> +
> +		mask = ~(pci_iov_resource_size(pdev, i) - 1);
> +
> +		*bar &= cpu_to_le32((u32)mask);
> +		*bar |= vfio_generate_bar_flags(pdev, i);
> +
> +		if (*bar & cpu_to_le32(PCI_BASE_ADDRESS_MEM_TYPE_64)) {
> +			bar++;
> +			*bar &= cpu_to_le32((u32)(mask >> 32));
> +			i++;
> +		}
> +	}
> +}
> +
>  /*
>   * Pretend we're hardware and tweak the values of the *virtual* PCI BARs
>   * to reflect the hardware capabilities.  This implements BAR sizing.
> @@ -901,6 +930,106 @@ static int __init init_pci_ext_cap_pwr_perm(struct perm_bits *perm)
>  	return 0;
>  }
>  
> +static int __init init_pci_ext_cap_sriov_perm(struct perm_bits *perm)
> +{
> +	int i;
> +
> +	if (alloc_perm_bits(perm, pci_ext_cap_length[PCI_EXT_CAP_ID_SRIOV]))
> +		return -ENOMEM;
> +
> +	/*
> +	 * Virtualize the first dword of all express capabilities
> +	 * because it includes the next pointer.  This lets us later
> +	 * remove capabilities from the chain if we need to.
> +	 */
> +	p_setd(perm, 0, ALL_VIRT, NO_WRITE);
> +
> +	/* VF Enable - Virtualized and writable

nit, comment style - multi-line comments as above.

Comment doesn't seem to match the code, VFE is neither virtualized nor
writable.

> +	 * Memory Space Enable - Non-virtualized and writable
> +	 */
> +	p_setw(perm, PCI_SRIOV_CTRL, NO_VIRT,
> +	       PCI_SRIOV_CTRL_MSE);
> +
> +	p_setw(perm, PCI_SRIOV_NUM_VF, (u16)NO_VIRT, (u16)ALL_WRITE);
> +	p_setw(perm, PCI_SRIOV_SUP_PGSIZE, (u16)ALL_VIRT, NO_WRITE);

Is this necessary?  What's the purpose in virtualizing it?  Per the
spec, it's read-only in hardware.

> +
> +	/* We cannot let user space application change the page size
> +	 * so we mark it as read only and trust the user application
> +	 * (e.g. qemu) to virtualize this correctly for the guest
> +	 */
> +	p_setw(perm, PCI_SRIOV_SYS_PGSIZE, (u16)ALL_VIRT, NO_WRITE);

But why do we virtualize it?

> +
> +	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
> +		p_setd(perm, PCI_SRIOV_BAR + 4 * i, ALL_VIRT, ALL_WRITE);
> +
> +	return 0;
> +}
> +
> +static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
> +{
> +	u8 cap;
> +	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
> +						 PCI_STD_HEADER_SIZEOF;
> +	cap = vdev->pci_config_map[pos];
> +
> +	if (cap == PCI_CAP_ID_BASIC)
> +		return 0;
> +
> +	/* XXX Can we have to abutting capabilities of the same type? */
> +	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
> +		pos--;
> +
> +	return pos;
> +}
> +
> +static int vfio_sriov_cap_config_read(struct vfio_pci_device *vdev, int pos,
> +				      int count, struct perm_bits *perm,
> +				      int offset, __le32 *val)
> +{
> +	int cap_start = vfio_find_cap_start(vdev, pos);
> +
> +	vfio_sriov_bar_fixup(vdev, cap_start);

Should we make an is_iov_bar() function for at least a little bit of
filtering?

> +	return vfio_default_config_read(vdev, pos, count, perm, offset, val);
> +}
> +
> +static int vfio_sriov_cap_config_write(struct vfio_pci_device *vdev, int pos,
> +				       int count, struct perm_bits *perm,
> +				       int offset, __le32 val)
> +{
> +	switch (offset) {
> +	case  PCI_SRIOV_NUM_VF:
> +	/* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset
> +	 * and VF Stride may change when NumVFs changes.

This really seems more complicated than set forth here to virtualize.
For instance offset and stride are also affected by ARI, so if the ARI
settings between host and VM don't match, a user like QEMU is going to
need to virtualize offset, stride, and maybe even TotalVFs to
something appropriate for the VM.  There's also the question of why
the physical offset/stride matter at all to a VM when these devices
aren't being initialized in the VM address space and the user/hypervisor
is free to manage them however they see fit.  So I think the only
purpose of virtualizing any of this is so that a VM can potentially
match the bare hardware in the case where the VM and physical system are
sufficiently similar.  Is that correct?

nit, comment stule, blank line within the comment block below.

> +	 *
> +	 * Therefore we should pass valid writes to the hardware.
> +	 *
> +	 * Per SR-IOV spec sec 3.3.7
> +	 * The results are undefined if NumVFs is set to a value greater
> +	 * than TotalVFs.
> +	 * NumVFs may only be written while VF Enable is Clear.
> +	 * If NumVFs is written when VF Enable is Set, the results
> +	 * are undefined.
> +
> +	 * Avoid passing such writes to the Hardware just in case.
> +	 */
> +		mutex_lock(&vdev->sriov_mutex);
> +		if (pci_num_vf(vdev->pdev) ||
> +		    val > pci_sriov_get_totalvfs(vdev->pdev)) {
> +			mutex_unlock(&vdev->sriov_mutex);
> +			return count;
> +		}
> +
> +		pci_iov_set_numvfs(vdev->pdev, val);
> +		mutex_unlock(&vdev->sriov_mutex);
> +		break;
> +	default:
> +		break;
> +	}

Seems unnecessary to have a switch statement for a single case, can't
we just wrap this in a "if (offset == PCI_SRIOV_NUM_VF)" block?

> +
> +	return vfio_default_config_write(vdev, pos, count, perm,
> +					 offset, val);
> +}
> +
>  /*
>   * Initialize the shared permission tables
>   */
> @@ -916,6 +1045,7 @@ void vfio_pci_uninit_perm_bits(void)
>  
>  	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_ERR]);
>  	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
> +	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
>  }
>  
>  int __init vfio_pci_init_perm_bits(void)
> @@ -938,29 +1068,16 @@ int __init vfio_pci_init_perm_bits(void)
>  	ret |= init_pci_ext_cap_pwr_perm(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
>  	ecap_perms[PCI_EXT_CAP_ID_VNDR].writefn = vfio_raw_config_write;
>  
> +	ret |= init_pci_ext_cap_sriov_perm(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
> +	ecap_perms[PCI_EXT_CAP_ID_SRIOV].readfn = vfio_sriov_cap_config_read;
> +	ecap_perms[PCI_EXT_CAP_ID_SRIOV].writefn = vfio_sriov_cap_config_write;
> +
>  	if (ret)
>  		vfio_pci_uninit_perm_bits();
>  
>  	return ret;
>  }
>  
> -static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
> -{
> -	u8 cap;
> -	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
> -						 PCI_STD_HEADER_SIZEOF;
> -	cap = vdev->pci_config_map[pos];
> -
> -	if (cap == PCI_CAP_ID_BASIC)
> -		return 0;
> -
> -	/* XXX Can we have to abutting capabilities of the same type? */
> -	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
> -		pos--;
> -
> -	return pos;
> -}
> -
>  static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
>  				int count, struct perm_bits *perm,
>  				int offset, __le32 *val)
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index 2128de8..02732eb 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -96,6 +96,8 @@ struct vfio_pci_device {
>  	struct eventfd_ctx	*err_trigger;
>  	struct eventfd_ctx	*req_trigger;
>  	struct list_head	dummy_resources_list;
> +	struct mutex		sriov_mutex;
> +

whitespace

>  };
>  
>  #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
@ 2016-08-18 22:09   ` Christoph Hellwig
  2016-08-22 18:51   ` kbuild test robot
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2016-08-18 22:09 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl

On Thu, Aug 18, 2016 at 10:29:15AM +0300, Ilya Lesokhin wrote:
> 1. Add pci_enable_sriov_with_override to allow
> enabling sriov with a driver override
> on the VFs.
> 
> 2. Expose pci_iov_set_numvfs and pci_iov_resource_size
> to make them available for other modules

Please use EXPORT_SYMBOL_GPL for such low-level exports.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
@ 2016-08-18 22:11   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2016-08-18 22:11 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl

On Thu, Aug 18, 2016 at 10:29:16AM +0300, Ilya Lesokhin wrote:
> +static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
> +{
> +	if (!num_vfs) {
> +		pci_disable_sriov(pdev);
> +		return 0;
> +	}
> +
> +	return pci_enable_sriov_with_override(pdev,
> +					      num_vfs,
> +					     "vfio-pci");
> +}

I have to admit that I don't really like this API.  Would it be
major burden for use case to just have a flag instead that disables
automatic driver attachments for VFs and requires manual binding
instead?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
  2016-08-18 20:32   ` Alex Williamson
@ 2016-08-22  6:48   ` kbuild test robot
  1 sibling, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2016-08-22  6:48 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, kvm, linux-pci, bhelgaas, alex.williamson, noaos,
	haggaie, ogerlitz, liranl, ilyal

[-- Attachment #1: Type: text/plain, Size: 1883 bytes --]

Hi Ilya,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.8-rc3 next-20160819]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SRIOV-support/20160818-153802
base:   https://github.com/awilliam/linux-vfio.git next
config: x86_64-randconfig-s1-08191332 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/vfio/pci/vfio_pci_config.c: In function 'vfio_sriov_bar_fixup':
>> drivers/vfio/pci/vfio_pci_config.c:461: error: 'PCI_IOV_RESOURCES' undeclared (first use in this function)
   drivers/vfio/pci/vfio_pci_config.c:461: error: (Each undeclared identifier is reported only once
   drivers/vfio/pci/vfio_pci_config.c:461: error: for each function it appears in.)
>> drivers/vfio/pci/vfio_pci_config.c:461: error: 'PCI_IOV_RESOURCE_END' undeclared (first use in this function)

vim +/PCI_IOV_RESOURCES +461 drivers/vfio/pci/vfio_pci_config.c

   455		int i;
   456		__le32 *bar;
   457		u64 mask;
   458	
   459		bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
   460	
 > 461		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {
   462			if (!pci_resource_start(pdev, i)) {
   463				*bar = 0; /* Unmapped by host = unimplemented to user */
   464				continue;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 21515 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
  2016-08-18 22:09   ` Christoph Hellwig
@ 2016-08-22 18:51   ` kbuild test robot
  1 sibling, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2016-08-22 18:51 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, kvm, linux-pci, bhelgaas, alex.williamson, noaos,
	haggaie, ogerlitz, liranl, ilyal

[-- Attachment #1: Type: text/plain, Size: 2438 bytes --]

Hi Ilya,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.8-rc3 next-20160822]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SRIOV-support/20160818-153802
base:   https://github.com/awilliam/linux-vfio.git next
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/eeh_driver.c: In function 'eeh_add_virt_device':
>> arch/powerpc/kernel/eeh_driver.c:444:2: error: too few arguments to function 'pci_iov_add_virtfn'
     pci_iov_add_virtfn(edev->physfn, pdn->vf_index, 0);
     ^
   In file included from arch/powerpc/kernel/eeh_driver.c:29:0:
   include/linux/pci.h:1746:5: note: declared here
    int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
        ^

vim +/pci_iov_add_virtfn +444 arch/powerpc/kernel/eeh_driver.c

67086e32 Wei Yang 2016-03-04  438  		eeh_pcid_put(dev);
67086e32 Wei Yang 2016-03-04  439  		if (driver->err_handler)
67086e32 Wei Yang 2016-03-04  440  			return NULL;
67086e32 Wei Yang 2016-03-04  441  	}
67086e32 Wei Yang 2016-03-04  442  
67086e32 Wei Yang 2016-03-04  443  #ifdef CONFIG_PPC_POWERNV
67086e32 Wei Yang 2016-03-04 @444  	pci_iov_add_virtfn(edev->physfn, pdn->vf_index, 0);
67086e32 Wei Yang 2016-03-04  445  #endif
67086e32 Wei Yang 2016-03-04  446  	return NULL;
67086e32 Wei Yang 2016-03-04  447  }

:::::: The code at line 444 was first introduced by commit
:::::: 67086e32b56481531ab1292b284e074b1a8d764c powerpc/eeh: powerpc/eeh: Support error recovery for VF PE

:::::: TO: Wei Yang <weiyang@linux.vnet.ibm.com>
:::::: CC: Michael Ellerman <mpe@ellerman.id.au>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 49243 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V3 0/3] VFIO SRIOV support
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
                   ` (2 preceding siblings ...)
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
@ 2017-03-08  7:29 ` Jike Song
  2017-03-09  6:24   ` Ilya Lesokhin
  3 siblings, 1 reply; 12+ messages in thread
From: Jike Song @ 2017-03-08  7:29 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl, You, Lizhen

On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> Changes from V2:
>         1. Enabling and disabling SR-IOV is now done
>         through the sysfs interface, requiring
>         admin privileges.
>         2. Since admin privileges are now required
>         to enable SR-IOV most of the the security
>         measures introduced in RFC V2 were removed.
>         Unfortunately we still need a mutex to prevent
>         the VFIO user from changing the number of
>         VFs while enable_sriov is in progress.
> 
> Changes from V1:
>         1. The VF are no longer assigned to PFs iommu group
>         2. Add a pci_enable_sriov_with_override API to allow
>         enablind sriov without probing the VFs with the
>         default driver
> 
> Changes from RFC V2:
>         1. pci_disable_sriov() is now called from a workqueue
>         To avoid the situation where a process is blocked
>         in pci_disable_sriov() wating for itself to relase the VFs.
>         2. a mutex was added to synchronize calls to
>         pci_enable_sriov() and pci_disable_sriov()
> 
> Changes from RFC V1:
>         Due to the security concern raised in RFC V1, we add two patches
>         to make sure the VFs belong to the same IOMMU group as
>         the PF and are probed by VFIO.
> 
> Today the QEMU hypervisor allows assigning a physical device to a VM,
> facilitating driver development. However, it does not support enabling
> SR-IOV by the VM kernel driver. Our goal is to implement such support,
> allowing developers working on SR-IOV physical function drivers to work
> inside VMs as well.
> 
> This patch series implements the kernel side of our solution.  It extends
> the VFIO driver to support the PCIE SRIOV extended capability with
> following features:
> 1. The ability to probe SR-IOV BAR sizes.
> 2. The ability to enable and disable SR-IOV.
> 
> This patch series is going to be used by QEMU to expose SR-IOV capabilities
> to VM. We already have an early prototype based on Knut Omang's patches for
> SR-IOV[1].
> 
> Limitations:
> 1. Per SR-IOV spec section 3.3.12, PFs are required to support
> 4-KB, 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> Unfourtently the kernel currently initializes the System Page Size register once
> and assumes it doesn't change therefore we cannot allow guests to change this
> register at will. We currently map both the Supported Page sizes and
> System Page Size as virtualized and read only in violation of the spec.
> In practice this is not an issue since both the hypervisor and the
> guest typically select the same System Page Size.
> 
> [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> 
> Ilya Lesokhin (3):
>   pci: Extend PCI IOV API
>   vfio/pci: Allow control SR-IOV through sysfs interface
>   vfio/pci: Add support for SR-IOV extended capablity
> 
>  drivers/pci/iov.c                   |  41 ++++++++--
>  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
>  drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
>  drivers/vfio/pci/vfio_pci_private.h |   2 +
>  include/linux/pci.h                 |  13 +++-
>  5 files changed, 219 insertions(+), 31 deletions(-)
> 

+Lizhen


Hi Ilya,

Sorry for jumping in abruptly. We are also looking forward to have PF
used within a VM, would you please share your next plan with us? Likely
there will be a v4 shortly?


--
Thanks,
Jike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
@ 2017-03-09  6:24   ` Ilya Lesokhin
  2017-03-09  6:29     ` You, Lizhen
  0 siblings, 1 reply; 12+ messages in thread
From: Ilya Lesokhin @ 2017-03-09  6:24 UTC (permalink / raw)
  To: Jike Song
  Cc: kvm@vger.kernel.org, linux-pci@vger.kernel.org,
	bhelgaas@google.com, alex.williamson@redhat.com, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss, You, Lizhen

Hi Jike,
I don't have a plan to work on it in the near future, but we can share the code if you are interested.

Thanks,
Ilya

> -----Original Message-----
> From: Jike Song [mailto:jike.song@intel.com]
> Sent: Wednesday, March 08, 2017 9:30 AM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com;
> alex.williamson@redhat.com; Noa Osherovich <noaos@mellanox.com>;
> Haggai Eran <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>;
> Liran Liss <liranl@mellanox.com>; You, Lizhen <lizhen.you@intel.com>
> Subject: Re: [PATCH V3 0/3] VFIO SRIOV support
> 
> On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> > Changes from V2:
> >         1. Enabling and disabling SR-IOV is now done
> >         through the sysfs interface, requiring
> >         admin privileges.
> >         2. Since admin privileges are now required
> >         to enable SR-IOV most of the the security
> >         measures introduced in RFC V2 were removed.
> >         Unfortunately we still need a mutex to prevent
> >         the VFIO user from changing the number of
> >         VFs while enable_sriov is in progress.
> >
> > Changes from V1:
> >         1. The VF are no longer assigned to PFs iommu group
> >         2. Add a pci_enable_sriov_with_override API to allow
> >         enablind sriov without probing the VFs with the
> >         default driver
> >
> > Changes from RFC V2:
> >         1. pci_disable_sriov() is now called from a workqueue
> >         To avoid the situation where a process is blocked
> >         in pci_disable_sriov() wating for itself to relase the VFs.
> >         2. a mutex was added to synchronize calls to
> >         pci_enable_sriov() and pci_disable_sriov()
> >
> > Changes from RFC V1:
> >         Due to the security concern raised in RFC V1, we add two patches
> >         to make sure the VFs belong to the same IOMMU group as
> >         the PF and are probed by VFIO.
> >
> > Today the QEMU hypervisor allows assigning a physical device to a VM,
> > facilitating driver development. However, it does not support enabling
> > SR-IOV by the VM kernel driver. Our goal is to implement such support,
> > allowing developers working on SR-IOV physical function drivers to
> > work inside VMs as well.
> >
> > This patch series implements the kernel side of our solution.  It
> > extends the VFIO driver to support the PCIE SRIOV extended capability
> > with following features:
> > 1. The ability to probe SR-IOV BAR sizes.
> > 2. The ability to enable and disable SR-IOV.
> >
> > This patch series is going to be used by QEMU to expose SR-IOV
> > capabilities to VM. We already have an early prototype based on Knut
> > Omang's patches for SR-IOV[1].
> >
> > Limitations:
> > 1. Per SR-IOV spec section 3.3.12, PFs are required to support 4-KB,
> > 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> > Unfourtently the kernel currently initializes the System Page Size
> > register once and assumes it doesn't change therefore we cannot allow
> > guests to change this register at will. We currently map both the
> > Supported Page sizes and System Page Size as virtualized and read only in
> violation of the spec.
> > In practice this is not an issue since both the hypervisor and the
> > guest typically select the same System Page Size.
> >
> > [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> >
> > Ilya Lesokhin (3):
> >   pci: Extend PCI IOV API
> >   vfio/pci: Allow control SR-IOV through sysfs interface
> >   vfio/pci: Add support for SR-IOV extended capablity
> >
> >  drivers/pci/iov.c                   |  41 ++++++++--
> >  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
> >  drivers/vfio/pci/vfio_pci_config.c  | 151
> ++++++++++++++++++++++++++++++++----
> >  drivers/vfio/pci/vfio_pci_private.h |   2 +
> >  include/linux/pci.h                 |  13 +++-
> >  5 files changed, 219 insertions(+), 31 deletions(-)
> >
> 
> +Lizhen
> 
> 
> Hi Ilya,
> 
> Sorry for jumping in abruptly. We are also looking forward to have PF used
> within a VM, would you please share your next plan with us? Likely there will
> be a v4 shortly?
> 
> 
> --
> Thanks,
> Jike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
  2017-03-09  6:24   ` Ilya Lesokhin
@ 2017-03-09  6:29     ` You, Lizhen
  0 siblings, 0 replies; 12+ messages in thread
From: You, Lizhen @ 2017-03-09  6:29 UTC (permalink / raw)
  To: Ilya Lesokhin, Song, Jike
  Cc: kvm@vger.kernel.org, linux-pci@vger.kernel.org,
	bhelgaas@google.com, alex.williamson@redhat.com, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss

Hi Ilya,

We'd like to give it a try. If you can share the codes that would be really appreciated!!  And Do you have a copy of the qemu related codes? 

Thanks,
Lizhen

-----Original Message-----
From: Ilya Lesokhin [mailto:ilyal@mellanox.com] 
Sent: Thursday, March 9, 2017 2:24 PM
To: Song, Jike <jike.song@intel.com>
Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com; alex.williamson@redhat.com; Noa Osherovich <noaos@mellanox.com>; Haggai Eran <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Liran Liss <liranl@mellanox.com>; You, Lizhen <lizhen.you@intel.com>
Subject: RE: [PATCH V3 0/3] VFIO SRIOV support

Hi Jike,
I don't have a plan to work on it in the near future, but we can share the code if you are interested.

Thanks,
Ilya

> -----Original Message-----
> From: Jike Song [mailto:jike.song@intel.com]
> Sent: Wednesday, March 08, 2017 9:30 AM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; 
> bhelgaas@google.com; alex.williamson@redhat.com; Noa Osherovich 
> <noaos@mellanox.com>; Haggai Eran <haggaie@mellanox.com>; Or Gerlitz 
> <ogerlitz@mellanox.com>; Liran Liss <liranl@mellanox.com>; You, Lizhen 
> <lizhen.you@intel.com>
> Subject: Re: [PATCH V3 0/3] VFIO SRIOV support
> 
> On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> > Changes from V2:
> >         1. Enabling and disabling SR-IOV is now done
> >         through the sysfs interface, requiring
> >         admin privileges.
> >         2. Since admin privileges are now required
> >         to enable SR-IOV most of the the security
> >         measures introduced in RFC V2 were removed.
> >         Unfortunately we still need a mutex to prevent
> >         the VFIO user from changing the number of
> >         VFs while enable_sriov is in progress.
> >
> > Changes from V1:
> >         1. The VF are no longer assigned to PFs iommu group
> >         2. Add a pci_enable_sriov_with_override API to allow
> >         enablind sriov without probing the VFs with the
> >         default driver
> >
> > Changes from RFC V2:
> >         1. pci_disable_sriov() is now called from a workqueue
> >         To avoid the situation where a process is blocked
> >         in pci_disable_sriov() wating for itself to relase the VFs.
> >         2. a mutex was added to synchronize calls to
> >         pci_enable_sriov() and pci_disable_sriov()
> >
> > Changes from RFC V1:
> >         Due to the security concern raised in RFC V1, we add two patches
> >         to make sure the VFs belong to the same IOMMU group as
> >         the PF and are probed by VFIO.
> >
> > Today the QEMU hypervisor allows assigning a physical device to a 
> > VM, facilitating driver development. However, it does not support 
> > enabling SR-IOV by the VM kernel driver. Our goal is to implement 
> > such support, allowing developers working on SR-IOV physical 
> > function drivers to work inside VMs as well.
> >
> > This patch series implements the kernel side of our solution.  It 
> > extends the VFIO driver to support the PCIE SRIOV extended 
> > capability with following features:
> > 1. The ability to probe SR-IOV BAR sizes.
> > 2. The ability to enable and disable SR-IOV.
> >
> > This patch series is going to be used by QEMU to expose SR-IOV 
> > capabilities to VM. We already have an early prototype based on Knut 
> > Omang's patches for SR-IOV[1].
> >
> > Limitations:
> > 1. Per SR-IOV spec section 3.3.12, PFs are required to support 4-KB, 
> > 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> > Unfourtently the kernel currently initializes the System Page Size 
> > register once and assumes it doesn't change therefore we cannot 
> > allow guests to change this register at will. We currently map both 
> > the Supported Page sizes and System Page Size as virtualized and 
> > read only in
> violation of the spec.
> > In practice this is not an issue since both the hypervisor and the 
> > guest typically select the same System Page Size.
> >
> > [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> >
> > Ilya Lesokhin (3):
> >   pci: Extend PCI IOV API
> >   vfio/pci: Allow control SR-IOV through sysfs interface
> >   vfio/pci: Add support for SR-IOV extended capablity
> >
> >  drivers/pci/iov.c                   |  41 ++++++++--
> >  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
> >  drivers/vfio/pci/vfio_pci_config.c  | 151
> ++++++++++++++++++++++++++++++++----
> >  drivers/vfio/pci/vfio_pci_private.h |   2 +
> >  include/linux/pci.h                 |  13 +++-
> >  5 files changed, 219 insertions(+), 31 deletions(-)
> >
> 
> +Lizhen
> 
> 
> Hi Ilya,
> 
> Sorry for jumping in abruptly. We are also looking forward to have PF 
> used within a VM, would you please share your next plan with us? 
> Likely there will be a v4 shortly?
> 
> 
> --
> Thanks,
> Jike

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-09  6:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
2016-08-18 22:09   ` Christoph Hellwig
2016-08-22 18:51   ` kbuild test robot
2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
2016-08-18 22:11   ` Christoph Hellwig
2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
2016-08-18 20:32   ` Alex Williamson
2016-08-22  6:48   ` kbuild test robot
2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
2017-03-09  6:24   ` Ilya Lesokhin
2017-03-09  6:29     ` You, Lizhen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox