public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/8] PCI: Linux kernel SR-IOV support
@ 2009-03-11  7:25 Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 1/8] PCI: initialize and release SR-IOV capability Yu Zhao
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Greetings,

Following patches are intended to support SR-IOV capability in the
Linux kernel. With these patches, people can turn a PCI device with
the capability into multiple ones from software perspective, which
will benefit KVM and achieve other purposes such as QoS, security,
and etc.

SR-IOV specification can be found at:
  http://www.pcisig.com/members/downloads/specifications/iov/sr-iov1.0_11Sep07.pdf
(it requires membership.)

Devices that support SR-IOV are available from following vendors:
  http://download.intel.com/design/network/ProdBrf/320025.pdf
  http://www.myri.com/vlsi/Lanai_Z8ES_Datasheet.pdf
  http://www.neterion.com/products/pdfs/X3100ProductBrief.pdf

The patches to enable the SR-IOV capability of Intel 82576 NIC are
available at (a.k.a Physical Function driver):
  http://patchwork.kernel.org/patch/8063/
  http://patchwork.kernel.org/patch/8064/
  http://patchwork.kernel.org/patch/8065/
  http://patchwork.kernel.org/patch/8066/
And the driver for Intel 82576 Virtual Function are available at:
  http://patchwork.kernel.org/patch/11029/
  http://patchwork.kernel.org/patch/11028/


Major changes from v10 to v11:
  1, use pci_setup_device() to setup Virtual Function (Matthew Wilcox)
  2, various coding style fixes (Matthew Wilcox)
  3, wording and grammar fixes (Randy Dunlap)

  v9 -> v10:
  1, minor fix in pci_restore_iov_state().
  2, respin against the latest tree.

  v8 -> v9:
  1, put a might_sleep() into SR-IOV API which sleeps (Andi Kleen)
  2, block user config accesses before clearing VF Enable bit (Matthew Wilcox)


Yu Zhao (8):
  PCI: initialize and release SR-IOV capability
  PCI: restore saved SR-IOV state
  PCI: reserve bus range for SR-IOV device
  PCI: centralize device setup code into pci_setup_device()
  PCI: add SR-IOV API for Physical Function driver
  PCI: handle SR-IOV Virtual Function Migration
  PCI: document SR-IOV sysfs entries
  PCI: manual for SR-IOV user and driver developer

 Documentation/ABI/testing/sysfs-bus-pci |   27 ++
 Documentation/DocBook/kernel-api.tmpl   |    1 +
 Documentation/PCI/pci-iov-howto.txt     |   99 +++++
 drivers/pci/Kconfig                     |   10 +
 drivers/pci/Makefile                    |    2 +
 drivers/pci/iov.c                       |  677 +++++++++++++++++++++++++++++++
 drivers/pci/pci.c                       |    8 +
 drivers/pci/pci.h                       |   53 +++
 drivers/pci/probe.c                     |   86 +++--
 include/linux/pci.h                     |   32 ++
 include/linux/pci_regs.h                |   33 ++
 11 files changed, 989 insertions(+), 39 deletions(-)
 create mode 100644 Documentation/PCI/pci-iov-howto.txt
 create mode 100644 drivers/pci/iov.c


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-19 19:53   ` Matthew Wilcox
  2009-03-11  7:25 ` [PATCH v11 2/8] PCI: restore saved SR-IOV state Yu Zhao
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

If a device has the SR-IOV capability, initialize it (set the ARI
Capable Hierarchy in the lowest numbered PF if necessary; calculate
the System Page Size for the VF MMIO, probe the VF Offset, Stride
and BARs). A lock for the VF bus allocation is also initialized if
a PF is the lowest numbered PF.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/Kconfig      |   10 +++
 drivers/pci/Makefile     |    2 +
 drivers/pci/iov.c        |  182 ++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.c        |    7 ++
 drivers/pci/pci.h        |   37 +++++++++
 drivers/pci/probe.c      |    4 +
 include/linux/pci.h      |    9 ++
 include/linux/pci_regs.h |   33 ++++++++
 8 files changed, 284 insertions(+), 0 deletions(-)
 create mode 100644 drivers/pci/iov.c

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 2a4501d..25cf360 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -59,3 +59,13 @@ config HT_IRQ
 	   This allows native hypertransport devices to use interrupts.
 
 	   If unsure say Y.
+
+config PCI_IOV
+	bool "PCI IOV support"
+	depends on PCI
+	help
+	  PCI-SIG I/O Virtualization (IOV) Specifications support.
+	  Single Root IOV: allows the creation of virtual PCI devices
+	  that share the physical resources from a real device.
+
+	  When in doubt, say N.
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 3d07ce2..ba6af16 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -29,6 +29,8 @@ obj-$(CONFIG_DMAR) += dmar.o iova.o intel-iommu.o
 
 obj-$(CONFIG_INTR_REMAP) += dmar.o intr_remapping.o
 
+obj-$(CONFIG_PCI_IOV) += iov.o
+
 #
 # Some architectures use the generic PCI setup functions
 #
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
new file mode 100644
index 0000000..656216c
--- /dev/null
+++ b/drivers/pci/iov.c
@@ -0,0 +1,182 @@
+/*
+ * drivers/pci/iov.c
+ *
+ * Copyright (C) 2009 Intel Corporation, Yu Zhao <yu.zhao@intel.com>
+ *
+ * PCI Express I/O Virtualization (IOV) support.
+ *   Single Root IOV 1.0
+ */
+
+#include <linux/pci.h>
+#include <linux/mutex.h>
+#include <linux/string.h>
+#include <linux/delay.h>
+#include "pci.h"
+
+
+static int sriov_init(struct pci_dev *dev, int pos)
+{
+	int i;
+	int rc;
+	int nres;
+	u32 pgsz;
+	u16 ctrl, total, offset, stride;
+	struct pci_sriov *iov;
+	struct resource *res;
+	struct pci_dev *pdev;
+
+	if (dev->pcie_type != PCI_EXP_TYPE_RC_END &&
+	    dev->pcie_type != PCI_EXP_TYPE_ENDPOINT)
+		return -ENODEV;
+
+	pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &ctrl);
+	if (ctrl & PCI_SRIOV_CTRL_VFE) {
+		pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, 0);
+		ssleep(1);
+	}
+
+	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
+	if (!total)
+		return 0;
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+		if (pdev->is_physfn)
+			break;
+	if (list_empty(&dev->bus->devices) || !pdev->is_physfn)
+		pdev = NULL;
+
+	ctrl = 0;
+	if (!pdev && pci_ari_enabled(dev->bus))
+		ctrl |= PCI_SRIOV_CTRL_ARI;
+
+	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
+	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, total);
+	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
+	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
+	if (!offset || (total > 1 && !stride))
+		return -EIO;
+
+	pci_read_config_dword(dev, pos + PCI_SRIOV_SUP_PGSIZE, &pgsz);
+	i = PAGE_SHIFT > 12 ? PAGE_SHIFT - 12 : 0;
+	pgsz &= ~((1 << i) - 1);
+	if (!pgsz)
+		return -EIO;
+
+	pgsz &= ~(pgsz - 1);
+	pci_write_config_dword(dev, pos + PCI_SRIOV_SYS_PGSIZE, pgsz);
+
+	nres = 0;
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+		res = dev->resource + PCI_IOV_RESOURCES + i;
+		i += __pci_read_base(dev, pci_bar_unknown, res,
+				     pos + PCI_SRIOV_BAR + i * 4);
+		if (!res->flags)
+			continue;
+		if (resource_size(res) & (PAGE_SIZE - 1)) {
+			rc = -EIO;
+			goto failed;
+		}
+		res->end = res->start + resource_size(res) * total - 1;
+		nres++;
+	}
+
+	iov = kzalloc(sizeof(*iov), GFP_KERNEL);
+	if (!iov) {
+		rc = -ENOMEM;
+		goto failed;
+	}
+
+	iov->pos = pos;
+	iov->nres = nres;
+	iov->ctrl = ctrl;
+	iov->total = total;
+	iov->offset = offset;
+	iov->stride = stride;
+	iov->pgsz = pgsz;
+	iov->self = dev;
+	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
+	pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link);
+
+	if (pdev)
+		iov->dev = pci_dev_get(pdev);
+	else {
+		iov->dev = dev;
+		mutex_init(&iov->lock);
+	}
+
+	dev->sriov = iov;
+	dev->is_physfn = 1;
+
+	return 0;
+
+failed:
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+		res = dev->resource + PCI_IOV_RESOURCES + i;
+		res->flags = 0;
+	}
+
+	return rc;
+}
+
+static void sriov_release(struct pci_dev *dev)
+{
+	if (dev == dev->sriov->dev)
+		mutex_destroy(&dev->sriov->lock);
+	else
+		pci_dev_put(dev->sriov->dev);
+
+	kfree(dev->sriov);
+	dev->sriov = NULL;
+}
+
+/**
+ * pci_iov_init - initialize the IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_init(struct pci_dev *dev)
+{
+	int pos;
+
+	if (!dev->is_pcie)
+		return -ENODEV;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
+	if (pos)
+		return sriov_init(dev, pos);
+
+	return -ENODEV;
+}
+
+/**
+ * pci_iov_release - release resources used by the IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_release(struct pci_dev *dev)
+{
+	if (dev->is_physfn)
+		sriov_release(dev);
+}
+
+/**
+ * pci_iov_resource_bar - get position of the SR-IOV BAR
+ * @dev: the PCI device
+ * @resno: the resource number
+ * @type: the BAR type to be filled in
+ *
+ * Returns position of the BAR encapsulated in the SR-IOV capability.
+ */
+int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+			 enum pci_bar_type *type)
+{
+	if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCE_END)
+		return 0;
+
+	BUG_ON(!dev->is_physfn);
+
+	*type = pci_bar_unknown;
+
+	return dev->sriov->pos + PCI_SRIOV_BAR +
+		4 * (resno - PCI_IOV_RESOURCES);
+}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6d61200..2eba2a5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2346,12 +2346,19 @@ int pci_select_bars(struct pci_dev *dev, unsigned long flags)
  */
 int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
 {
+	int reg;
+
 	if (resno < PCI_ROM_RESOURCE) {
 		*type = pci_bar_unknown;
 		return PCI_BASE_ADDRESS_0 + 4 * resno;
 	} else if (resno == PCI_ROM_RESOURCE) {
 		*type = pci_bar_mem32;
 		return dev->rom_base_reg;
+	} else if (resno < PCI_BRIDGE_RESOURCES) {
+		/* device specific resource */
+		reg = pci_iov_resource_bar(dev, resno, type);
+		if (reg)
+			return reg;
 	}
 
 	dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 07c0aa5..196be5e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -195,4 +195,41 @@ static inline int pci_ari_enabled(struct pci_bus *bus)
 	return bus->self && bus->self->ari_enabled;
 }
 
+/* Single Root I/O Virtualization */
+struct pci_sriov {
+	int pos;		/* capability position */
+	int nres;		/* number of resources */
+	u32 cap;		/* SR-IOV Capabilities */
+	u16 ctrl;		/* SR-IOV Control */
+	u16 total;		/* total VFs associated with the PF */
+	u16 offset;		/* first VF Routing ID offset */
+	u16 stride;		/* following VF stride */
+	u32 pgsz;		/* page size for BAR alignment */
+	u8 link;		/* Function Dependency Link */
+	struct pci_dev *dev;	/* lowest numbered PF */
+	struct pci_dev *self;	/* this PF */
+	struct mutex lock;	/* lock for VF bus */
+};
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_init(struct pci_dev *dev);
+extern void pci_iov_release(struct pci_dev *dev);
+extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+				enum pci_bar_type *type);
+#else
+static inline int pci_iov_init(struct pci_dev *dev)
+{
+	return -ENODEV;
+}
+static inline void pci_iov_release(struct pci_dev *dev)
+
+{
+}
+static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+				       enum pci_bar_type *type)
+{
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 #endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 55ec44a..03b6f29 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -785,6 +785,7 @@ static int pci_setup_device(struct pci_dev * dev)
 static void pci_release_capabilities(struct pci_dev *dev)
 {
 	pci_vpd_release(dev);
+	pci_iov_release(dev);
 }
 
 /**
@@ -972,6 +973,9 @@ static void pci_init_capabilities(struct pci_dev *dev)
 
 	/* Alternative Routing-ID Forwarding */
 	pci_enable_ari(dev);
+
+	/* Single Root I/O Virtualization */
+	pci_iov_init(dev);
 }
 
 void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7bd624b..8eba820 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -93,6 +93,12 @@ enum {
 	/* #6: expansion ROM resource */
 	PCI_ROM_RESOURCE,
 
+	/* device specific resources */
+#ifdef CONFIG_PCI_IOV
+	PCI_IOV_RESOURCES,
+	PCI_IOV_RESOURCE_END = PCI_IOV_RESOURCES + PCI_SRIOV_NUM_BARS - 1,
+#endif
+
 	/* resources assigned to buses behind the bridge */
 #define PCI_BRIDGE_RESOURCE_NUM 4
 
@@ -180,6 +186,7 @@ struct pci_cap_saved_state {
 
 struct pcie_link_state;
 struct pci_vpd;
+struct pci_sriov;
 
 /*
  * The pci_dev structure is used to describe PCI devices.
@@ -257,6 +264,7 @@ struct pci_dev {
 	unsigned int	is_managed:1;
 	unsigned int	is_pcie:1;
 	unsigned int	state_saved:1;
+	unsigned int	is_physfn:1;
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
 
@@ -270,6 +278,7 @@ struct pci_dev {
 	struct list_head msi_list;
 #endif
 	struct pci_vpd *vpd;
+	struct pci_sriov *sriov;	/* SR-IOV capability related */
 };
 
 extern struct pci_dev *alloc_pci_dev(void);
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index 027815b..4ce5eb0 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -375,6 +375,7 @@
 #define  PCI_EXP_TYPE_UPSTREAM	0x5	/* Upstream Port */
 #define  PCI_EXP_TYPE_DOWNSTREAM 0x6	/* Downstream Port */
 #define  PCI_EXP_TYPE_PCI_BRIDGE 0x7	/* PCI/PCI-X Bridge */
+#define  PCI_EXP_TYPE_RC_END	0x9	/* Root Complex Integrated Endpoint */
 #define PCI_EXP_FLAGS_SLOT	0x0100	/* Slot implemented */
 #define PCI_EXP_FLAGS_IRQ	0x3e00	/* Interrupt message number */
 #define PCI_EXP_DEVCAP		4	/* Device capabilities */
@@ -498,6 +499,7 @@
 #define PCI_EXT_CAP_ID_DSN	3
 #define PCI_EXT_CAP_ID_PWR	4
 #define PCI_EXT_CAP_ID_ARI	14
+#define PCI_EXT_CAP_ID_SRIOV	16
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
@@ -615,4 +617,35 @@
 #define  PCI_ARI_CTRL_ACS	0x0002	/* ACS Function Groups Enable */
 #define  PCI_ARI_CTRL_FG(x)	(((x) >> 4) & 7) /* Function Group */
 
+/* Single Root I/O Virtualization */
+#define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
+#define  PCI_SRIOV_CAP_VFM	0x01	/* VF Migration Capable */
+#define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
+#define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
+#define  PCI_SRIOV_CTRL_VFE	0x01	/* VF Enable */
+#define  PCI_SRIOV_CTRL_VFM	0x02	/* VF Migration Enable */
+#define  PCI_SRIOV_CTRL_INTR	0x04	/* VF Migration Interrupt Enable */
+#define  PCI_SRIOV_CTRL_MSE	0x08	/* VF Memory Space Enable */
+#define  PCI_SRIOV_CTRL_ARI	0x10	/* ARI Capable Hierarchy */
+#define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
+#define  PCI_SRIOV_STATUS_VFM	0x01	/* VF Migration Status */
+#define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
+#define PCI_SRIOV_TOTAL_VF	0x0e	/* Total VFs */
+#define PCI_SRIOV_NUM_VF	0x10	/* Number of VFs */
+#define PCI_SRIOV_FUNC_LINK	0x12	/* Function Dependency Link */
+#define PCI_SRIOV_VF_OFFSET	0x14	/* First VF Offset */
+#define PCI_SRIOV_VF_STRIDE	0x16	/* Following VF Stride */
+#define PCI_SRIOV_VF_DID	0x1a	/* VF Device ID */
+#define PCI_SRIOV_SUP_PGSIZE	0x1c	/* Supported Page Sizes */
+#define PCI_SRIOV_SYS_PGSIZE	0x20	/* System Page Size */
+#define PCI_SRIOV_BAR		0x24	/* VF BAR0 */
+#define  PCI_SRIOV_NUM_BARS	6	/* Number of VF BARs */
+#define PCI_SRIOV_VFM		0x3c	/* VF Migration State Array Offset*/
+#define  PCI_SRIOV_VFM_BIR(x)	((x) & 7)	/* State BIR */
+#define  PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7)	/* State Offset */
+#define  PCI_SRIOV_VFM_UA	0x0	/* Inactive.Unavailable */
+#define  PCI_SRIOV_VFM_MI	0x1	/* Dormant.MigrateIn */
+#define  PCI_SRIOV_VFM_MO	0x2	/* Active.MigrateOut */
+#define  PCI_SRIOV_VFM_AV	0x3	/* Active.Available */
+
 #endif /* LINUX_PCI_REGS_H */
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 2/8] PCI: restore saved SR-IOV state
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 1/8] PCI: initialize and release SR-IOV capability Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 3/8] PCI: reserve bus range for SR-IOV device Yu Zhao
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Restore the volatile registers in the SR-IOV capability after the
D3->D0 transition.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/iov.c |   29 +++++++++++++++++++++++++++++
 drivers/pci/pci.c |    1 +
 drivers/pci/pci.h |    4 ++++
 3 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 656216c..8df2246 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -129,6 +129,25 @@ static void sriov_release(struct pci_dev *dev)
 	dev->sriov = NULL;
 }
 
+static void sriov_restore_state(struct pci_dev *dev)
+{
+	int i;
+	u16 ctrl;
+	struct pci_sriov *iov = dev->sriov;
+
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &ctrl);
+	if (ctrl & PCI_SRIOV_CTRL_VFE)
+		return;
+
+	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++)
+		pci_update_resource(dev, i);
+
+	pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	if (iov->ctrl & PCI_SRIOV_CTRL_VFE)
+		msleep(100);
+}
+
 /**
  * pci_iov_init - initialize the IOV capability
  * @dev: the PCI device
@@ -180,3 +199,13 @@ int pci_iov_resource_bar(struct pci_dev *dev, int resno,
 	return dev->sriov->pos + PCI_SRIOV_BAR +
 		4 * (resno - PCI_IOV_RESOURCES);
 }
+
+/**
+ * pci_restore_iov_state - restore the state of the IOV capability
+ * @dev: the PCI device
+ */
+void pci_restore_iov_state(struct pci_dev *dev)
+{
+	if (dev->is_physfn)
+		sriov_restore_state(dev);
+}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2eba2a5..8e21912 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -773,6 +773,7 @@ pci_restore_state(struct pci_dev *dev)
 	}
 	pci_restore_pcix_state(dev);
 	pci_restore_msi_state(dev);
+	pci_restore_iov_state(dev);
 
 	return 0;
 }
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 196be5e..efd79a2 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -216,6 +216,7 @@ extern int pci_iov_init(struct pci_dev *dev);
 extern void pci_iov_release(struct pci_dev *dev);
 extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
 				enum pci_bar_type *type);
+extern void pci_restore_iov_state(struct pci_dev *dev);
 #else
 static inline int pci_iov_init(struct pci_dev *dev)
 {
@@ -230,6 +231,9 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
 {
 	return 0;
 }
+static inline void pci_restore_iov_state(struct pci_dev *dev)
+{
+}
 #endif /* CONFIG_PCI_IOV */
 
 #endif /* DRIVERS_PCI_H */
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 3/8] PCI: reserve bus range for SR-IOV device
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 1/8] PCI: initialize and release SR-IOV capability Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 2/8] PCI: restore saved SR-IOV state Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 4/8] PCI: centralize device setup code Yu Zhao
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Reserve the bus number range used by the Virtual Function when
pcibios_assign_all_busses() returns true.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/iov.c   |   36 ++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h   |    5 +++++
 drivers/pci/probe.c |    3 +++
 3 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 8df2246..fb8fab1 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -14,6 +14,18 @@
 #include "pci.h"
 
 
+static inline u8 virtfn_bus(struct pci_dev *dev, int id)
+{
+	return dev->bus->number + ((dev->devfn + dev->sriov->offset +
+				    dev->sriov->stride * id) >> 8);
+}
+
+static inline u8 virtfn_devfn(struct pci_dev *dev, int id)
+{
+	return (dev->devfn + dev->sriov->offset +
+		dev->sriov->stride * id) & 0xff;
+}
+
 static int sriov_init(struct pci_dev *dev, int pos)
 {
 	int i;
@@ -209,3 +221,27 @@ void pci_restore_iov_state(struct pci_dev *dev)
 	if (dev->is_physfn)
 		sriov_restore_state(dev);
 }
+
+/**
+ * pci_iov_bus_range - find bus range used by Virtual Function
+ * @bus: the PCI bus
+ *
+ * Returns max number of buses (exclude current one) used by Virtual
+ * Functions.
+ */
+int pci_iov_bus_range(struct pci_bus *bus)
+{
+	int max = 0;
+	u8 busnr;
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		if (!dev->is_physfn)
+			continue;
+		busnr = virtfn_bus(dev, dev->sriov->total - 1);
+		if (busnr > max)
+			max = busnr;
+	}
+
+	return max ? max - bus->number : 0;
+}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index efd79a2..7abdef6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -217,6 +217,7 @@ extern void pci_iov_release(struct pci_dev *dev);
 extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
 				enum pci_bar_type *type);
 extern void pci_restore_iov_state(struct pci_dev *dev);
+extern int pci_iov_bus_range(struct pci_bus *bus);
 #else
 static inline int pci_iov_init(struct pci_dev *dev)
 {
@@ -234,6 +235,10 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
 static inline void pci_restore_iov_state(struct pci_dev *dev)
 {
 }
+static inline int pci_iov_bus_range(struct pci_bus *bus)
+{
+	return 0;
+}
 #endif /* CONFIG_PCI_IOV */
 
 #endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 03b6f29..4c8abd0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1078,6 +1078,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
 	for (devfn = 0; devfn < 0x100; devfn += 8)
 		pci_scan_slot(bus, devfn);
 
+	/* Reserve buses for SR-IOV capability. */
+	max += pci_iov_bus_range(bus);
+
 	/*
 	 * After performing arch-dependent fixup of the bus, look behind
 	 * all PCI-to-PCI bridges on this bus.
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 4/8] PCI: centralize device setup code
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (2 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 3/8] PCI: reserve bus range for SR-IOV device Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 5/8] PCI: add SR-IOV API for Physical Function driver Yu Zhao
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Move the device setup stuff into pci_setup_device() which will be used
to setup the Virtual Function later.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   79 ++++++++++++++++++++++++++-------------------------
 2 files changed, 41 insertions(+), 39 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 7abdef6..80ad848 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -178,6 +178,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
+extern int pci_setup_device(struct pci_dev *dev);
 extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 				struct resource *res, unsigned int reg);
 extern int pci_resource_bar(struct pci_dev *dev, int resno,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4c8abd0..f4ca550 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -674,6 +674,19 @@ static void pci_read_irq(struct pci_dev *dev)
 	dev->irq = irq;
 }
 
+static void set_pcie_port_type(struct pci_dev *pdev)
+{
+	int pos;
+	u16 reg16;
+
+	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+	if (!pos)
+		return;
+	pdev->is_pcie = 1;
+	pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
+	pdev->pcie_type = (reg16 & PCI_EXP_FLAGS_TYPE) >> 4;
+}
+
 #define LEGACY_IO_RESOURCE	(IORESOURCE_IO | IORESOURCE_PCI_FIXED)
 
 /**
@@ -683,12 +696,34 @@ static void pci_read_irq(struct pci_dev *dev)
  * Initialize the device structure with information about the device's 
  * vendor,class,memory and IO-space addresses,IRQ lines etc.
  * Called at initialisation of the PCI subsystem and by CardBus services.
- * Returns 0 on success and -1 if unknown type of device (not normal, bridge
- * or CardBus).
+ * Returns 0 on success and negative if unknown type of device (not normal,
+ * bridge or CardBus).
  */
-static int pci_setup_device(struct pci_dev * dev)
+int pci_setup_device(struct pci_dev *dev)
 {
 	u32 class;
+	u8 hdr_type;
+	struct pci_slot *slot;
+
+	if (pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type))
+		return -EIO;
+
+	dev->sysdata = dev->bus->sysdata;
+	dev->dev.parent = dev->bus->bridge;
+	dev->dev.bus = &pci_bus_type;
+	dev->hdr_type = hdr_type & 0x7f;
+	dev->multifunction = !!(hdr_type & 0x80);
+	dev->cfg_size = pci_cfg_space_size(dev);
+	dev->error_state = pci_channel_io_normal;
+	set_pcie_port_type(dev);
+
+	list_for_each_entry(slot, &dev->bus->slots, list)
+		if (PCI_SLOT(dev->devfn) == slot->number)
+			dev->slot = slot;
+
+	/* Assume 32-bit PCI; let 64-bit PCI cards (which are far rarer)
+	   set this higher, assuming the system even supports it.  */
+	dev->dma_mask = 0xffffffff;
 
 	dev_set_name(&dev->dev, "%04x:%02x:%02x.%d", pci_domain_nr(dev->bus),
 		     dev->bus->number, PCI_SLOT(dev->devfn),
@@ -708,7 +743,6 @@ static int pci_setup_device(struct pci_dev * dev)
 
 	/* Early fixups, before probing the BARs */
 	pci_fixup_device(pci_fixup_early, dev);
-	class = dev->class >> 8;
 
 	switch (dev->hdr_type) {		    /* header type */
 	case PCI_HEADER_TYPE_NORMAL:		    /* standard header */
@@ -770,7 +804,7 @@ static int pci_setup_device(struct pci_dev * dev)
 	default:				    /* unknown header */
 		dev_err(&dev->dev, "unknown header type %02x, "
 			"ignoring device\n", dev->hdr_type);
-		return -1;
+		return -EIO;
 
 	bad:
 		dev_err(&dev->dev, "ignoring class %02x (doesn't match header "
@@ -804,19 +838,6 @@ static void pci_release_dev(struct device *dev)
 	kfree(pci_dev);
 }
 
-static void set_pcie_port_type(struct pci_dev *pdev)
-{
-	int pos;
-	u16 reg16;
-
-	pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
-	if (!pos)
-		return;
-	pdev->is_pcie = 1;
-	pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
-	pdev->pcie_type = (reg16 & PCI_EXP_FLAGS_TYPE) >> 4;
-}
-
 /**
  * pci_cfg_space_size - get the configuration space size of the PCI device.
  * @dev: PCI device
@@ -892,9 +913,7 @@ EXPORT_SYMBOL(alloc_pci_dev);
 static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 {
 	struct pci_dev *dev;
-	struct pci_slot *slot;
 	u32 l;
-	u8 hdr_type;
 	int delay = 1;
 
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, &l))
@@ -921,34 +940,16 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 		}
 	}
 
-	if (pci_bus_read_config_byte(bus, devfn, PCI_HEADER_TYPE, &hdr_type))
-		return NULL;
-
 	dev = alloc_pci_dev();
 	if (!dev)
 		return NULL;
 
 	dev->bus = bus;
-	dev->sysdata = bus->sysdata;
-	dev->dev.parent = bus->bridge;
-	dev->dev.bus = &pci_bus_type;
 	dev->devfn = devfn;
-	dev->hdr_type = hdr_type & 0x7f;
-	dev->multifunction = !!(hdr_type & 0x80);
 	dev->vendor = l & 0xffff;
 	dev->device = (l >> 16) & 0xffff;
-	dev->cfg_size = pci_cfg_space_size(dev);
-	dev->error_state = pci_channel_io_normal;
-	set_pcie_port_type(dev);
-
-	list_for_each_entry(slot, &bus->slots, list)
-		if (PCI_SLOT(devfn) == slot->number)
-			dev->slot = slot;
 
-	/* Assume 32-bit PCI; let 64-bit PCI cards (which are far rarer)
-	   set this higher, assuming the system even supports it.  */
-	dev->dma_mask = 0xffffffff;
-	if (pci_setup_device(dev) < 0) {
+	if (pci_setup_device(dev)) {
 		kfree(dev);
 		return NULL;
 	}
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 5/8] PCI: add SR-IOV API for Physical Function driver
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (3 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 4/8] PCI: centralize device setup code Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 6/8] PCI: handle SR-IOV Virtual Function Migration Yu Zhao
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Add or remove the Virtual Function when the SR-IOV is enabled or
disabled by the device driver. This can happen anytime rather than
only at the device probe stage.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/iov.c   |  314 +++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h   |    2 +
 include/linux/pci.h |   19 +++-
 3 files changed, 334 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index fb8fab1..0a3af12 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -13,6 +13,7 @@
 #include <linux/delay.h>
 #include "pci.h"
 
+#define VIRTFN_ID_LEN	16
 
 static inline u8 virtfn_bus(struct pci_dev *dev, int id)
 {
@@ -26,6 +27,284 @@ static inline u8 virtfn_devfn(struct pci_dev *dev, int id)
 		dev->sriov->stride * id) & 0xff;
 }
 
+static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
+{
+	int rc;
+	struct pci_bus *child;
+
+	if (bus->number == busnr)
+		return bus;
+
+	child = pci_find_bus(pci_domain_nr(bus), busnr);
+	if (child)
+		return child;
+
+	child = pci_add_new_bus(bus, NULL, busnr);
+	if (!child)
+		return NULL;
+
+	child->subordinate = busnr;
+	child->dev.parent = bus->bridge;
+	rc = pci_bus_add_child(child);
+	if (rc) {
+		pci_remove_bus(child);
+		return NULL;
+	}
+
+	return child;
+}
+
+static void virtfn_remove_bus(struct pci_bus *bus, int busnr)
+{
+	struct pci_bus *child;
+
+	if (bus->number == busnr)
+		return;
+
+	child = pci_find_bus(pci_domain_nr(bus), busnr);
+	BUG_ON(!child);
+
+	if (list_empty(&child->devices))
+		pci_remove_bus(child);
+}
+
+static int virtfn_add(struct pci_dev *dev, int id, int reset)
+{
+	int i;
+	int rc;
+	u64 size;
+	char buf[VIRTFN_ID_LEN];
+	struct pci_dev *virtfn;
+	struct resource *res;
+	struct pci_sriov *iov = dev->sriov;
+
+	virtfn = alloc_pci_dev();
+	if (!virtfn)
+		return -ENOMEM;
+
+	mutex_lock(&iov->dev->sriov->lock);
+	virtfn->bus = virtfn_add_bus(dev->bus, virtfn_bus(dev, id));
+	if (!virtfn->bus) {
+		kfree(virtfn);
+		mutex_unlock(&iov->dev->sriov->lock);
+		return -ENOMEM;
+	}
+	virtfn->devfn = virtfn_devfn(dev, id);
+	virtfn->vendor = dev->vendor;
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_DID, &virtfn->device);
+	pci_setup_device(virtfn);
+	virtfn->dev.parent = dev->dev.parent;
+
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+		res = dev->resource + PCI_IOV_RESOURCES + i;
+		if (!res->parent)
+			continue;
+		virtfn->resource[i].name = pci_name(virtfn);
+		virtfn->resource[i].flags = res->flags;
+		size = resource_size(res);
+		do_div(size, iov->total);
+		virtfn->resource[i].start = res->start + size * id;
+		virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
+		rc = request_resource(res, &virtfn->resource[i]);
+		BUG_ON(rc);
+	}
+
+	if (reset)
+		pci_execute_reset_function(virtfn);
+
+	pci_device_add(virtfn, virtfn->bus);
+	mutex_unlock(&iov->dev->sriov->lock);
+
+	virtfn->physfn = pci_dev_get(dev);
+	virtfn->is_virtfn = 1;
+
+	rc = pci_bus_add_device(virtfn);
+	if (rc)
+		goto failed1;
+	sprintf(buf, "virtfn%u", id);
+	rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
+	if (rc)
+		goto failed1;
+	rc = sysfs_create_link(&virtfn->dev.kobj, &dev->dev.kobj, "physfn");
+	if (rc)
+		goto failed2;
+
+	kobject_uevent(&virtfn->dev.kobj, KOBJ_CHANGE);
+
+	return 0;
+
+failed2:
+	sysfs_remove_link(&dev->dev.kobj, buf);
+failed1:
+	pci_dev_put(dev);
+	mutex_lock(&iov->dev->sriov->lock);
+	pci_remove_bus_device(virtfn);
+	virtfn_remove_bus(dev->bus, virtfn_bus(dev, id));
+	mutex_unlock(&iov->dev->sriov->lock);
+
+	return rc;
+}
+
+static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+{
+	char buf[VIRTFN_ID_LEN];
+	struct pci_bus *bus;
+	struct pci_dev *virtfn;
+	struct pci_sriov *iov = dev->sriov;
+
+	bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id));
+	if (!bus)
+		return;
+
+	virtfn = pci_get_slot(bus, virtfn_devfn(dev, id));
+	if (!virtfn)
+		return;
+
+	pci_dev_put(virtfn);
+
+	if (reset) {
+		device_release_driver(&virtfn->dev);
+		pci_execute_reset_function(virtfn);
+	}
+
+	sprintf(buf, "virtfn%u", id);
+	sysfs_remove_link(&dev->dev.kobj, buf);
+	sysfs_remove_link(&virtfn->dev.kobj, "physfn");
+
+	mutex_lock(&iov->dev->sriov->lock);
+	pci_remove_bus_device(virtfn);
+	virtfn_remove_bus(dev->bus, virtfn_bus(dev, id));
+	mutex_unlock(&iov->dev->sriov->lock);
+
+	pci_dev_put(dev);
+}
+
+static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
+{
+	int rc;
+	int i, j;
+	int nres;
+	u16 offset, stride, initial;
+	struct resource *res;
+	struct pci_dev *pdev;
+	struct pci_sriov *iov = dev->sriov;
+
+	if (!nr_virtfn)
+		return 0;
+
+	if (iov->nr_virtfn)
+		return -EINVAL;
+
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_INITIAL_VF, &initial);
+	if (initial > iov->total ||
+	    (!(iov->cap & PCI_SRIOV_CAP_VFM) && (initial != iov->total)))
+		return -EIO;
+
+	if (nr_virtfn < 0 || nr_virtfn > iov->total ||
+	    (!(iov->cap & PCI_SRIOV_CAP_VFM) && (nr_virtfn > initial)))
+		return -EINVAL;
+
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn);
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &offset);
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &stride);
+	if (!offset || (nr_virtfn > 1 && !stride))
+		return -EIO;
+
+	nres = 0;
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+		res = dev->resource + PCI_IOV_RESOURCES + i;
+		if (res->parent)
+			nres++;
+	}
+	if (nres != iov->nres) {
+		dev_err(&dev->dev, "not enough MMIO resources for SR-IOV\n");
+		return -ENOMEM;
+	}
+
+	iov->offset = offset;
+	iov->stride = stride;
+
+	if (virtfn_bus(dev, nr_virtfn - 1) > dev->bus->subordinate) {
+		dev_err(&dev->dev, "SR-IOV: bus number out of range\n");
+		return -ENOMEM;
+	}
+
+	if (iov->link != dev->devfn) {
+		pdev = pci_get_slot(dev->bus, iov->link);
+		if (!pdev)
+			return -ENODEV;
+
+		pci_dev_put(pdev);
+
+		if (!pdev->is_physfn)
+			return -ENODEV;
+
+		rc = sysfs_create_link(&dev->dev.kobj,
+					&pdev->dev.kobj, "dep_link");
+		if (rc)
+			return rc;
+	}
+
+	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+	pci_block_user_cfg_access(dev);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	msleep(100);
+	pci_unblock_user_cfg_access(dev);
+
+	iov->initial = initial;
+	if (nr_virtfn < initial)
+		initial = nr_virtfn;
+
+	for (i = 0; i < initial; i++) {
+		rc = virtfn_add(dev, i, 0);
+		if (rc)
+			goto failed;
+	}
+
+	kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
+	iov->nr_virtfn = nr_virtfn;
+
+	return 0;
+
+failed:
+	for (j = 0; j < i; j++)
+		virtfn_remove(dev, j, 0);
+
+	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	pci_block_user_cfg_access(dev);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	ssleep(1);
+	pci_unblock_user_cfg_access(dev);
+
+	if (iov->link != dev->devfn)
+		sysfs_remove_link(&dev->dev.kobj, "dep_link");
+
+	return rc;
+}
+
+static void sriov_disable(struct pci_dev *dev)
+{
+	int i;
+	struct pci_sriov *iov = dev->sriov;
+
+	if (!iov->nr_virtfn)
+		return;
+
+	for (i = 0; i < iov->nr_virtfn; i++)
+		virtfn_remove(dev, i, 0);
+
+	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	pci_block_user_cfg_access(dev);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	ssleep(1);
+	pci_unblock_user_cfg_access(dev);
+
+	if (iov->link != dev->devfn)
+		sysfs_remove_link(&dev->dev.kobj, "dep_link");
+
+	iov->nr_virtfn = 0;
+}
+
 static int sriov_init(struct pci_dev *dev, int pos)
 {
 	int i;
@@ -132,6 +411,8 @@ failed:
 
 static void sriov_release(struct pci_dev *dev)
 {
+	BUG_ON(dev->sriov->nr_virtfn);
+
 	if (dev == dev->sriov->dev)
 		mutex_destroy(&dev->sriov->lock);
 	else
@@ -155,6 +436,7 @@ static void sriov_restore_state(struct pci_dev *dev)
 		pci_update_resource(dev, i);
 
 	pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, iov->nr_virtfn);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	if (iov->ctrl & PCI_SRIOV_CTRL_VFE)
 		msleep(100);
@@ -245,3 +527,35 @@ int pci_iov_bus_range(struct pci_bus *bus)
 
 	return max ? max - bus->number : 0;
 }
+
+/**
+ * pci_enable_sriov - enable the SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+	might_sleep();
+
+	if (!dev->is_physfn)
+		return -ENODEV;
+
+	return sriov_enable(dev, nr_virtfn);
+}
+EXPORT_SYMBOL_GPL(pci_enable_sriov);
+
+/**
+ * pci_disable_sriov - disable the SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_disable_sriov(struct pci_dev *dev)
+{
+	might_sleep();
+
+	if (!dev->is_physfn)
+		return;
+
+	sriov_disable(dev);
+}
+EXPORT_SYMBOL_GPL(pci_disable_sriov);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 80ad848..1bdace3 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -203,6 +203,8 @@ struct pci_sriov {
 	u32 cap;		/* SR-IOV Capabilities */
 	u16 ctrl;		/* SR-IOV Control */
 	u16 total;		/* total VFs associated with the PF */
+	u16 initial;		/* initial VFs associated with the PF */
+	u16 nr_virtfn;		/* number of VFs available */
 	u16 offset;		/* first VF Routing ID offset */
 	u16 stride;		/* following VF stride */
 	u32 pgsz;		/* page size for BAR alignment */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8eba820..a40d19d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -265,6 +265,7 @@ struct pci_dev {
 	unsigned int	is_pcie:1;
 	unsigned int	state_saved:1;
 	unsigned int	is_physfn:1;
+	unsigned int	is_virtfn:1;
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
 
@@ -278,7 +279,10 @@ struct pci_dev {
 	struct list_head msi_list;
 #endif
 	struct pci_vpd *vpd;
-	struct pci_sriov *sriov;	/* SR-IOV capability related */
+	union {
+		struct pci_sriov *sriov;	/* SR-IOV capability related */
+		struct pci_dev *physfn;	/* the PF this VF is associated with */
+	};
 };
 
 extern struct pci_dev *alloc_pci_dev(void);
@@ -1203,5 +1207,18 @@ int pci_ext_cfg_avail(struct pci_dev *dev);
 
 void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
 
+#ifdef CONFIG_PCI_IOV
+extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
+extern void pci_disable_sriov(struct pci_dev *dev);
+#else
+static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+	return -ENODEV;
+}
+static inline void pci_disable_sriov(struct pci_dev *dev)
+{
+}
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 6/8] PCI: handle SR-IOV Virtual Function Migration
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (4 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 5/8] PCI: add SR-IOV API for Physical Function driver Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 7/8] PCI: document SR-IOV sysfs entries Yu Zhao
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Add or remove a Virtual Function after receiving a Migrate In or Out
Request.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 drivers/pci/iov.c   |  119 +++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h   |    4 ++
 include/linux/pci.h |    6 +++
 3 files changed, 129 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0a3af12..213fb61 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -179,6 +179,97 @@ static void virtfn_remove(struct pci_dev *dev, int id, int reset)
 	pci_dev_put(dev);
 }
 
+static int sriov_migration(struct pci_dev *dev)
+{
+	u16 status;
+	struct pci_sriov *iov = dev->sriov;
+
+	if (!iov->nr_virtfn)
+		return 0;
+
+	if (!(iov->cap & PCI_SRIOV_CAP_VFM))
+		return 0;
+
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_STATUS, &status);
+	if (!(status & PCI_SRIOV_STATUS_VFM))
+		return 0;
+
+	schedule_work(&iov->mtask);
+
+	return 1;
+}
+
+static void sriov_migration_task(struct work_struct *work)
+{
+	int i;
+	u8 state;
+	u16 status;
+	struct pci_sriov *iov = container_of(work, struct pci_sriov, mtask);
+
+	for (i = iov->initial; i < iov->nr_virtfn; i++) {
+		state = readb(iov->mstate + i);
+		if (state == PCI_SRIOV_VFM_MI) {
+			writeb(PCI_SRIOV_VFM_AV, iov->mstate + i);
+			state = readb(iov->mstate + i);
+			if (state == PCI_SRIOV_VFM_AV)
+				virtfn_add(iov->self, i, 1);
+		} else if (state == PCI_SRIOV_VFM_MO) {
+			virtfn_remove(iov->self, i, 1);
+			writeb(PCI_SRIOV_VFM_UA, iov->mstate + i);
+			state = readb(iov->mstate + i);
+			if (state == PCI_SRIOV_VFM_AV)
+				virtfn_add(iov->self, i, 0);
+		}
+	}
+
+	pci_read_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, &status);
+	status &= ~PCI_SRIOV_STATUS_VFM;
+	pci_write_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, status);
+}
+
+static int sriov_enable_migration(struct pci_dev *dev, int nr_virtfn)
+{
+	int bir;
+	u32 table;
+	resource_size_t pa;
+	struct pci_sriov *iov = dev->sriov;
+
+	if (nr_virtfn <= iov->initial)
+		return 0;
+
+	pci_read_config_dword(dev, iov->pos + PCI_SRIOV_VFM, &table);
+	bir = PCI_SRIOV_VFM_BIR(table);
+	if (bir > PCI_STD_RESOURCE_END)
+		return -EIO;
+
+	table = PCI_SRIOV_VFM_OFFSET(table);
+	if (table + nr_virtfn > pci_resource_len(dev, bir))
+		return -EIO;
+
+	pa = pci_resource_start(dev, bir) + table;
+	iov->mstate = ioremap(pa, nr_virtfn);
+	if (!iov->mstate)
+		return -ENOMEM;
+
+	INIT_WORK(&iov->mtask, sriov_migration_task);
+
+	iov->ctrl |= PCI_SRIOV_CTRL_VFM | PCI_SRIOV_CTRL_INTR;
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+
+	return 0;
+}
+
+static void sriov_disable_migration(struct pci_dev *dev)
+{
+	struct pci_sriov *iov = dev->sriov;
+
+	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFM | PCI_SRIOV_CTRL_INTR);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+
+	cancel_work_sync(&iov->mtask);
+	iounmap(iov->mstate);
+}
+
 static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 {
 	int rc;
@@ -261,6 +352,12 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 			goto failed;
 	}
 
+	if (iov->cap & PCI_SRIOV_CAP_VFM) {
+		rc = sriov_enable_migration(dev, nr_virtfn);
+		if (rc)
+			goto failed;
+	}
+
 	kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
 	iov->nr_virtfn = nr_virtfn;
 
@@ -290,6 +387,9 @@ static void sriov_disable(struct pci_dev *dev)
 	if (!iov->nr_virtfn)
 		return;
 
+	if (iov->cap & PCI_SRIOV_CAP_VFM)
+		sriov_disable_migration(dev);
+
 	for (i = 0; i < iov->nr_virtfn; i++)
 		virtfn_remove(dev, i, 0);
 
@@ -559,3 +659,22 @@ void pci_disable_sriov(struct pci_dev *dev)
 	sriov_disable(dev);
 }
 EXPORT_SYMBOL_GPL(pci_disable_sriov);
+
+/**
+ * pci_sriov_migration - notify SR-IOV core of Virtual Function Migration
+ * @dev: the PCI device
+ *
+ * Returns IRQ_HANDLED if the IRQ is handled, or IRQ_NONE if not.
+ *
+ * Physical Function driver is responsible to register IRQ handler using
+ * VF Migration Interrupt Message Number, and call this function when the
+ * interrupt is generated by the hardware.
+ */
+irqreturn_t pci_sriov_migration(struct pci_dev *dev)
+{
+	if (!dev->is_physfn)
+		return IRQ_NONE;
+
+	return sriov_migration(dev) ? IRQ_HANDLED : IRQ_NONE;
+}
+EXPORT_SYMBOL_GPL(pci_sriov_migration);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1bdace3..dd7c63f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -1,6 +1,8 @@
 #ifndef DRIVERS_PCI_H
 #define DRIVERS_PCI_H
 
+#include <linux/workqueue.h>
+
 #define PCI_CFG_SPACE_SIZE	256
 #define PCI_CFG_SPACE_EXP_SIZE	4096
 
@@ -212,6 +214,8 @@ struct pci_sriov {
 	struct pci_dev *dev;	/* lowest numbered PF */
 	struct pci_dev *self;	/* this PF */
 	struct mutex lock;	/* lock for VF bus */
+	struct work_struct mtask; /* VF Migration task */
+	u8 __iomem *mstate;	/* VF Migration State Array */
 };
 
 #ifdef CONFIG_PCI_IOV
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a40d19d..baf833f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@
 #include <asm/atomic.h>
 #include <linux/device.h>
 #include <linux/io.h>
+#include <linux/irqreturn.h>
 
 /* Include the ID list */
 #include <linux/pci_ids.h>
@@ -1210,6 +1211,7 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
 #ifdef CONFIG_PCI_IOV
 extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 extern void pci_disable_sriov(struct pci_dev *dev);
+extern irqreturn_t pci_sriov_migration(struct pci_dev *dev);
 #else
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 {
@@ -1218,6 +1220,10 @@ static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 static inline void pci_disable_sriov(struct pci_dev *dev)
 {
 }
+static inline irqreturn_t pci_sriov_migration(struct pci_dev *dev)
+{
+	return IRQ_NONE;
+}
 #endif
 
 #endif /* __KERNEL__ */
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 7/8] PCI: document SR-IOV sysfs entries
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (5 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 6/8] PCI: handle SR-IOV Virtual Function Migration Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-11  7:25 ` [PATCH v11 8/8] PCI: manual for SR-IOV user and driver developer Yu Zhao
  2009-03-17  1:55 ` [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-pci |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index e638e15..36edf03 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -52,3 +52,30 @@ Description:
 		that some devices may have malformatted data.  If the
 		underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
+
+What:		/sys/bus/pci/devices/.../virtfnN
+Date:		March 2009
+Contact:	Yu Zhao <yu.zhao@intel.com>
+Description:
+		This symbolic link appears when hardware supports the SR-IOV
+		capability and the Physical Function driver has enabled it.
+		The symbolic link points to the PCI device sysfs entry of the
+		Virtual Function whose index is N (0...MaxVFs-1).
+
+What:		/sys/bus/pci/devices/.../dep_link
+Date:		March 2009
+Contact:	Yu Zhao <yu.zhao@intel.com>
+Description:
+		This symbolic link appears when hardware supports the SR-IOV
+		capability and the Physical Function driver has enabled it,
+		and this device has vendor specific dependencies with others.
+		The symbolic link points to the PCI device sysfs entry of
+		Physical Function this device depends on.
+
+What:		/sys/bus/pci/devices/.../physfn
+Date:		March 2009
+Contact:	Yu Zhao <yu.zhao@intel.com>
+Description:
+		This symbolic link appears when a device is a Virtual Function.
+		The symbolic link points to the PCI device sysfs entry of the
+		Physical Function this device associates with.
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 8/8] PCI: manual for SR-IOV user and driver developer
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (6 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 7/8] PCI: document SR-IOV sysfs entries Yu Zhao
@ 2009-03-11  7:25 ` Yu Zhao
  2009-03-17  1:55 ` [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-11  7:25 UTC (permalink / raw)
  To: jbarnes; +Cc: linux-pci, kvm, linux-kernel, Yu Zhao

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
 Documentation/DocBook/kernel-api.tmpl |    1 +
 Documentation/PCI/pci-iov-howto.txt   |   99 +++++++++++++++++++++++++++++++++
 2 files changed, 100 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/PCI/pci-iov-howto.txt

diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index bc962cd..58c1945 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -199,6 +199,7 @@ X!Edrivers/pci/hotplug.c
 -->
 !Edrivers/pci/probe.c
 !Edrivers/pci/rom.c
+!Edrivers/pci/iov.c
      </sect1>
      <sect1><title>PCI Hotplug Support Library</title>
 !Edrivers/pci/hotplug/pci_hotplug_core.c
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt
new file mode 100644
index 0000000..fc73ef5
--- /dev/null
+++ b/Documentation/PCI/pci-iov-howto.txt
@@ -0,0 +1,99 @@
+		PCI Express I/O Virtualization Howto
+		Copyright (C) 2009 Intel Corporation
+		    Yu Zhao <yu.zhao@intel.com>
+
+
+1. Overview
+
+1.1 What is SR-IOV
+
+Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
+capability which makes one physical device appear as multiple virtual
+devices. The physical device is referred to as Physical Function (PF)
+while the virtual devices are referred to as Virtual Functions (VF).
+Allocation of the VF can be dynamically controlled by the PF via
+registers encapsulated in the capability. By default, this feature is
+not enabled and the PF behaves as traditional PCIe device. Once it's
+turned on, each VF's PCI configuration space can be accessed by its own
+Bus, Device and Function Number (Routing ID). And each VF also has PCI
+Memory Space, which is used to map its register set. VF device driver
+operates on the register set so it can be functional and appear as a
+real existing PCI device.
+
+2. User Guide
+
+2.1 How can I enable SR-IOV capability
+
+The device driver (PF driver) will control the enabling and disabling
+of the capability via API provided by SR-IOV core. If the hardware
+has SR-IOV capability, loading its PF driver would enable it and all
+VFs associated with the PF.
+
+2.2 How can I use the Virtual Functions
+
+The VF is treated as hot-plugged PCI devices in the kernel, so they
+should be able to work in the same way as real PCI devices. The VF
+requires device driver that is same as a normal PCI device's.
+
+3. Developer Guide
+
+3.1 SR-IOV API
+
+To enable SR-IOV capability:
+	int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
+	'nr_virtfn' is number of VFs to be enabled.
+
+To disable SR-IOV capability:
+	void pci_disable_sriov(struct pci_dev *dev);
+
+To notify SR-IOV core of Virtual Function Migration:
+	irqreturn_t pci_sriov_migration(struct pci_dev *dev);
+
+3.2 Usage example
+
+Following piece of code illustrates the usage of the SR-IOV API.
+
+static int __devinit dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
+{
+	pci_enable_sriov(dev, NR_VIRTFN);
+
+	...
+
+	return 0;
+}
+
+static void __devexit dev_remove(struct pci_dev *dev)
+{
+	pci_disable_sriov(dev);
+
+	...
+}
+
+static int dev_suspend(struct pci_dev *dev, pm_message_t state)
+{
+	...
+
+	return 0;
+}
+
+static int dev_resume(struct pci_dev *dev)
+{
+	...
+
+	return 0;
+}
+
+static void dev_shutdown(struct pci_dev *dev)
+{
+	...
+}
+
+static struct pci_driver dev_driver = {
+	.name =		"SR-IOV Physical Function driver",
+	.id_table =	dev_id_table,
+	.probe =	dev_probe,
+	.remove =	__devexit_p(dev_remove),
+	.suspend =	dev_suspend,
+	.resume =	dev_resume,
+	.shutdown =	dev_shutdown,
+};
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 0/8] PCI: Linux kernel SR-IOV support
  2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
                   ` (7 preceding siblings ...)
  2009-03-11  7:25 ` [PATCH v11 8/8] PCI: manual for SR-IOV user and driver developer Yu Zhao
@ 2009-03-17  1:55 ` Yu Zhao
  8 siblings, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-17  1:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

Hi Matthew,

Can you please take a look at this new version? I'd like to make sure
that all concerns are addressed and I didn't miss something :-)

Thanks,
Yu

On Wed, Mar 11, 2009 at 03:25:41PM +0800, Yu Zhao wrote:
> Greetings,
> 
> Following patches are intended to support SR-IOV capability in the
> Linux kernel. With these patches, people can turn a PCI device with
> the capability into multiple ones from software perspective, which
> will benefit KVM and achieve other purposes such as QoS, security,
> and etc.
> 
> SR-IOV specification can be found at:
>   http://www.pcisig.com/members/downloads/specifications/iov/sr-iov1.0_11Sep07.pdf
> (it requires membership.)
> 
> Devices that support SR-IOV are available from following vendors:
>   http://download.intel.com/design/network/ProdBrf/320025.pdf
>   http://www.myri.com/vlsi/Lanai_Z8ES_Datasheet.pdf
>   http://www.neterion.com/products/pdfs/X3100ProductBrief.pdf
> 
> The patches to enable the SR-IOV capability of Intel 82576 NIC are
> available at (a.k.a Physical Function driver):
>   http://patchwork.kernel.org/patch/8063/
>   http://patchwork.kernel.org/patch/8064/
>   http://patchwork.kernel.org/patch/8065/
>   http://patchwork.kernel.org/patch/8066/
> And the driver for Intel 82576 Virtual Function are available at:
>   http://patchwork.kernel.org/patch/11029/
>   http://patchwork.kernel.org/patch/11028/
> 
> 
> Major changes from v10 to v11:
>   1, use pci_setup_device() to setup Virtual Function (Matthew Wilcox)
>   2, various coding style fixes (Matthew Wilcox)
>   3, wording and grammar fixes (Randy Dunlap)
> 
>   v9 -> v10:
>   1, minor fix in pci_restore_iov_state().
>   2, respin against the latest tree.
> 
>   v8 -> v9:
>   1, put a might_sleep() into SR-IOV API which sleeps (Andi Kleen)
>   2, block user config accesses before clearing VF Enable bit (Matthew Wilcox)
> 
> 
> Yu Zhao (8):
>   PCI: initialize and release SR-IOV capability
>   PCI: restore saved SR-IOV state
>   PCI: reserve bus range for SR-IOV device
>   PCI: centralize device setup code into pci_setup_device()
>   PCI: add SR-IOV API for Physical Function driver
>   PCI: handle SR-IOV Virtual Function Migration
>   PCI: document SR-IOV sysfs entries
>   PCI: manual for SR-IOV user and driver developer
> 
>  Documentation/ABI/testing/sysfs-bus-pci |   27 ++
>  Documentation/DocBook/kernel-api.tmpl   |    1 +
>  Documentation/PCI/pci-iov-howto.txt     |   99 +++++
>  drivers/pci/Kconfig                     |   10 +
>  drivers/pci/Makefile                    |    2 +
>  drivers/pci/iov.c                       |  677 +++++++++++++++++++++++++++++++
>  drivers/pci/pci.c                       |    8 +
>  drivers/pci/pci.h                       |   53 +++
>  drivers/pci/probe.c                     |   86 +++--
>  include/linux/pci.h                     |   32 ++
>  include/linux/pci_regs.h                |   33 ++
>  11 files changed, 989 insertions(+), 39 deletions(-)
>  create mode 100644 Documentation/PCI/pci-iov-howto.txt
>  create mode 100644 drivers/pci/iov.c

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-11  7:25 ` [PATCH v11 1/8] PCI: initialize and release SR-IOV capability Yu Zhao
@ 2009-03-19 19:53   ` Matthew Wilcox
  2009-03-20  1:20     ` Jesse Barnes
  2009-03-20  2:06     ` Yu Zhao
  0 siblings, 2 replies; 15+ messages in thread
From: Matthew Wilcox @ 2009-03-19 19:53 UTC (permalink / raw)
  To: Yu Zhao; +Cc: jbarnes, linux-pci, kvm, linux-kernel

On Wed, Mar 11, 2009 at 03:25:42PM +0800, Yu Zhao wrote:
> +config PCI_IOV
> +	bool "PCI IOV support"
> +	depends on PCI
> +	help
> +	  PCI-SIG I/O Virtualization (IOV) Specifications support.
> +	  Single Root IOV: allows the creation of virtual PCI devices
> +	  that share the physical resources from a real device.
> +
> +	  When in doubt, say N.

It's certainly shorter than my text, which is nice.  But I think it
still has too much spec-ese and not enough explanation.  How about:

	help
	  I/O Virtualization is a PCI feature supported by some devices
	  which allows them to create virtual devices which share their
	  physical resources.

	  If unsure, say N.

> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> +		if (pdev->is_physfn)
> +			break;
> +	if (list_empty(&dev->bus->devices) || !pdev->is_physfn)
> +		pdev = NULL;

This is still wrong.  If the 'break' condition is not hit, pdev is
pointing to garbage, not to the last pci_dev in the list.

> @@ -270,6 +278,7 @@ struct pci_dev {
>  	struct list_head msi_list;
>  #endif
>  	struct pci_vpd *vpd;
> +	struct pci_sriov *sriov;	/* SR-IOV capability related */

Should be ifdeffed?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-19 19:53   ` Matthew Wilcox
@ 2009-03-20  1:20     ` Jesse Barnes
  2009-03-20  1:42       ` Matthew Wilcox
  2009-03-20  3:28       ` Zhao, Yu
  2009-03-20  2:06     ` Yu Zhao
  1 sibling, 2 replies; 15+ messages in thread
From: Jesse Barnes @ 2009-03-20  1:20 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Yu Zhao, linux-pci, kvm, linux-kernel

On Thu, 19 Mar 2009 13:53:12 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Wed, Mar 11, 2009 at 03:25:42PM +0800, Yu Zhao wrote:
> > +config PCI_IOV
> > +	bool "PCI IOV support"
> > +	depends on PCI
> > +	help
> > +	  PCI-SIG I/O Virtualization (IOV) Specifications support.
> > +	  Single Root IOV: allows the creation of virtual PCI
> > devices
> > +	  that share the physical resources from a real device.
> > +
> > +	  When in doubt, say N.
> 
> It's certainly shorter than my text, which is nice.  But I think it
> still has too much spec-ese and not enough explanation.  How about:
> 
> 	help
> 	  I/O Virtualization is a PCI feature supported by some
> devices which allows them to create virtual devices which share their
> 	  physical resources.
> 
> 	  If unsure, say N.
> 
> > +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> > +		if (pdev->is_physfn)
> > +			break;
> > +	if (list_empty(&dev->bus->devices) || !pdev->is_physfn)
> > +		pdev = NULL;
> 
> This is still wrong.  If the 'break' condition is not hit, pdev is
> pointing to garbage, not to the last pci_dev in the list.
> 
> > @@ -270,6 +278,7 @@ struct pci_dev {
> >  	struct list_head msi_list;
> >  #endif
> >  	struct pci_vpd *vpd;
> > +	struct pci_sriov *sriov;	/* SR-IOV capability
> > related */
> 
> Should be ifdeffed?

Ok Yu, I'm ready to apply this set, can you send an updated one with
the fixes Matthew mentioned?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-20  1:20     ` Jesse Barnes
@ 2009-03-20  1:42       ` Matthew Wilcox
  2009-03-20  3:28       ` Zhao, Yu
  1 sibling, 0 replies; 15+ messages in thread
From: Matthew Wilcox @ 2009-03-20  1:42 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Yu Zhao, linux-pci, kvm, linux-kernel

On Thu, Mar 19, 2009 at 06:20:16PM -0700, Jesse Barnes wrote:
> Ok Yu, I'm ready to apply this set, can you send an updated one with
> the fixes Matthew mentioned?

And please add Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
to each of the patches.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-19 19:53   ` Matthew Wilcox
  2009-03-20  1:20     ` Jesse Barnes
@ 2009-03-20  2:06     ` Yu Zhao
  1 sibling, 0 replies; 15+ messages in thread
From: Yu Zhao @ 2009-03-20  2:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org

On Fri, Mar 20, 2009 at 03:53:12AM +0800, Matthew Wilcox wrote:
> On Wed, Mar 11, 2009 at 03:25:42PM +0800, Yu Zhao wrote:
> > +config PCI_IOV
> > +	bool "PCI IOV support"
> > +	depends on PCI
> > +	help
> > +	  PCI-SIG I/O Virtualization (IOV) Specifications support.
> > +	  Single Root IOV: allows the creation of virtual PCI devices
> > +	  that share the physical resources from a real device.
> > +
> > +	  When in doubt, say N.
> 
> It's certainly shorter than my text, which is nice.  But I think it
> still has too much spec-ese and not enough explanation.  How about:
> 
> 	help
> 	  I/O Virtualization is a PCI feature supported by some devices
> 	  which allows them to create virtual devices which share their
> 	  physical resources.
> 
> 	  If unsure, say N.

Yes, it's more user-friendly.

> > +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> > +		if (pdev->is_physfn)
> > +			break;
> > +	if (list_empty(&dev->bus->devices) || !pdev->is_physfn)
> > +		pdev = NULL;
> 
> This is still wrong.  If the 'break' condition is not hit, pdev is
> pointing to garbage, not to the last pci_dev in the list.

Yes, you are right. I should think it over after you commented on it
last time.

So it looks like we need to make it as:

	ctrl = 0;
	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
		if (pdev->is_physfn)
			goto found;

	pdev = NULL;
	if (pci_ari_enabled(dev->bus))
		ctrl |= PCI_SRIOV_CTRL_ARI;

found:
	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
	...

> > @@ -270,6 +278,7 @@ struct pci_dev {
> >  	struct list_head msi_list;
> >  #endif
> >  	struct pci_vpd *vpd;
> > +	struct pci_sriov *sriov;	/* SR-IOV capability related */
> 
> Should be ifdeffed?

Yes, will do.


Thank you for reviewing it. The patch series was applied on Xen Domain0
tree 2 days ago, and I'll carry your comments back to Xen tree too.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 1/8] PCI: initialize and release SR-IOV capability
  2009-03-20  1:20     ` Jesse Barnes
  2009-03-20  1:42       ` Matthew Wilcox
@ 2009-03-20  3:28       ` Zhao, Yu
  1 sibling, 0 replies; 15+ messages in thread
From: Zhao, Yu @ 2009-03-20  3:28 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Matthew Wilcox, linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

Jesse Barnes wrote:
> On Thu, 19 Mar 2009 13:53:12 -0600
> Matthew Wilcox <matthew@wil.cx> wrote:
> 
>> On Wed, Mar 11, 2009 at 03:25:42PM +0800, Yu Zhao wrote:
>>> +config PCI_IOV
>>> +	bool "PCI IOV support"
>>> +	depends on PCI
>>> +	help
>>> +	  PCI-SIG I/O Virtualization (IOV) Specifications support.
>>> +	  Single Root IOV: allows the creation of virtual PCI
>>> devices
>>> +	  that share the physical resources from a real device.
>>> +
>>> +	  When in doubt, say N.
>> It's certainly shorter than my text, which is nice.  But I think it
>> still has too much spec-ese and not enough explanation.  How about:
>>
>> 	help
>> 	  I/O Virtualization is a PCI feature supported by some
>> devices which allows them to create virtual devices which share their
>> 	  physical resources.
>>
>> 	  If unsure, say N.
>>
>>> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>>> +		if (pdev->is_physfn)
>>> +			break;
>>> +	if (list_empty(&dev->bus->devices) || !pdev->is_physfn)
>>> +		pdev = NULL;
>> This is still wrong.  If the 'break' condition is not hit, pdev is
>> pointing to garbage, not to the last pci_dev in the list.
>>
>>> @@ -270,6 +278,7 @@ struct pci_dev {
>>>  	struct list_head msi_list;
>>>  #endif
>>>  	struct pci_vpd *vpd;
>>> +	struct pci_sriov *sriov;	/* SR-IOV capability
>>> related */
>> Should be ifdeffed?
> 
> Ok Yu, I'm ready to apply this set, can you send an updated one with
> the fixes Matthew mentioned?

Thanks, Jesse. I updated this one according to Matthew's comments and 
respan others in patchset v12 so they can be cleanly applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-03-20  3:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-11  7:25 [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao
2009-03-11  7:25 ` [PATCH v11 1/8] PCI: initialize and release SR-IOV capability Yu Zhao
2009-03-19 19:53   ` Matthew Wilcox
2009-03-20  1:20     ` Jesse Barnes
2009-03-20  1:42       ` Matthew Wilcox
2009-03-20  3:28       ` Zhao, Yu
2009-03-20  2:06     ` Yu Zhao
2009-03-11  7:25 ` [PATCH v11 2/8] PCI: restore saved SR-IOV state Yu Zhao
2009-03-11  7:25 ` [PATCH v11 3/8] PCI: reserve bus range for SR-IOV device Yu Zhao
2009-03-11  7:25 ` [PATCH v11 4/8] PCI: centralize device setup code Yu Zhao
2009-03-11  7:25 ` [PATCH v11 5/8] PCI: add SR-IOV API for Physical Function driver Yu Zhao
2009-03-11  7:25 ` [PATCH v11 6/8] PCI: handle SR-IOV Virtual Function Migration Yu Zhao
2009-03-11  7:25 ` [PATCH v11 7/8] PCI: document SR-IOV sysfs entries Yu Zhao
2009-03-11  7:25 ` [PATCH v11 8/8] PCI: manual for SR-IOV user and driver developer Yu Zhao
2009-03-17  1:55 ` [PATCH v11 0/8] PCI: Linux kernel SR-IOV support Yu Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox