* Re: igbvf: add new driver to support 82576 virtual functions
@ 2009-03-25 6:38 David Miller
2009-03-25 8:45 ` Jeff Kirsher
0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2009-03-25 6:38 UTC (permalink / raw)
To: alexander.h.duyck; +Cc: jeffrey.t.kirsher, netdev
This breaks the build:
drivers/net/igbvf/ethtool.c: In function 'igbvf_set_ringparam':
drivers/net/igbvf/ethtool.c:299: error: implicit declaration of function 'vmalloc'
drivers/net/igbvf/ethtool.c:299: warning: assignment makes pointer from integer without a cast
drivers/net/igbvf/ethtool.c:346: error: implicit declaration of function 'vfree'
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: igbvf: add new driver to support 82576 virtual functions
2009-03-25 6:38 igbvf: add new driver to support 82576 virtual functions David Miller
@ 2009-03-25 8:45 ` Jeff Kirsher
2009-03-25 9:03 ` Jeff Kirsher
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Kirsher @ 2009-03-25 8:45 UTC (permalink / raw)
To: David Miller, Yu Zhao; +Cc: alexander.h.duyck, netdev
[-- Attachment #1: Type: text/plain, Size: 1289 bytes --]
On Tue, Mar 24, 2009 at 11:38 PM, David Miller <davem@davemloft.net> wrote:
>
> This breaks the build:
>
> drivers/net/igbvf/ethtool.c: In function 'igbvf_set_ringparam':
> drivers/net/igbvf/ethtool.c:299: error: implicit declaration of function 'vmalloc'
> drivers/net/igbvf/ethtool.c:299: warning: assignment makes pointer from integer without a cast
> drivers/net/igbvf/ethtool.c:346: error: implicit declaration of function 'vfree'
> --
Sorry Dave, I thought this was called out earlier, but I see it was
not. The igbvf driver requires the following patches applied to your
tree to have them compile. Last I heard, these SR-IOV patches were
accepted for 2.6.30, in the PCI tree.
Yu, can you confirm that these patches have been accepted for 2.6.30?
Summary:
http://marc.info/?l=linux-kernel&m=123751955907397&w=2
Patches:
http://marc.info/?l=linux-kernel&m=123751956107417&w=2
http://marc.info/?l=linux-kernel&m=123751956207423&w=2
http://marc.info/?l=linux-kernel&m=123751981407629&w=2
http://marc.info/?l=linux-kernel&m=123751981507632&w=2
http://marc.info/?l=linux-kernel&m=123751981707644&w=2
http://marc.info/?l=linux-kernel&m=123751981607635&w=2
http://marc.info/?l=linux-kernel&m=123751981607638&w=2
http://marc.info/?l=linux-kernel&m=123751981707641&w=2
--
Cheers,
Jeff
[-- Attachment #2: 0008-PCI-save-and-restore-PCIe-2.0-registers.patch --]
[-- Type: application/octet-stream, Size: 2552 bytes --]
From f477163648953229f477ed14c5bfe8b90602ec1a Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH] PCI: save and restore PCIe 2.0 registers
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.c | 11 ++++++++++-
include/linux/pci_regs.h | 2 ++
2 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8e21912..d849668 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -645,6 +645,8 @@ pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state)
EXPORT_SYMBOL(pci_choose_state);
+#define PCI_EXP_SAVE_REGS 7
+
static int pci_save_pcie_state(struct pci_dev *dev)
{
int pos, i = 0;
@@ -666,6 +668,9 @@ static int pci_save_pcie_state(struct pci_dev *dev)
pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, &cap[i++]);
pci_read_config_word(dev, pos + PCI_EXP_SLTCTL, &cap[i++]);
pci_read_config_word(dev, pos + PCI_EXP_RTCTL, &cap[i++]);
+ pci_read_config_word(dev, pos + PCI_EXP_DEVCTL2, &cap[i++]);
+ pci_read_config_word(dev, pos + PCI_EXP_LNKCTL2, &cap[i++]);
+ pci_read_config_word(dev, pos + PCI_EXP_SLTCTL2, &cap[i++]);
return 0;
}
@@ -686,6 +691,9 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
pci_write_config_word(dev, pos + PCI_EXP_LNKCTL, cap[i++]);
pci_write_config_word(dev, pos + PCI_EXP_SLTCTL, cap[i++]);
pci_write_config_word(dev, pos + PCI_EXP_RTCTL, cap[i++]);
+ pci_write_config_word(dev, pos + PCI_EXP_DEVCTL2, cap[i++]);
+ pci_write_config_word(dev, pos + PCI_EXP_LNKCTL2, cap[i++]);
+ pci_write_config_word(dev, pos + PCI_EXP_SLTCTL2, cap[i++]);
}
@@ -1370,7 +1378,8 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev)
{
int error;
- error = pci_add_cap_save_buffer(dev, PCI_CAP_ID_EXP, 4 * sizeof(u16));
+ error = pci_add_cap_save_buffer(dev, PCI_CAP_ID_EXP,
+ PCI_EXP_SAVE_REGS * sizeof(u16));
if (error)
dev_err(&dev->dev,
"unable to preallocate PCI Express save buffer\n");
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index 4ce5eb0..196e202 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -488,6 +488,8 @@
#define PCI_EXP_DEVCAP2_ARI 0x20 /* Alternative Routing-ID */
#define PCI_EXP_DEVCTL2 40 /* Device Control 2 */
#define PCI_EXP_DEVCTL2_ARI 0x20 /* Alternative Routing-ID */
+#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
+#define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */
/* Extended Capabilities (PCI-X 2.0 and Express) */
#define PCI_EXT_CAP_ID(header) (header & 0x0000ffff)
[-- Attachment #3: 0001-PCI-initialize-and-release-SR-IOV-capability.patch --]
[-- Type: application/octet-stream, Size: 12364 bytes --]
From aca164666d7d1acdc93438d2289e59df7b49f41d Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 1/7] PCI: initialize and release SR-IOV capability
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/Kconfig | 13 +++
drivers/pci/Makefile | 3 +
drivers/pci/iov.c | 181 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.c | 7 ++
drivers/pci/pci.h | 37 +++++++++
drivers/pci/probe.c | 4 +
include/linux/pci.h | 8 ++
include/linux/pci_regs.h | 33 ++++++++
8 files changed, 286 insertions(+), 0 deletions(-)
create mode 100644 drivers/pci/iov.c
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 2a4501d..e8ea3e8 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -59,3 +59,16 @@ config HT_IRQ
This allows native hypertransport devices to use interrupts.
If unsure say Y.
+
+config PCI_IOV
+ bool "PCI IOV support"
+ depends on PCI
+ select PCI_MSI
+ default n
+ help
+ PCI-SIG I/O Virtualization (IOV) Specifications support.
+ Single Root IOV: allows the Physical Function driver to enable
+ the hardware capability, so the Virtual Function is accessible
+ via the PCI Configuration Space using its own Bus, Device and
+ Function Numbers. Each Virtual Function also has the PCI Memory
+ Space to map the device specific register set.
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 3d07ce2..ba99282 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -29,6 +29,9 @@ obj-$(CONFIG_DMAR) += dmar.o iova.o intel-iommu.o
obj-$(CONFIG_INTR_REMAP) += dmar.o intr_remapping.o
+# PCI IOV support
+obj-$(CONFIG_PCI_IOV) += iov.o
+
#
# Some architectures use the generic PCI setup functions
#
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
new file mode 100644
index 0000000..e6736d4
--- /dev/null
+++ b/drivers/pci/iov.c
@@ -0,0 +1,181 @@
+/*
+ * drivers/pci/iov.c
+ *
+ * Copyright (C) 2009 Intel Corporation, Yu Zhao <yu.zhao@intel.com>
+ *
+ * PCI Express I/O Virtualization (IOV) support.
+ * Single Root IOV 1.0
+ */
+
+#include <linux/pci.h>
+#include <linux/mutex.h>
+#include <linux/string.h>
+#include <linux/delay.h>
+#include "pci.h"
+
+
+static int sriov_init(struct pci_dev *dev, int pos)
+{
+ int i;
+ int rc;
+ int nres;
+ u32 pgsz;
+ u16 ctrl, total, offset, stride;
+ struct pci_sriov *iov;
+ struct resource *res;
+ struct pci_dev *pdev;
+
+ if (dev->pcie_type != PCI_EXP_TYPE_RC_END &&
+ dev->pcie_type != PCI_EXP_TYPE_ENDPOINT)
+ return -ENODEV;
+
+ pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &ctrl);
+ if (ctrl & PCI_SRIOV_CTRL_VFE) {
+ pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, 0);
+ ssleep(1);
+ }
+
+ pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
+ if (!total)
+ return 0;
+
+ list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+ if (pdev->sriov)
+ break;
+ if (list_empty(&dev->bus->devices) || !pdev->sriov)
+ pdev = NULL;
+
+ ctrl = 0;
+ if (!pdev && pci_ari_enabled(dev->bus))
+ ctrl |= PCI_SRIOV_CTRL_ARI;
+
+ pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
+ pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, total);
+ pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
+ pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
+ if (!offset || (total > 1 && !stride))
+ return -EIO;
+
+ pci_read_config_dword(dev, pos + PCI_SRIOV_SUP_PGSIZE, &pgsz);
+ i = PAGE_SHIFT > 12 ? PAGE_SHIFT - 12 : 0;
+ pgsz &= ~((1 << i) - 1);
+ if (!pgsz)
+ return -EIO;
+
+ pgsz &= ~(pgsz - 1);
+ pci_write_config_dword(dev, pos + PCI_SRIOV_SYS_PGSIZE, pgsz);
+
+ nres = 0;
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+ res = dev->resource + PCI_SRIOV_RESOURCES + i;
+ i += __pci_read_base(dev, pci_bar_unknown, res,
+ pos + PCI_SRIOV_BAR + i * 4);
+ if (!res->flags)
+ continue;
+ if (resource_size(res) & (PAGE_SIZE - 1)) {
+ rc = -EIO;
+ goto failed;
+ }
+ res->end = res->start + resource_size(res) * total - 1;
+ nres++;
+ }
+
+ iov = kzalloc(sizeof(*iov), GFP_KERNEL);
+ if (!iov) {
+ rc = -ENOMEM;
+ goto failed;
+ }
+
+ iov->pos = pos;
+ iov->nres = nres;
+ iov->ctrl = ctrl;
+ iov->total = total;
+ iov->offset = offset;
+ iov->stride = stride;
+ iov->pgsz = pgsz;
+ iov->self = dev;
+ pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
+ pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link);
+
+ if (pdev)
+ iov->pdev = pci_dev_get(pdev);
+ else {
+ iov->pdev = dev;
+ mutex_init(&iov->lock);
+ }
+
+ dev->sriov = iov;
+
+ return 0;
+
+failed:
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+ res = dev->resource + PCI_SRIOV_RESOURCES + i;
+ res->flags = 0;
+ }
+
+ return rc;
+}
+
+static void sriov_release(struct pci_dev *dev)
+{
+ if (dev == dev->sriov->pdev)
+ mutex_destroy(&dev->sriov->lock);
+ else
+ pci_dev_put(dev->sriov->pdev);
+
+ kfree(dev->sriov);
+ dev->sriov = NULL;
+}
+
+/**
+ * pci_iov_init - initialize the IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_init(struct pci_dev *dev)
+{
+ int pos;
+
+ if (!dev->is_pcie)
+ return -ENODEV;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
+ if (pos)
+ return sriov_init(dev, pos);
+
+ return -ENODEV;
+}
+
+/**
+ * pci_iov_release - release resources used by the IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_release(struct pci_dev *dev)
+{
+ if (dev->sriov)
+ sriov_release(dev);
+}
+
+/**
+ * pci_iov_resource_bar - get position of the SR-IOV BAR
+ * @dev: the PCI device
+ * @resno: the resource number
+ * @type: the BAR type to be filled in
+ *
+ * Returns position of the BAR encapsulated in the SR-IOV capability.
+ */
+int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ if (resno < PCI_SRIOV_RESOURCES || resno > PCI_SRIOV_RESOURCE_END)
+ return 0;
+
+ BUG_ON(!dev->sriov);
+
+ *type = pci_bar_unknown;
+
+ return dev->sriov->pos + PCI_SRIOV_BAR +
+ 4 * (resno - PCI_SRIOV_RESOURCES);
+}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6d61200..2eba2a5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2346,12 +2346,19 @@ int pci_select_bars(struct pci_dev *dev, unsigned long flags)
*/
int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
{
+ int reg;
+
if (resno < PCI_ROM_RESOURCE) {
*type = pci_bar_unknown;
return PCI_BASE_ADDRESS_0 + 4 * resno;
} else if (resno == PCI_ROM_RESOURCE) {
*type = pci_bar_mem32;
return dev->rom_base_reg;
+ } else if (resno < PCI_BRIDGE_RESOURCES) {
+ /* device specific resource */
+ reg = pci_iov_resource_bar(dev, resno, type);
+ if (reg)
+ return reg;
}
dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 07c0aa5..451db74 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -195,4 +195,41 @@ static inline int pci_ari_enabled(struct pci_bus *bus)
return bus->self && bus->self->ari_enabled;
}
+/* Single Root I/O Virtualization */
+struct pci_sriov {
+ int pos; /* capability position */
+ int nres; /* number of resources */
+ u32 cap; /* SR-IOV Capabilities */
+ u16 ctrl; /* SR-IOV Control */
+ u16 total; /* total VFs associated with the PF */
+ u16 offset; /* first VF Routing ID offset */
+ u16 stride; /* following VF stride */
+ u32 pgsz; /* page size for BAR alignment */
+ u8 link; /* Function Dependency Link */
+ struct pci_dev *pdev; /* lowest numbered PF */
+ struct pci_dev *self; /* this PF */
+ struct mutex lock; /* lock for VF bus */
+};
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_init(struct pci_dev *dev);
+extern void pci_iov_release(struct pci_dev *dev);
+extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type);
+#else
+static inline int pci_iov_init(struct pci_dev *dev)
+{
+ return -ENODEV;
+}
+static inline void pci_iov_release(struct pci_dev *dev)
+
+{
+}
+static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 55ec44a..03b6f29 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -785,6 +785,7 @@ static int pci_setup_device(struct pci_dev * dev)
static void pci_release_capabilities(struct pci_dev *dev)
{
pci_vpd_release(dev);
+ pci_iov_release(dev);
}
/**
@@ -972,6 +973,9 @@ static void pci_init_capabilities(struct pci_dev *dev)
/* Alternative Routing-ID Forwarding */
pci_enable_ari(dev);
+
+ /* Single Root I/O Virtualization */
+ pci_iov_init(dev);
}
void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7bd624b..f4d740e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -93,6 +93,12 @@ enum {
/* #6: expansion ROM resource */
PCI_ROM_RESOURCE,
+ /* device specific resources */
+#ifdef CONFIG_PCI_IOV
+ PCI_SRIOV_RESOURCES,
+ PCI_SRIOV_RESOURCE_END = PCI_SRIOV_RESOURCES + PCI_SRIOV_NUM_BARS - 1,
+#endif
+
/* resources assigned to buses behind the bridge */
#define PCI_BRIDGE_RESOURCE_NUM 4
@@ -180,6 +186,7 @@ struct pci_cap_saved_state {
struct pcie_link_state;
struct pci_vpd;
+struct pci_sriov;
/*
* The pci_dev structure is used to describe PCI devices.
@@ -270,6 +277,7 @@ struct pci_dev {
struct list_head msi_list;
#endif
struct pci_vpd *vpd;
+ struct pci_sriov *sriov; /* SR-IOV capability related */
};
extern struct pci_dev *alloc_pci_dev(void);
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index 027815b..4ce5eb0 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -375,6 +375,7 @@
#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
#define PCI_EXP_DEVCAP 4 /* Device capabilities */
@@ -498,6 +499,7 @@
#define PCI_EXT_CAP_ID_DSN 3
#define PCI_EXT_CAP_ID_PWR 4
#define PCI_EXT_CAP_ID_ARI 14
+#define PCI_EXT_CAP_ID_SRIOV 16
/* Advanced Error Reporting */
#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
@@ -615,4 +617,35 @@
#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
+/* Single Root I/O Virtualization */
+#define PCI_SRIOV_CAP 0x04 /* SR-IOV Capabilities */
+#define PCI_SRIOV_CAP_VFM 0x01 /* VF Migration Capable */
+#define PCI_SRIOV_CAP_INTR(x) ((x) >> 21) /* Interrupt Message Number */
+#define PCI_SRIOV_CTRL 0x08 /* SR-IOV Control */
+#define PCI_SRIOV_CTRL_VFE 0x01 /* VF Enable */
+#define PCI_SRIOV_CTRL_VFM 0x02 /* VF Migration Enable */
+#define PCI_SRIOV_CTRL_INTR 0x04 /* VF Migration Interrupt Enable */
+#define PCI_SRIOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
+#define PCI_SRIOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
+#define PCI_SRIOV_STATUS 0x0a /* SR-IOV Status */
+#define PCI_SRIOV_STATUS_VFM 0x01 /* VF Migration Status */
+#define PCI_SRIOV_INITIAL_VF 0x0c /* Initial VFs */
+#define PCI_SRIOV_TOTAL_VF 0x0e /* Total VFs */
+#define PCI_SRIOV_NUM_VF 0x10 /* Number of VFs */
+#define PCI_SRIOV_FUNC_LINK 0x12 /* Function Dependency Link */
+#define PCI_SRIOV_VF_OFFSET 0x14 /* First VF Offset */
+#define PCI_SRIOV_VF_STRIDE 0x16 /* Following VF Stride */
+#define PCI_SRIOV_VF_DID 0x1a /* VF Device ID */
+#define PCI_SRIOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
+#define PCI_SRIOV_SYS_PGSIZE 0x20 /* System Page Size */
+#define PCI_SRIOV_BAR 0x24 /* VF BAR0 */
+#define PCI_SRIOV_NUM_BARS 6 /* Number of VF BARs */
+#define PCI_SRIOV_VFM 0x3c /* VF Migration State Array Offset*/
+#define PCI_SRIOV_VFM_BIR(x) ((x) & 7) /* State BIR */
+#define PCI_SRIOV_VFM_OFFSET(x) ((x) & ~7) /* State Offset */
+#define PCI_SRIOV_VFM_UA 0x0 /* Inactive.Unavailable */
+#define PCI_SRIOV_VFM_MI 0x1 /* Dormant.MigrateIn */
+#define PCI_SRIOV_VFM_MO 0x2 /* Active.MigrateOut */
+#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
+
#endif /* LINUX_PCI_REGS_H */
[-- Attachment #4: 0002-PCI-restore-saved-SR-IOV-state.patch --]
[-- Type: application/octet-stream, Size: 2580 bytes --]
From eb89221b31c2812626f505bdde4f19116c5a8773 Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 2/7] PCI: restore saved SR-IOV state
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 29 +++++++++++++++++++++++++++++
drivers/pci/pci.c | 1 +
drivers/pci/pci.h | 4 ++++
3 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index e6736d4..3bca8f8 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -128,6 +128,25 @@ static void sriov_release(struct pci_dev *dev)
dev->sriov = NULL;
}
+static void sriov_restore_state(struct pci_dev *dev)
+{
+ int i;
+ u16 ctrl;
+ struct pci_sriov *iov = dev->sriov;
+
+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &ctrl);
+ if (ctrl & PCI_SRIOV_CTRL_VFE)
+ return;
+
+ for (i = PCI_SRIOV_RESOURCES; i <= PCI_SRIOV_RESOURCE_END; i++)
+ pci_update_resource(dev, i);
+
+ pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+ if (iov->ctrl & PCI_SRIOV_CTRL_VFE)
+ msleep(100);
+}
+
/**
* pci_iov_init - initialize the IOV capability
* @dev: the PCI device
@@ -179,3 +198,13 @@ int pci_iov_resource_bar(struct pci_dev *dev, int resno,
return dev->sriov->pos + PCI_SRIOV_BAR +
4 * (resno - PCI_SRIOV_RESOURCES);
}
+
+/**
+ * pci_restore_iov_state - restore the state of the IOV capability
+ * @dev: the PCI device
+ */
+void pci_restore_iov_state(struct pci_dev *dev)
+{
+ if (dev->sriov)
+ sriov_restore_state(dev);
+}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2eba2a5..8e21912 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -773,6 +773,7 @@ pci_restore_state(struct pci_dev *dev)
}
pci_restore_pcix_state(dev);
pci_restore_msi_state(dev);
+ pci_restore_iov_state(dev);
return 0;
}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 451db74..b24c9e2 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -216,6 +216,7 @@ extern int pci_iov_init(struct pci_dev *dev);
extern void pci_iov_release(struct pci_dev *dev);
extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
enum pci_bar_type *type);
+extern void pci_restore_iov_state(struct pci_dev *dev);
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
@@ -230,6 +231,9 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
{
return 0;
}
+static inline void pci_restore_iov_state(struct pci_dev *dev)
+{
+}
#endif /* CONFIG_PCI_IOV */
#endif /* DRIVERS_PCI_H */
[-- Attachment #5: 0003-PCI-reserve-bus-range-for-SR-IOV-device.patch --]
[-- Type: application/octet-stream, Size: 2785 bytes --]
From 2bb6c4c8e0f4b6603b3415c95c1ed37d58bdcfc5 Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 3/7] PCI: reserve bus range for SR-IOV device
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 34 ++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 5 +++++
drivers/pci/probe.c | 3 +++
3 files changed, 42 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 3bca8f8..0b80437 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -14,6 +14,16 @@
#include "pci.h"
+static inline void virtfn_bdf(struct pci_dev *dev, int id, u8 *busnr, u8 *devfn)
+{
+ u16 bdf;
+
+ bdf = (dev->bus->number << 8) + dev->devfn +
+ dev->sriov->offset + dev->sriov->stride * id;
+ *busnr = bdf >> 8;
+ *devfn = bdf & 0xff;
+}
+
static int sriov_init(struct pci_dev *dev, int pos)
{
int i;
@@ -208,3 +218,27 @@ void pci_restore_iov_state(struct pci_dev *dev)
if (dev->sriov)
sriov_restore_state(dev);
}
+
+/**
+ * pci_iov_bus_range - find bus range used by Virtual Function
+ * @bus: the PCI bus
+ *
+ * Returns max number of buses (exclude current one) used by Virtual
+ * Functions.
+ */
+int pci_iov_bus_range(struct pci_bus *bus)
+{
+ int max = 0;
+ u8 busnr, devfn;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ if (!dev->sriov)
+ continue;
+ virtfn_bdf(dev, dev->sriov->total - 1, &busnr, &devfn);
+ if (busnr > max)
+ max = busnr;
+ }
+
+ return max ? max - bus->number : 0;
+}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b24c9e2..2cf32f5 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -217,6 +217,7 @@ extern void pci_iov_release(struct pci_dev *dev);
extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
enum pci_bar_type *type);
extern void pci_restore_iov_state(struct pci_dev *dev);
+extern int pci_iov_bus_range(struct pci_bus *bus);
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
@@ -234,6 +235,10 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
static inline void pci_restore_iov_state(struct pci_dev *dev)
{
}
+static inline int pci_iov_bus_range(struct pci_bus *bus)
+{
+ return 0;
+}
#endif /* CONFIG_PCI_IOV */
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 03b6f29..4c8abd0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1078,6 +1078,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
for (devfn = 0; devfn < 0x100; devfn += 8)
pci_scan_slot(bus, devfn);
+ /* Reserve buses for SR-IOV capability. */
+ max += pci_iov_bus_range(bus);
+
/*
* After performing arch-dependent fixup of the bus, look behind
* all PCI-to-PCI bridges on this bus.
[-- Attachment #6: 0004-PCI-add-SR-IOV-API-for-Physical-Function-driver.patch --]
[-- Type: application/octet-stream, Size: 11567 bytes --]
From 44373ec4e7aa847e1a605b68fda3267a953691df Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 4/7] PCI: add SR-IOV API for Physical Function driver
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 348 +++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 3
include/linux/pci.h | 14 ++
3 files changed, 365 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0b80437..8096fc9 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -13,6 +13,8 @@
#include <linux/delay.h>
#include "pci.h"
+#define VIRTFN_ID_LEN 8
+
static inline void virtfn_bdf(struct pci_dev *dev, int id, u8 *busnr, u8 *devfn)
{
@@ -24,6 +26,319 @@ static inline void virtfn_bdf(struct pci_dev *dev, int id, u8 *busnr, u8 *devfn)
*devfn = bdf & 0xff;
}
+static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
+{
+ int rc;
+ struct pci_bus *child;
+
+ if (bus->number == busnr)
+ return bus;
+
+ child = pci_find_bus(pci_domain_nr(bus), busnr);
+ if (child)
+ return child;
+
+ child = pci_add_new_bus(bus, NULL, busnr);
+ if (!child)
+ return NULL;
+
+ child->subordinate = busnr;
+ child->dev.parent = bus->bridge;
+ rc = pci_bus_add_child(child);
+ if (rc) {
+ pci_remove_bus(child);
+ return NULL;
+ }
+
+ return child;
+}
+
+static void virtfn_remove_bus(struct pci_bus *bus, int busnr)
+{
+ struct pci_bus *child;
+
+ if (bus->number == busnr)
+ return;
+
+ child = pci_find_bus(pci_domain_nr(bus), busnr);
+ BUG_ON(!child);
+
+ if (list_empty(&child->devices))
+ pci_remove_bus(child);
+}
+
+static int virtfn_add(struct pci_dev *dev, int id, int reset)
+{
+ int i;
+ int rc;
+ u64 size;
+ u8 busnr, devfn;
+ char buf[VIRTFN_ID_LEN];
+ struct pci_dev *virtfn;
+ struct resource *res;
+ struct pci_sriov *iov = dev->sriov;
+
+ virtfn = alloc_pci_dev();
+ if (!virtfn)
+ return -ENOMEM;
+
+ virtfn_bdf(dev, id, &busnr, &devfn);
+ mutex_lock(&iov->pdev->sriov->lock);
+ virtfn->bus = virtfn_add_bus(dev->bus, busnr);
+ if (!virtfn->bus) {
+ kfree(virtfn);
+ mutex_unlock(&iov->pdev->sriov->lock);
+ return -ENOMEM;
+ }
+
+ virtfn->sysdata = dev->bus->sysdata;
+ virtfn->dev.parent = dev->dev.parent;
+ virtfn->dev.bus = dev->dev.bus;
+ virtfn->devfn = devfn;
+ virtfn->hdr_type = PCI_HEADER_TYPE_NORMAL;
+ virtfn->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
+ virtfn->error_state = pci_channel_io_normal;
+ virtfn->current_state = PCI_UNKNOWN;
+ virtfn->is_pcie = 1;
+ virtfn->pcie_type = PCI_EXP_TYPE_ENDPOINT;
+ virtfn->dma_mask = 0xffffffff;
+ virtfn->vendor = dev->vendor;
+ virtfn->subsystem_vendor = dev->subsystem_vendor;
+ virtfn->class = dev->class;
+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_DID, &virtfn->device);
+ pci_read_config_byte(virtfn, PCI_REVISION_ID, &virtfn->revision);
+ pci_read_config_word(virtfn, PCI_SUBSYSTEM_ID,
+ &virtfn->subsystem_device);
+
+ dev_set_name(&virtfn->dev, "%04x:%02x:%02x.%d",
+ pci_domain_nr(virtfn->bus), busnr,
+ PCI_SLOT(devfn), PCI_FUNC(devfn));
+
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+ res = dev->resource + PCI_SRIOV_RESOURCES + i;
+ if (!res->parent)
+ continue;
+ virtfn->resource[i].name = pci_name(virtfn);
+ virtfn->resource[i].flags = res->flags;
+ size = resource_size(res);
+ do_div(size, iov->total);
+ virtfn->resource[i].start = res->start + size * id;
+ virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
+ rc = request_resource(res, &virtfn->resource[i]);
+ BUG_ON(rc);
+ }
+
+ if (reset)
+ pci_execute_reset_function(virtfn);
+
+ pci_device_add(virtfn, virtfn->bus);
+ mutex_unlock(&iov->pdev->sriov->lock);
+
+ virtfn->physfn = pci_dev_get(dev);
+
+ rc = pci_bus_add_device(virtfn);
+ if (rc)
+ goto failed1;
+ sprintf(buf, "%d", id);
+ rc = sysfs_create_link(&iov->dev.kobj, &virtfn->dev.kobj, buf);
+ if (rc)
+ goto failed1;
+ rc = sysfs_create_link(&virtfn->dev.kobj, &dev->dev.kobj, "physfn");
+ if (rc)
+ goto failed2;
+
+ kobject_uevent(&virtfn->dev.kobj, KOBJ_CHANGE);
+
+ return 0;
+
+failed2:
+ sysfs_remove_link(&iov->dev.kobj, buf);
+failed1:
+ pci_dev_put(dev);
+ mutex_lock(&iov->pdev->sriov->lock);
+ pci_remove_bus_device(virtfn);
+ virtfn_remove_bus(dev->bus, busnr);
+ mutex_unlock(&iov->pdev->sriov->lock);
+
+ return rc;
+}
+
+static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+{
+ u8 busnr, devfn;
+ char buf[VIRTFN_ID_LEN];
+ struct pci_bus *bus;
+ struct pci_dev *virtfn;
+ struct pci_sriov *iov = dev->sriov;
+
+ virtfn_bdf(dev, id, &busnr, &devfn);
+ bus = pci_find_bus(pci_domain_nr(dev->bus), busnr);
+ if (!bus)
+ return;
+
+ virtfn = pci_get_slot(bus, devfn);
+ if (!virtfn)
+ return;
+
+ pci_dev_put(virtfn);
+
+ if (reset) {
+ device_release_driver(&virtfn->dev);
+ pci_execute_reset_function(virtfn);
+ }
+
+ sprintf(buf, "%d", id);
+ sysfs_remove_link(&iov->dev.kobj, buf);
+ sysfs_remove_link(&virtfn->dev.kobj, "physfn");
+
+ mutex_lock(&iov->pdev->sriov->lock);
+ pci_remove_bus_device(virtfn);
+ virtfn_remove_bus(dev->bus, busnr);
+ mutex_unlock(&iov->pdev->sriov->lock);
+
+ pci_dev_put(dev);
+}
+
+static void sriov_release_dev(struct device *dev)
+{
+ struct pci_sriov *iov = container_of(dev, struct pci_sriov, dev);
+
+ iov->nr_virtfn = 0;
+}
+
+static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
+{
+ int rc;
+ int i, j;
+ int nres;
+ u8 busnr, devfn;
+ u16 offset, stride, initial;
+ struct resource *res;
+ struct pci_dev *link;
+ struct pci_sriov *iov = dev->sriov;
+
+ if (!nr_virtfn)
+ return 0;
+
+ if (iov->nr_virtfn)
+ return -EINVAL;
+
+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_INITIAL_VF, &initial);
+ if (initial > iov->total ||
+ (!(iov->cap & PCI_SRIOV_CAP_VFM) && (initial != iov->total)))
+ return -EIO;
+
+ if (nr_virtfn < 0 || nr_virtfn > iov->total ||
+ (!(iov->cap & PCI_SRIOV_CAP_VFM) && (nr_virtfn > initial)))
+ return -EINVAL;
+
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn);
+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &offset);
+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &stride);
+ if (!offset || (nr_virtfn > 1 && !stride))
+ return -EIO;
+
+ nres = 0;
+ for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+ res = dev->resource + PCI_SRIOV_RESOURCES + i;
+ if (!res->parent)
+ continue;
+ nres++;
+ }
+ if (nres != iov->nres) {
+ dev_err(&dev->dev, "no enough MMIO for SR-IOV\n");
+ return -ENOMEM;
+ }
+
+ iov->offset = offset;
+ iov->stride = stride;
+
+ virtfn_bdf(dev, nr_virtfn - 1, &busnr, &devfn);
+ if (busnr > dev->bus->subordinate) {
+ dev_err(&dev->dev, "no enough bus range for SR-IOV\n");
+ return -ENOMEM;
+ }
+
+ memset(&iov->dev, 0, sizeof(iov->dev));
+ strcpy(iov->dev.bus_id, "virtfn");
+ iov->dev.parent = &dev->dev;
+ iov->dev.release = sriov_release_dev;
+ rc = device_register(&iov->dev);
+ if (rc)
+ return rc;
+
+ if (iov->link != dev->devfn) {
+ rc = -ENODEV;
+ list_for_each_entry(link, &dev->bus->devices, bus_list) {
+ if (link->sriov && link->devfn == iov->link)
+ rc = sysfs_create_link(&iov->dev.kobj,
+ &link->dev.kobj, "dep_link");
+ }
+ if (rc)
+ goto failed1;
+ }
+
+ iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+ pci_block_user_cfg_access(dev);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+ msleep(100);
+ pci_unblock_user_cfg_access(dev);
+
+ iov->initial = initial;
+ if (nr_virtfn < initial)
+ initial = nr_virtfn;
+
+ for (i = 0; i < initial; i++) {
+ rc = virtfn_add(dev, i, 0);
+ if (rc)
+ goto failed2;
+ }
+
+ kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
+ iov->nr_virtfn = nr_virtfn;
+
+ return 0;
+
+failed2:
+ for (j = 0; j < i; j++)
+ virtfn_remove(dev, j, 0);
+
+ iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+ pci_block_user_cfg_access(dev);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+ ssleep(1);
+ pci_unblock_user_cfg_access(dev);
+
+ if (iov->link != dev->devfn)
+ sysfs_remove_link(&iov->dev.kobj, "dep_link");
+failed1:
+ device_unregister(&iov->dev);
+
+ return rc;
+}
+
+static void sriov_disable(struct pci_dev *dev)
+{
+ int i;
+ struct pci_sriov *iov = dev->sriov;
+
+ if (!iov->nr_virtfn)
+ return;
+
+ for (i = 0; i < iov->nr_virtfn; i++)
+ virtfn_remove(dev, i, 0);
+
+ iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+ pci_block_user_cfg_access(dev);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+ ssleep(1);
+ pci_unblock_user_cfg_access(dev);
+
+ if (iov->link != dev->devfn)
+ sysfs_remove_link(&iov->dev.kobj, "dep_link");
+ device_unregister(&iov->dev);
+}
+
static int sriov_init(struct pci_dev *dev, int pos)
{
int i;
@@ -129,6 +444,8 @@ failed:
static void sriov_release(struct pci_dev *dev)
{
+ BUG_ON(dev->sriov->nr_virtfn);
+
if (dev == dev->sriov->pdev)
mutex_destroy(&dev->sriov->lock);
else
@@ -152,6 +469,7 @@ static void sriov_restore_state(struct pci_dev *dev)
pci_update_resource(dev, i);
pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, iov->nr_virtfn);
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
if (iov->ctrl & PCI_SRIOV_CTRL_VFE)
msleep(100);
@@ -242,3 +560,33 @@ int pci_iov_bus_range(struct pci_bus *bus)
return max ? max - bus->number : 0;
}
+
+/**
+ * pci_enable_sriov - enable the SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+ might_sleep();
+
+ if (!dev->sriov)
+ return -ENODEV;
+
+ return sriov_enable(dev, nr_virtfn);
+}
+EXPORT_SYMBOL_GPL(pci_enable_sriov);
+
+/**
+ * pci_disable_sriov - disable the SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_disable_sriov(struct pci_dev *dev)
+{
+ might_sleep();
+
+ if (dev->sriov)
+ sriov_disable(dev);
+}
+EXPORT_SYMBOL_GPL(pci_disable_sriov);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2cf32f5..9bbf868 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -202,6 +202,8 @@ struct pci_sriov {
u32 cap; /* SR-IOV Capabilities */
u16 ctrl; /* SR-IOV Control */
u16 total; /* total VFs associated with the PF */
+ u16 initial; /* initial VFs associated with the PF */
+ u16 nr_virtfn; /* number of VFs available */
u16 offset; /* first VF Routing ID offset */
u16 stride; /* following VF stride */
u32 pgsz; /* page size for BAR alignment */
@@ -209,6 +211,7 @@ struct pci_sriov {
struct pci_dev *pdev; /* lowest numbered PF */
struct pci_dev *self; /* this PF */
struct mutex lock; /* lock for VF bus */
+ struct device dev;
};
#ifdef CONFIG_PCI_IOV
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f4d740e..3a24ff5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -278,6 +278,7 @@ struct pci_dev {
#endif
struct pci_vpd *vpd;
struct pci_sriov *sriov; /* SR-IOV capability related */
+ struct pci_dev *physfn; /* Physical Function the device belongs to */
};
extern struct pci_dev *alloc_pci_dev(void);
@@ -1202,5 +1203,18 @@ int pci_ext_cfg_avail(struct pci_dev *dev);
void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
+#ifdef CONFIG_PCI_IOV
+extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
+extern void pci_disable_sriov(struct pci_dev *dev);
+#else
+static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+ return -ENODEV;
+}
+static inline void pci_disable_sriov(struct pci_dev *dev)
+{
+}
+#endif
+
#endif /* __KERNEL__ */
#endif /* LINUX_PCI_H */
[-- Attachment #7: 0005-PCI-handle-SR-IOV-Virtual-Function-Migration.patch --]
[-- Type: application/octet-stream, Size: 5884 bytes --]
From f5a91ff00d6fe96733f07d40d139ab2e581056c6 Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 5/7] PCI: handle SR-IOV Virtual Function Migration
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 4 ++
include/linux/pci.h | 6 +++
3 files changed, 129 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 8096fc9..063fe74 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -206,6 +206,97 @@ static void sriov_release_dev(struct device *dev)
iov->nr_virtfn = 0;
}
+static int sriov_migration(struct pci_dev *dev)
+{
+ u16 status;
+ struct pci_sriov *iov = dev->sriov;
+
+ if (!iov->nr_virtfn)
+ return 0;
+
+ if (!(iov->cap & PCI_SRIOV_CAP_VFM))
+ return 0;
+
+ pci_read_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, &status);
+ if (!(status & PCI_SRIOV_STATUS_VFM))
+ return 0;
+
+ schedule_work(&iov->mtask);
+
+ return 1;
+}
+
+static void sriov_migration_task(struct work_struct *work)
+{
+ int i;
+ u8 state;
+ u16 status;
+ struct pci_sriov *iov = container_of(work, struct pci_sriov, mtask);
+
+ for (i = iov->initial; i < iov->nr_virtfn; i++) {
+ state = readb(iov->mstate + i);
+ if (state == PCI_SRIOV_VFM_MI) {
+ writeb(PCI_SRIOV_VFM_AV, iov->mstate + i);
+ state = readb(iov->mstate + i);
+ if (state == PCI_SRIOV_VFM_AV)
+ virtfn_add(iov->self, i, 1);
+ } else if (state == PCI_SRIOV_VFM_MO) {
+ virtfn_remove(iov->self, i, 1);
+ writeb(PCI_SRIOV_VFM_UA, iov->mstate + i);
+ state = readb(iov->mstate + i);
+ if (state == PCI_SRIOV_VFM_AV)
+ virtfn_add(iov->self, i, 0);
+ }
+ }
+
+ pci_read_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, &status);
+ status &= ~PCI_SRIOV_STATUS_VFM;
+ pci_write_config_word(iov->self, iov->pos + PCI_SRIOV_STATUS, status);
+}
+
+static int sriov_enable_migration(struct pci_dev *dev, int nr_virtfn)
+{
+ int bir;
+ u32 table;
+ resource_size_t pa;
+ struct pci_sriov *iov = dev->sriov;
+
+ if (nr_virtfn <= iov->initial)
+ return 0;
+
+ pci_read_config_dword(dev, iov->pos + PCI_SRIOV_VFM, &table);
+ bir = PCI_SRIOV_VFM_BIR(table);
+ if (bir > PCI_STD_RESOURCE_END)
+ return -EIO;
+
+ table = PCI_SRIOV_VFM_OFFSET(table);
+ if (table + nr_virtfn > pci_resource_len(dev, bir))
+ return -EIO;
+
+ pa = pci_resource_start(dev, bir) + table;
+ iov->mstate = ioremap(pa, nr_virtfn);
+ if (!iov->mstate)
+ return -ENOMEM;
+
+ INIT_WORK(&iov->mtask, sriov_migration_task);
+
+ iov->ctrl |= PCI_SRIOV_CTRL_VFM | PCI_SRIOV_CTRL_INTR;
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+
+ return 0;
+}
+
+static void sriov_disable_migration(struct pci_dev *dev)
+{
+ struct pci_sriov *iov = dev->sriov;
+
+ iov->ctrl &= ~(PCI_SRIOV_CTRL_VFM | PCI_SRIOV_CTRL_INTR);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+
+ cancel_work_sync(&iov->mtask);
+ iounmap(iov->mstate);
+}
+
static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
{
int rc;
@@ -294,6 +385,12 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
goto failed2;
}
+ if (iov->cap & PCI_SRIOV_CAP_VFM) {
+ rc = sriov_enable_migration(dev, nr_virtfn);
+ if (rc)
+ goto failed2;
+ }
+
kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
iov->nr_virtfn = nr_virtfn;
@@ -325,6 +422,9 @@ static void sriov_disable(struct pci_dev *dev)
if (!iov->nr_virtfn)
return;
+ if (iov->cap & PCI_SRIOV_CAP_VFM)
+ sriov_disable_migration(dev);
+
for (i = 0; i < iov->nr_virtfn; i++)
virtfn_remove(dev, i, 0);
@@ -590,3 +690,22 @@ void pci_disable_sriov(struct pci_dev *dev)
sriov_disable(dev);
}
EXPORT_SYMBOL_GPL(pci_disable_sriov);
+
+/**
+ * pci_sriov_migration - notify SR-IOV core of Virtual Function Migration
+ * @dev: the PCI device
+ *
+ * Returns IRQ_HANDLED if the IRQ is handled, or IRQ_NONE if not.
+ *
+ * Physical Function driver is responsible to register IRQ handler using
+ * VF Migration Interrupt Message Number, and call this function when the
+ * interrupt is generated by the hardware.
+ */
+irqreturn_t pci_sriov_migration(struct pci_dev *dev)
+{
+ if (!dev->sriov)
+ return IRQ_NONE;
+
+ return sriov_migration(dev) ? IRQ_HANDLED : IRQ_NONE;
+}
+EXPORT_SYMBOL_GPL(pci_sriov_migration);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 9bbf868..6764f02 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -1,6 +1,8 @@
#ifndef DRIVERS_PCI_H
#define DRIVERS_PCI_H
+#include <linux/workqueue.h>
+
#define PCI_CFG_SPACE_SIZE 256
#define PCI_CFG_SPACE_EXP_SIZE 4096
@@ -211,6 +213,8 @@ struct pci_sriov {
struct pci_dev *pdev; /* lowest numbered PF */
struct pci_dev *self; /* this PF */
struct mutex lock; /* lock for VF bus */
+ struct work_struct mtask; /* VF Migration task */
+ u8 __iomem *mstate; /* VF Migration State Array */
struct device dev;
};
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3a24ff5..d16b913 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@
#include <asm/atomic.h>
#include <linux/device.h>
#include <linux/io.h>
+#include <linux/irqreturn.h>
/* Include the ID list */
#include <linux/pci_ids.h>
@@ -1206,6 +1207,7 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
#ifdef CONFIG_PCI_IOV
extern int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
extern void pci_disable_sriov(struct pci_dev *dev);
+extern irqreturn_t pci_sriov_migration(struct pci_dev *dev);
#else
static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
{
@@ -1214,6 +1216,10 @@ static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
static inline void pci_disable_sriov(struct pci_dev *dev)
{
}
+static inline irqreturn_t pci_sriov_migration(struct pci_dev *dev)
+{
+ return IRQ_NONE;
+}
#endif
#endif /* __KERNEL__ */
[-- Attachment #8: 0006-PCI-document-SR-IOV-sysfs-entries.patch --]
[-- Type: application/octet-stream, Size: 1803 bytes --]
From 0357beb9d669f4a7d88e95d66714cd48fb5b0693 Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 6/7] PCI: document SR-IOV sysfs entries
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
Documentation/ABI/testing/sysfs-bus-pci | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index e638e15..a0a052c 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -52,3 +52,30 @@ Description:
that some devices may have malformatted data. If the
underlying VPD has a writable section then the
corresponding section of this file will be writable.
+
+What: /sys/bus/pci/devices/.../virtfn/N
+Date: February 2009
+Contact: Yu Zhao <yu.zhao@intel.com>
+Description:
+ This symbol link appears when hardware supports SR-IOV
+ capability and Physical Function driver has enabled it.
+ The symbol link points to the PCI device sysfs entry of
+ Virtual Function whose index is N (0...MaxVFs-1).
+
+What: /sys/bus/pci/devices/.../virtfn/dep_link
+Date: February 2009
+Contact: Yu Zhao <yu.zhao@intel.com>
+Description:
+ This symbol link appears when hardware supports SR-IOV
+ capability and Physical Function driver has enabled it,
+ and this device has vendor specific dependencies with
+ others. The symbol link points to the PCI device sysfs
+ entry of Physical Function this device depends on.
+
+What: /sys/bus/pci/devices/.../physfn
+Date: February 2009
+Contact: Yu Zhao <yu.zhao@intel.com>
+Description:
+ This symbol link appears when a device is Virtual Function.
+ The symbol link points to the PCI device sysfs entry of
+ Physical Function this device associates with.
[-- Attachment #9: 0007-PCI-manual-for-SR-IOV-user-and-driver-developer.patch --]
[-- Type: application/octet-stream, Size: 3798 bytes --]
From 4f465dcb76e2c125357f2524cf62bf50a7546ce3 Mon Sep 17 00:00:00 2001
From: Yu Zhao <yu.zhao@intel.com>
Subject: [PATCH v10 7/7] PCI: manual for SR-IOV user and driver developer
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
Documentation/DocBook/kernel-api.tmpl | 1
Documentation/PCI/pci-iov-howto.txt | 99 +++++++++++++++++++++++++++++++++
2 files changed, 100 insertions(+), 0 deletions(-)
create mode 100644 Documentation/PCI/pci-iov-howto.txt
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index bc962cd..58c1945 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -199,6 +199,7 @@ X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/rom.c
+!Edrivers/pci/iov.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt
new file mode 100644
index 0000000..fc73ef5
--- /dev/null
+++ b/Documentation/PCI/pci-iov-howto.txt
@@ -0,0 +1,99 @@
+ PCI Express I/O Virtualization Howto
+ Copyright (C) 2009 Intel Corporation
+ Yu Zhao <yu.zhao@intel.com>
+
+
+1. Overview
+
+1.1 What is SR-IOV
+
+Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
+capability which makes one physical device appear as multiple virtual
+devices. The physical device is referred to as Physical Function (PF)
+while the virtual devices are referred to as Virtual Functions (VF).
+Allocation of the VF can be dynamically controlled by the PF via
+registers encapsulated in the capability. By default, this feature is
+not enabled and the PF behaves as traditional PCIe device. Once it's
+turned on, each VF's PCI configuration space can be accessed by its own
+Bus, Device and Function Number (Routing ID). And each VF also has PCI
+Memory Space, which is used to map its register set. VF device driver
+operates on the register set so it can be functional and appear as a
+real existing PCI device.
+
+2. User Guide
+
+2.1 How can I enable SR-IOV capability
+
+The device driver (PF driver) will control the enabling and disabling
+of the capability via API provided by SR-IOV core. If the hardware
+has SR-IOV capability, loading its PF driver would enable it and all
+VFs associated with the PF.
+
+2.2 How can I use the Virtual Functions
+
+The VF is treated as hot-plugged PCI devices in the kernel, so they
+should be able to work in the same way as real PCI devices. The VF
+requires device driver that is same as a normal PCI device's.
+
+3. Developer Guide
+
+3.1 SR-IOV API
+
+To enable SR-IOV capability:
+ int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
+ 'nr_virtfn' is number of VFs to be enabled.
+
+To disable SR-IOV capability:
+ void pci_disable_sriov(struct pci_dev *dev);
+
+To notify SR-IOV core of Virtual Function Migration:
+ irqreturn_t pci_sriov_migration(struct pci_dev *dev);
+
+3.2 Usage example
+
+Following piece of code illustrates the usage of the SR-IOV API.
+
+static int __devinit dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
+{
+ pci_enable_sriov(dev, NR_VIRTFN);
+
+ ...
+
+ return 0;
+}
+
+static void __devexit dev_remove(struct pci_dev *dev)
+{
+ pci_disable_sriov(dev);
+
+ ...
+}
+
+static int dev_suspend(struct pci_dev *dev, pm_message_t state)
+{
+ ...
+
+ return 0;
+}
+
+static int dev_resume(struct pci_dev *dev)
+{
+ ...
+
+ return 0;
+}
+
+static void dev_shutdown(struct pci_dev *dev)
+{
+ ...
+}
+
+static struct pci_driver dev_driver = {
+ .name = "SR-IOV Physical Function driver",
+ .id_table = dev_id_table,
+ .probe = dev_probe,
+ .remove = __devexit_p(dev_remove),
+ .suspend = dev_suspend,
+ .resume = dev_resume,
+ .shutdown = dev_shutdown,
+};
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: igbvf: add new driver to support 82576 virtual functions
2009-03-25 8:45 ` Jeff Kirsher
@ 2009-03-25 9:03 ` Jeff Kirsher
2009-03-25 9:47 ` Yu Zhao
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Kirsher @ 2009-03-25 9:03 UTC (permalink / raw)
To: David Miller, Yu Zhao, Matthew Wilcox; +Cc: alexander.h.duyck, netdev
On Wed, Mar 25, 2009 at 1:45 AM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> On Tue, Mar 24, 2009 at 11:38 PM, David Miller <davem@davemloft.net> wrote:
>>
>> This breaks the build:
>>
>> drivers/net/igbvf/ethtool.c: In function 'igbvf_set_ringparam':
>> drivers/net/igbvf/ethtool.c:299: error: implicit declaration of function 'vmalloc'
>> drivers/net/igbvf/ethtool.c:299: warning: assignment makes pointer from integer without a cast
>> drivers/net/igbvf/ethtool.c:346: error: implicit declaration of function 'vfree'
>> --
>
> Sorry Dave, I thought this was called out earlier, but I see it was
> not. The igbvf driver requires the following patches applied to your
> tree to have them compile. Last I heard, these SR-IOV patches were
> accepted for 2.6.30, in the PCI tree.
>
> Yu, can you confirm that these patches have been accepted for 2.6.30?
>
> Summary:
> http://marc.info/?l=linux-kernel&m=123751955907397&w=2
> Patches:
> http://marc.info/?l=linux-kernel&m=123751956107417&w=2
> http://marc.info/?l=linux-kernel&m=123751956207423&w=2
> http://marc.info/?l=linux-kernel&m=123751981407629&w=2
> http://marc.info/?l=linux-kernel&m=123751981507632&w=2
> http://marc.info/?l=linux-kernel&m=123751981707644&w=2
> http://marc.info/?l=linux-kernel&m=123751981607635&w=2
> http://marc.info/?l=linux-kernel&m=123751981607638&w=2
> http://marc.info/?l=linux-kernel&m=123751981707641&w=2
I confirmed that Jesse Barnes has these SR-IOV patches queued up for 2.6.30.
http://marc.info/?l=linux-kernel&m=123757169806111&w=2
--
Cheers,
Jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: igbvf: add new driver to support 82576 virtual functions
2009-03-25 9:03 ` Jeff Kirsher
@ 2009-03-25 9:47 ` Yu Zhao
2009-03-25 16:10 ` Alexander Duyck
0 siblings, 1 reply; 6+ messages in thread
From: Yu Zhao @ 2009-03-25 9:47 UTC (permalink / raw)
To: Kirsher, Jeffrey T
Cc: David Miller, Matthew Wilcox, Duyck, Alexander H,
netdev@vger.kernel.org
On Wed, Mar 25, 2009 at 05:03:29PM +0800, Kirsher, Jeffrey T wrote:
> On Wed, Mar 25, 2009 at 1:45 AM, Jeff Kirsher
> <jeffrey.t.kirsher@intel.com> wrote:
> > On Tue, Mar 24, 2009 at 11:38 PM, David Miller <davem@davemloft.net> wrote:
> >>
> >> This breaks the build:
> >>
> >> drivers/net/igbvf/ethtool.c: In function 'igbvf_set_ringparam':
> >> drivers/net/igbvf/ethtool.c:299: error: implicit declaration of function 'vmalloc'
> >> drivers/net/igbvf/ethtool.c:299: warning: assignment makes pointer from integer without a cast
> >> drivers/net/igbvf/ethtool.c:346: error: implicit declaration of function 'vfree'
> >> --
> >
> > Sorry Dave, I thought this was called out earlier, but I see it was
> > not. The igbvf driver requires the following patches applied to your
> > tree to have them compile. Last I heard, these SR-IOV patches were
> > accepted for 2.6.30, in the PCI tree.
> >
> > Yu, can you confirm that these patches have been accepted for 2.6.30?
> >
> > Summary:
> > http://marc.info/?l=linux-kernel&m=123751955907397&w=2
> > Patches:
> > http://marc.info/?l=linux-kernel&m=123751956107417&w=2
> > http://marc.info/?l=linux-kernel&m=123751956207423&w=2
> > http://marc.info/?l=linux-kernel&m=123751981407629&w=2
> > http://marc.info/?l=linux-kernel&m=123751981507632&w=2
> > http://marc.info/?l=linux-kernel&m=123751981707644&w=2
> > http://marc.info/?l=linux-kernel&m=123751981607635&w=2
> > http://marc.info/?l=linux-kernel&m=123751981607638&w=2
> > http://marc.info/?l=linux-kernel&m=123751981707641&w=2
>
> I confirmed that Jesse Barnes has these SR-IOV patches queued up for 2.6.30.
> http://marc.info/?l=linux-kernel&m=123757169806111&w=2
Yes, it's in Jesse's linux-next branch:
http://git.kernel.org/?p=linux/kernel/git/jbarnes/pci-2.6.git;a=shortlog;h=linux-next
Thanks,
Yu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: igbvf: add new driver to support 82576 virtual functions
2009-03-25 9:47 ` Yu Zhao
@ 2009-03-25 16:10 ` Alexander Duyck
2009-03-25 21:34 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2009-03-25 16:10 UTC (permalink / raw)
To: Zhao, Yu
Cc: Kirsher, Jeffrey T, David Miller, Matthew Wilcox,
netdev@vger.kernel.org
Zhao, Yu wrote:
> On Wed, Mar 25, 2009 at 05:03:29PM +0800, Kirsher, Jeffrey T wrote:
>> On Wed, Mar 25, 2009 at 1:45 AM, Jeff Kirsher
>> <jeffrey.t.kirsher@intel.com> wrote:
>>> On Tue, Mar 24, 2009 at 11:38 PM, David Miller <davem@davemloft.net> wrote:
>>>> This breaks the build:
>>>>
>>>> drivers/net/igbvf/ethtool.c: In function 'igbvf_set_ringparam':
>>>> drivers/net/igbvf/ethtool.c:299: error: implicit declaration of function 'vmalloc'
>>>> drivers/net/igbvf/ethtool.c:299: warning: assignment makes pointer from integer without a cast
>>>> drivers/net/igbvf/ethtool.c:346: error: implicit declaration of function 'vfree'
>>>> --
>>> Sorry Dave, I thought this was called out earlier, but I see it was
>>> not. The igbvf driver requires the following patches applied to your
>>> tree to have them compile. Last I heard, these SR-IOV patches were
>>> accepted for 2.6.30, in the PCI tree.
>>>
>>> Yu, can you confirm that these patches have been accepted for 2.6.30?
>>>
>>> Summary:
>>> http://marc.info/?l=linux-kernel&m=123751955907397&w=2
>>> Patches:
>>> http://marc.info/?l=linux-kernel&m=123751956107417&w=2
>>> http://marc.info/?l=linux-kernel&m=123751956207423&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981407629&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981507632&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981707644&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981607635&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981607638&w=2
>>> http://marc.info/?l=linux-kernel&m=123751981707641&w=2
>> I confirmed that Jesse Barnes has these SR-IOV patches queued up for 2.6.30.
>> http://marc.info/?l=linux-kernel&m=123757169806111&w=2
>
> Yes, it's in Jesse's linux-next branch:
> http://git.kernel.org/?p=linux/kernel/git/jbarnes/pci-2.6.git;a=shortlog;h=linux-next
>
> Thanks,
> Yu
The problem isn't the SR-IOV patches it is a difference in
architectures. The x86/x86_64 architecture lets you be a bit more
sloppy when it comes to including vmalloc. I've seen it in the past
with igb, and I suspect that is why we didn't catch this in testing. We
just need to add a #include of vmalloc.h in ethtool.c and the issue
should be fixed.
Thanks,
Alex
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: igbvf: add new driver to support 82576 virtual functions
2009-03-25 16:10 ` Alexander Duyck
@ 2009-03-25 21:34 ` David Miller
0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2009-03-25 21:34 UTC (permalink / raw)
To: alexander.h.duyck; +Cc: yu.zhao, jeffrey.t.kirsher, willy, netdev
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Wed, 25 Mar 2009 09:10:48 -0700
> The problem isn't the SR-IOV patches it is a difference in
> architectures. The x86/x86_64 architecture lets you be a bit more
> sloppy when it comes to including vmalloc. I've seen it in the past
> with igb, and I suspect that is why we didn't catch this in testing.
> We just need to add a #include of vmalloc.h in ethtool.c and the
> issue should be fixed.
Yes, that was the problem.
I thought it was dead obvious from the build failure message.
What else could a lack of visible vmalloc() declaration mean? :-/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-03-25 21:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-25 6:38 igbvf: add new driver to support 82576 virtual functions David Miller
2009-03-25 8:45 ` Jeff Kirsher
2009-03-25 9:03 ` Jeff Kirsher
2009-03-25 9:47 ` Yu Zhao
2009-03-25 16:10 ` Alexander Duyck
2009-03-25 21:34 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).