* [PATCH 1/8 v4] PCI: define PCI resource names in an 'enum'
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
@ 2008-10-14 10:46 ` Yu Zhao
2008-10-14 10:46 ` Yu Zhao
` (14 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:46 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
This patch moves all definitions of PCI resource names to an 'enum',
and also replaces some hard-coded resource variables with symbol
names. This change eases the introduction of device specific resources.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci-sysfs.c | 4 +++-
drivers/pci/pci.c | 19 ++-----------------
drivers/pci/probe.c | 2 +-
drivers/pci/proc.c | 7 ++++---
include/linux/pci.h | 37 ++++++++++++++++++++++++-------------
5 files changed, 34 insertions(+), 35 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 2cad6da..c41b783 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -101,11 +101,13 @@ resource_show(struct device * dev, struct device_attribute *attr, char * buf)
struct pci_dev * pci_dev = to_pci_dev(dev);
char * str = buf;
int i;
- int max = 7;
+ int max;
resource_size_t start, end;
if (pci_dev->subordinate)
max = DEVICE_COUNT_RESOURCE;
+ else
+ max = PCI_BRIDGE_RESOURCES;
for (i = 0; i < max; i++) {
struct resource *res = &pci_dev->resource[i];
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5ecd2d7..a9c64b0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -359,24 +359,9 @@ pci_find_parent_resource(const struct pci_dev *dev, struct resource *res)
static void
pci_restore_bars(struct pci_dev *dev)
{
- int i, numres;
-
- switch (dev->hdr_type) {
- case PCI_HEADER_TYPE_NORMAL:
- numres = 6;
- break;
- case PCI_HEADER_TYPE_BRIDGE:
- numres = 2;
- break;
- case PCI_HEADER_TYPE_CARDBUS:
- numres = 1;
- break;
- default:
- /* Should never get here, but just in case... */
- return;
- }
+ int i;
- for (i = 0; i < numres; i++)
+ for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
pci_update_resource(dev, i);
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index dcd6bf1..03ddfee 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -492,7 +492,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
child->subordinate = 0xff;
/* Set up default resource pointers and names.. */
- for (i = 0; i < 4; i++) {
+ for (i = 0; i < PCI_BRIDGE_RES_NUM; i++) {
child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i];
child->resource[i]->name = child->name;
}
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index e1098c3..f6f2a59 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -352,15 +352,16 @@ static int show_device(struct seq_file *m, void *v)
dev->vendor,
dev->device,
dev->irq);
- /* Here should be 7 and not PCI_NUM_RESOURCES as we need to preserve compatibility */
- for (i=0; i<7; i++) {
+
+ /* only print standard and ROM resources to preserve compatibility */
+ for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
resource_size_t start, end;
pci_resource_to_user(dev, i, &dev->resource[i], &start, &end);
seq_printf(m, "\t%16llx",
(unsigned long long)(start |
(dev->resource[i].flags & PCI_REGION_FLAG_MASK)));
}
- for (i=0; i<7; i++) {
+ for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
resource_size_t start, end;
pci_resource_to_user(dev, i, &dev->resource[i], &start, &end);
seq_printf(m, "\t%16llx",
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f280783..497d639 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -76,7 +76,30 @@ enum pci_mmap_state {
#define PCI_DMA_FROMDEVICE 2
#define PCI_DMA_NONE 3
-#define DEVICE_COUNT_RESOURCE 12
+/*
+ * For PCI devices, the region numbers are assigned this way:
+ */
+enum {
+ /* #0-5: standard PCI regions */
+ PCI_STD_RESOURCES,
+ PCI_STD_RESOURCES_END = 5,
+
+ /* #6: expansion ROM */
+ PCI_ROM_RESOURCE,
+
+ /* address space assigned to buses behind the bridge */
+#ifndef PCI_BRIDGE_RES_NUM
+#define PCI_BRIDGE_RES_NUM 4
+#endif
+ PCI_BRIDGE_RESOURCES,
+ PCI_BRIDGE_RES_END = PCI_BRIDGE_RESOURCES + PCI_BRIDGE_RES_NUM - 1,
+
+ /* total resources associated with a PCI device */
+ PCI_NUM_RESOURCES,
+
+ /* preserve this for compatibility */
+ DEVICE_COUNT_RESOURCE
+};
typedef int __bitwise pci_power_t;
@@ -262,18 +285,6 @@ static inline void pci_add_saved_cap(struct pci_dev *pci_dev,
hlist_add_head(&new_cap->next, &pci_dev->saved_cap_space);
}
-/*
- * For PCI devices, the region numbers are assigned this way:
- *
- * 0-5 standard PCI regions
- * 6 expansion ROM
- * 7-10 bridges: address space assigned to buses behind the bridge
- */
-
-#define PCI_ROM_RESOURCE 6
-#define PCI_BRIDGE_RESOURCES 7
-#define PCI_NUM_RESOURCES 11
-
#ifndef PCI_BUS_NUM_RESOURCES
#define PCI_BUS_NUM_RESOURCES 16
#endif
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 1/8 v4] PCI: define PCI resource names in an 'enum'
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
2008-10-14 10:46 ` [PATCH 1/8 v4] PCI: define PCI resource names in an 'enum' Yu Zhao
@ 2008-10-14 10:46 ` Yu Zhao
2008-10-14 10:48 ` [PATCH 2/8 v4] PCI: export __pci_read_base Yu Zhao
` (13 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:46 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
This patch moves all definitions of PCI resource names to an 'enum',
and also replaces some hard-coded resource variables with symbol
names. This change eases the introduction of device specific resources.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci-sysfs.c | 4 +++-
drivers/pci/pci.c | 19 ++-----------------
drivers/pci/probe.c | 2 +-
drivers/pci/proc.c | 7 ++++---
include/linux/pci.h | 37 ++++++++++++++++++++++++-------------
5 files changed, 34 insertions(+), 35 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 2cad6da..c41b783 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -101,11 +101,13 @@ resource_show(struct device * dev, struct device_attribute *attr, char * buf)
struct pci_dev * pci_dev = to_pci_dev(dev);
char * str = buf;
int i;
- int max = 7;
+ int max;
resource_size_t start, end;
if (pci_dev->subordinate)
max = DEVICE_COUNT_RESOURCE;
+ else
+ max = PCI_BRIDGE_RESOURCES;
for (i = 0; i < max; i++) {
struct resource *res = &pci_dev->resource[i];
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5ecd2d7..a9c64b0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -359,24 +359,9 @@ pci_find_parent_resource(const struct pci_dev *dev, struct resource *res)
static void
pci_restore_bars(struct pci_dev *dev)
{
- int i, numres;
-
- switch (dev->hdr_type) {
- case PCI_HEADER_TYPE_NORMAL:
- numres = 6;
- break;
- case PCI_HEADER_TYPE_BRIDGE:
- numres = 2;
- break;
- case PCI_HEADER_TYPE_CARDBUS:
- numres = 1;
- break;
- default:
- /* Should never get here, but just in case... */
- return;
- }
+ int i;
- for (i = 0; i < numres; i++)
+ for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
pci_update_resource(dev, i);
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index dcd6bf1..03ddfee 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -492,7 +492,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
child->subordinate = 0xff;
/* Set up default resource pointers and names.. */
- for (i = 0; i < 4; i++) {
+ for (i = 0; i < PCI_BRIDGE_RES_NUM; i++) {
child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i];
child->resource[i]->name = child->name;
}
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index e1098c3..f6f2a59 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -352,15 +352,16 @@ static int show_device(struct seq_file *m, void *v)
dev->vendor,
dev->device,
dev->irq);
- /* Here should be 7 and not PCI_NUM_RESOURCES as we need to preserve compatibility */
- for (i=0; i<7; i++) {
+
+ /* only print standard and ROM resources to preserve compatibility */
+ for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
resource_size_t start, end;
pci_resource_to_user(dev, i, &dev->resource[i], &start, &end);
seq_printf(m, "\t%16llx",
(unsigned long long)(start |
(dev->resource[i].flags & PCI_REGION_FLAG_MASK)));
}
- for (i=0; i<7; i++) {
+ for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
resource_size_t start, end;
pci_resource_to_user(dev, i, &dev->resource[i], &start, &end);
seq_printf(m, "\t%16llx",
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f280783..497d639 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -76,7 +76,30 @@ enum pci_mmap_state {
#define PCI_DMA_FROMDEVICE 2
#define PCI_DMA_NONE 3
-#define DEVICE_COUNT_RESOURCE 12
+/*
+ * For PCI devices, the region numbers are assigned this way:
+ */
+enum {
+ /* #0-5: standard PCI regions */
+ PCI_STD_RESOURCES,
+ PCI_STD_RESOURCES_END = 5,
+
+ /* #6: expansion ROM */
+ PCI_ROM_RESOURCE,
+
+ /* address space assigned to buses behind the bridge */
+#ifndef PCI_BRIDGE_RES_NUM
+#define PCI_BRIDGE_RES_NUM 4
+#endif
+ PCI_BRIDGE_RESOURCES,
+ PCI_BRIDGE_RES_END = PCI_BRIDGE_RESOURCES + PCI_BRIDGE_RES_NUM - 1,
+
+ /* total resources associated with a PCI device */
+ PCI_NUM_RESOURCES,
+
+ /* preserve this for compatibility */
+ DEVICE_COUNT_RESOURCE
+};
typedef int __bitwise pci_power_t;
@@ -262,18 +285,6 @@ static inline void pci_add_saved_cap(struct pci_dev *pci_dev,
hlist_add_head(&new_cap->next, &pci_dev->saved_cap_space);
}
-/*
- * For PCI devices, the region numbers are assigned this way:
- *
- * 0-5 standard PCI regions
- * 6 expansion ROM
- * 7-10 bridges: address space assigned to buses behind the bridge
- */
-
-#define PCI_ROM_RESOURCE 6
-#define PCI_BRIDGE_RESOURCES 7
-#define PCI_NUM_RESOURCES 11
-
#ifndef PCI_BUS_NUM_RESOURCES
#define PCI_BUS_NUM_RESOURCES 16
#endif
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 2/8 v4] PCI: export __pci_read_base
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
2008-10-14 10:46 ` [PATCH 1/8 v4] PCI: define PCI resource names in an 'enum' Yu Zhao
2008-10-14 10:46 ` Yu Zhao
@ 2008-10-14 10:48 ` Yu Zhao
2008-10-14 10:48 ` Yu Zhao
` (12 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:48 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Export __pci_read_base() so it can be used by whole PCI subsystem.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.h | 9 +++++++++
drivers/pci/probe.c | 20 +++++++++-----------
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 69b6365..922b742 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -150,6 +150,15 @@ struct pci_slot_attribute {
};
#define to_pci_slot_attr(s) container_of(s, struct pci_slot_attribute, attr)
+enum pci_bar_type {
+ pci_bar_unknown, /* Standard PCI BAR probe */
+ pci_bar_io, /* An io port BAR */
+ pci_bar_mem32, /* A 32-bit memory BAR */
+ pci_bar_mem64, /* A 64-bit memory BAR */
+};
+
+extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
+ struct resource *res, unsigned int reg);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 03ddfee..2326609 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -201,13 +201,6 @@ static u64 pci_size(u64 base, u64 maxbase, u64 mask)
return size;
}
-enum pci_bar_type {
- pci_bar_unknown, /* Standard PCI BAR probe */
- pci_bar_io, /* An io port BAR */
- pci_bar_mem32, /* A 32-bit memory BAR */
- pci_bar_mem64, /* A 64-bit memory BAR */
-};
-
static inline enum pci_bar_type decode_bar(struct resource *res, u32 bar)
{
if ((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO) {
@@ -222,11 +215,16 @@ static inline enum pci_bar_type decode_bar(struct resource *res, u32 bar)
return pci_bar_mem32;
}
-/*
- * If the type is not unknown, we assume that the lowest bit is 'enable'.
- * Returns 1 if the BAR was 64-bit and 0 if it was 32-bit.
+/**
+ * pci_read_base - read a PCI BAR
+ * @dev: the PCI device
+ * @type: type of the BAR
+ * @res: resource buffer to be filled in
+ * @pos: BAR position in the config space
+ *
+ * Returns 1 if the BAR is 64-bit, or 0 if 32-bit.
*/
-static int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
+int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int pos)
{
u32 l, sz, mask;
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 2/8 v4] PCI: export __pci_read_base
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (2 preceding siblings ...)
2008-10-14 10:48 ` [PATCH 2/8 v4] PCI: export __pci_read_base Yu Zhao
@ 2008-10-14 10:48 ` Yu Zhao
2008-10-14 10:53 ` [PATCH 3/8 v4] PCI: export pci_alloc_child_bus Yu Zhao
` (11 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:48 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Export __pci_read_base() so it can be used by whole PCI subsystem.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.h | 9 +++++++++
drivers/pci/probe.c | 20 +++++++++-----------
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 69b6365..922b742 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -150,6 +150,15 @@ struct pci_slot_attribute {
};
#define to_pci_slot_attr(s) container_of(s, struct pci_slot_attribute, attr)
+enum pci_bar_type {
+ pci_bar_unknown, /* Standard PCI BAR probe */
+ pci_bar_io, /* An io port BAR */
+ pci_bar_mem32, /* A 32-bit memory BAR */
+ pci_bar_mem64, /* A 64-bit memory BAR */
+};
+
+extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
+ struct resource *res, unsigned int reg);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 03ddfee..2326609 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -201,13 +201,6 @@ static u64 pci_size(u64 base, u64 maxbase, u64 mask)
return size;
}
-enum pci_bar_type {
- pci_bar_unknown, /* Standard PCI BAR probe */
- pci_bar_io, /* An io port BAR */
- pci_bar_mem32, /* A 32-bit memory BAR */
- pci_bar_mem64, /* A 64-bit memory BAR */
-};
-
static inline enum pci_bar_type decode_bar(struct resource *res, u32 bar)
{
if ((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO) {
@@ -222,11 +215,16 @@ static inline enum pci_bar_type decode_bar(struct resource *res, u32 bar)
return pci_bar_mem32;
}
-/*
- * If the type is not unknown, we assume that the lowest bit is 'enable'.
- * Returns 1 if the BAR was 64-bit and 0 if it was 32-bit.
+/**
+ * pci_read_base - read a PCI BAR
+ * @dev: the PCI device
+ * @type: type of the BAR
+ * @res: resource buffer to be filled in
+ * @pos: BAR position in the config space
+ *
+ * Returns 1 if the BAR is 64-bit, or 0 if 32-bit.
*/
-static int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
+int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int pos)
{
u32 l, sz, mask;
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 3/8 v4] PCI: export pci_alloc_child_bus
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (3 preceding siblings ...)
2008-10-14 10:48 ` Yu Zhao
@ 2008-10-14 10:53 ` Yu Zhao
2008-10-14 10:53 ` Yu Zhao
` (10 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:53 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Export pci_alloc_child_bus(), and make it be able to handle buses without
bridge devices. Some devices such as SR-IOV devices use more than one bus
number while there is no explicit bridge devices since they have internal
routing mechanism.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.h | 2 ++
drivers/pci/probe.c | 9 ++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 922b742..c6fa8ab 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -159,6 +159,8 @@ enum pci_bar_type {
extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
+extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
+ struct pci_dev *bridge, int busnr);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2326609..9c680b8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -454,7 +454,7 @@ static struct pci_bus * pci_alloc_bus(void)
return b;
}
-static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
+struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr)
{
struct pci_bus *child;
@@ -467,12 +467,10 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
if (!child)
return NULL;
- child->self = bridge;
child->parent = parent;
child->ops = parent->ops;
child->sysdata = parent->sysdata;
child->bus_flags = parent->bus_flags;
- child->bridge = get_device(&bridge->dev);
/* initialize some portions of the bus device, but don't register it
* now as the parent is not properly set up yet. This device will get
@@ -489,6 +487,11 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
child->primary = parent->secondary;
child->subordinate = 0xff;
+ if (!bridge)
+ return child;
+
+ child->self = bridge;
+ child->bridge = get_device(&bridge->dev);
/* Set up default resource pointers and names.. */
for (i = 0; i < PCI_BRIDGE_RES_NUM; i++) {
child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i];
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 3/8 v4] PCI: export pci_alloc_child_bus
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (4 preceding siblings ...)
2008-10-14 10:53 ` [PATCH 3/8 v4] PCI: export pci_alloc_child_bus Yu Zhao
@ 2008-10-14 10:53 ` Yu Zhao
2008-10-14 10:55 ` [PATCH 4/8 v4] PCI: add a wrapper for resource_alignment Yu Zhao
` (9 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:53 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Export pci_alloc_child_bus(), and make it be able to handle buses without
bridge devices. Some devices such as SR-IOV devices use more than one bus
number while there is no explicit bridge devices since they have internal
routing mechanism.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.h | 2 ++
drivers/pci/probe.c | 9 ++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 922b742..c6fa8ab 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -159,6 +159,8 @@ enum pci_bar_type {
extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
+extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
+ struct pci_dev *bridge, int busnr);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2326609..9c680b8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -454,7 +454,7 @@ static struct pci_bus * pci_alloc_bus(void)
return b;
}
-static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
+struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr)
{
struct pci_bus *child;
@@ -467,12 +467,10 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
if (!child)
return NULL;
- child->self = bridge;
child->parent = parent;
child->ops = parent->ops;
child->sysdata = parent->sysdata;
child->bus_flags = parent->bus_flags;
- child->bridge = get_device(&bridge->dev);
/* initialize some portions of the bus device, but don't register it
* now as the parent is not properly set up yet. This device will get
@@ -489,6 +487,11 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
child->primary = parent->secondary;
child->subordinate = 0xff;
+ if (!bridge)
+ return child;
+
+ child->self = bridge;
+ child->bridge = get_device(&bridge->dev);
/* Set up default resource pointers and names.. */
for (i = 0; i < PCI_BRIDGE_RES_NUM; i++) {
child->resource[i] = &bridge->resource[PCI_BRIDGE_RESOURCES+i];
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 4/8 v4] PCI: add a wrapper for resource_alignment
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (5 preceding siblings ...)
2008-10-14 10:53 ` Yu Zhao
@ 2008-10-14 10:55 ` Yu Zhao
2008-10-14 10:55 ` Yu Zhao
` (8 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:55 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Add a wrap of resource_alignment so it can handle device specific resource
alignment.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.c | 25 +++++++++++++++++++++++++
drivers/pci/pci.h | 1 +
drivers/pci/setup-bus.c | 4 ++--
drivers/pci/setup-res.c | 7 ++++---
4 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a9c64b0..381e958 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1884,6 +1884,31 @@ int pci_select_bars(struct pci_dev *dev, unsigned long flags)
return bars;
}
+/**
+ * pci_resource_alignment - get a PCI BAR resource alignment
+ * @dev: the PCI device
+ * @resno: the resource number
+ *
+ * Returns alignment size on success, or 0 on error.
+ */
+int pci_resource_alignment(struct pci_dev *dev, int resno)
+{
+ resource_size_t align;
+ struct resource *res = dev->resource + resno;
+
+ align = resource_alignment(res);
+ if (align)
+ return align;
+
+ if (resno <= PCI_ROM_RESOURCE)
+ return resource_size(res);
+ else if (resno <= PCI_BRIDGE_RES_END)
+ return res->start;
+
+ dev_err(&dev->dev, "alignment: invalid resource #%d\n", resno);
+ return 0;
+}
+
static void __devinit pci_no_domains(void)
{
#ifdef CONFIG_PCI_DOMAINS
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c6fa8ab..720b7d6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -161,6 +161,7 @@ extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr);
+extern int pci_resource_alignment(struct pci_dev *dev, int resno);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6c78cf8..d454ec3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -25,6 +25,7 @@
#include <linux/ioport.h>
#include <linux/cache.h>
#include <linux/slab.h>
+#include "pci.h"
static void pbus_assign_resources_sorted(struct pci_bus *bus)
@@ -351,8 +352,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, unsigned long
if (r->parent || (r->flags & mask) != type)
continue;
r_size = resource_size(r);
- /* For bridges size != alignment */
- align = resource_alignment(r);
+ align = pci_resource_alignment(dev, i);
order = __ffs(align) - 20;
if (order > 11) {
dev_warn(&dev->dev, "BAR %d bad alignment %llx: "
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index a81caac..ecff483 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -137,7 +137,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
size = resource_size(res);
min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM;
- align = resource_alignment(res);
+ align = pci_resource_alignment(dev, resno);
if (!align) {
dev_err(&dev->dev, "BAR %d: can't allocate resource (bogus "
"alignment) [%#llx-%#llx] flags %#lx\n",
@@ -235,7 +235,7 @@ void pdev_sort_resources(struct pci_dev *dev, struct resource_list *head)
if (!(r->flags) || r->parent)
continue;
- r_align = resource_alignment(r);
+ r_align = pci_resource_alignment(dev, i);
if (!r_align) {
dev_warn(&dev->dev, "BAR %d: bogus alignment "
"[%#llx-%#llx] flags %#lx\n",
@@ -248,7 +248,8 @@ void pdev_sort_resources(struct pci_dev *dev, struct resource_list *head)
struct resource_list *ln = list->next;
if (ln)
- align = resource_alignment(ln->res);
+ align = pci_resource_alignment(ln->dev,
+ ln->res - ln->dev->resource);
if (r_align > align) {
tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 4/8 v4] PCI: add a wrapper for resource_alignment
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (6 preceding siblings ...)
2008-10-14 10:55 ` [PATCH 4/8 v4] PCI: add a wrapper for resource_alignment Yu Zhao
@ 2008-10-14 10:55 ` Yu Zhao
2008-10-14 10:57 ` [PATCH 5/8 v4] PCI: add a new function to map BAR offset Yu Zhao
` (7 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:55 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Add a wrap of resource_alignment so it can handle device specific resource
alignment.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.c | 25 +++++++++++++++++++++++++
drivers/pci/pci.h | 1 +
drivers/pci/setup-bus.c | 4 ++--
drivers/pci/setup-res.c | 7 ++++---
4 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a9c64b0..381e958 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1884,6 +1884,31 @@ int pci_select_bars(struct pci_dev *dev, unsigned long flags)
return bars;
}
+/**
+ * pci_resource_alignment - get a PCI BAR resource alignment
+ * @dev: the PCI device
+ * @resno: the resource number
+ *
+ * Returns alignment size on success, or 0 on error.
+ */
+int pci_resource_alignment(struct pci_dev *dev, int resno)
+{
+ resource_size_t align;
+ struct resource *res = dev->resource + resno;
+
+ align = resource_alignment(res);
+ if (align)
+ return align;
+
+ if (resno <= PCI_ROM_RESOURCE)
+ return resource_size(res);
+ else if (resno <= PCI_BRIDGE_RES_END)
+ return res->start;
+
+ dev_err(&dev->dev, "alignment: invalid resource #%d\n", resno);
+ return 0;
+}
+
static void __devinit pci_no_domains(void)
{
#ifdef CONFIG_PCI_DOMAINS
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c6fa8ab..720b7d6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -161,6 +161,7 @@ extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr);
+extern int pci_resource_alignment(struct pci_dev *dev, int resno);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6c78cf8..d454ec3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -25,6 +25,7 @@
#include <linux/ioport.h>
#include <linux/cache.h>
#include <linux/slab.h>
+#include "pci.h"
static void pbus_assign_resources_sorted(struct pci_bus *bus)
@@ -351,8 +352,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, unsigned long
if (r->parent || (r->flags & mask) != type)
continue;
r_size = resource_size(r);
- /* For bridges size != alignment */
- align = resource_alignment(r);
+ align = pci_resource_alignment(dev, i);
order = __ffs(align) - 20;
if (order > 11) {
dev_warn(&dev->dev, "BAR %d bad alignment %llx: "
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index a81caac..ecff483 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -137,7 +137,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
size = resource_size(res);
min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM;
- align = resource_alignment(res);
+ align = pci_resource_alignment(dev, resno);
if (!align) {
dev_err(&dev->dev, "BAR %d: can't allocate resource (bogus "
"alignment) [%#llx-%#llx] flags %#lx\n",
@@ -235,7 +235,7 @@ void pdev_sort_resources(struct pci_dev *dev, struct resource_list *head)
if (!(r->flags) || r->parent)
continue;
- r_align = resource_alignment(r);
+ r_align = pci_resource_alignment(dev, i);
if (!r_align) {
dev_warn(&dev->dev, "BAR %d: bogus alignment "
"[%#llx-%#llx] flags %#lx\n",
@@ -248,7 +248,8 @@ void pdev_sort_resources(struct pci_dev *dev, struct resource_list *head)
struct resource_list *ln = list->next;
if (ln)
- align = resource_alignment(ln->res);
+ align = pci_resource_alignment(ln->dev,
+ ln->res - ln->dev->resource);
if (r_align > align) {
tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 5/8 v4] PCI: add a new function to map BAR offset
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (7 preceding siblings ...)
2008-10-14 10:55 ` Yu Zhao
@ 2008-10-14 10:57 ` Yu Zhao
2008-10-14 10:57 ` Yu Zhao
` (6 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:57 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Add a new function to map resource number to base register (offset and type).
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.c | 22 ++++++++++++++++++++++
drivers/pci/pci.h | 2 ++
drivers/pci/setup-res.c | 13 +++++--------
3 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 381e958..3575124 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1909,6 +1909,28 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
return 0;
}
+/**
+ * pci_resource_bar - get position of the BAR associated with a resource
+ * @dev: the PCI device
+ * @resno: the resource number
+ * @type: the BAR type to be filled in
+ *
+ * Returns BAR position in config space, or 0 if the BAR is invalid.
+ */
+int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
+{
+ if (resno < PCI_ROM_RESOURCE) {
+ *type = pci_bar_unknown;
+ return PCI_BASE_ADDRESS_0 + 4 * resno;
+ } else if (resno == PCI_ROM_RESOURCE) {
+ *type = pci_bar_mem32;
+ return dev->rom_base_reg;
+ }
+
+ dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
+ return 0;
+}
+
static void __devinit pci_no_domains(void)
{
#ifdef CONFIG_PCI_DOMAINS
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 720b7d6..e2237ad 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -162,6 +162,8 @@ extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr);
extern int pci_resource_alignment(struct pci_dev *dev, int resno);
+extern int pci_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index ecff483..c3585a0 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -31,6 +31,7 @@ void pci_update_resource(struct pci_dev *dev, int resno)
struct pci_bus_region region;
u32 new, check, mask;
int reg;
+ enum pci_bar_type type;
struct resource *res = dev->resource + resno;
/*
@@ -64,17 +65,13 @@ void pci_update_resource(struct pci_dev *dev, int resno)
else
mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
- if (resno < 6) {
- reg = PCI_BASE_ADDRESS_0 + 4 * resno;
- } else if (resno == PCI_ROM_RESOURCE) {
+ reg = pci_resource_bar(dev, resno, &type);
+ if (!reg)
+ return;
+ if (type != pci_bar_unknown) {
if (!(res->flags & IORESOURCE_ROM_ENABLE))
return;
new |= PCI_ROM_ADDRESS_ENABLE;
- reg = dev->rom_base_reg;
- } else {
- /* Hmm, non-standard resource. */
-
- return; /* kill uninitialised var warning */
}
pci_write_config_dword(dev, reg, new);
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 5/8 v4] PCI: add a new function to map BAR offset
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (8 preceding siblings ...)
2008-10-14 10:57 ` [PATCH 5/8 v4] PCI: add a new function to map BAR offset Yu Zhao
@ 2008-10-14 10:57 ` Yu Zhao
2008-10-14 10:59 ` [PATCH 6/8 v4] PCI: support the SR-IOV capability Yu Zhao
` (5 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:57 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Add a new function to map resource number to base register (offset and type).
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/pci.c | 22 ++++++++++++++++++++++
drivers/pci/pci.h | 2 ++
drivers/pci/setup-res.c | 13 +++++--------
3 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 381e958..3575124 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1909,6 +1909,28 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
return 0;
}
+/**
+ * pci_resource_bar - get position of the BAR associated with a resource
+ * @dev: the PCI device
+ * @resno: the resource number
+ * @type: the BAR type to be filled in
+ *
+ * Returns BAR position in config space, or 0 if the BAR is invalid.
+ */
+int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
+{
+ if (resno < PCI_ROM_RESOURCE) {
+ *type = pci_bar_unknown;
+ return PCI_BASE_ADDRESS_0 + 4 * resno;
+ } else if (resno == PCI_ROM_RESOURCE) {
+ *type = pci_bar_mem32;
+ return dev->rom_base_reg;
+ }
+
+ dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
+ return 0;
+}
+
static void __devinit pci_no_domains(void)
{
#ifdef CONFIG_PCI_DOMAINS
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 720b7d6..e2237ad 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -162,6 +162,8 @@ extern int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
extern struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
struct pci_dev *bridge, int busnr);
extern int pci_resource_alignment(struct pci_dev *dev, int resno);
+extern int pci_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type);
extern void pci_enable_ari(struct pci_dev *dev);
/**
* pci_ari_enabled - query ARI forwarding status
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index ecff483..c3585a0 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -31,6 +31,7 @@ void pci_update_resource(struct pci_dev *dev, int resno)
struct pci_bus_region region;
u32 new, check, mask;
int reg;
+ enum pci_bar_type type;
struct resource *res = dev->resource + resno;
/*
@@ -64,17 +65,13 @@ void pci_update_resource(struct pci_dev *dev, int resno)
else
mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
- if (resno < 6) {
- reg = PCI_BASE_ADDRESS_0 + 4 * resno;
- } else if (resno == PCI_ROM_RESOURCE) {
+ reg = pci_resource_bar(dev, resno, &type);
+ if (!reg)
+ return;
+ if (type != pci_bar_unknown) {
if (!(res->flags & IORESOURCE_ROM_ENABLE))
return;
new |= PCI_ROM_ADDRESS_ENABLE;
- reg = dev->rom_base_reg;
- } else {
- /* Hmm, non-standard resource. */
-
- return; /* kill uninitialised var warning */
}
pci_write_config_dword(dev, reg, new);
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (9 preceding siblings ...)
2008-10-14 10:57 ` Yu Zhao
@ 2008-10-14 10:59 ` Yu Zhao
2008-10-14 10:59 ` Yu Zhao
` (4 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:59 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Support Single Root I/O Virtualization (SR-IOV) capability.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/Kconfig | 12 +
drivers/pci/Makefile | 2 +
drivers/pci/iov.c | 853 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci-sysfs.c | 4 +
drivers/pci/pci.c | 14 +-
drivers/pci/pci.h | 55 +++
drivers/pci/probe.c | 4 +
include/linux/pci.h | 57 +++
include/linux/pci_regs.h | 21 ++
9 files changed, 1021 insertions(+), 1 deletions(-)
create mode 100644 drivers/pci/iov.c
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index e1ca425..e7c0836 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -50,3 +50,15 @@ config HT_IRQ
This allows native hypertransport devices to use interrupts.
If unsure say Y.
+
+config PCI_IOV
+ bool "PCI SR-IOV support"
+ depends on PCI
+ select PCI_MSI
+ default n
+ help
+ This option allows device drivers to enable Single Root I/O
+ Virtualization. Each Virtual Function's PCI configuration
+ space can be accessed using its own Bus, Device and Function
+ Number (Routing ID). Each Virtual Function also has PCI Memory
+ Space, which is used to map its own register set.
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 7d63f8c..47bb456 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -53,3 +53,5 @@ obj-$(CONFIG_PCI_SYSCALL) += syscall.o
ifeq ($(CONFIG_PCI_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG
endif
+
+obj-$(CONFIG_PCI_IOV) += iov.o
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
new file mode 100644
index 0000000..3cf9709
--- /dev/null
+++ b/drivers/pci/iov.c
@@ -0,0 +1,853 @@
+/*
+ * drivers/pci/iov.c
+ *
+ * Copyright (C) 2008 Intel Corporation
+ *
+ * PCI Express Single Root I/O Virtualization capability support.
+ */
+
+#include <linux/ctype.h>
+#include <linux/string.h>
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <asm/page.h>
+#include "pci.h"
+
+#define VF_NAME_LEN 8
+
+
+struct iov_attr {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject *,
+ struct iov_attr *, char *);
+ ssize_t (*store)(struct kobject *,
+ struct iov_attr *, const char *, size_t);
+};
+
+#define iov_config_attr(field) \
+static ssize_t field##_show(struct kobject *kobj, \
+ struct iov_attr *attr, char *buf) \
+{ \
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj); \
+ \
+ return sprintf(buf, "%d\n", iov->field); \
+}
+
+iov_config_attr(is_enabled);
+iov_config_attr(totalvfs);
+iov_config_attr(initialvfs);
+iov_config_attr(numvfs);
+
+struct vf_entry {
+ int vfn;
+ struct kobject kobj;
+ struct pci_iov *iov;
+ struct iov_attr *attr;
+ char name[VF_NAME_LEN];
+ char (*param)[PCI_IOV_PARAM_LEN];
+};
+
+static ssize_t iov_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct iov_attr *ia = container_of(attr, struct iov_attr, attr);
+
+ return ia->show ? ia->show(kobj, ia, buf) : -EIO;
+}
+
+static ssize_t iov_attr_store(struct kobject *kobj,
+ struct attribute *attr, const char *buf, size_t len)
+{
+ struct iov_attr *ia = container_of(attr, struct iov_attr, attr);
+
+ return ia->store ? ia->store(kobj, ia, buf, len) : -EIO;
+}
+
+static struct sysfs_ops iov_attr_ops = {
+ .show = iov_attr_show,
+ .store = iov_attr_store,
+};
+
+static struct kobj_type iov_ktype = {
+ .sysfs_ops = &iov_attr_ops,
+};
+
+static inline void vf_rid(struct pci_dev *dev, int vfn, u8 *busnr, u8 *devfn)
+{
+ u16 rid;
+
+ rid = (dev->bus->number << 8) + dev->devfn +
+ dev->iov->offset + dev->iov->stride * vfn;
+ *busnr = rid >> 8;
+ *devfn = rid & 0xff;
+}
+
+static int vf_add(struct pci_dev *dev, int vfn)
+{
+ int i;
+ int rc;
+ u8 busnr, devfn;
+ unsigned long size;
+ struct pci_dev *new;
+ struct pci_bus *bus;
+ struct resource *res;
+
+ vf_rid(dev, vfn, &busnr, &devfn);
+
+ new = alloc_pci_dev();
+ if (!new)
+ return -ENOMEM;
+
+ if (dev->bus->number == busnr)
+ new->bus = bus = dev->bus;
+ else {
+ list_for_each_entry(bus, &dev->bus->children, node)
+ if (bus->number == busnr) {
+ new->bus = bus;
+ break;
+ }
+ BUG_ON(!new->bus);
+ }
+
+ new->sysdata = bus->sysdata;
+ new->dev.parent = dev->dev.parent;
+ new->dev.bus = dev->dev.bus;
+ new->devfn = devfn;
+ new->hdr_type = PCI_HEADER_TYPE_NORMAL;
+ new->multifunction = 0;
+ new->vendor = dev->vendor;
+ pci_read_config_word(dev, dev->iov->cap + PCI_IOV_VF_DID, &new->device);
+ new->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
+ new->error_state = pci_channel_io_normal;
+ new->is_pcie = 1;
+ new->pcie_type = PCI_EXP_TYPE_ENDPOINT;
+ new->dma_mask = 0xffffffff;
+
+ dev_set_name(&new->dev, "%04x:%02x:%02x.%d", pci_domain_nr(bus),
+ busnr, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
+ pci_read_config_byte(new, PCI_REVISION_ID, &new->revision);
+ new->class = dev->class;
+ new->current_state = PCI_UNKNOWN;
+ new->irq = 0;
+
+ for (i = 0; i < PCI_IOV_NUM_BAR; i++) {
+ res = dev->resource + PCI_IOV_RESOURCES + i;
+ if (!res->parent)
+ continue;
+ new->resource[i].name = pci_name(new);
+ new->resource[i].flags = res->flags;
+ size = resource_size(res) / dev->iov->totalvfs;
+ new->resource[i].start = res->start + size * vfn;
+ new->resource[i].end = new->resource[i].start + size - 1;
+ rc = request_resource(res, &new->resource[i]);
+ BUG_ON(rc);
+ }
+
+ new->subsystem_vendor = dev->subsystem_vendor;
+ pci_read_config_word(new, PCI_SUBSYSTEM_ID, &new->subsystem_device);
+
+ pci_device_add(new, bus);
+ return pci_bus_add_device(new);
+}
+
+static void vf_remove(struct pci_dev *dev, int vfn)
+{
+ u8 busnr, devfn;
+ struct pci_dev *tmp;
+
+ vf_rid(dev, vfn, &busnr, &devfn);
+
+ tmp = pci_get_bus_and_slot(busnr, devfn);
+ if (!tmp)
+ return;
+
+ pci_dev_put(tmp);
+ pci_remove_bus_device(tmp);
+}
+
+static int iov_enable(struct pci_iov *iov)
+{
+ int rc;
+ int i, j;
+ u16 ctrl;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (iov->is_enabled)
+ return 0;
+
+ iov->notify(iov->dev, iov->numvfs | PCI_IOV_ENABLE);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl |= (PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ for (i = 0; i < iov->numvfs; i++) {
+ rc = vf_add(iov->dev, i);
+ if (rc)
+ goto failed;
+ }
+
+ iov->notify(iov->dev, iov->numvfs |
+ PCI_IOV_ENABLE | PCI_IOV_POST_EVENT);
+ iov->is_enabled = 1;
+ return 0;
+
+failed:
+ for (j = 0; j < i; j++)
+ vf_remove(iov->dev, j);
+
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl &= ~(PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ return rc;
+}
+
+static int iov_disable(struct pci_iov *iov)
+{
+ int i;
+ u16 ctrl;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (!iov->is_enabled)
+ return 0;
+
+ iov->notify(iov->dev, PCI_IOV_DISABLE);
+ for (i = 0; i < iov->numvfs; i++)
+ vf_remove(iov->dev, i);
+
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl &= ~(PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ iov->notify(iov->dev, PCI_IOV_DISABLE | PCI_IOV_POST_EVENT);
+ iov->is_enabled = 0;
+ return 0;
+}
+
+static int iov_set_numvfs(struct pci_iov *iov, int numvfs)
+{
+ u16 offset, stride;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (numvfs == iov->numvfs)
+ return 0;
+
+ if (numvfs < 0 || numvfs > iov->initialvfs || iov->is_enabled)
+ return -EINVAL;
+
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_NUM_VF, numvfs);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_VF_OFFSET, &offset);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_VF_STRIDE, &stride);
+ if ((numvfs && !offset) || (numvfs > 1 && !stride))
+ return -EIO;
+
+ iov->offset = offset;
+ iov->stride = stride;
+ iov->numvfs = numvfs;
+ return 0;
+}
+
+static ssize_t is_enabled_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int rc;
+ long enable;
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj);
+
+ rc = strict_strtol(buf, 0, &enable);
+ if (rc)
+ return rc;
+
+ mutex_lock(&iov->mutex);
+ switch (enable) {
+ case 0:
+ rc = iov_disable(iov);
+ break;
+ case 1:
+ rc = iov_enable(iov);
+ break;
+ default:
+ rc = -EINVAL;
+ }
+ mutex_unlock(&iov->mutex);
+
+ return rc ? rc : count;
+}
+
+static ssize_t numvfs_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int rc;
+ long numvfs;
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj);
+
+ rc = strict_strtol(buf, 0, &numvfs);
+ if (rc)
+ return rc;
+
+ mutex_lock(&iov->mutex);
+ rc = iov_set_numvfs(iov, numvfs);
+ mutex_unlock(&iov->mutex);
+
+ return rc ? rc : count;
+}
+
+
+static struct iov_attr iov_attr[] = {
+ __ATTR_RO(totalvfs),
+ __ATTR_RO(initialvfs),
+ __ATTR(numvfs, S_IWUSR | S_IRUGO, numvfs_show, numvfs_store),
+ __ATTR(enable, S_IWUSR | S_IRUGO, is_enabled_show, is_enabled_store),
+};
+
+static ssize_t vf_show(struct kobject *kobj, struct iov_attr *attr,
+ char *buf)
+{
+ int vfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vfn = attr - ve->attr;
+ ve->iov->notify(ve->iov->dev, vfn | PCI_IOV_RD_CONF);
+
+ return sprintf(buf, "%s\n", ve->param[vfn]);
+}
+
+static ssize_t vf_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int vfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vfn = attr - ve->attr;
+ sscanf(buf, "%63s", ve->param[vfn]);
+ ve->iov->notify(ve->iov->dev, vfn | PCI_IOV_WR_CONF);
+
+ return count;
+}
+
+static ssize_t rid_show(struct kobject *kobj, struct iov_attr *attr,
+ char *buf)
+{
+ u8 busnr, devfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vf_rid(ve->iov->dev, ve->vfn, &busnr, &devfn);
+
+ return sprintf(buf, "%04x:%02x:%02x.%d\n",
+ pci_domain_nr(ve->iov->dev->bus),
+ busnr, PCI_SLOT(devfn), PCI_FUNC(devfn));
+}
+
+static struct iov_attr vf_attr = __ATTR_RO(rid);
+
+int iov_alloc_bus(struct pci_bus *bus, int busnr)
+{
+ int i;
+ int rc = 0;
+ struct pci_bus *child, *next;
+ struct list_head head;
+
+ INIT_LIST_HEAD(&head);
+
+ down_write(&pci_bus_sem);
+
+ for (i = bus->number + 1; i <= busnr; i++) {
+ list_for_each_entry(child, &bus->children, node)
+ if (child->number == i)
+ break;
+ if (child->number == i)
+ continue;
+ child = pci_alloc_child_bus(bus, NULL, i);
+ if (!child) {
+ rc = -ENOMEM;
+ break;
+ }
+ child->subordinate = i;
+ child->dev.parent = bus->bridge;
+ rc = device_register(&child->dev);
+ if (rc) {
+ kfree(child);
+ break;
+ }
+ child->is_added = 1;
+ list_add_tail(&child->node, &head);
+ }
+
+ if (rc)
+ list_for_each_entry_safe(child, next, &head, node) {
+ device_unregister(&child->dev);
+ kfree(child);
+ }
+ else
+ list_for_each_entry_safe(child, next, &head, node)
+ list_move_tail(&child->node, &bus->children);
+
+ up_write(&pci_bus_sem);
+
+ return rc;
+}
+
+void iov_release_bus(struct pci_bus *bus)
+{
+ struct pci_dev *dev;
+ struct pci_bus *child, *next;
+ struct list_head head;
+
+ INIT_LIST_HEAD(&head);
+
+ down_write(&pci_bus_sem);
+
+ list_for_each_entry(dev, &bus->devices, bus_list)
+ if (dev->iov && dev->iov->notify)
+ goto done;
+
+ list_for_each_entry_safe(child, next, &bus->children, node)
+ if (!child->bridge)
+ list_move(&child->node, &head);
+done:
+ up_write(&pci_bus_sem);
+
+ list_for_each_entry_safe(child, next, &head, node)
+ pci_remove_bus(child);
+}
+
+/**
+ * pci_iov_init - initialize device's SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ *
+ * The major differences between Virtual Function and PCI device are:
+ * 1) the device with multiple bus numbers uses internal routing, so
+ * there is no explicit bridge device in this case.
+ * 2) Virtual Function memory spaces are designated by BARs encapsulated
+ * in the capability structure, and the BARs in Virtual Function PCI
+ * configuration space are read-only zero.
+ */
+int pci_iov_init(struct pci_dev *dev)
+{
+ int i;
+ int pos;
+ u32 pgsz;
+ u16 ctrl, total, initial, offset, stride;
+ struct pci_iov *iov;
+ struct resource *res;
+
+ if (!dev->is_pcie || (dev->pcie_type != PCI_EXP_TYPE_RC_END &&
+ dev->pcie_type != PCI_EXP_TYPE_ENDPOINT))
+ return -ENODEV;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_IOV);
+ if (!pos)
+ return -ENODEV;
+
+ ctrl = pci_ari_enabled(dev) ? PCI_IOV_CTRL_ARI : 0;
+ pci_write_config_word(dev, pos + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ pci_read_config_word(dev, pos + PCI_IOV_TOTAL_VF, &total);
+ pci_read_config_word(dev, pos + PCI_IOV_INITIAL_VF, &initial);
+ pci_write_config_word(dev, pos + PCI_IOV_NUM_VF, initial);
+ pci_read_config_word(dev, pos + PCI_IOV_VF_OFFSET, &offset);
+ pci_read_config_word(dev, pos + PCI_IOV_VF_STRIDE, &stride);
+ if (!total || initial > total || (initial && !offset) ||
+ (initial > 1 && !stride))
+ return -EIO;
+
+ pci_read_config_dword(dev, pos + PCI_IOV_SUP_PGSIZE, &pgsz);
+ i = PAGE_SHIFT > 12 ? PAGE_SHIFT - 12 : 0;
+ pgsz &= ~((1 << i) - 1);
+ if (!pgsz)
+ return -EIO;
+
+ pgsz &= ~(pgsz - 1);
+ pci_write_config_dword(dev, pos + PCI_IOV_SYS_PGSIZE, pgsz);
+
+ iov = kzalloc(sizeof(*iov), GFP_KERNEL);
+ if (!iov)
+ return -ENOMEM;
+
+ iov->dev = dev;
+ iov->cap = pos;
+ iov->totalvfs = total;
+ iov->initialvfs = initial;
+ iov->offset = offset;
+ iov->stride = stride;
+ iov->align = pgsz << 12;
+ mutex_init(&iov->mutex);
+
+ for (i = 0; i < PCI_IOV_NUM_BAR; i++) {
+ res = dev->resource + PCI_IOV_RESOURCES + i;
+ pos = iov->cap + PCI_IOV_BAR_0 + i * 4;
+ i += __pci_read_base(dev, pci_bar_unknown, res, pos);
+ if (!res->flags)
+ continue;
+ res->flags &= ~IORESOURCE_SIZEALIGN;
+ res->end = res->start + resource_size(res) * total - 1;
+ }
+
+ dev->iov = iov;
+
+ return 0;
+}
+
+/**
+ * pci_iov_release - release resources used by SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_release(struct pci_dev *dev)
+{
+ if (!dev->iov)
+ return;
+
+ mutex_destroy(&dev->iov->mutex);
+ kfree(dev->iov);
+ dev->iov = NULL;
+}
+
+/**
+ * pci_iov_create_sysfs - create sysfs for SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_create_sysfs(struct pci_dev *dev)
+{
+ int rc;
+ int i, j;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return;
+
+ iov->ve = kzalloc(sizeof(*iov->ve) * iov->totalvfs, GFP_KERNEL);
+ if (!iov->ve)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ iov->ve[i].vfn = i;
+ iov->ve[i].iov = iov;
+ }
+
+ rc = kobject_init_and_add(&iov->kobj, &iov_ktype,
+ &dev->dev.kobj, "iov");
+ if (rc)
+ goto failed1;
+
+ for (i = 0; i < ARRAY_SIZE(iov_attr); i++) {
+ rc = sysfs_create_file(&iov->kobj, &iov_attr[i].attr);
+ if (rc)
+ goto failed2;
+ }
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ sprintf(iov->ve[i].name, "%d", i);
+ rc = kobject_init_and_add(&iov->ve[i].kobj, &iov_ktype,
+ &iov->kobj, iov->ve[i].name);
+ if (rc)
+ goto failed3;
+ rc = sysfs_create_file(&iov->ve[i].kobj, &vf_attr.attr);
+ if (rc) {
+ kobject_put(&iov->ve[i].kobj);
+ goto failed3;
+ }
+ }
+
+ return;
+
+failed3:
+ for (j = 0; j < i; j++) {
+ sysfs_remove_file(&iov->ve[j].kobj, &vf_attr.attr);
+ kobject_put(&iov->ve[j].kobj);
+ }
+failed2:
+ for (j = 0; j < i; j++)
+ sysfs_remove_file(&dev->iov->kobj, &iov_attr[j].attr);
+ kobject_put(&iov->kobj);
+failed1:
+ kfree(iov->ve);
+ iov->ve = NULL;
+
+ dev_err(&dev->dev, "can't create sysfs for SR-IOV.\n");
+}
+
+/**
+ * pci_iov_remove_sysfs - remove sysfs of SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_remove_sysfs(struct pci_dev *dev)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov || !iov->ve)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ sysfs_remove_file(&iov->ve[i].kobj, &vf_attr.attr);
+ kobject_put(&iov->ve[i].kobj);
+ }
+
+ for (i = 0; i < ARRAY_SIZE(iov_attr); i++)
+ sysfs_remove_file(&dev->iov->kobj, &iov_attr[i].attr);
+
+ kobject_put(&iov->kobj);
+ kfree(iov->ve);
+}
+
+int pci_iov_resource_align(struct pci_dev *dev, int resno)
+{
+ if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
+ return 0;
+
+ BUG_ON(!dev->iov);
+
+ return dev->iov->align;
+}
+
+int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
+ return 0;
+
+ BUG_ON(!dev->iov);
+
+ *type = pci_bar_unknown;
+ return dev->iov->cap + PCI_IOV_BAR_0 +
+ 4 * (resno - PCI_IOV_RESOURCES);
+}
+
+/**
+ * pci_iov_register - register SR-IOV service
+ * @dev: the PCI device
+ * @notify: callback function for SR-IOV events
+ * @entries: sysfs entries used by Physical Function driver
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_register(struct pci_dev *dev, int (*notify)(struct pci_dev *, u32),
+ char **entries)
+{
+ int rc;
+ int n, i, j, k;
+ u8 busnr, devfn;
+ struct iov_attr *attr;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov || !iov->ve)
+ return -ENODEV;
+
+ if (!notify)
+ return -EINVAL;
+
+ vf_rid(dev, iov->totalvfs - 1, &busnr, &devfn);
+ if (busnr > dev->bus->subordinate)
+ return -EIO;
+
+ iov->notify = notify;
+ rc = iov_alloc_bus(dev->bus, busnr);
+ if (rc)
+ return rc;
+
+ for (n = 0; entries && entries[n] && *entries[n]; n++)
+ ;
+ if (!n)
+ return 0;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ rc = -ENOMEM;
+ iov->ve[i].param = kzalloc(PCI_IOV_PARAM_LEN * n, GFP_KERNEL);
+ if (!iov->ve[i].param)
+ goto failed;
+ attr = kzalloc(sizeof(*attr) * n, GFP_KERNEL);
+ if (!attr) {
+ kfree(iov->ve[i].param);
+ goto failed;
+ }
+ iov->ve[i].attr = attr;
+ for (j = 0; j < n; j++) {
+ attr[j].attr.name = entries[j];
+ attr[j].attr.mode = S_IWUSR | S_IRUGO;
+ attr[j].show = vf_show;
+ attr[j].store = vf_store;
+ rc = sysfs_create_file(&iov->ve[i].kobj, &attr[j].attr);
+ if (rc) {
+ while (j--)
+ sysfs_remove_file(&iov->ve[i].kobj,
+ &attr[j].attr);
+ kfree(iov->ve[i].attr);
+ kfree(iov->ve[i].param);
+ goto failed;
+ }
+ }
+ }
+
+ iov->nentries = n;
+ return 0;
+
+failed:
+ for (k = 0; k < i; k++) {
+ for (j = 0; j < n; j++)
+ sysfs_remove_file(&iov->ve[k].kobj,
+ &iov->ve[k].attr[j].attr);
+ kfree(iov->ve[k].attr);
+ kfree(iov->ve[k].param);
+ }
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(pci_iov_register);
+
+/**
+ * pci_iov_unregister - unregister SR-IOV service
+ * @dev: the PCI device
+ */
+void pci_iov_unregister(struct pci_dev *dev)
+{
+ int i, j;
+ struct pci_iov *iov = dev->iov;
+
+ BUG_ON(!iov || !iov->notify);
+
+ if (!iov->nentries)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ for (j = 0; j < iov->nentries; j++)
+ sysfs_remove_file(&iov->ve[i].kobj,
+ &iov->ve[i].attr[j].attr);
+ kfree(iov->ve[i].attr);
+ kfree(iov->ve[i].param);
+ }
+ iov->notify = NULL;
+ iov_release_bus(dev->bus);
+}
+EXPORT_SYMBOL_GPL(pci_iov_unregister);
+
+/**
+ * pci_iov_enable - enable SR-IOV capability
+ * @dev: the PCI device
+ * @numvfs: number of VFs to be available
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_enable(struct pci_dev *dev, int numvfs)
+{
+ int rc;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify)
+ return -EINVAL;
+
+ mutex_lock(&iov->mutex);
+ rc = iov_set_numvfs(iov, numvfs);
+ if (rc)
+ goto done;
+ rc = iov_enable(iov);
+done:
+ mutex_unlock(&iov->mutex);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(pci_iov_enable);
+
+/**
+ * pci_iov_disable - disable SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Should be called upon Physical Function driver removal, and power
+ * state change. All previous allocated Virtual Functions are reclaimed.
+ */
+void pci_iov_disable(struct pci_dev *dev)
+{
+ struct pci_iov *iov = dev->iov;
+
+ BUG_ON(!iov || !iov->notify);
+ mutex_lock(&iov->mutex);
+ iov_disable(iov);
+ mutex_unlock(&iov->mutex);
+}
+EXPORT_SYMBOL_GPL(pci_iov_disable);
+
+/**
+ * pci_iov_read_config - read SR-IOV configurations
+ * @dev: the PCI device
+ * @vfn: Virtual Function Number
+ * @entry: the entry to be read
+ * @buf: the buffer to be filled
+ * @size: size of the buffer
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_read_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf, int size)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify || !iov->ve || !iov->nentries)
+ return -EINVAL;
+
+ if (vfn < 0 || vfn >= iov->totalvfs)
+ return -EINVAL;
+
+ for (i = 0; i < iov->nentries; i++)
+ if (!strcmp(iov->ve[vfn].attr[i].attr.name, entry)) {
+ strncpy(buf, iov->ve[vfn].param[i], size);
+ buf[size - 1] = '\0';
+ return 0;
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(pci_iov_read_config);
+
+/**
+ * pci_iov_write_config - write SR-IOV configurations
+ * @dev: the PCI device
+ * @vfn: Virtual Function Number
+ * @entry: the entry to be written
+ * @buf: the buffer contains configurations
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_write_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify || !iov->ve || !iov->nentries)
+ return -EINVAL;
+
+ if (vfn < 0 || vfn >= iov->totalvfs)
+ return -EINVAL;
+
+ for (i = 0; i < iov->nentries; i++)
+ if (!strcmp(iov->ve[vfn].attr[i].attr.name, entry)) {
+ strncpy(iov->ve[vfn].param[i], buf, PCI_IOV_PARAM_LEN);
+ iov->ve[vfn].param[i][PCI_IOV_PARAM_LEN - 1] = '\0';
+ return 0;
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(pci_iov_write_config);
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c41b783..9494659 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -764,6 +764,9 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
/* Active State Power Management */
pcie_aspm_create_sysfs_dev_files(dev);
+ /* Single Root I/O Virtualization */
+ pci_iov_create_sysfs(dev);
+
return 0;
}
@@ -849,6 +852,7 @@ static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
}
pcie_aspm_remove_sysfs_dev_files(dev);
+ pci_iov_remove_sysfs(dev);
}
/**
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3575124..4cfdbdb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1902,7 +1902,12 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
if (resno <= PCI_ROM_RESOURCE)
return resource_size(res);
- else if (resno <= PCI_BRIDGE_RES_END)
+ else if (resno < PCI_BRIDGE_RESOURCES) {
+ /* may be device specific resource */
+ align = pci_iov_resource_align(dev, resno);
+ if (align)
+ return align;
+ } else if (resno <= PCI_BRIDGE_RES_END)
return res->start;
dev_err(&dev->dev, "alignment: invalid resource #%d\n", resno);
@@ -1919,12 +1924,19 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
*/
int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
{
+ int reg;
+
if (resno < PCI_ROM_RESOURCE) {
*type = pci_bar_unknown;
return PCI_BASE_ADDRESS_0 + 4 * resno;
} else if (resno == PCI_ROM_RESOURCE) {
*type = pci_bar_mem32;
return dev->rom_base_reg;
+ } else if (resno < PCI_BRIDGE_RESOURCES) {
+ /* may be device specific resource */
+ reg = pci_iov_resource_bar(dev, resno, type);
+ if (reg)
+ return reg;
}
dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e2237ad..c66a4bd 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -176,4 +176,59 @@ static inline int pci_ari_enabled(struct pci_dev *dev)
return dev->ari_enabled;
}
+/* Single Root I/O Virtualization */
+#define PCI_IOV_PARAM_LEN 64
+
+struct vf_entry;
+
+struct pci_iov {
+ int cap; /* capability position */
+ int align; /* page size used to map memory space */
+ int is_enabled; /* status of SR-IOV */
+ int nentries; /* number of sysfs entries used by PF driver */
+ u16 totalvfs; /* total VFs associated with the PF */
+ u16 initialvfs; /* initial VFs associated with the PF */
+ u16 numvfs; /* number of VFs available */
+ u16 offset; /* first VF Routing ID offset */
+ u16 stride; /* following VF stride */
+ struct mutex mutex; /* lock for SR-IOV */
+ struct kobject kobj; /* koject for IOV */
+ struct pci_dev *dev; /* Physical Function */
+ struct vf_entry *ve; /* Virtual Function related */
+ int (*notify)(struct pci_dev *, u32); /* event callback function */
+};
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_init(struct pci_dev *dev);
+extern void pci_iov_release(struct pci_dev *dev);
+void pci_iov_create_sysfs(struct pci_dev *dev);
+void pci_iov_remove_sysfs(struct pci_dev *dev);
+extern int pci_iov_resource_align(struct pci_dev *dev, int resno);
+extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type);
+#else
+static inline int pci_iov_init(struct pci_dev *dev)
+{
+ return -EIO;
+}
+static inline void pci_iov_release(struct pci_dev *dev)
+{
+}
+static inline void pci_iov_create_sysfs(struct pci_dev *dev)
+{
+}
+static inline void pci_iov_remove_sysfs(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_resource_align(struct pci_dev *dev, int resno)
+{
+ return 0;
+}
+static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 9c680b8..831d8d0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -845,6 +845,7 @@ static int pci_setup_device(struct pci_dev * dev)
static void pci_release_capabilities(struct pci_dev *dev)
{
pci_vpd_release(dev);
+ pci_iov_release(dev);
}
/**
@@ -1023,6 +1024,9 @@ static void pci_init_capabilities(struct pci_dev *dev)
/* Alternative Routing-ID Forwarding */
pci_enable_ari(dev);
+
+ /* Single Root I/O Virtualization */
+ pci_iov_init(dev);
}
void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 497d639..a7d2fd4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -87,6 +87,12 @@ enum {
/* #6: expansion ROM */
PCI_ROM_RESOURCE,
+ /* device specific resources */
+#ifdef CONFIG_PCI_IOV
+ PCI_IOV_RESOURCES,
+ PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
+#endif
+
/* address space assigned to buses behind the bridge */
#ifndef PCI_BRIDGE_RES_NUM
#define PCI_BRIDGE_RES_NUM 4
@@ -165,6 +171,7 @@ struct pci_cap_saved_state {
struct pcie_link_state;
struct pci_vpd;
+struct pci_iov;
/*
* The pci_dev structure is used to describe PCI devices.
@@ -253,6 +260,7 @@ struct pci_dev {
struct list_head msi_list;
#endif
struct pci_vpd *vpd;
+ struct pci_iov *iov;
};
extern struct pci_dev *alloc_pci_dev(void);
@@ -1128,5 +1136,54 @@ static inline void pci_mmcfg_early_init(void) { }
static inline void pci_mmcfg_late_init(void) { }
#endif
+/* SR-IOV events masks */
+#define PCI_IOV_VIRTFN_ID 0x0000FFFFU /* Virtual Function Number */
+#define PCI_IOV_NUM_VIRTFN 0x0000FFFFU /* num of Virtual Functions */
+#define PCI_IOV_EVENT_TYPE 0x80000000U /* event type (pre/post) */
+/* SR-IOV events values */
+#define PCI_IOV_ENABLE 0x00010000U /* SR-IOV enable request */
+#define PCI_IOV_DISABLE 0x00020000U /* SR-IOV disable request */
+#define PCI_IOV_RD_CONF 0x00040000U /* read configuration */
+#define PCI_IOV_WR_CONF 0x00080000U /* write configuration */
+#define PCI_IOV_POST_EVENT 0x80000000U /* post event */
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_enable(struct pci_dev *dev, int numvfs);
+extern void pci_iov_disable(struct pci_dev *dev);
+extern int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *dev, u32 event), char **entries);
+extern void pci_iov_unregister(struct pci_dev *dev);
+extern int pci_iov_read_config(struct pci_dev *dev, int id,
+ char *entry, char *buf, int size);
+extern int pci_iov_write_config(struct pci_dev *dev, int id,
+ char *entry, char *buf);
+#else
+static inline int pci_iov_enable(struct pci_dev *dev, int numvfs)
+{
+ return -EIO;
+}
+static inline void pci_iov_disable(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *dev, u32 event), char **entries)
+{
+ return -EIO;
+}
+static inline void pci_iov_unregister(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_read_config(struct pci_dev *dev, int id,
+ char *entry, char *buf, int size)
+{
+ return -EIO;
+}
+static inline int pci_iov_write_config(struct pci_dev *dev, int id,
+ char *entry, char *buf)
+{
+ return -EIO;
+}
+#endif /* CONFIG_PCI_IOV */
+
#endif /* __KERNEL__ */
#endif /* LINUX_PCI_H */
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index eb6686b..1b28b3f 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -363,6 +363,7 @@
#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
#define PCI_EXP_DEVCAP 4 /* Device capabilities */
@@ -434,6 +435,7 @@
#define PCI_EXT_CAP_ID_DSN 3
#define PCI_EXT_CAP_ID_PWR 4
#define PCI_EXT_CAP_ID_ARI 14
+#define PCI_EXT_CAP_ID_IOV 16
/* Advanced Error Reporting */
#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
@@ -551,4 +553,23 @@
#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
+/* Single Root I/O Virtualization */
+#define PCI_IOV_CAP 0x04 /* SR-IOV Capabilities */
+#define PCI_IOV_CTRL 0x08 /* SR-IOV Control */
+#define PCI_IOV_CTRL_VFE 0x01 /* VF Enable */
+#define PCI_IOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
+#define PCI_IOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
+#define PCI_IOV_STATUS 0x0a /* SR-IOV Status */
+#define PCI_IOV_INITIAL_VF 0x0c /* Initial VFs */
+#define PCI_IOV_TOTAL_VF 0x0e /* Total VFs */
+#define PCI_IOV_NUM_VF 0x10 /* Number of VFs */
+#define PCI_IOV_FUNC_LINK 0x12 /* Function Dependency Link */
+#define PCI_IOV_VF_OFFSET 0x14 /* First VF Offset */
+#define PCI_IOV_VF_STRIDE 0x16 /* Following VF Stride */
+#define PCI_IOV_VF_DID 0x1a /* VF Device ID */
+#define PCI_IOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
+#define PCI_IOV_SYS_PGSIZE 0x20 /* System Page Size */
+#define PCI_IOV_BAR_0 0x24 /* VF BAR0 */
+#define PCI_IOV_NUM_BAR 6 /* Number of VF BARs */
+
#endif /* LINUX_PCI_REGS_H */
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (10 preceding siblings ...)
2008-10-14 10:59 ` [PATCH 6/8 v4] PCI: support the SR-IOV capability Yu Zhao
@ 2008-10-14 10:59 ` Yu Zhao
2008-10-14 12:30 ` Matthew Wilcox
` (3 more replies)
2008-10-14 11:00 ` [PATCH 7/8 v4] PCI: reserve bus range for the SR-IOV device Yu Zhao
` (3 subsequent siblings)
15 siblings, 4 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 10:59 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Support Single Root I/O Virtualization (SR-IOV) capability.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/Kconfig | 12 +
drivers/pci/Makefile | 2 +
drivers/pci/iov.c | 853 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci-sysfs.c | 4 +
drivers/pci/pci.c | 14 +-
drivers/pci/pci.h | 55 +++
drivers/pci/probe.c | 4 +
include/linux/pci.h | 57 +++
include/linux/pci_regs.h | 21 ++
9 files changed, 1021 insertions(+), 1 deletions(-)
create mode 100644 drivers/pci/iov.c
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index e1ca425..e7c0836 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -50,3 +50,15 @@ config HT_IRQ
This allows native hypertransport devices to use interrupts.
If unsure say Y.
+
+config PCI_IOV
+ bool "PCI SR-IOV support"
+ depends on PCI
+ select PCI_MSI
+ default n
+ help
+ This option allows device drivers to enable Single Root I/O
+ Virtualization. Each Virtual Function's PCI configuration
+ space can be accessed using its own Bus, Device and Function
+ Number (Routing ID). Each Virtual Function also has PCI Memory
+ Space, which is used to map its own register set.
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 7d63f8c..47bb456 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -53,3 +53,5 @@ obj-$(CONFIG_PCI_SYSCALL) += syscall.o
ifeq ($(CONFIG_PCI_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG
endif
+
+obj-$(CONFIG_PCI_IOV) += iov.o
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
new file mode 100644
index 0000000..3cf9709
--- /dev/null
+++ b/drivers/pci/iov.c
@@ -0,0 +1,853 @@
+/*
+ * drivers/pci/iov.c
+ *
+ * Copyright (C) 2008 Intel Corporation
+ *
+ * PCI Express Single Root I/O Virtualization capability support.
+ */
+
+#include <linux/ctype.h>
+#include <linux/string.h>
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <asm/page.h>
+#include "pci.h"
+
+#define VF_NAME_LEN 8
+
+
+struct iov_attr {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject *,
+ struct iov_attr *, char *);
+ ssize_t (*store)(struct kobject *,
+ struct iov_attr *, const char *, size_t);
+};
+
+#define iov_config_attr(field) \
+static ssize_t field##_show(struct kobject *kobj, \
+ struct iov_attr *attr, char *buf) \
+{ \
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj); \
+ \
+ return sprintf(buf, "%d\n", iov->field); \
+}
+
+iov_config_attr(is_enabled);
+iov_config_attr(totalvfs);
+iov_config_attr(initialvfs);
+iov_config_attr(numvfs);
+
+struct vf_entry {
+ int vfn;
+ struct kobject kobj;
+ struct pci_iov *iov;
+ struct iov_attr *attr;
+ char name[VF_NAME_LEN];
+ char (*param)[PCI_IOV_PARAM_LEN];
+};
+
+static ssize_t iov_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct iov_attr *ia = container_of(attr, struct iov_attr, attr);
+
+ return ia->show ? ia->show(kobj, ia, buf) : -EIO;
+}
+
+static ssize_t iov_attr_store(struct kobject *kobj,
+ struct attribute *attr, const char *buf, size_t len)
+{
+ struct iov_attr *ia = container_of(attr, struct iov_attr, attr);
+
+ return ia->store ? ia->store(kobj, ia, buf, len) : -EIO;
+}
+
+static struct sysfs_ops iov_attr_ops = {
+ .show = iov_attr_show,
+ .store = iov_attr_store,
+};
+
+static struct kobj_type iov_ktype = {
+ .sysfs_ops = &iov_attr_ops,
+};
+
+static inline void vf_rid(struct pci_dev *dev, int vfn, u8 *busnr, u8 *devfn)
+{
+ u16 rid;
+
+ rid = (dev->bus->number << 8) + dev->devfn +
+ dev->iov->offset + dev->iov->stride * vfn;
+ *busnr = rid >> 8;
+ *devfn = rid & 0xff;
+}
+
+static int vf_add(struct pci_dev *dev, int vfn)
+{
+ int i;
+ int rc;
+ u8 busnr, devfn;
+ unsigned long size;
+ struct pci_dev *new;
+ struct pci_bus *bus;
+ struct resource *res;
+
+ vf_rid(dev, vfn, &busnr, &devfn);
+
+ new = alloc_pci_dev();
+ if (!new)
+ return -ENOMEM;
+
+ if (dev->bus->number == busnr)
+ new->bus = bus = dev->bus;
+ else {
+ list_for_each_entry(bus, &dev->bus->children, node)
+ if (bus->number == busnr) {
+ new->bus = bus;
+ break;
+ }
+ BUG_ON(!new->bus);
+ }
+
+ new->sysdata = bus->sysdata;
+ new->dev.parent = dev->dev.parent;
+ new->dev.bus = dev->dev.bus;
+ new->devfn = devfn;
+ new->hdr_type = PCI_HEADER_TYPE_NORMAL;
+ new->multifunction = 0;
+ new->vendor = dev->vendor;
+ pci_read_config_word(dev, dev->iov->cap + PCI_IOV_VF_DID, &new->device);
+ new->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
+ new->error_state = pci_channel_io_normal;
+ new->is_pcie = 1;
+ new->pcie_type = PCI_EXP_TYPE_ENDPOINT;
+ new->dma_mask = 0xffffffff;
+
+ dev_set_name(&new->dev, "%04x:%02x:%02x.%d", pci_domain_nr(bus),
+ busnr, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
+ pci_read_config_byte(new, PCI_REVISION_ID, &new->revision);
+ new->class = dev->class;
+ new->current_state = PCI_UNKNOWN;
+ new->irq = 0;
+
+ for (i = 0; i < PCI_IOV_NUM_BAR; i++) {
+ res = dev->resource + PCI_IOV_RESOURCES + i;
+ if (!res->parent)
+ continue;
+ new->resource[i].name = pci_name(new);
+ new->resource[i].flags = res->flags;
+ size = resource_size(res) / dev->iov->totalvfs;
+ new->resource[i].start = res->start + size * vfn;
+ new->resource[i].end = new->resource[i].start + size - 1;
+ rc = request_resource(res, &new->resource[i]);
+ BUG_ON(rc);
+ }
+
+ new->subsystem_vendor = dev->subsystem_vendor;
+ pci_read_config_word(new, PCI_SUBSYSTEM_ID, &new->subsystem_device);
+
+ pci_device_add(new, bus);
+ return pci_bus_add_device(new);
+}
+
+static void vf_remove(struct pci_dev *dev, int vfn)
+{
+ u8 busnr, devfn;
+ struct pci_dev *tmp;
+
+ vf_rid(dev, vfn, &busnr, &devfn);
+
+ tmp = pci_get_bus_and_slot(busnr, devfn);
+ if (!tmp)
+ return;
+
+ pci_dev_put(tmp);
+ pci_remove_bus_device(tmp);
+}
+
+static int iov_enable(struct pci_iov *iov)
+{
+ int rc;
+ int i, j;
+ u16 ctrl;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (iov->is_enabled)
+ return 0;
+
+ iov->notify(iov->dev, iov->numvfs | PCI_IOV_ENABLE);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl |= (PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ for (i = 0; i < iov->numvfs; i++) {
+ rc = vf_add(iov->dev, i);
+ if (rc)
+ goto failed;
+ }
+
+ iov->notify(iov->dev, iov->numvfs |
+ PCI_IOV_ENABLE | PCI_IOV_POST_EVENT);
+ iov->is_enabled = 1;
+ return 0;
+
+failed:
+ for (j = 0; j < i; j++)
+ vf_remove(iov->dev, j);
+
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl &= ~(PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ return rc;
+}
+
+static int iov_disable(struct pci_iov *iov)
+{
+ int i;
+ u16 ctrl;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (!iov->is_enabled)
+ return 0;
+
+ iov->notify(iov->dev, PCI_IOV_DISABLE);
+ for (i = 0; i < iov->numvfs; i++)
+ vf_remove(iov->dev, i);
+
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, &ctrl);
+ ctrl &= ~(PCI_IOV_CTRL_VFE | PCI_IOV_CTRL_MSE);
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ iov->notify(iov->dev, PCI_IOV_DISABLE | PCI_IOV_POST_EVENT);
+ iov->is_enabled = 0;
+ return 0;
+}
+
+static int iov_set_numvfs(struct pci_iov *iov, int numvfs)
+{
+ u16 offset, stride;
+
+ if (!iov->notify)
+ return -ENODEV;
+
+ if (numvfs == iov->numvfs)
+ return 0;
+
+ if (numvfs < 0 || numvfs > iov->initialvfs || iov->is_enabled)
+ return -EINVAL;
+
+ pci_write_config_word(iov->dev, iov->cap + PCI_IOV_NUM_VF, numvfs);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_VF_OFFSET, &offset);
+ pci_read_config_word(iov->dev, iov->cap + PCI_IOV_VF_STRIDE, &stride);
+ if ((numvfs && !offset) || (numvfs > 1 && !stride))
+ return -EIO;
+
+ iov->offset = offset;
+ iov->stride = stride;
+ iov->numvfs = numvfs;
+ return 0;
+}
+
+static ssize_t is_enabled_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int rc;
+ long enable;
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj);
+
+ rc = strict_strtol(buf, 0, &enable);
+ if (rc)
+ return rc;
+
+ mutex_lock(&iov->mutex);
+ switch (enable) {
+ case 0:
+ rc = iov_disable(iov);
+ break;
+ case 1:
+ rc = iov_enable(iov);
+ break;
+ default:
+ rc = -EINVAL;
+ }
+ mutex_unlock(&iov->mutex);
+
+ return rc ? rc : count;
+}
+
+static ssize_t numvfs_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int rc;
+ long numvfs;
+ struct pci_iov *iov = container_of(kobj, struct pci_iov, kobj);
+
+ rc = strict_strtol(buf, 0, &numvfs);
+ if (rc)
+ return rc;
+
+ mutex_lock(&iov->mutex);
+ rc = iov_set_numvfs(iov, numvfs);
+ mutex_unlock(&iov->mutex);
+
+ return rc ? rc : count;
+}
+
+
+static struct iov_attr iov_attr[] = {
+ __ATTR_RO(totalvfs),
+ __ATTR_RO(initialvfs),
+ __ATTR(numvfs, S_IWUSR | S_IRUGO, numvfs_show, numvfs_store),
+ __ATTR(enable, S_IWUSR | S_IRUGO, is_enabled_show, is_enabled_store),
+};
+
+static ssize_t vf_show(struct kobject *kobj, struct iov_attr *attr,
+ char *buf)
+{
+ int vfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vfn = attr - ve->attr;
+ ve->iov->notify(ve->iov->dev, vfn | PCI_IOV_RD_CONF);
+
+ return sprintf(buf, "%s\n", ve->param[vfn]);
+}
+
+static ssize_t vf_store(struct kobject *kobj, struct iov_attr *attr,
+ const char *buf, size_t count)
+{
+ int vfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vfn = attr - ve->attr;
+ sscanf(buf, "%63s", ve->param[vfn]);
+ ve->iov->notify(ve->iov->dev, vfn | PCI_IOV_WR_CONF);
+
+ return count;
+}
+
+static ssize_t rid_show(struct kobject *kobj, struct iov_attr *attr,
+ char *buf)
+{
+ u8 busnr, devfn;
+ struct vf_entry *ve = container_of(kobj, struct vf_entry, kobj);
+
+ vf_rid(ve->iov->dev, ve->vfn, &busnr, &devfn);
+
+ return sprintf(buf, "%04x:%02x:%02x.%d\n",
+ pci_domain_nr(ve->iov->dev->bus),
+ busnr, PCI_SLOT(devfn), PCI_FUNC(devfn));
+}
+
+static struct iov_attr vf_attr = __ATTR_RO(rid);
+
+int iov_alloc_bus(struct pci_bus *bus, int busnr)
+{
+ int i;
+ int rc = 0;
+ struct pci_bus *child, *next;
+ struct list_head head;
+
+ INIT_LIST_HEAD(&head);
+
+ down_write(&pci_bus_sem);
+
+ for (i = bus->number + 1; i <= busnr; i++) {
+ list_for_each_entry(child, &bus->children, node)
+ if (child->number == i)
+ break;
+ if (child->number == i)
+ continue;
+ child = pci_alloc_child_bus(bus, NULL, i);
+ if (!child) {
+ rc = -ENOMEM;
+ break;
+ }
+ child->subordinate = i;
+ child->dev.parent = bus->bridge;
+ rc = device_register(&child->dev);
+ if (rc) {
+ kfree(child);
+ break;
+ }
+ child->is_added = 1;
+ list_add_tail(&child->node, &head);
+ }
+
+ if (rc)
+ list_for_each_entry_safe(child, next, &head, node) {
+ device_unregister(&child->dev);
+ kfree(child);
+ }
+ else
+ list_for_each_entry_safe(child, next, &head, node)
+ list_move_tail(&child->node, &bus->children);
+
+ up_write(&pci_bus_sem);
+
+ return rc;
+}
+
+void iov_release_bus(struct pci_bus *bus)
+{
+ struct pci_dev *dev;
+ struct pci_bus *child, *next;
+ struct list_head head;
+
+ INIT_LIST_HEAD(&head);
+
+ down_write(&pci_bus_sem);
+
+ list_for_each_entry(dev, &bus->devices, bus_list)
+ if (dev->iov && dev->iov->notify)
+ goto done;
+
+ list_for_each_entry_safe(child, next, &bus->children, node)
+ if (!child->bridge)
+ list_move(&child->node, &head);
+done:
+ up_write(&pci_bus_sem);
+
+ list_for_each_entry_safe(child, next, &head, node)
+ pci_remove_bus(child);
+}
+
+/**
+ * pci_iov_init - initialize device's SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ *
+ * The major differences between Virtual Function and PCI device are:
+ * 1) the device with multiple bus numbers uses internal routing, so
+ * there is no explicit bridge device in this case.
+ * 2) Virtual Function memory spaces are designated by BARs encapsulated
+ * in the capability structure, and the BARs in Virtual Function PCI
+ * configuration space are read-only zero.
+ */
+int pci_iov_init(struct pci_dev *dev)
+{
+ int i;
+ int pos;
+ u32 pgsz;
+ u16 ctrl, total, initial, offset, stride;
+ struct pci_iov *iov;
+ struct resource *res;
+
+ if (!dev->is_pcie || (dev->pcie_type != PCI_EXP_TYPE_RC_END &&
+ dev->pcie_type != PCI_EXP_TYPE_ENDPOINT))
+ return -ENODEV;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_IOV);
+ if (!pos)
+ return -ENODEV;
+
+ ctrl = pci_ari_enabled(dev) ? PCI_IOV_CTRL_ARI : 0;
+ pci_write_config_word(dev, pos + PCI_IOV_CTRL, ctrl);
+ ssleep(1);
+
+ pci_read_config_word(dev, pos + PCI_IOV_TOTAL_VF, &total);
+ pci_read_config_word(dev, pos + PCI_IOV_INITIAL_VF, &initial);
+ pci_write_config_word(dev, pos + PCI_IOV_NUM_VF, initial);
+ pci_read_config_word(dev, pos + PCI_IOV_VF_OFFSET, &offset);
+ pci_read_config_word(dev, pos + PCI_IOV_VF_STRIDE, &stride);
+ if (!total || initial > total || (initial && !offset) ||
+ (initial > 1 && !stride))
+ return -EIO;
+
+ pci_read_config_dword(dev, pos + PCI_IOV_SUP_PGSIZE, &pgsz);
+ i = PAGE_SHIFT > 12 ? PAGE_SHIFT - 12 : 0;
+ pgsz &= ~((1 << i) - 1);
+ if (!pgsz)
+ return -EIO;
+
+ pgsz &= ~(pgsz - 1);
+ pci_write_config_dword(dev, pos + PCI_IOV_SYS_PGSIZE, pgsz);
+
+ iov = kzalloc(sizeof(*iov), GFP_KERNEL);
+ if (!iov)
+ return -ENOMEM;
+
+ iov->dev = dev;
+ iov->cap = pos;
+ iov->totalvfs = total;
+ iov->initialvfs = initial;
+ iov->offset = offset;
+ iov->stride = stride;
+ iov->align = pgsz << 12;
+ mutex_init(&iov->mutex);
+
+ for (i = 0; i < PCI_IOV_NUM_BAR; i++) {
+ res = dev->resource + PCI_IOV_RESOURCES + i;
+ pos = iov->cap + PCI_IOV_BAR_0 + i * 4;
+ i += __pci_read_base(dev, pci_bar_unknown, res, pos);
+ if (!res->flags)
+ continue;
+ res->flags &= ~IORESOURCE_SIZEALIGN;
+ res->end = res->start + resource_size(res) * total - 1;
+ }
+
+ dev->iov = iov;
+
+ return 0;
+}
+
+/**
+ * pci_iov_release - release resources used by SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_release(struct pci_dev *dev)
+{
+ if (!dev->iov)
+ return;
+
+ mutex_destroy(&dev->iov->mutex);
+ kfree(dev->iov);
+ dev->iov = NULL;
+}
+
+/**
+ * pci_iov_create_sysfs - create sysfs for SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_create_sysfs(struct pci_dev *dev)
+{
+ int rc;
+ int i, j;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return;
+
+ iov->ve = kzalloc(sizeof(*iov->ve) * iov->totalvfs, GFP_KERNEL);
+ if (!iov->ve)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ iov->ve[i].vfn = i;
+ iov->ve[i].iov = iov;
+ }
+
+ rc = kobject_init_and_add(&iov->kobj, &iov_ktype,
+ &dev->dev.kobj, "iov");
+ if (rc)
+ goto failed1;
+
+ for (i = 0; i < ARRAY_SIZE(iov_attr); i++) {
+ rc = sysfs_create_file(&iov->kobj, &iov_attr[i].attr);
+ if (rc)
+ goto failed2;
+ }
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ sprintf(iov->ve[i].name, "%d", i);
+ rc = kobject_init_and_add(&iov->ve[i].kobj, &iov_ktype,
+ &iov->kobj, iov->ve[i].name);
+ if (rc)
+ goto failed3;
+ rc = sysfs_create_file(&iov->ve[i].kobj, &vf_attr.attr);
+ if (rc) {
+ kobject_put(&iov->ve[i].kobj);
+ goto failed3;
+ }
+ }
+
+ return;
+
+failed3:
+ for (j = 0; j < i; j++) {
+ sysfs_remove_file(&iov->ve[j].kobj, &vf_attr.attr);
+ kobject_put(&iov->ve[j].kobj);
+ }
+failed2:
+ for (j = 0; j < i; j++)
+ sysfs_remove_file(&dev->iov->kobj, &iov_attr[j].attr);
+ kobject_put(&iov->kobj);
+failed1:
+ kfree(iov->ve);
+ iov->ve = NULL;
+
+ dev_err(&dev->dev, "can't create sysfs for SR-IOV.\n");
+}
+
+/**
+ * pci_iov_remove_sysfs - remove sysfs of SR-IOV capability
+ * @dev: the PCI device
+ */
+void pci_iov_remove_sysfs(struct pci_dev *dev)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov || !iov->ve)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ sysfs_remove_file(&iov->ve[i].kobj, &vf_attr.attr);
+ kobject_put(&iov->ve[i].kobj);
+ }
+
+ for (i = 0; i < ARRAY_SIZE(iov_attr); i++)
+ sysfs_remove_file(&dev->iov->kobj, &iov_attr[i].attr);
+
+ kobject_put(&iov->kobj);
+ kfree(iov->ve);
+}
+
+int pci_iov_resource_align(struct pci_dev *dev, int resno)
+{
+ if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
+ return 0;
+
+ BUG_ON(!dev->iov);
+
+ return dev->iov->align;
+}
+
+int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
+ return 0;
+
+ BUG_ON(!dev->iov);
+
+ *type = pci_bar_unknown;
+ return dev->iov->cap + PCI_IOV_BAR_0 +
+ 4 * (resno - PCI_IOV_RESOURCES);
+}
+
+/**
+ * pci_iov_register - register SR-IOV service
+ * @dev: the PCI device
+ * @notify: callback function for SR-IOV events
+ * @entries: sysfs entries used by Physical Function driver
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_register(struct pci_dev *dev, int (*notify)(struct pci_dev *, u32),
+ char **entries)
+{
+ int rc;
+ int n, i, j, k;
+ u8 busnr, devfn;
+ struct iov_attr *attr;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov || !iov->ve)
+ return -ENODEV;
+
+ if (!notify)
+ return -EINVAL;
+
+ vf_rid(dev, iov->totalvfs - 1, &busnr, &devfn);
+ if (busnr > dev->bus->subordinate)
+ return -EIO;
+
+ iov->notify = notify;
+ rc = iov_alloc_bus(dev->bus, busnr);
+ if (rc)
+ return rc;
+
+ for (n = 0; entries && entries[n] && *entries[n]; n++)
+ ;
+ if (!n)
+ return 0;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ rc = -ENOMEM;
+ iov->ve[i].param = kzalloc(PCI_IOV_PARAM_LEN * n, GFP_KERNEL);
+ if (!iov->ve[i].param)
+ goto failed;
+ attr = kzalloc(sizeof(*attr) * n, GFP_KERNEL);
+ if (!attr) {
+ kfree(iov->ve[i].param);
+ goto failed;
+ }
+ iov->ve[i].attr = attr;
+ for (j = 0; j < n; j++) {
+ attr[j].attr.name = entries[j];
+ attr[j].attr.mode = S_IWUSR | S_IRUGO;
+ attr[j].show = vf_show;
+ attr[j].store = vf_store;
+ rc = sysfs_create_file(&iov->ve[i].kobj, &attr[j].attr);
+ if (rc) {
+ while (j--)
+ sysfs_remove_file(&iov->ve[i].kobj,
+ &attr[j].attr);
+ kfree(iov->ve[i].attr);
+ kfree(iov->ve[i].param);
+ goto failed;
+ }
+ }
+ }
+
+ iov->nentries = n;
+ return 0;
+
+failed:
+ for (k = 0; k < i; k++) {
+ for (j = 0; j < n; j++)
+ sysfs_remove_file(&iov->ve[k].kobj,
+ &iov->ve[k].attr[j].attr);
+ kfree(iov->ve[k].attr);
+ kfree(iov->ve[k].param);
+ }
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(pci_iov_register);
+
+/**
+ * pci_iov_unregister - unregister SR-IOV service
+ * @dev: the PCI device
+ */
+void pci_iov_unregister(struct pci_dev *dev)
+{
+ int i, j;
+ struct pci_iov *iov = dev->iov;
+
+ BUG_ON(!iov || !iov->notify);
+
+ if (!iov->nentries)
+ return;
+
+ for (i = 0; i < iov->totalvfs; i++) {
+ for (j = 0; j < iov->nentries; j++)
+ sysfs_remove_file(&iov->ve[i].kobj,
+ &iov->ve[i].attr[j].attr);
+ kfree(iov->ve[i].attr);
+ kfree(iov->ve[i].param);
+ }
+ iov->notify = NULL;
+ iov_release_bus(dev->bus);
+}
+EXPORT_SYMBOL_GPL(pci_iov_unregister);
+
+/**
+ * pci_iov_enable - enable SR-IOV capability
+ * @dev: the PCI device
+ * @numvfs: number of VFs to be available
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_enable(struct pci_dev *dev, int numvfs)
+{
+ int rc;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify)
+ return -EINVAL;
+
+ mutex_lock(&iov->mutex);
+ rc = iov_set_numvfs(iov, numvfs);
+ if (rc)
+ goto done;
+ rc = iov_enable(iov);
+done:
+ mutex_unlock(&iov->mutex);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(pci_iov_enable);
+
+/**
+ * pci_iov_disable - disable SR-IOV capability
+ * @dev: the PCI device
+ *
+ * Should be called upon Physical Function driver removal, and power
+ * state change. All previous allocated Virtual Functions are reclaimed.
+ */
+void pci_iov_disable(struct pci_dev *dev)
+{
+ struct pci_iov *iov = dev->iov;
+
+ BUG_ON(!iov || !iov->notify);
+ mutex_lock(&iov->mutex);
+ iov_disable(iov);
+ mutex_unlock(&iov->mutex);
+}
+EXPORT_SYMBOL_GPL(pci_iov_disable);
+
+/**
+ * pci_iov_read_config - read SR-IOV configurations
+ * @dev: the PCI device
+ * @vfn: Virtual Function Number
+ * @entry: the entry to be read
+ * @buf: the buffer to be filled
+ * @size: size of the buffer
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_read_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf, int size)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify || !iov->ve || !iov->nentries)
+ return -EINVAL;
+
+ if (vfn < 0 || vfn >= iov->totalvfs)
+ return -EINVAL;
+
+ for (i = 0; i < iov->nentries; i++)
+ if (!strcmp(iov->ve[vfn].attr[i].attr.name, entry)) {
+ strncpy(buf, iov->ve[vfn].param[i], size);
+ buf[size - 1] = '\0';
+ return 0;
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(pci_iov_read_config);
+
+/**
+ * pci_iov_write_config - write SR-IOV configurations
+ * @dev: the PCI device
+ * @vfn: Virtual Function Number
+ * @entry: the entry to be written
+ * @buf: the buffer contains configurations
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_iov_write_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf)
+{
+ int i;
+ struct pci_iov *iov = dev->iov;
+
+ if (!iov)
+ return -ENODEV;
+
+ if (!iov->notify || !iov->ve || !iov->nentries)
+ return -EINVAL;
+
+ if (vfn < 0 || vfn >= iov->totalvfs)
+ return -EINVAL;
+
+ for (i = 0; i < iov->nentries; i++)
+ if (!strcmp(iov->ve[vfn].attr[i].attr.name, entry)) {
+ strncpy(iov->ve[vfn].param[i], buf, PCI_IOV_PARAM_LEN);
+ iov->ve[vfn].param[i][PCI_IOV_PARAM_LEN - 1] = '\0';
+ return 0;
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(pci_iov_write_config);
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c41b783..9494659 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -764,6 +764,9 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
/* Active State Power Management */
pcie_aspm_create_sysfs_dev_files(dev);
+ /* Single Root I/O Virtualization */
+ pci_iov_create_sysfs(dev);
+
return 0;
}
@@ -849,6 +852,7 @@ static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
}
pcie_aspm_remove_sysfs_dev_files(dev);
+ pci_iov_remove_sysfs(dev);
}
/**
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3575124..4cfdbdb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1902,7 +1902,12 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
if (resno <= PCI_ROM_RESOURCE)
return resource_size(res);
- else if (resno <= PCI_BRIDGE_RES_END)
+ else if (resno < PCI_BRIDGE_RESOURCES) {
+ /* may be device specific resource */
+ align = pci_iov_resource_align(dev, resno);
+ if (align)
+ return align;
+ } else if (resno <= PCI_BRIDGE_RES_END)
return res->start;
dev_err(&dev->dev, "alignment: invalid resource #%d\n", resno);
@@ -1919,12 +1924,19 @@ int pci_resource_alignment(struct pci_dev *dev, int resno)
*/
int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
{
+ int reg;
+
if (resno < PCI_ROM_RESOURCE) {
*type = pci_bar_unknown;
return PCI_BASE_ADDRESS_0 + 4 * resno;
} else if (resno == PCI_ROM_RESOURCE) {
*type = pci_bar_mem32;
return dev->rom_base_reg;
+ } else if (resno < PCI_BRIDGE_RESOURCES) {
+ /* may be device specific resource */
+ reg = pci_iov_resource_bar(dev, resno, type);
+ if (reg)
+ return reg;
}
dev_err(&dev->dev, "BAR: invalid resource #%d\n", resno);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e2237ad..c66a4bd 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -176,4 +176,59 @@ static inline int pci_ari_enabled(struct pci_dev *dev)
return dev->ari_enabled;
}
+/* Single Root I/O Virtualization */
+#define PCI_IOV_PARAM_LEN 64
+
+struct vf_entry;
+
+struct pci_iov {
+ int cap; /* capability position */
+ int align; /* page size used to map memory space */
+ int is_enabled; /* status of SR-IOV */
+ int nentries; /* number of sysfs entries used by PF driver */
+ u16 totalvfs; /* total VFs associated with the PF */
+ u16 initialvfs; /* initial VFs associated with the PF */
+ u16 numvfs; /* number of VFs available */
+ u16 offset; /* first VF Routing ID offset */
+ u16 stride; /* following VF stride */
+ struct mutex mutex; /* lock for SR-IOV */
+ struct kobject kobj; /* koject for IOV */
+ struct pci_dev *dev; /* Physical Function */
+ struct vf_entry *ve; /* Virtual Function related */
+ int (*notify)(struct pci_dev *, u32); /* event callback function */
+};
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_init(struct pci_dev *dev);
+extern void pci_iov_release(struct pci_dev *dev);
+void pci_iov_create_sysfs(struct pci_dev *dev);
+void pci_iov_remove_sysfs(struct pci_dev *dev);
+extern int pci_iov_resource_align(struct pci_dev *dev, int resno);
+extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type);
+#else
+static inline int pci_iov_init(struct pci_dev *dev)
+{
+ return -EIO;
+}
+static inline void pci_iov_release(struct pci_dev *dev)
+{
+}
+static inline void pci_iov_create_sysfs(struct pci_dev *dev)
+{
+}
+static inline void pci_iov_remove_sysfs(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_resource_align(struct pci_dev *dev, int resno)
+{
+ return 0;
+}
+static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
+ enum pci_bar_type *type)
+{
+ return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 9c680b8..831d8d0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -845,6 +845,7 @@ static int pci_setup_device(struct pci_dev * dev)
static void pci_release_capabilities(struct pci_dev *dev)
{
pci_vpd_release(dev);
+ pci_iov_release(dev);
}
/**
@@ -1023,6 +1024,9 @@ static void pci_init_capabilities(struct pci_dev *dev)
/* Alternative Routing-ID Forwarding */
pci_enable_ari(dev);
+
+ /* Single Root I/O Virtualization */
+ pci_iov_init(dev);
}
void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 497d639..a7d2fd4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -87,6 +87,12 @@ enum {
/* #6: expansion ROM */
PCI_ROM_RESOURCE,
+ /* device specific resources */
+#ifdef CONFIG_PCI_IOV
+ PCI_IOV_RESOURCES,
+ PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
+#endif
+
/* address space assigned to buses behind the bridge */
#ifndef PCI_BRIDGE_RES_NUM
#define PCI_BRIDGE_RES_NUM 4
@@ -165,6 +171,7 @@ struct pci_cap_saved_state {
struct pcie_link_state;
struct pci_vpd;
+struct pci_iov;
/*
* The pci_dev structure is used to describe PCI devices.
@@ -253,6 +260,7 @@ struct pci_dev {
struct list_head msi_list;
#endif
struct pci_vpd *vpd;
+ struct pci_iov *iov;
};
extern struct pci_dev *alloc_pci_dev(void);
@@ -1128,5 +1136,54 @@ static inline void pci_mmcfg_early_init(void) { }
static inline void pci_mmcfg_late_init(void) { }
#endif
+/* SR-IOV events masks */
+#define PCI_IOV_VIRTFN_ID 0x0000FFFFU /* Virtual Function Number */
+#define PCI_IOV_NUM_VIRTFN 0x0000FFFFU /* num of Virtual Functions */
+#define PCI_IOV_EVENT_TYPE 0x80000000U /* event type (pre/post) */
+/* SR-IOV events values */
+#define PCI_IOV_ENABLE 0x00010000U /* SR-IOV enable request */
+#define PCI_IOV_DISABLE 0x00020000U /* SR-IOV disable request */
+#define PCI_IOV_RD_CONF 0x00040000U /* read configuration */
+#define PCI_IOV_WR_CONF 0x00080000U /* write configuration */
+#define PCI_IOV_POST_EVENT 0x80000000U /* post event */
+
+#ifdef CONFIG_PCI_IOV
+extern int pci_iov_enable(struct pci_dev *dev, int numvfs);
+extern void pci_iov_disable(struct pci_dev *dev);
+extern int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *dev, u32 event), char **entries);
+extern void pci_iov_unregister(struct pci_dev *dev);
+extern int pci_iov_read_config(struct pci_dev *dev, int id,
+ char *entry, char *buf, int size);
+extern int pci_iov_write_config(struct pci_dev *dev, int id,
+ char *entry, char *buf);
+#else
+static inline int pci_iov_enable(struct pci_dev *dev, int numvfs)
+{
+ return -EIO;
+}
+static inline void pci_iov_disable(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *dev, u32 event), char **entries)
+{
+ return -EIO;
+}
+static inline void pci_iov_unregister(struct pci_dev *dev)
+{
+}
+static inline int pci_iov_read_config(struct pci_dev *dev, int id,
+ char *entry, char *buf, int size)
+{
+ return -EIO;
+}
+static inline int pci_iov_write_config(struct pci_dev *dev, int id,
+ char *entry, char *buf)
+{
+ return -EIO;
+}
+#endif /* CONFIG_PCI_IOV */
+
#endif /* __KERNEL__ */
#endif /* LINUX_PCI_H */
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index eb6686b..1b28b3f 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -363,6 +363,7 @@
#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
#define PCI_EXP_DEVCAP 4 /* Device capabilities */
@@ -434,6 +435,7 @@
#define PCI_EXT_CAP_ID_DSN 3
#define PCI_EXT_CAP_ID_PWR 4
#define PCI_EXT_CAP_ID_ARI 14
+#define PCI_EXT_CAP_ID_IOV 16
/* Advanced Error Reporting */
#define PCI_ERR_UNCOR_STATUS 4 /* Uncorrectable Error Status */
@@ -551,4 +553,23 @@
#define PCI_ARI_CTRL_ACS 0x0002 /* ACS Function Groups Enable */
#define PCI_ARI_CTRL_FG(x) (((x) >> 4) & 7) /* Function Group */
+/* Single Root I/O Virtualization */
+#define PCI_IOV_CAP 0x04 /* SR-IOV Capabilities */
+#define PCI_IOV_CTRL 0x08 /* SR-IOV Control */
+#define PCI_IOV_CTRL_VFE 0x01 /* VF Enable */
+#define PCI_IOV_CTRL_MSE 0x08 /* VF Memory Space Enable */
+#define PCI_IOV_CTRL_ARI 0x10 /* ARI Capable Hierarchy */
+#define PCI_IOV_STATUS 0x0a /* SR-IOV Status */
+#define PCI_IOV_INITIAL_VF 0x0c /* Initial VFs */
+#define PCI_IOV_TOTAL_VF 0x0e /* Total VFs */
+#define PCI_IOV_NUM_VF 0x10 /* Number of VFs */
+#define PCI_IOV_FUNC_LINK 0x12 /* Function Dependency Link */
+#define PCI_IOV_VF_OFFSET 0x14 /* First VF Offset */
+#define PCI_IOV_VF_STRIDE 0x16 /* Following VF Stride */
+#define PCI_IOV_VF_DID 0x1a /* VF Device ID */
+#define PCI_IOV_SUP_PGSIZE 0x1c /* Supported Page Sizes */
+#define PCI_IOV_SYS_PGSIZE 0x20 /* System Page Size */
+#define PCI_IOV_BAR_0 0x24 /* VF BAR0 */
+#define PCI_IOV_NUM_BAR 6 /* Number of VF BARs */
+
#endif /* LINUX_PCI_REGS_H */
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:59 ` Yu Zhao
@ 2008-10-14 12:30 ` Matthew Wilcox
2008-10-14 12:30 ` Matthew Wilcox
` (2 subsequent siblings)
3 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2008-10-14 12:30 UTC (permalink / raw)
To: Yu Zhao
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, linux-pci@vger.kernel.org, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
greg@kroah.com
On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
> +++ b/drivers/pci/pci.h
> @@ -176,4 +176,59 @@ static inline int pci_ari_enabled(struct pci_dev *dev)
> +struct pci_iov {
> + int cap; /* capability position */
> + int align; /* page size used to map memory space */
> + int is_enabled; /* status of SR-IOV */
> + int nentries; /* number of sysfs entries used by PF driver */
> + u16 totalvfs; /* total VFs associated with the PF */
> + u16 initialvfs; /* initial VFs associated with the PF */
> + u16 numvfs; /* number of VFs available */
> + u16 offset; /* first VF Routing ID offset */
> + u16 stride; /* following VF stride */
> + struct mutex mutex; /* lock for SR-IOV */
> + struct kobject kobj; /* koject for IOV */
> + struct pci_dev *dev; /* Physical Function */
> + struct vf_entry *ve; /* Virtual Function related */
> + int (*notify)(struct pci_dev *, u32); /* event callback function */
> +};
> +++ b/include/linux/pci.h
> @@ -87,6 +87,12 @@ enum {
> /* #6: expansion ROM */
> PCI_ROM_RESOURCE,
>
> + /* device specific resources */
> +#ifdef CONFIG_PCI_IOV
> + PCI_IOV_RESOURCES,
> + PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
> +#endif
> +
> /* address space assigned to buses behind the bridge */
> #ifndef PCI_BRIDGE_RES_NUM
> #define PCI_BRIDGE_RES_NUM 4
Why expand the number of resources in struct pci_dev instead of putting
the new resources in struct pci_iov?
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:59 ` Yu Zhao
2008-10-14 12:30 ` Matthew Wilcox
@ 2008-10-14 12:30 ` Matthew Wilcox
2008-10-15 2:04 ` Zhao, Yu
2008-10-15 2:04 ` Zhao, Yu
2008-10-14 14:37 ` Greg KH
2008-10-14 14:37 ` Greg KH
3 siblings, 2 replies; 25+ messages in thread
From: Matthew Wilcox @ 2008-10-14 12:30 UTC (permalink / raw)
To: Yu Zhao
Cc: linux-pci@vger.kernel.org, jbarnes@virtuousgeek.org,
randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, rdreier@cisco.com, greg@kroah.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org
On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
> +++ b/drivers/pci/pci.h
> @@ -176,4 +176,59 @@ static inline int pci_ari_enabled(struct pci_dev *dev)
> +struct pci_iov {
> + int cap; /* capability position */
> + int align; /* page size used to map memory space */
> + int is_enabled; /* status of SR-IOV */
> + int nentries; /* number of sysfs entries used by PF driver */
> + u16 totalvfs; /* total VFs associated with the PF */
> + u16 initialvfs; /* initial VFs associated with the PF */
> + u16 numvfs; /* number of VFs available */
> + u16 offset; /* first VF Routing ID offset */
> + u16 stride; /* following VF stride */
> + struct mutex mutex; /* lock for SR-IOV */
> + struct kobject kobj; /* koject for IOV */
> + struct pci_dev *dev; /* Physical Function */
> + struct vf_entry *ve; /* Virtual Function related */
> + int (*notify)(struct pci_dev *, u32); /* event callback function */
> +};
> +++ b/include/linux/pci.h
> @@ -87,6 +87,12 @@ enum {
> /* #6: expansion ROM */
> PCI_ROM_RESOURCE,
>
> + /* device specific resources */
> +#ifdef CONFIG_PCI_IOV
> + PCI_IOV_RESOURCES,
> + PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
> +#endif
> +
> /* address space assigned to buses behind the bridge */
> #ifndef PCI_BRIDGE_RES_NUM
> #define PCI_BRIDGE_RES_NUM 4
Why expand the number of resources in struct pci_dev instead of putting
the new resources in struct pci_iov?
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 12:30 ` Matthew Wilcox
@ 2008-10-15 2:04 ` Zhao, Yu
2008-10-15 2:04 ` Zhao, Yu
1 sibling, 0 replies; 25+ messages in thread
From: Zhao, Yu @ 2008-10-15 2:04 UTC (permalink / raw)
To: Matthew Wilcox
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, linux-pci@vger.kernel.org, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
greg@kroah.com
Matthew Wilcox wrote:
> On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
>> +++ b/include/linux/pci.h
>> @@ -87,6 +87,12 @@ enum {
>> /* #6: expansion ROM */
>> PCI_ROM_RESOURCE,
>>
>> + /* device specific resources */
>> +#ifdef CONFIG_PCI_IOV
>> + PCI_IOV_RESOURCES,
>> + PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
>> +#endif
>> +
>> /* address space assigned to buses behind the bridge */
>> #ifndef PCI_BRIDGE_RES_NUM
>> #define PCI_BRIDGE_RES_NUM 4
>
> Why expand the number of resources in struct pci_dev instead of putting
> the new resources in struct pci_iov?
Yes, it's supposed to be in the 'struct pci_iov', and the resources used
to be there in early version. But later I found all resource related
functions such as pci_assign_resource, pdev_sort_resources,
pbus_size_mem, etc. assume the resources are bundled with 'struct
pci_dev' and address them using their indexes. Encapsulating resources
into 'pci_iov' will impact all these functions. And I think we can
postpone the change of these functions until the PCIM comes out, if the
IOV is the only one who uses non-standard resources.
>
> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 12:30 ` Matthew Wilcox
2008-10-15 2:04 ` Zhao, Yu
@ 2008-10-15 2:04 ` Zhao, Yu
1 sibling, 0 replies; 25+ messages in thread
From: Zhao, Yu @ 2008-10-15 2:04 UTC (permalink / raw)
To: Matthew Wilcox
Cc: linux-pci@vger.kernel.org, jbarnes@virtuousgeek.org,
randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, rdreier@cisco.com, greg@kroah.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org
Matthew Wilcox wrote:
> On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
>> +++ b/include/linux/pci.h
>> @@ -87,6 +87,12 @@ enum {
>> /* #6: expansion ROM */
>> PCI_ROM_RESOURCE,
>>
>> + /* device specific resources */
>> +#ifdef CONFIG_PCI_IOV
>> + PCI_IOV_RESOURCES,
>> + PCI_IOV_RESOURCES_END = PCI_IOV_RESOURCES + PCI_IOV_NUM_BAR - 1,
>> +#endif
>> +
>> /* address space assigned to buses behind the bridge */
>> #ifndef PCI_BRIDGE_RES_NUM
>> #define PCI_BRIDGE_RES_NUM 4
>
> Why expand the number of resources in struct pci_dev instead of putting
> the new resources in struct pci_iov?
Yes, it's supposed to be in the 'struct pci_iov', and the resources used
to be there in early version. But later I found all resource related
functions such as pci_assign_resource, pdev_sort_resources,
pbus_size_mem, etc. assume the resources are bundled with 'struct
pci_dev' and address them using their indexes. Encapsulating resources
into 'pci_iov' will impact all these functions. And I think we can
postpone the change of these functions until the PCIM comes out, if the
IOV is the only one who uses non-standard resources.
>
> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:59 ` Yu Zhao
2008-10-14 12:30 ` Matthew Wilcox
2008-10-14 12:30 ` Matthew Wilcox
@ 2008-10-14 14:37 ` Greg KH
2008-10-14 14:37 ` Greg KH
3 siblings, 0 replies; 25+ messages in thread
From: Greg KH @ 2008-10-14 14:37 UTC (permalink / raw)
To: Yu Zhao
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, linux-pci@vger.kernel.org,
rdreier@cisco.com, linux-kernel@vger.kernel.org,
jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
> +struct pci_iov {
> + int cap; /* capability position */
> + int align; /* page size used to map memory space */
> + int is_enabled; /* status of SR-IOV */
> + int nentries; /* number of sysfs entries used by PF driver */
> + u16 totalvfs; /* total VFs associated with the PF */
> + u16 initialvfs; /* initial VFs associated with the PF */
> + u16 numvfs; /* number of VFs available */
> + u16 offset; /* first VF Routing ID offset */
> + u16 stride; /* following VF stride */
> + struct mutex mutex; /* lock for SR-IOV */
> + struct kobject kobj; /* koject for IOV */
Why isn't this a real struct device?
That way you get all of the proper userspace notification and the like,
with kobjects, you do not.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 6/8 v4] PCI: support the SR-IOV capability
2008-10-14 10:59 ` Yu Zhao
` (2 preceding siblings ...)
2008-10-14 14:37 ` Greg KH
@ 2008-10-14 14:37 ` Greg KH
3 siblings, 0 replies; 25+ messages in thread
From: Greg KH @ 2008-10-14 14:37 UTC (permalink / raw)
To: Yu Zhao
Cc: linux-pci@vger.kernel.org, jbarnes@virtuousgeek.org,
randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, rdreier@cisco.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org
On Tue, Oct 14, 2008 at 06:59:28PM +0800, Yu Zhao wrote:
> +struct pci_iov {
> + int cap; /* capability position */
> + int align; /* page size used to map memory space */
> + int is_enabled; /* status of SR-IOV */
> + int nentries; /* number of sysfs entries used by PF driver */
> + u16 totalvfs; /* total VFs associated with the PF */
> + u16 initialvfs; /* initial VFs associated with the PF */
> + u16 numvfs; /* number of VFs available */
> + u16 offset; /* first VF Routing ID offset */
> + u16 stride; /* following VF stride */
> + struct mutex mutex; /* lock for SR-IOV */
> + struct kobject kobj; /* koject for IOV */
Why isn't this a real struct device?
That way you get all of the proper userspace notification and the like,
with kobjects, you do not.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 7/8 v4] PCI: reserve bus range for the SR-IOV device
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (11 preceding siblings ...)
2008-10-14 10:59 ` Yu Zhao
@ 2008-10-14 11:00 ` Yu Zhao
2008-10-14 11:00 ` Yu Zhao
` (2 subsequent siblings)
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 11:00 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Reserve bus range for SR-IOV at device scanning stage.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 24 ++++++++++++++++++++++++
drivers/pci/pci.h | 5 +++++
drivers/pci/probe.c | 3 +++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 3cf9709..7685c6b 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -603,6 +603,30 @@ void pci_iov_remove_sysfs(struct pci_dev *dev)
kfree(iov->ve);
}
+/**
+ * pci_iov_bus_range - find bus range used by SR-IOV capability
+ * @bus: the PCI bus
+ *
+ * Returns max number of buses (exclude current one) used by Virtual
+ * Functions.
+ */
+int pci_iov_bus_range(struct pci_bus *bus)
+{
+ int max = 0;
+ u8 busnr, devfn;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ if (!dev->iov)
+ continue;
+ vf_rid(dev, dev->iov->totalvfs - 1, &busnr, &devfn);
+ if (busnr > max)
+ max = busnr;
+ }
+
+ return max ? max - bus->number : 0;
+}
+
int pci_iov_resource_align(struct pci_dev *dev, int resno)
{
if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c66a4bd..71149b5 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -206,6 +206,7 @@ void pci_iov_remove_sysfs(struct pci_dev *dev);
extern int pci_iov_resource_align(struct pci_dev *dev, int resno);
extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
enum pci_bar_type *type);
+extern int pci_iov_bus_range(struct pci_bus *bus);
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
@@ -229,6 +230,10 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
{
return 0;
}
+extern inline int pci_iov_bus_range(struct pci_bus *bus)
+{
+ return 0;
+}
#endif /* CONFIG_PCI_IOV */
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 831d8d0..b11f4b8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1129,6 +1129,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
for (devfn = 0; devfn < 0x100; devfn += 8)
pci_scan_slot(bus, devfn);
+ /* Reserve buses for SR-IOV capability. */
+ max += pci_iov_bus_range(bus);
+
/*
* After performing arch-dependent fixup of the bus, look behind
* all PCI-to-PCI bridges on this bus.
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 7/8 v4] PCI: reserve bus range for the SR-IOV device
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (12 preceding siblings ...)
2008-10-14 11:00 ` [PATCH 7/8 v4] PCI: reserve bus range for the SR-IOV device Yu Zhao
@ 2008-10-14 11:00 ` Yu Zhao
2008-10-14 11:01 ` [PATCH 8/8 v4] PCI: document the changes Yu Zhao
2008-10-14 11:01 ` Yu Zhao
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 11:00 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Reserve bus range for SR-IOV at device scanning stage.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
drivers/pci/iov.c | 24 ++++++++++++++++++++++++
drivers/pci/pci.h | 5 +++++
drivers/pci/probe.c | 3 +++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 3cf9709..7685c6b 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -603,6 +603,30 @@ void pci_iov_remove_sysfs(struct pci_dev *dev)
kfree(iov->ve);
}
+/**
+ * pci_iov_bus_range - find bus range used by SR-IOV capability
+ * @bus: the PCI bus
+ *
+ * Returns max number of buses (exclude current one) used by Virtual
+ * Functions.
+ */
+int pci_iov_bus_range(struct pci_bus *bus)
+{
+ int max = 0;
+ u8 busnr, devfn;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ if (!dev->iov)
+ continue;
+ vf_rid(dev, dev->iov->totalvfs - 1, &busnr, &devfn);
+ if (busnr > max)
+ max = busnr;
+ }
+
+ return max ? max - bus->number : 0;
+}
+
int pci_iov_resource_align(struct pci_dev *dev, int resno)
{
if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCES_END)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c66a4bd..71149b5 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -206,6 +206,7 @@ void pci_iov_remove_sysfs(struct pci_dev *dev);
extern int pci_iov_resource_align(struct pci_dev *dev, int resno);
extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
enum pci_bar_type *type);
+extern int pci_iov_bus_range(struct pci_bus *bus);
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
@@ -229,6 +230,10 @@ static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno,
{
return 0;
}
+extern inline int pci_iov_bus_range(struct pci_bus *bus)
+{
+ return 0;
+}
#endif /* CONFIG_PCI_IOV */
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 831d8d0..b11f4b8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1129,6 +1129,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
for (devfn = 0; devfn < 0x100; devfn += 8)
pci_scan_slot(bus, devfn);
+ /* Reserve buses for SR-IOV capability. */
+ max += pci_iov_bus_range(bus);
+
/*
* After performing arch-dependent fixup of the bus, look behind
* all PCI-to-PCI bridges on this bus.
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 8/8 v4] PCI: document the changes
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (13 preceding siblings ...)
2008-10-14 11:00 ` Yu Zhao
@ 2008-10-14 11:01 ` Yu Zhao
2008-10-14 11:01 ` Yu Zhao
15 siblings, 0 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 11:01 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, greg@kroah.com, rdreier@cisco.com,
linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Create how-to for SR-IOV user and device driver developer.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
Documentation/DocBook/kernel-api.tmpl | 1 +
Documentation/PCI/pci-iov-howto.txt | 222 +++++++++++++++++++++++++++++++++
2 files changed, 223 insertions(+), 0 deletions(-)
create mode 100644 Documentation/PCI/pci-iov-howto.txt
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index b7b1482..5cb6491 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -251,6 +251,7 @@ X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/rom.c
+!Edrivers/pci/iov.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt
new file mode 100644
index 0000000..15d846d
--- /dev/null
+++ b/Documentation/PCI/pci-iov-howto.txt
@@ -0,0 +1,222 @@
+ PCI Express Single Root I/O Virtualization HOWTO
+ Copyright (C) 2008 Intel Corporation
+
+
+1. Overview
+
+1.1 What is SR-IOV
+
+Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
+capability which makes one physical device appear as multiple virtual
+devices. The physical device is referred to as Physical Function while
+the virtual devices are referred to as Virtual Functions. Allocation
+of Virtual Functions can be dynamically controlled by Physical Function
+via registers encapsulated in the capability. By default, this feature
+is not enabled and the Physical Function behaves as traditional PCIe
+device. Once it's turned on, each Virtual Function's PCI configuration
+space can be accessed by its own Bus, Device and Function Number (Routing
+ID). And each Virtual Function also has PCI Memory Space, which is used
+to map its register set. Virtual Function device driver operates on the
+register set so it can be functional and appear as a real existing PCI
+device.
+
+2. User Guide
+
+2.1 How can I manage SR-IOV
+
+If a device supports SR-IOV, then there should be some entries under
+Physical Function's PCI device directory. These entries are in directory:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/
+ (XXXX:BB:DD:F is domain:bus:dev:fun)
+and
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N
+ (N is VF number from 0 to initialvfs-1)
+
+To enable or disable SR-IOV:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/enable
+ (writing 1/0 means enable/disable VFs, state change will
+ notify PF driver)
+
+To change number of Virtual Functions:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/numvfs
+ (writing positive integer to this file will change NumVFs)
+
+The total and initial number of VFs can get from:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/totalvfs
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/initialvfs
+
+The identifier of a VF that belongs to this PF can get from:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N/rid
+
+2.2 How can I use Virtual Functions
+
+Virtual Functions are treated as hot-plugged PCI devices in the kernel,
+so they should be able to work in the same way as real PCI devices.
+NOTE: Virtual Function device driver must be loaded to make it work.
+
+
+3. Developer Guide
+
+3.1 SR-IOV APIs
+
+To register SR-IOV service, Physical Function device driver needs to call:
+ int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *, u32), char **entries)
+ The 'notify' is a callback function that the SR-IOV code will invoke
+ it when events related to VFs happen (e.g. user read/write the sysfs
+ entries). The first argument is PF itself, the second argument is
+ event type and value. For now, following events type are supported:
+ - PCI_IOV_ENABLE: SR-IOV enable request
+ - PCI_IOV_DISABLE: SR-IOV disable request
+ - PCI_IOV_RD_CONF: read configuration
+ - PCI_IOV_WR_CONF: write configuration
+ - PCI_IOV_POST_EVENT: post event
+ And event values can be extract using following masks:
+ - PCI_IOV_VIRTFN_ID: Virtual Function Number
+ - PCI_IOV_NUM_VIRTFN: num of Virtual Functions
+ - PCI_IOV_EVENT_TYPE: event type (pre/post)
+ The 'entries' is is a list of sysfs entry names that will be to
+ created by the SR-IOV code.
+
+Note: entries could be NULL if PF driver doesn't want to create new entries
+under /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N/.
+
+To unregister SR-IOV service, Physical Function device driver needs to call:
+ void pci_iov_unregister(struct pci_dev *dev)
+
+To enable SR-IOV, Physical Function device driver needs to call:
+ int pci_iov_enable(struct pci_dev *dev, int numvfs)
+ 'numvfs' is the number of VFs that PF wants to enable.
+
+To disable SR-IOV, Physical Function device driver needs to call:
+ void pci_iov_disable(struct pci_dev *dev)
+
+Note: above two functions sleeps 1 second waiting on hardware transaction
+completion according to SR-IOV specification.
+
+To read or write VFs configuration:
+ - int pci_iov_read_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf, int size);
+ - int pci_iov_write_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf);
+3.2 Usage example
+
+Following piece of code illustrates the usage of APIs above.
+
+static char *entries[] = { "foo", "bar", NULL };
+
+static int callback(struct pci_dev *dev, u32 event)
+{
+ int err;
+ int vfn;
+ int numvfs;
+
+ if (event & PCI_IOV_ENABLE) {
+ /*
+ * request to enable SR-IOV, NumVFs is available.
+ * Note: if the PF want to support PM, it has to
+ * check the device power state here to see if
+ * the request is allowed or not.
+ */
+
+ numvfs = event & PCI_IOV_NUM_VIRTFN;
+
+ } else if (event & PCI_IOV_DISABLE) {
+ /*
+ * request to disable SR-IOV.
+ */
+ ...
+
+ } else if (event & PCI_IOV_RD_CONF) {
+ /*
+ * request to read VF configuration, Virtual
+ * Function Number is available.
+ */
+
+ vfn = event & PCI_IOV_VIRTFN_ID;
+
+ /* pass the config to SR-IOV code so user can read it */
+ err = pci_iov_write_config(dev, vfn, entry, buf);
+
+ } else if (event & PCI_IOV_WR_CONF) {
+ /*
+ * request to write VF configuration, Virtual
+ * Function Number is available.
+ */
+
+ vfn = event & PCI_IOV_VIRTFN_ID;
+
+ /* read the config that has been written by user */
+ err = pci_iov_read_config(dev, vfn, entry, buf, size);
+
+ } else
+ return -EINVAL;
+
+ return err;
+}
+
+static int __devinit dev_probe(struct pci_dev *dev,
+ const struct pci_device_id *id)
+{
+ int err;
+
+ err = pci_iov_register(dev, callback, entries);
+ ...
+
+ err = pci_iov_enable(dev, nr_virtfn, callback);
+
+ ...
+
+ return err;
+}
+
+static void __devexit dev_remove(struct pci_dev *dev)
+{
+ ...
+
+ pci_iov_disable(dev);
+
+ ...
+
+ pci_iov_unregister(dev);
+
+ ...
+}
+
+#ifdef CONFIG_PM
+/*
+ * If Physical Function supports the power management, then the
+ * SR-IOV needs to be disabled before the adapter goes to sleep,
+ * because Virtual Functions will not work when the adapter is in
+ * the power-saving mode.
+ * The SR-IOV can be enabled again after the adapter wakes up.
+ */
+static int dev_suspend(struct pci_dev *dev, pm_message_t state)
+{
+ ...
+
+ pci_iov_disable(dev);
+
+ ...
+}
+
+static int dev_resume(struct pci_dev *dev)
+{
+ ...
+
+ pci_iov_enable(dev, numvfs);
+
+ ...
+}
+#endif
+
+static struct pci_driver dev_driver = {
+ .name = "SR-IOV Physical Function driver",
+ .id_table = dev_id_table,
+ .probe = dev_probe,
+ .remove = __devexit_p(dev_remove),
+#ifdef CONFIG_PM
+ .suspend = dev_suspend,
+ .resume = dev_resume,
+#endif
+};
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 8/8 v4] PCI: document the changes
2008-10-14 10:34 [PATCH 0/8 v4] PCI: Linux kernel SR-IOV support Yu Zhao
` (14 preceding siblings ...)
2008-10-14 11:01 ` [PATCH 8/8 v4] PCI: document the changes Yu Zhao
@ 2008-10-14 11:01 ` Yu Zhao
2008-10-17 22:54 ` Pavel Machek
2008-10-17 22:54 ` Pavel Machek
15 siblings, 2 replies; 25+ messages in thread
From: Yu Zhao @ 2008-10-14 11:01 UTC (permalink / raw)
To: linux-pci@vger.kernel.org
Cc: jbarnes@virtuousgeek.org, randy.dunlap@oracle.com,
grundler@parisc-linux.org, achiang@hp.com, matthew@wil.cx,
rdreier@cisco.com, greg@kroah.com, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Create how-to for SR-IOV user and device driver developer.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
---
Documentation/DocBook/kernel-api.tmpl | 1 +
Documentation/PCI/pci-iov-howto.txt | 222 +++++++++++++++++++++++++++++++++
2 files changed, 223 insertions(+), 0 deletions(-)
create mode 100644 Documentation/PCI/pci-iov-howto.txt
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index b7b1482..5cb6491 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -251,6 +251,7 @@ X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/rom.c
+!Edrivers/pci/iov.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt
new file mode 100644
index 0000000..15d846d
--- /dev/null
+++ b/Documentation/PCI/pci-iov-howto.txt
@@ -0,0 +1,222 @@
+ PCI Express Single Root I/O Virtualization HOWTO
+ Copyright (C) 2008 Intel Corporation
+
+
+1. Overview
+
+1.1 What is SR-IOV
+
+Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
+capability which makes one physical device appear as multiple virtual
+devices. The physical device is referred to as Physical Function while
+the virtual devices are referred to as Virtual Functions. Allocation
+of Virtual Functions can be dynamically controlled by Physical Function
+via registers encapsulated in the capability. By default, this feature
+is not enabled and the Physical Function behaves as traditional PCIe
+device. Once it's turned on, each Virtual Function's PCI configuration
+space can be accessed by its own Bus, Device and Function Number (Routing
+ID). And each Virtual Function also has PCI Memory Space, which is used
+to map its register set. Virtual Function device driver operates on the
+register set so it can be functional and appear as a real existing PCI
+device.
+
+2. User Guide
+
+2.1 How can I manage SR-IOV
+
+If a device supports SR-IOV, then there should be some entries under
+Physical Function's PCI device directory. These entries are in directory:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/
+ (XXXX:BB:DD:F is domain:bus:dev:fun)
+and
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N
+ (N is VF number from 0 to initialvfs-1)
+
+To enable or disable SR-IOV:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/enable
+ (writing 1/0 means enable/disable VFs, state change will
+ notify PF driver)
+
+To change number of Virtual Functions:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/numvfs
+ (writing positive integer to this file will change NumVFs)
+
+The total and initial number of VFs can get from:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/totalvfs
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/initialvfs
+
+The identifier of a VF that belongs to this PF can get from:
+ - /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N/rid
+
+2.2 How can I use Virtual Functions
+
+Virtual Functions are treated as hot-plugged PCI devices in the kernel,
+so they should be able to work in the same way as real PCI devices.
+NOTE: Virtual Function device driver must be loaded to make it work.
+
+
+3. Developer Guide
+
+3.1 SR-IOV APIs
+
+To register SR-IOV service, Physical Function device driver needs to call:
+ int pci_iov_register(struct pci_dev *dev,
+ int (*notify)(struct pci_dev *, u32), char **entries)
+ The 'notify' is a callback function that the SR-IOV code will invoke
+ it when events related to VFs happen (e.g. user read/write the sysfs
+ entries). The first argument is PF itself, the second argument is
+ event type and value. For now, following events type are supported:
+ - PCI_IOV_ENABLE: SR-IOV enable request
+ - PCI_IOV_DISABLE: SR-IOV disable request
+ - PCI_IOV_RD_CONF: read configuration
+ - PCI_IOV_WR_CONF: write configuration
+ - PCI_IOV_POST_EVENT: post event
+ And event values can be extract using following masks:
+ - PCI_IOV_VIRTFN_ID: Virtual Function Number
+ - PCI_IOV_NUM_VIRTFN: num of Virtual Functions
+ - PCI_IOV_EVENT_TYPE: event type (pre/post)
+ The 'entries' is is a list of sysfs entry names that will be to
+ created by the SR-IOV code.
+
+Note: entries could be NULL if PF driver doesn't want to create new entries
+under /sys/bus/pci/devices/XXXX:BB:DD.F/iov/N/.
+
+To unregister SR-IOV service, Physical Function device driver needs to call:
+ void pci_iov_unregister(struct pci_dev *dev)
+
+To enable SR-IOV, Physical Function device driver needs to call:
+ int pci_iov_enable(struct pci_dev *dev, int numvfs)
+ 'numvfs' is the number of VFs that PF wants to enable.
+
+To disable SR-IOV, Physical Function device driver needs to call:
+ void pci_iov_disable(struct pci_dev *dev)
+
+Note: above two functions sleeps 1 second waiting on hardware transaction
+completion according to SR-IOV specification.
+
+To read or write VFs configuration:
+ - int pci_iov_read_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf, int size);
+ - int pci_iov_write_config(struct pci_dev *dev, int vfn,
+ char *entry, char *buf);
+3.2 Usage example
+
+Following piece of code illustrates the usage of APIs above.
+
+static char *entries[] = { "foo", "bar", NULL };
+
+static int callback(struct pci_dev *dev, u32 event)
+{
+ int err;
+ int vfn;
+ int numvfs;
+
+ if (event & PCI_IOV_ENABLE) {
+ /*
+ * request to enable SR-IOV, NumVFs is available.
+ * Note: if the PF want to support PM, it has to
+ * check the device power state here to see if
+ * the request is allowed or not.
+ */
+
+ numvfs = event & PCI_IOV_NUM_VIRTFN;
+
+ } else if (event & PCI_IOV_DISABLE) {
+ /*
+ * request to disable SR-IOV.
+ */
+ ...
+
+ } else if (event & PCI_IOV_RD_CONF) {
+ /*
+ * request to read VF configuration, Virtual
+ * Function Number is available.
+ */
+
+ vfn = event & PCI_IOV_VIRTFN_ID;
+
+ /* pass the config to SR-IOV code so user can read it */
+ err = pci_iov_write_config(dev, vfn, entry, buf);
+
+ } else if (event & PCI_IOV_WR_CONF) {
+ /*
+ * request to write VF configuration, Virtual
+ * Function Number is available.
+ */
+
+ vfn = event & PCI_IOV_VIRTFN_ID;
+
+ /* read the config that has been written by user */
+ err = pci_iov_read_config(dev, vfn, entry, buf, size);
+
+ } else
+ return -EINVAL;
+
+ return err;
+}
+
+static int __devinit dev_probe(struct pci_dev *dev,
+ const struct pci_device_id *id)
+{
+ int err;
+
+ err = pci_iov_register(dev, callback, entries);
+ ...
+
+ err = pci_iov_enable(dev, nr_virtfn, callback);
+
+ ...
+
+ return err;
+}
+
+static void __devexit dev_remove(struct pci_dev *dev)
+{
+ ...
+
+ pci_iov_disable(dev);
+
+ ...
+
+ pci_iov_unregister(dev);
+
+ ...
+}
+
+#ifdef CONFIG_PM
+/*
+ * If Physical Function supports the power management, then the
+ * SR-IOV needs to be disabled before the adapter goes to sleep,
+ * because Virtual Functions will not work when the adapter is in
+ * the power-saving mode.
+ * The SR-IOV can be enabled again after the adapter wakes up.
+ */
+static int dev_suspend(struct pci_dev *dev, pm_message_t state)
+{
+ ...
+
+ pci_iov_disable(dev);
+
+ ...
+}
+
+static int dev_resume(struct pci_dev *dev)
+{
+ ...
+
+ pci_iov_enable(dev, numvfs);
+
+ ...
+}
+#endif
+
+static struct pci_driver dev_driver = {
+ .name = "SR-IOV Physical Function driver",
+ .id_table = dev_id_table,
+ .probe = dev_probe,
+ .remove = __devexit_p(dev_remove),
+#ifdef CONFIG_PM
+ .suspend = dev_suspend,
+ .resume = dev_resume,
+#endif
+};
--
1.5.6.4
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 8/8 v4] PCI: document the changes
2008-10-14 11:01 ` Yu Zhao
@ 2008-10-17 22:54 ` Pavel Machek
2008-10-17 22:54 ` Pavel Machek
1 sibling, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2008-10-17 22:54 UTC (permalink / raw)
To: Yu Zhao
Cc: randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, linux-pci@vger.kernel.org,
rdreier@cisco.com, linux-kernel@vger.kernel.org,
jbarnes@virtuousgeek.org,
virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
greg@kroah.com
Hi!
> Create how-to for SR-IOV user and device driver developer.
>
> Signed-off-by: Yu Zhao <yu.zhao@intel.com>
> +1.1 What is SR-IOV
> +
> +Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
> +capability which makes one physical device appear as multiple virtual
> +devices. The physical device is referred to as Physical Function while
> +the virtual devices are referred to as Virtual Functions. Allocation
> +of Virtual Functions can be dynamically controlled by Physical Function
> +via registers encapsulated in the capability. By default, this feature
> +is not enabled and the Physical Function behaves as traditional PCIe
> +device. Once it's turned on, each Virtual Function's PCI configuration
> +space can be accessed by its own Bus, Device and Function Number (Routing
> +ID). And each Virtual Function also has PCI Memory Space, which is
> used
Ok, why is this optional? If intel cares about virtualization, it
should enable this by default. I dont see why this should be
configurable.
> +#ifdef CONFIG_PM
> +/*
> + * If Physical Function supports the power management, then the
> + * SR-IOV needs to be disabled before the adapter goes to sleep,
> + * because Virtual Functions will not work when the adapter is in
> + * the power-saving mode.
> + * The SR-IOV can be enabled again after the adapter wakes up.
> + */
How beatiful :-(.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/8 v4] PCI: document the changes
2008-10-14 11:01 ` Yu Zhao
2008-10-17 22:54 ` Pavel Machek
@ 2008-10-17 22:54 ` Pavel Machek
1 sibling, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2008-10-17 22:54 UTC (permalink / raw)
To: Yu Zhao
Cc: linux-pci@vger.kernel.org, jbarnes@virtuousgeek.org,
randy.dunlap@oracle.com, grundler@parisc-linux.org,
achiang@hp.com, matthew@wil.cx, rdreier@cisco.com, greg@kroah.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org
Hi!
> Create how-to for SR-IOV user and device driver developer.
>
> Signed-off-by: Yu Zhao <yu.zhao@intel.com>
> +1.1 What is SR-IOV
> +
> +Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
> +capability which makes one physical device appear as multiple virtual
> +devices. The physical device is referred to as Physical Function while
> +the virtual devices are referred to as Virtual Functions. Allocation
> +of Virtual Functions can be dynamically controlled by Physical Function
> +via registers encapsulated in the capability. By default, this feature
> +is not enabled and the Physical Function behaves as traditional PCIe
> +device. Once it's turned on, each Virtual Function's PCI configuration
> +space can be accessed by its own Bus, Device and Function Number (Routing
> +ID). And each Virtual Function also has PCI Memory Space, which is
> used
Ok, why is this optional? If intel cares about virtualization, it
should enable this by default. I dont see why this should be
configurable.
> +#ifdef CONFIG_PM
> +/*
> + * If Physical Function supports the power management, then the
> + * SR-IOV needs to be disabled before the adapter goes to sleep,
> + * because Virtual Functions will not work when the adapter is in
> + * the power-saving mode.
> + * The SR-IOV can be enabled again after the adapter wakes up.
> + */
How beatiful :-(.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 25+ messages in thread