* [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref
@ 2013-11-20 1:51 Yinghai Lu
2013-11-20 1:51 ` [PATCH 1/6] PCI: move back pci_proc_attach_devices calling Yinghai Lu
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
First 3 are for Gu Zheng <guz.fnst@cn.fujitsu.com> to help double pci
device removing via sysfs.
Second 3 are about mmio 64 allocation that could help Guo Chao <yan@linux.vnet.ibm.com> on powerpc mmio allocation.
It will try to assign 64 bit resource above 4g at first.
I extracted them from big piles of pci related...
Could be found:
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-3.14
And it is based on current pci/for-linus, and should be applied without
problem on 3.13-rc1 after linus pull pci/for-linus.
Thanks
Yinghai
Yinghai Lu (6):
PCI: move back pci_proc_attach_devices calling
PCI: move resources and bus_list releasing to pci_release_dev
PCI: Destroy pci dev only once
PCI: pcibus address to resource converting take bus directly
PCI: Add pcibios_bus_addr_to_res()
PCI: Try to allocate mem64 above 4G at first
arch/x86/include/asm/pci.h | 1 -
drivers/pci/bus.c | 43 +++++++++++++++++++++++++++++++++--------
drivers/pci/host-bridge.c | 48 +++++++++++++++++++++++++++++++++-------------
drivers/pci/pci.h | 2 ++
drivers/pci/probe.c | 23 ++++++++++++++++++----
drivers/pci/remove.c | 24 ++++-------------------
include/linux/pci.h | 10 ++++++----
7 files changed, 101 insertions(+), 50 deletions(-)
--
1.8.1.4
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/6] PCI: move back pci_proc_attach_devices calling
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-20 1:51 ` [PATCH 2/6] PCI: move resources and bus_list releasing to pci_release_dev Yinghai Lu
` (4 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
We stop detach proc when pci_stop_device.
So should attach that during pci_bus_add_device.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/bus.c | 1 +
drivers/pci/probe.c | 2 --
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index fc1b740..1ffd95b 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -175,6 +175,7 @@ int pci_bus_add_device(struct pci_dev *dev)
* are not assigned yet for some devices.
*/
pci_fixup_device(pci_fixup_final, dev);
+ pci_proc_attach_device(dev);
pci_create_sysfs_dev_files(dev);
dev->match_driver = true;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 38e403d..173a9cf 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1381,8 +1381,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
dev->match_driver = false;
ret = device_add(&dev->dev);
WARN_ON(ret < 0);
-
- pci_proc_attach_device(dev);
}
struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/6] PCI: move resources and bus_list releasing to pci_release_dev
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
2013-11-20 1:51 ` [PATCH 1/6] PCI: move back pci_proc_attach_devices calling Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-20 1:51 ` [PATCH 3/6] PCI: Destroy pci dev only once Yinghai Lu
` (3 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
We should not release resource in pci_destroy that is too early
as there could be still other use hold reference.
release them or remove it from bus devices list at last
in pci_release_dev instead.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/probe.c | 21 +++++++++++++++++++--
drivers/pci/remove.c | 19 -------------------
2 files changed, 19 insertions(+), 21 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 173a9cf..12ec56c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1154,6 +1154,18 @@ static void pci_release_capabilities(struct pci_dev *dev)
pci_free_cap_save_buffers(dev);
}
+static void pci_free_resources(struct pci_dev *dev)
+{
+ int i;
+
+ pci_cleanup_rom(dev);
+ for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+ struct resource *res = dev->resource + i;
+ if (res->parent)
+ release_resource(res);
+ }
+}
+
/**
* pci_release_dev - free a pci device structure when all users of it are finished.
* @dev: device that's been disconnected
@@ -1163,9 +1175,14 @@ static void pci_release_capabilities(struct pci_dev *dev)
*/
static void pci_release_dev(struct device *dev)
{
- struct pci_dev *pci_dev;
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+
+ down_write(&pci_bus_sem);
+ list_del(&pci_dev->bus_list);
+ up_write(&pci_bus_sem);
+
+ pci_free_resources(pci_dev);
- pci_dev = to_pci_dev(dev);
pci_release_capabilities(pci_dev);
pci_release_of_node(pci_dev);
pcibios_release_device(pci_dev);
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 1576851..0d2c36f 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -3,20 +3,6 @@
#include <linux/pci-aspm.h>
#include "pci.h"
-static void pci_free_resources(struct pci_dev *dev)
-{
- int i;
-
- msi_remove_pci_irq_vectors(dev);
-
- pci_cleanup_rom(dev);
- for (i = 0; i < PCI_NUM_RESOURCES; i++) {
- struct resource *res = dev->resource + i;
- if (res->parent)
- release_resource(res);
- }
-}
-
static void pci_stop_dev(struct pci_dev *dev)
{
pci_pme_active(dev, false);
@@ -34,11 +20,6 @@ static void pci_stop_dev(struct pci_dev *dev)
static void pci_destroy_dev(struct pci_dev *dev)
{
- down_write(&pci_bus_sem);
- list_del(&dev->bus_list);
- up_write(&pci_bus_sem);
-
- pci_free_resources(dev);
put_device(&dev->dev);
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/6] PCI: Destroy pci dev only once
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
2013-11-20 1:51 ` [PATCH 1/6] PCI: move back pci_proc_attach_devices calling Yinghai Lu
2013-11-20 1:51 ` [PATCH 2/6] PCI: move resources and bus_list releasing to pci_release_dev Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-20 1:51 ` [PATCH 4/6] PCI: pcibus address to resource converting take bus directly Yinghai Lu
` (2 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
Mutliple removing via /sys will call pci_destroy_dev two times.
Add is_removed to record if pci_destroy_dev is called already.
During second calling, still have extra dev ref hold via
device_schedule_call, so we are safe to check dev->is_removed.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/remove.c | 5 ++++-
include/linux/pci.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 0d2c36f..cffe269 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -20,7 +20,10 @@ static void pci_stop_dev(struct pci_dev *dev)
static void pci_destroy_dev(struct pci_dev *dev)
{
- put_device(&dev->dev);
+ if (!dev->is_removed) {
+ dev->is_removed = 1;
+ put_device(&dev->dev);
+ }
}
void pci_remove_bus(struct pci_bus *bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1084a15..ccb316d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -321,6 +321,7 @@ struct pci_dev {
unsigned int multifunction:1;/* Part of multi-function device */
/* keep track of device state */
unsigned int is_added:1;
+ unsigned int is_removed:1; /* pci_destroy_dev is called */
unsigned int is_busmaster:1; /* device is busmaster */
unsigned int no_msi:1; /* device may not use msi */
unsigned int block_cfg_access:1; /* config space access is blocked */
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/6] PCI: pcibus address to resource converting take bus directly
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
` (2 preceding siblings ...)
2013-11-20 1:51 ` [PATCH 3/6] PCI: Destroy pci dev only once Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-20 1:51 ` [PATCH 5/6] PCI: Add pcibios_bus_addr_to_res() Yinghai Lu
2013-11-20 1:51 ` [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first Yinghai Lu
5 siblings, 0 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
For allocating resource under bus path, we do have dev pass along, and we
could just use bus instead.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/host-bridge.c | 34 +++++++++++++++++++++-------------
include/linux/pci.h | 3 +++
2 files changed, 24 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index a68dc61..2e7288b 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -9,22 +9,19 @@
#include "pci.h"
-static struct pci_bus *find_pci_root_bus(struct pci_dev *dev)
+static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
{
- struct pci_bus *bus;
-
- bus = dev->bus;
while (bus->parent)
bus = bus->parent;
return bus;
}
-static struct pci_host_bridge *find_pci_host_bridge(struct pci_dev *dev)
+static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
{
- struct pci_bus *bus = find_pci_root_bus(dev);
+ struct pci_bus *root_bus = find_pci_root_bus(bus);
- return to_pci_host_bridge(bus->bridge);
+ return to_pci_host_bridge(root_bus->bridge);
}
void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
@@ -40,10 +37,11 @@ static bool resource_contains(struct resource *res1, struct resource *res2)
return res1->start <= res2->start && res1->end >= res2->end;
}
-void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
- struct resource *res)
+void __pcibios_resource_to_bus(struct pci_bus *bus,
+ struct pci_bus_region *region,
+ struct resource *res)
{
- struct pci_host_bridge *bridge = find_pci_host_bridge(dev);
+ struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
struct pci_host_bridge_window *window;
resource_size_t offset = 0;
@@ -60,6 +58,11 @@ void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
region->start = res->start - offset;
region->end = res->end - offset;
}
+void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
+ struct resource *res)
+{
+ __pcibios_resource_to_bus(dev->bus, region, res);
+}
EXPORT_SYMBOL(pcibios_resource_to_bus);
static bool region_contains(struct pci_bus_region *region1,
@@ -68,10 +71,10 @@ static bool region_contains(struct pci_bus_region *region1,
return region1->start <= region2->start && region1->end >= region2->end;
}
-void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
- struct pci_bus_region *region)
+static void __pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
+ struct pci_bus_region *region)
{
- struct pci_host_bridge *bridge = find_pci_host_bridge(dev);
+ struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
struct pci_host_bridge_window *window;
resource_size_t offset = 0;
@@ -93,4 +96,9 @@ void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
res->start = region->start + offset;
res->end = region->end + offset;
}
+void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
+ struct pci_bus_region *region)
+{
+ __pcibios_bus_to_resource(dev->bus, res, region);
+}
EXPORT_SYMBOL(pcibios_bus_to_resource);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ccb316d..55ee90f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -738,6 +738,9 @@ void pci_fixup_cardbus(struct pci_bus *);
/* Generic PCI functions used internally */
+void __pcibios_resource_to_bus(struct pci_bus *bus,
+ struct pci_bus_region *region,
+ struct resource *res);
void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
struct resource *res);
void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/6] PCI: Add pcibios_bus_addr_to_res()
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
` (3 preceding siblings ...)
2013-11-20 1:51 ` [PATCH 4/6] PCI: pcibus address to resource converting take bus directly Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-20 1:51 ` [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first Yinghai Lu
5 siblings, 0 replies; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
it takes addr and return converted address only.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/host-bridge.c | 14 ++++++++++++++
include/linux/pci.h | 2 ++
2 files changed, 16 insertions(+)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 2e7288b..c911adb 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -102,3 +102,17 @@ void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
__pcibios_bus_to_resource(dev->bus, res, region);
}
EXPORT_SYMBOL(pcibios_bus_to_resource);
+
+resource_size_t pcibios_bus_addr_to_res(struct pci_bus *bus, int flags,
+ resource_size_t addr)
+{
+ struct pci_bus_region region;
+ struct resource r;
+
+ r.flags = flags;
+ region.start = addr;
+ region.end = addr;
+ __pcibios_bus_to_resource(bus, &r, ®ion);
+
+ return r.end;
+}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 55ee90f..3c6e399 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -745,6 +745,8 @@ void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region,
struct resource *res);
void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
struct pci_bus_region *region);
+resource_size_t pcibios_bus_addr_to_res(struct pci_bus *bus, int flags,
+ resource_size_t addr);
void pcibios_scan_specific_bus(int busn);
struct pci_bus *pci_find_bus(int domain, int busnr);
void pci_bus_add_devices(const struct pci_bus *bus);
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
` (4 preceding siblings ...)
2013-11-20 1:51 ` [PATCH 5/6] PCI: Add pcibios_bus_addr_to_res() Yinghai Lu
@ 2013-11-20 1:51 ` Yinghai Lu
2013-11-21 10:30 ` Guo Chao
5 siblings, 1 reply; 14+ messages in thread
From: Yinghai Lu @ 2013-11-20 1:51 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Gu Zheng, Guo Chao, linux-pci, Yinghai Lu
Will fall back to below 4g if it can not find any above 4g.
x86 32bit without X86_PAE support will have bottom set to 0, because
resource_size_t is 32bit.
Also for 32bit with resource_size_t 64bit kernel on machine with pae support
we are safe because iomem_resource is limited to 32bit according to
x86_phys_bits.
-v2: update bottom assigning to make it clear for non-pae support machine.
-v3: Bjorn's change:
use MAX_RESOURCE instead of -1
use start/end instead of bottom/max
for all arch instead of just x86_64
-v4: updated after PCI_MAX_RESOURCE_32 change.
-v5: restore io handling to use PCI_MAX_RESOURCE_32 as limit.
-v6: checking pcibios_resource_to_bus return for every bus res, to decide it
if we need to try high at first.
It supports all arches instead of just x86_64.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/include/asm/pci.h | 1 -
drivers/pci/bus.c | 42 ++++++++++++++++++++++++++++++++++--------
drivers/pci/pci.h | 2 ++
include/linux/pci.h | 4 ----
4 files changed, 36 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 7d74432..73ff4bc 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -125,7 +125,6 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
/* generic pci stuff */
#include <asm-generic/pci.h>
-#define PCIBIOS_MAX_MEM_32 0xffffffff
#ifdef CONFIG_NUMA
/* Returns the node based on pci bus */
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 1ffd95b..f801f6a 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -125,15 +125,13 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
{
int i, ret = -ENOMEM;
struct resource *r;
- resource_size_t max = -1;
type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
- /* don't allocate too high if the pref mem doesn't support 64bit*/
- if (!(res->flags & IORESOURCE_MEM_64))
- max = PCIBIOS_MAX_MEM_32;
-
pci_bus_for_each_resource(bus, r, i) {
+ resource_size_t start, end, middle;
+ struct pci_bus_region region;
+
if (!r)
continue;
@@ -147,14 +145,42 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
!(res->flags & IORESOURCE_PREFETCH))
continue;
+ start = 0;
+ end = MAX_RESOURCE;
+ /*
+ * don't allocate too high if the pref mem doesn't
+ * support 64bit, also if this is a 64-bit mem
+ * resource, try above 4GB first
+ */
+ __pcibios_resource_to_bus(bus, ®ion, r);
+ if (region.start <= PCI_MAX_ADDR_32 &&
+ region.end > PCI_MAX_ADDR_32) {
+ middle = pcibios_bus_addr_to_res(bus, res->flags,
+ PCI_MAX_ADDR_32);
+ if (res->flags & IORESOURCE_MEM_64)
+ start = middle + 1;
+ else
+ end = middle;
+ } else if (region.start > PCI_MAX_ADDR_32 &&
+ !(res->flags & IORESOURCE_MEM_64))
+ continue;
+
+again:
/* Ok, try it out.. */
ret = allocate_resource(r, res, size,
- r->start ? : min,
- max, align,
+ max(start, r->start ? : min),
+ end, align,
alignf, alignf_data);
if (ret == 0)
- break;
+ return 0;
+
+ if (start != 0) {
+ start = 0;
+ goto again;
+ }
}
+
+
return ret;
}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 9c91ecc..aea4efb 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -198,6 +198,8 @@ enum pci_bar_type {
pci_bar_mem64, /* A 64-bit memory BAR */
};
+#define PCI_MAX_ADDR_32 ((resource_size_t)0xffffffff)
+
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
int pci_setup_device(struct pci_dev *dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3c6e399..1c69789 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1491,10 +1491,6 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev)
#include <asm/pci.h>
-#ifndef PCIBIOS_MAX_MEM_32
-#define PCIBIOS_MAX_MEM_32 (-1)
-#endif
-
/* these helpers provide future and backwards compatibility
* for accessing popular PCI BAR info */
#define pci_resource_start(dev, bar) ((dev)->resource[(bar)].start)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-20 1:51 ` [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first Yinghai Lu
@ 2013-11-21 10:30 ` Guo Chao
2013-11-21 20:18 ` Yinghai Lu
0 siblings, 1 reply; 14+ messages in thread
From: Guo Chao @ 2013-11-21 10:30 UTC (permalink / raw)
To: Yinghai Lu, Bjorn Helgaas; +Cc: linux-pci
Hi:
On Tue, Nov 19, 2013 at 05:51:57PM -0800, Yinghai Lu wrote:
> Will fall back to below 4g if it can not find any above 4g.
>
Work fine in our systems if '[RFC PATCH 3/3] PCI: do not reset bridge's
IORESOURCE_MEM_64 flag for ROM BAR' applied.
Otherwise, in one system, the 32-bit window is too small to provide
fallback space for prefetchable windows of root bridge, causing all
prefethable resources failed to get addresses.
Any comments about that patch?
Thanks,
Guo Chao
> x86 32bit without X86_PAE support will have bottom set to 0, because
> resource_size_t is 32bit.
>
> Also for 32bit with resource_size_t 64bit kernel on machine with pae support
> we are safe because iomem_resource is limited to 32bit according to
> x86_phys_bits.
>
> -v2: update bottom assigning to make it clear for non-pae support machine.
> -v3: Bjorn's change:
> use MAX_RESOURCE instead of -1
> use start/end instead of bottom/max
> for all arch instead of just x86_64
> -v4: updated after PCI_MAX_RESOURCE_32 change.
> -v5: restore io handling to use PCI_MAX_RESOURCE_32 as limit.
> -v6: checking pcibios_resource_to_bus return for every bus res, to decide it
> if we need to try high at first.
> It supports all arches instead of just x86_64.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> ---
> arch/x86/include/asm/pci.h | 1 -
> drivers/pci/bus.c | 42 ++++++++++++++++++++++++++++++++++--------
> drivers/pci/pci.h | 2 ++
> include/linux/pci.h | 4 ----
> 4 files changed, 36 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index 7d74432..73ff4bc 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -125,7 +125,6 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
>
> /* generic pci stuff */
> #include <asm-generic/pci.h>
> -#define PCIBIOS_MAX_MEM_32 0xffffffff
>
> #ifdef CONFIG_NUMA
> /* Returns the node based on pci bus */
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 1ffd95b..f801f6a 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -125,15 +125,13 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
> {
> int i, ret = -ENOMEM;
> struct resource *r;
> - resource_size_t max = -1;
>
> type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
>
> - /* don't allocate too high if the pref mem doesn't support 64bit*/
> - if (!(res->flags & IORESOURCE_MEM_64))
> - max = PCIBIOS_MAX_MEM_32;
> -
> pci_bus_for_each_resource(bus, r, i) {
> + resource_size_t start, end, middle;
> + struct pci_bus_region region;
> +
> if (!r)
> continue;
>
> @@ -147,14 +145,42 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
> !(res->flags & IORESOURCE_PREFETCH))
> continue;
>
> + start = 0;
> + end = MAX_RESOURCE;
> + /*
> + * don't allocate too high if the pref mem doesn't
> + * support 64bit, also if this is a 64-bit mem
> + * resource, try above 4GB first
> + */
> + __pcibios_resource_to_bus(bus, ®ion, r);
> + if (region.start <= PCI_MAX_ADDR_32 &&
> + region.end > PCI_MAX_ADDR_32) {
> + middle = pcibios_bus_addr_to_res(bus, res->flags,
> + PCI_MAX_ADDR_32);
> + if (res->flags & IORESOURCE_MEM_64)
> + start = middle + 1;
> + else
> + end = middle;
> + } else if (region.start > PCI_MAX_ADDR_32 &&
> + !(res->flags & IORESOURCE_MEM_64))
> + continue;
> +
> +again:
> /* Ok, try it out.. */
> ret = allocate_resource(r, res, size,
> - r->start ? : min,
> - max, align,
> + max(start, r->start ? : min),
> + end, align,
> alignf, alignf_data);
> if (ret == 0)
> - break;
> + return 0;
> +
> + if (start != 0) {
> + start = 0;
> + goto again;
> + }
> }
> +
> +
> return ret;
> }
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 9c91ecc..aea4efb 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -198,6 +198,8 @@ enum pci_bar_type {
> pci_bar_mem64, /* A 64-bit memory BAR */
> };
>
> +#define PCI_MAX_ADDR_32 ((resource_size_t)0xffffffff)
> +
> bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
> int crs_timeout);
> int pci_setup_device(struct pci_dev *dev);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3c6e399..1c69789 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1491,10 +1491,6 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev)
>
> #include <asm/pci.h>
>
> -#ifndef PCIBIOS_MAX_MEM_32
> -#define PCIBIOS_MAX_MEM_32 (-1)
> -#endif
> -
> /* these helpers provide future and backwards compatibility
> * for accessing popular PCI BAR info */
> #define pci_resource_start(dev, bar) ((dev)->resource[(bar)].start)
> --
> 1.8.1.4
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-21 10:30 ` Guo Chao
@ 2013-11-21 20:18 ` Yinghai Lu
2013-11-21 21:05 ` Linus Torvalds
0 siblings, 1 reply; 14+ messages in thread
From: Yinghai Lu @ 2013-11-21 20:18 UTC (permalink / raw)
To: Guo Chao, Bjorn Helgaas, Linus Torvalds; +Cc: linux-pci@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1915 bytes --]
On Thu, Nov 21, 2013 at 2:30 AM, Guo Chao <yan@linux.vnet.ibm.com> wrote:
> Hi:
>
> On Tue, Nov 19, 2013 at 05:51:57PM -0800, Yinghai Lu wrote:
>> Will fall back to below 4g if it can not find any above 4g.
>>
>
>
>
>
>> x86 32bit without X86_PAE support will have bottom set to 0, because
>> resource_size_t is 32bit.
>>
>> Also for 32bit with resource_size_t 64bit kernel on machine with pae support
>> we are safe because iomem_resource is limited to 32bit according to
>> x86_phys_bits.
>>
>> -v2: update bottom assigning to make it clear for non-pae support machine.
>> -v3: Bjorn's change:
>> use MAX_RESOURCE instead of -1
>> use start/end instead of bottom/max
>> for all arch instead of just x86_64
>> -v4: updated after PCI_MAX_RESOURCE_32 change.
>> -v5: restore io handling to use PCI_MAX_RESOURCE_32 as limit.
>> -v6: checking pcibios_resource_to_bus return for every bus res, to decide it
>> if we need to try high at first.
>> It supports all arches instead of just x86_64.
>>
> Work fine in our systems if '[RFC PATCH 3/3] PCI: do not reset bridge's
> IORESOURCE_MEM_64 flag for ROM BAR' applied.
>
> Otherwise, in one system, the 32-bit window is too small to provide
> fallback space for prefetchable windows of root bridge, causing all
> prefethable resources failed to get addresses.
>
> Any comments about that patch?
no, that patch is not right.
That could prevent rom BAR getting allocate under 4G.
solution could be:
1. Just remove pref on rom bar allocation. as attached rom_no_pref.patch
2. or treat pref rom as option resource, as attached rom_option_1_xxx.patch
3. or more generic, treat all pci BAR 32bit prefetechable as normal
MMIO32 during allocation. aka mmio prefectechable will be used for pci
bridge that support 64bit mmio pref and leave bridge's 32bit only pref
bar register blank.
Bjorn, Linus,
Are you happy with No 3?
Thanks
Yinghai
[-- Attachment #2: rom_no_pref.patch --]
[-- Type: text/x-patch, Size: 840 bytes --]
Subject: [PATCH] PCI: Don't allocate rom bar in bridge pref resource
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/probe.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -336,9 +336,8 @@ static void pci_read_bases(struct pci_de
if (rom) {
struct resource *res = &dev->resource[PCI_ROM_RESOURCE];
dev->rom_base_reg = rom;
- res->flags = IORESOURCE_MEM | IORESOURCE_PREFETCH |
- IORESOURCE_READONLY | IORESOURCE_CACHEABLE |
- IORESOURCE_SIZEALIGN;
+ res->flags = IORESOURCE_MEM | IORESOURCE_READONLY |
+ IORESOURCE_CACHEABLE | IORESOURCE_SIZEALIGN;
__pci_read_base(dev, pci_bar_mem32, res, rom);
}
}
[-- Attachment #3: rom_option_1_xxx.patch --]
[-- Type: text/x-patch, Size: 1792 bytes --]
Subject: [PATCH] PCI: Treat ROM resource as optional during assigning.
So will try to allocate them together with requested ones, if can not assign
them, could go with requested one only, and just skip ROM resource.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/setup-bus.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -303,18 +303,10 @@ static void assign_requested_resources_s
idx = pci_dev_resource_idx(dev_res->dev, res);
if (resource_size(res) &&
pci_assign_resource_fit(dev_res->dev, idx, fit)) {
- if (fail_head) {
- /*
- * if the failed res is for ROM BAR, and it will
- * be enabled later, don't add it to the list
- */
- if (!((idx == PCI_ROM_RESOURCE) &&
- (!(res->flags & IORESOURCE_ROM_ENABLE))))
- add_to_list(fail_head,
- dev_res->dev, res,
- 0 /* don't care */,
- 0 /* don't care */);
- }
+ if (fail_head)
+ add_to_list(fail_head, dev_res->dev, res,
+ 0 /* don't care */,
+ 0 /* don't care */);
reset_resource(res);
}
}
@@ -903,8 +895,9 @@ static int pbus_size_mem(struct pci_bus
continue;
r_size = resource_size(r);
- /* put SRIOV requested res to the optional list */
- if (realloc_head && is_pci_iov_resource_idx(i)) {
+ /* put SRIOV/ROM requested res to the optional list */
+ if (realloc_head && (is_pci_iov_resource_idx(i) ||
+ is_pci_rom_resource_idx(i))) {
r->end = r->start - 1;
add_to_list(realloc_head, dev, r, r_size, 0/* dont' care */);
children_add_size += r_size;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-21 20:18 ` Yinghai Lu
@ 2013-11-21 21:05 ` Linus Torvalds
2013-11-21 21:34 ` Yinghai Lu
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2013-11-21 21:05 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Guo Chao, Bjorn Helgaas, linux-pci@vger.kernel.org
On Thu, Nov 21, 2013 at 12:18 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>
> 3. or more generic, treat all pci BAR 32bit prefetechable as normal
> MMIO32 during allocation. aka mmio prefectechable will be used for pci
> bridge that support 64bit mmio pref and leave bridge's 32bit only pref
> bar register blank.
>
> Bjorn, Linus,
> Are you happy with No 3?
I don't think I understand your #3. If there's a 32-bit device behind
the bridge, it will need to be in the low 4G, so you can't use the
64-bit window because it will be out of range. So that wouldn't work.
So you'd need to have a bridge window in the low 32-bit area too if
there are any 32-bit devices behind the bridge.
Also, haven't we tried the "try to allocate 64-bit devices in high
memory" and it has always failed because there are broken devices?
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-21 21:05 ` Linus Torvalds
@ 2013-11-21 21:34 ` Yinghai Lu
2013-11-21 21:42 ` Linus Torvalds
0 siblings, 1 reply; 14+ messages in thread
From: Yinghai Lu @ 2013-11-21 21:34 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Guo Chao, Bjorn Helgaas, linux-pci@vger.kernel.org
On Thu, Nov 21, 2013 at 1:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Nov 21, 2013 at 12:18 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>
>> 3. or more generic, treat all pci BAR 32bit prefetechable as normal
>> MMIO32 during allocation. aka mmio prefectechable will be used for pci
>> bridge that support 64bit mmio pref and leave bridge's 32bit only pref
>> bar register blank.
>>
>> Bjorn, Linus,
>> Are you happy with No 3?
>
> I don't think I understand your #3. If there's a 32-bit device behind
> the bridge, it will need to be in the low 4G, so you can't use the
> 64-bit window because it will be out of range. So that wouldn't work.
> So you'd need to have a bridge window in the low 32-bit area too if
> there are any 32-bit devices behind the bridge.
bridge should have two mmio,
mmio nonpref: it only can be 32bit
mmio pref: it could be 32bit only or 64bit.
and we are ok to put child pref range under nonpref range from bridge.
Here are suggestion is:
assume mmio pref: will be use for 64bit pref only, and if the mmio pref
does not support 64bit, we just ignore it.
If devices under that bridge need pref, we will just use range from bridge's
nonpref mmio.
>
> Also, haven't we tried the "try to allocate 64-bit devices in high
> memory" and it has always failed because there are broken devices?
In my test setups, it is always working.
--- only with Intel network devices and Mellanox Infiniband cards and storage
cards from Qlogic and Emulex.
Thanks
Yinghai
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-21 21:34 ` Yinghai Lu
@ 2013-11-21 21:42 ` Linus Torvalds
2013-11-22 6:11 ` Yinghai Lu
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2013-11-21 21:42 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Guo Chao, Bjorn Helgaas, linux-pci@vger.kernel.org
On Thu, Nov 21, 2013 at 1:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>
> Here are suggestion is:
> assume mmio pref: will be use for 64bit pref only, and if the mmio pref
> does not support 64bit, we just ignore it.
> If devices under that bridge need pref, we will just use range from bridge's
> nonpref mmio.
Since we always scan devices behind the bus, why don't we take that
into account? If we find devices with 32-bit mmio, we try to make the
prefetchable one be in the 32-bit range.
> In my test setups, it is always working.
> --- only with Intel network devices and Mellanox Infiniband cards and storage
> cards from Qlogic and Emulex.
Right. And remember how many times your testing did *not* show
problems that others saw?
"Works for me" doesn't work for PCI resource management.
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-21 21:42 ` Linus Torvalds
@ 2013-11-22 6:11 ` Yinghai Lu
2013-11-22 7:08 ` Guo Chao
0 siblings, 1 reply; 14+ messages in thread
From: Yinghai Lu @ 2013-11-22 6:11 UTC (permalink / raw)
To: Linus Torvalds, Guo Chao; +Cc: Bjorn Helgaas, linux-pci@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1362 bytes --]
On Thu, Nov 21, 2013 at 1:42 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Nov 21, 2013 at 1:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>
>> Here are suggestion is:
>> assume mmio pref: will be use for 64bit pref only, and if the mmio pref
>> does not support 64bit, we just ignore it.
>> If devices under that bridge need pref, we will just use range from bridge's
>> nonpref mmio.
>
> Since we always scan devices behind the bus, why don't we take that
> into account? If we find devices with 32-bit mmio, we try to make the
> prefetchable one be in the 32-bit range.
Refresh one old patch, and it will separate 32bit pref and 64bit pref depend
on the bridge capability.
| Subject: [PATCH] PCI: Try best to allocate pref mem 64 above 4g
|
| When one of children resources does not support MEM_64, MEM_64 for
| bridge get reset, so pull down whole resource under 4G.
|
| We could move those 32bit pref mmio under 32bit non-pref mmio of the bridge.
|
| If the bridge support pref mem 64, will only allocate that with pref mem64 to
| children that support it.
| For children resources if they only support pref mem 32, will allocate them
| from non pref mem instead.
|
| If the bridge only support 32bit pref mmio, will still have all children pref
| mmio under that.
Guo, Can you test it on top of those 6 patches ?
Thanks
Yinghai
[-- Attachment #2: pref_mem_64_only.patch --]
[-- Type: text/x-patch, Size: 9041 bytes --]
Subject: [PATCH] PCI: Try best to allocate pref mem 64 above 4g
When one of children resources does not support MEM_64, MEM_64 for
bridge get reset, so pull down whole resource under 4G.
We could move those 32bit pref mmio under 32bit non-pref mmio of the bridge.
If the bridge support pref mem 64, will only allocate that with pref mem64 to
children that support it.
For children resources if they only support pref mem 32, will allocate them
from non pref mem instead.
If the bridge only support 32bit pref mmio, will still have all children pref
mmio under that.
-v2: Add release bridge res support with bridge mem res for pref_mem children res.
-v3: refresh and make it can be applied early before for_each_dev_res
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
drivers/pci/setup-bus.c | 133 +++++++++++++++++++++++++++++++-----------------
drivers/pci/setup-res.c | 14 ++++-
2 files changed, 101 insertions(+), 46 deletions(-)
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -711,12 +711,11 @@ static void pci_bridge_check_ranges(stru
bus resource of a given type. Note: we intentionally skip
the bus resources which have already been assigned (that is,
have non-NULL parent resource). */
-static struct resource *find_free_bus_resource(struct pci_bus *bus, unsigned long type)
+static struct resource *find_free_bus_resource(struct pci_bus *bus,
+ unsigned long type_mask, unsigned long type)
{
int i;
struct resource *r;
- unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH;
pci_bus_for_each_resource(bus, r, i) {
if (r == &ioport_resource || r == &iomem_resource)
@@ -813,7 +812,8 @@ static void pbus_size_io(struct pci_bus
resource_size_t add_size, struct list_head *realloc_head)
{
struct pci_dev *dev;
- struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO);
+ struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,
+ IORESOURCE_IO);
resource_size_t size = 0, size0 = 0, size1 = 0;
resource_size_t children_add_size = 0;
resource_size_t min_align, align;
@@ -913,15 +913,16 @@ static inline resource_size_t calculate_
* guarantees that all child resources fit in this size.
*/
static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
- unsigned long type, resource_size_t min_size,
- resource_size_t add_size,
- struct list_head *realloc_head)
+ unsigned long type, unsigned long type2,
+ resource_size_t min_size, resource_size_t add_size,
+ struct list_head *realloc_head)
{
struct pci_dev *dev;
resource_size_t min_align, align, size, size0, size1;
resource_size_t aligns[12]; /* Alignments from 1Mb to 2Gb */
int order, max_order;
- struct resource *b_res = find_free_bus_resource(bus, type);
+ struct resource *b_res = find_free_bus_resource(bus,
+ mask | IORESOURCE_PREFETCH, type);
unsigned int mem64_mask = 0;
resource_size_t children_add_size = 0;
@@ -942,7 +943,8 @@ static int pbus_size_mem(struct pci_bus
struct resource *r = &dev->resource[i];
resource_size_t r_size;
- if (r->parent || (r->flags & mask) != type)
+ if (r->parent || ((r->flags & mask) != type &&
+ (r->flags & mask) != type2))
continue;
r_size = resource_size(r);
#ifdef CONFIG_PCI_IOV
@@ -1115,8 +1117,9 @@ void __ref __pci_bus_size_bridges(struct
struct list_head *realloc_head)
{
struct pci_dev *dev;
- unsigned long mask, prefmask;
+ unsigned long mask, prefmask, type2 = 0;
resource_size_t additional_mem_size = 0, additional_io_size = 0;
+ struct resource *b_res;
list_for_each_entry(dev, &bus->devices, bus_list) {
struct pci_bus *b = dev->subordinate;
@@ -1161,15 +1164,31 @@ void __ref __pci_bus_size_bridges(struct
has already been allocated by arch code, try
non-prefetchable range for both types of PCI memory
resources. */
+ b_res = &bus->self->resource[PCI_BRIDGE_RESOURCES];
mask = IORESOURCE_MEM;
prefmask = IORESOURCE_MEM | IORESOURCE_PREFETCH;
- if (pbus_size_mem(bus, prefmask, prefmask,
+ if (b_res[2].flags & IORESOURCE_MEM_64) {
+ prefmask |= IORESOURCE_MEM_64;
+ if (pbus_size_mem(bus, prefmask, prefmask, prefmask,
realloc_head ? 0 : additional_mem_size,
- additional_mem_size, realloc_head))
- mask = prefmask; /* Success, size non-prefetch only. */
- else
- additional_mem_size += additional_mem_size;
- pbus_size_mem(bus, mask, IORESOURCE_MEM,
+ additional_mem_size, realloc_head)) {
+ /* Success, size non-pref64 only. */
+ mask = prefmask;
+ type2 = prefmask & ~IORESOURCE_MEM_64;
+ }
+ }
+ if (!type2) {
+ prefmask &= ~IORESOURCE_MEM_64;
+ if (pbus_size_mem(bus, prefmask, prefmask, prefmask,
+ realloc_head ? 0 : additional_mem_size,
+ additional_mem_size, realloc_head)) {
+ /* Success, size non-prefetch only. */
+ mask = prefmask;
+ } else
+ additional_mem_size += additional_mem_size;
+ type2 = IORESOURCE_MEM;
+ }
+ pbus_size_mem(bus, mask, IORESOURCE_MEM, type2,
realloc_head ? 0 : additional_mem_size,
additional_mem_size, realloc_head);
break;
@@ -1255,42 +1274,66 @@ static void __ref __pci_bridge_assign_re
static void pci_bridge_release_resources(struct pci_bus *bus,
unsigned long type)
{
- int idx;
- bool changed = false;
- struct pci_dev *dev;
+ struct pci_dev *dev = bus->self;
struct resource *r;
unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH;
+ IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
+ unsigned old_flags = 0;
+ struct resource *b_res;
+ int idx = 1;
- dev = bus->self;
- for (idx = PCI_BRIDGE_RESOURCES; idx <= PCI_BRIDGE_RESOURCE_END;
- idx++) {
- r = &dev->resource[idx];
- if ((r->flags & type_mask) != type)
- continue;
- if (!r->parent)
- continue;
- /*
- * if there are children under that, we should release them
- * all
- */
- release_child_resources(r);
- if (!release_resource(r)) {
- dev_printk(KERN_DEBUG, &dev->dev,
- "resource %d %pR released\n", idx, r);
- /* keep the old size */
- r->end = resource_size(r) - 1;
- r->start = 0;
- r->flags = 0;
- changed = true;
- }
- }
+ b_res = &dev->resource[PCI_BRIDGE_RESOURCES];
+
+ /*
+ * 1. if there is io port assign fail, will release bridge
+ * io port.
+ * 2. if there is non pref mmio assign fail, release bridge
+ * nonpref mmio.
+ * 3. if there is 64bit pref mmio assign fail, and bridge pref
+ * is 64bit, release bridge pref mmio.
+ * 4. if there is pref mmio assign fail, and bridge pref is
+ * 32bit mmio, release bridge pref mmio
+ * 5. if there is pref mmio assign fail, and bridge pref is not
+ * assigned, release bridge nonpref mmio.
+ */
+ if (type & IORESOURCE_IO)
+ idx = 0;
+ else if (!(type & IORESOURCE_PREFETCH))
+ idx = 1;
+ else if ((type & IORESOURCE_MEM_64) &&
+ (b_res[2].flags & IORESOURCE_MEM_64))
+ idx = 2;
+ else if (!(b_res[2].flags & IORESOURCE_MEM_64) &&
+ (b_res[2].flags & IORESOURCE_PREFETCH))
+ idx = 2;
+ else
+ idx = 1;
+
+ r = &b_res[idx];
+
+ if (!r->parent)
+ return;
+
+ /*
+ * if there are children under that, we should release them
+ * all
+ */
+ release_child_resources(r);
+ if (!release_resource(r)) {
+ type = old_flags = r->flags & type_mask;
+ dev_printk(KERN_DEBUG, &dev->dev, "resource %d %pR released\n",
+ PCI_BRIDGE_RESOURCES + idx, r);
+ /* keep the old size */
+ r->end = resource_size(r) - 1;
+ r->start = 0;
+ r->flags = 0;
- if (changed) {
/* avoiding touch the one without PREF */
if (type & IORESOURCE_PREFETCH)
type = IORESOURCE_PREFETCH;
__pci_setup_bridge(bus, type);
+ /* for next child res under same bridge */
+ r->flags = old_flags;
}
}
@@ -1469,7 +1512,7 @@ void pci_assign_unassigned_root_bus_reso
LIST_HEAD(fail_head);
struct pci_dev_resource *fail_res;
unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH;
+ IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
int pci_try_num = 1;
enum enable_type enable_local;
Index: linux-2.6/drivers/pci/setup-res.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-res.c
+++ linux-2.6/drivers/pci/setup-res.c
@@ -208,9 +208,21 @@ static int __pci_assign_resource(struct
/* First, try exact prefetching match.. */
ret = pci_bus_alloc_resource(bus, res, size, align, min,
- IORESOURCE_PREFETCH,
+ IORESOURCE_PREFETCH | IORESOURCE_MEM_64,
pcibios_align_resource, dev);
+ if (ret < 0 &&
+ (res->flags & (IORESOURCE_PREFETCH | IORESOURCE_MEM_64))) {
+ /*
+ * That failed.
+ *
+ * Try below 4g pref
+ */
+ ret = pci_bus_alloc_resource(bus, res, size, align, min,
+ IORESOURCE_PREFETCH,
+ pcibios_align_resource, dev);
+ }
+
if (ret < 0 && (res->flags & IORESOURCE_PREFETCH)) {
/*
* That failed.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first
2013-11-22 6:11 ` Yinghai Lu
@ 2013-11-22 7:08 ` Guo Chao
0 siblings, 0 replies; 14+ messages in thread
From: Guo Chao @ 2013-11-22 7:08 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Linus Torvalds, Bjorn Helgaas, linux-pci@vger.kernel.org
On Thu, Nov 21, 2013 at 10:11:27PM -0800, Yinghai Lu wrote:
> On Thu, Nov 21, 2013 at 1:42 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Nov 21, 2013 at 1:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >>
> >> Here are suggestion is:
> >> assume mmio pref: will be use for 64bit pref only, and if the mmio pref
> >> does not support 64bit, we just ignore it.
> >> If devices under that bridge need pref, we will just use range from bridge's
> >> nonpref mmio.
> >
> > Since we always scan devices behind the bus, why don't we take that
> > into account? If we find devices with 32-bit mmio, we try to make the
> > prefetchable one be in the 32-bit range.
>
> Refresh one old patch, and it will separate 32bit pref and 64bit pref depend
> on the bridge capability.
>
> | Subject: [PATCH] PCI: Try best to allocate pref mem 64 above 4g
> |
> | When one of children resources does not support MEM_64, MEM_64 for
> | bridge get reset, so pull down whole resource under 4G.
> |
> | We could move those 32bit pref mmio under 32bit non-pref mmio of the bridge.
> |
> | If the bridge support pref mem 64, will only allocate that with pref mem64 to
> | children that support it.
> | For children resources if they only support pref mem 32, will allocate them
> | from non pref mem instead.
> |
> | If the bridge only support 32bit pref mmio, will still have all children pref
> | mmio under that.
>
> Guo, Can you test it on top of those 6 patches ?
>
Tested-by: Guo Chao <yan@linux.vnet.ibm.com>
Thanks,
Guo Chao
> Thanks
>
> Yinghai
> Subject: [PATCH] PCI: Try best to allocate pref mem 64 above 4g
>
> When one of children resources does not support MEM_64, MEM_64 for
> bridge get reset, so pull down whole resource under 4G.
>
> We could move those 32bit pref mmio under 32bit non-pref mmio of the bridge.
>
> If the bridge support pref mem 64, will only allocate that with pref mem64 to
> children that support it.
> For children resources if they only support pref mem 32, will allocate them
> from non pref mem instead.
>
> If the bridge only support 32bit pref mmio, will still have all children pref
> mmio under that.
>
> -v2: Add release bridge res support with bridge mem res for pref_mem children res.
> -v3: refresh and make it can be applied early before for_each_dev_res
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
> ---
> drivers/pci/setup-bus.c | 133 +++++++++++++++++++++++++++++++-----------------
> drivers/pci/setup-res.c | 14 ++++-
> 2 files changed, 101 insertions(+), 46 deletions(-)
>
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -711,12 +711,11 @@ static void pci_bridge_check_ranges(stru
> bus resource of a given type. Note: we intentionally skip
> the bus resources which have already been assigned (that is,
> have non-NULL parent resource). */
> -static struct resource *find_free_bus_resource(struct pci_bus *bus, unsigned long type)
> +static struct resource *find_free_bus_resource(struct pci_bus *bus,
> + unsigned long type_mask, unsigned long type)
> {
> int i;
> struct resource *r;
> - unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> - IORESOURCE_PREFETCH;
>
> pci_bus_for_each_resource(bus, r, i) {
> if (r == &ioport_resource || r == &iomem_resource)
> @@ -813,7 +812,8 @@ static void pbus_size_io(struct pci_bus
> resource_size_t add_size, struct list_head *realloc_head)
> {
> struct pci_dev *dev;
> - struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO);
> + struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,
> + IORESOURCE_IO);
> resource_size_t size = 0, size0 = 0, size1 = 0;
> resource_size_t children_add_size = 0;
> resource_size_t min_align, align;
> @@ -913,15 +913,16 @@ static inline resource_size_t calculate_
> * guarantees that all child resources fit in this size.
> */
> static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> - unsigned long type, resource_size_t min_size,
> - resource_size_t add_size,
> - struct list_head *realloc_head)
> + unsigned long type, unsigned long type2,
> + resource_size_t min_size, resource_size_t add_size,
> + struct list_head *realloc_head)
> {
> struct pci_dev *dev;
> resource_size_t min_align, align, size, size0, size1;
> resource_size_t aligns[12]; /* Alignments from 1Mb to 2Gb */
> int order, max_order;
> - struct resource *b_res = find_free_bus_resource(bus, type);
> + struct resource *b_res = find_free_bus_resource(bus,
> + mask | IORESOURCE_PREFETCH, type);
> unsigned int mem64_mask = 0;
> resource_size_t children_add_size = 0;
>
> @@ -942,7 +943,8 @@ static int pbus_size_mem(struct pci_bus
> struct resource *r = &dev->resource[i];
> resource_size_t r_size;
>
> - if (r->parent || (r->flags & mask) != type)
> + if (r->parent || ((r->flags & mask) != type &&
> + (r->flags & mask) != type2))
> continue;
> r_size = resource_size(r);
> #ifdef CONFIG_PCI_IOV
> @@ -1115,8 +1117,9 @@ void __ref __pci_bus_size_bridges(struct
> struct list_head *realloc_head)
> {
> struct pci_dev *dev;
> - unsigned long mask, prefmask;
> + unsigned long mask, prefmask, type2 = 0;
> resource_size_t additional_mem_size = 0, additional_io_size = 0;
> + struct resource *b_res;
>
> list_for_each_entry(dev, &bus->devices, bus_list) {
> struct pci_bus *b = dev->subordinate;
> @@ -1161,15 +1164,31 @@ void __ref __pci_bus_size_bridges(struct
> has already been allocated by arch code, try
> non-prefetchable range for both types of PCI memory
> resources. */
> + b_res = &bus->self->resource[PCI_BRIDGE_RESOURCES];
> mask = IORESOURCE_MEM;
> prefmask = IORESOURCE_MEM | IORESOURCE_PREFETCH;
> - if (pbus_size_mem(bus, prefmask, prefmask,
> + if (b_res[2].flags & IORESOURCE_MEM_64) {
> + prefmask |= IORESOURCE_MEM_64;
> + if (pbus_size_mem(bus, prefmask, prefmask, prefmask,
> realloc_head ? 0 : additional_mem_size,
> - additional_mem_size, realloc_head))
> - mask = prefmask; /* Success, size non-prefetch only. */
> - else
> - additional_mem_size += additional_mem_size;
> - pbus_size_mem(bus, mask, IORESOURCE_MEM,
> + additional_mem_size, realloc_head)) {
> + /* Success, size non-pref64 only. */
> + mask = prefmask;
> + type2 = prefmask & ~IORESOURCE_MEM_64;
> + }
> + }
> + if (!type2) {
> + prefmask &= ~IORESOURCE_MEM_64;
> + if (pbus_size_mem(bus, prefmask, prefmask, prefmask,
> + realloc_head ? 0 : additional_mem_size,
> + additional_mem_size, realloc_head)) {
> + /* Success, size non-prefetch only. */
> + mask = prefmask;
> + } else
> + additional_mem_size += additional_mem_size;
> + type2 = IORESOURCE_MEM;
> + }
> + pbus_size_mem(bus, mask, IORESOURCE_MEM, type2,
> realloc_head ? 0 : additional_mem_size,
> additional_mem_size, realloc_head);
> break;
> @@ -1255,42 +1274,66 @@ static void __ref __pci_bridge_assign_re
> static void pci_bridge_release_resources(struct pci_bus *bus,
> unsigned long type)
> {
> - int idx;
> - bool changed = false;
> - struct pci_dev *dev;
> + struct pci_dev *dev = bus->self;
> struct resource *r;
> unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> - IORESOURCE_PREFETCH;
> + IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
> + unsigned old_flags = 0;
> + struct resource *b_res;
> + int idx = 1;
>
> - dev = bus->self;
> - for (idx = PCI_BRIDGE_RESOURCES; idx <= PCI_BRIDGE_RESOURCE_END;
> - idx++) {
> - r = &dev->resource[idx];
> - if ((r->flags & type_mask) != type)
> - continue;
> - if (!r->parent)
> - continue;
> - /*
> - * if there are children under that, we should release them
> - * all
> - */
> - release_child_resources(r);
> - if (!release_resource(r)) {
> - dev_printk(KERN_DEBUG, &dev->dev,
> - "resource %d %pR released\n", idx, r);
> - /* keep the old size */
> - r->end = resource_size(r) - 1;
> - r->start = 0;
> - r->flags = 0;
> - changed = true;
> - }
> - }
> + b_res = &dev->resource[PCI_BRIDGE_RESOURCES];
> +
> + /*
> + * 1. if there is io port assign fail, will release bridge
> + * io port.
> + * 2. if there is non pref mmio assign fail, release bridge
> + * nonpref mmio.
> + * 3. if there is 64bit pref mmio assign fail, and bridge pref
> + * is 64bit, release bridge pref mmio.
> + * 4. if there is pref mmio assign fail, and bridge pref is
> + * 32bit mmio, release bridge pref mmio
> + * 5. if there is pref mmio assign fail, and bridge pref is not
> + * assigned, release bridge nonpref mmio.
> + */
> + if (type & IORESOURCE_IO)
> + idx = 0;
> + else if (!(type & IORESOURCE_PREFETCH))
> + idx = 1;
> + else if ((type & IORESOURCE_MEM_64) &&
> + (b_res[2].flags & IORESOURCE_MEM_64))
> + idx = 2;
> + else if (!(b_res[2].flags & IORESOURCE_MEM_64) &&
> + (b_res[2].flags & IORESOURCE_PREFETCH))
> + idx = 2;
> + else
> + idx = 1;
> +
> + r = &b_res[idx];
> +
> + if (!r->parent)
> + return;
> +
> + /*
> + * if there are children under that, we should release them
> + * all
> + */
> + release_child_resources(r);
> + if (!release_resource(r)) {
> + type = old_flags = r->flags & type_mask;
> + dev_printk(KERN_DEBUG, &dev->dev, "resource %d %pR released\n",
> + PCI_BRIDGE_RESOURCES + idx, r);
> + /* keep the old size */
> + r->end = resource_size(r) - 1;
> + r->start = 0;
> + r->flags = 0;
>
> - if (changed) {
> /* avoiding touch the one without PREF */
> if (type & IORESOURCE_PREFETCH)
> type = IORESOURCE_PREFETCH;
> __pci_setup_bridge(bus, type);
> + /* for next child res under same bridge */
> + r->flags = old_flags;
> }
> }
>
> @@ -1469,7 +1512,7 @@ void pci_assign_unassigned_root_bus_reso
> LIST_HEAD(fail_head);
> struct pci_dev_resource *fail_res;
> unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> - IORESOURCE_PREFETCH;
> + IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
> int pci_try_num = 1;
> enum enable_type enable_local;
>
> Index: linux-2.6/drivers/pci/setup-res.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-res.c
> +++ linux-2.6/drivers/pci/setup-res.c
> @@ -208,9 +208,21 @@ static int __pci_assign_resource(struct
>
> /* First, try exact prefetching match.. */
> ret = pci_bus_alloc_resource(bus, res, size, align, min,
> - IORESOURCE_PREFETCH,
> + IORESOURCE_PREFETCH | IORESOURCE_MEM_64,
> pcibios_align_resource, dev);
>
> + if (ret < 0 &&
> + (res->flags & (IORESOURCE_PREFETCH | IORESOURCE_MEM_64))) {
> + /*
> + * That failed.
> + *
> + * Try below 4g pref
> + */
> + ret = pci_bus_alloc_resource(bus, res, size, align, min,
> + IORESOURCE_PREFETCH,
> + pcibios_align_resource, dev);
> + }
> +
> if (ret < 0 && (res->flags & IORESOURCE_PREFETCH)) {
> /*
> * That failed.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-11-22 7:08 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-20 1:51 [PATCH 0/6] PCI: double remove fixing and allocate 64bit mmio pref Yinghai Lu
2013-11-20 1:51 ` [PATCH 1/6] PCI: move back pci_proc_attach_devices calling Yinghai Lu
2013-11-20 1:51 ` [PATCH 2/6] PCI: move resources and bus_list releasing to pci_release_dev Yinghai Lu
2013-11-20 1:51 ` [PATCH 3/6] PCI: Destroy pci dev only once Yinghai Lu
2013-11-20 1:51 ` [PATCH 4/6] PCI: pcibus address to resource converting take bus directly Yinghai Lu
2013-11-20 1:51 ` [PATCH 5/6] PCI: Add pcibios_bus_addr_to_res() Yinghai Lu
2013-11-20 1:51 ` [PATCH 6/6] PCI: Try to allocate mem64 above 4G at first Yinghai Lu
2013-11-21 10:30 ` Guo Chao
2013-11-21 20:18 ` Yinghai Lu
2013-11-21 21:05 ` Linus Torvalds
2013-11-21 21:34 ` Yinghai Lu
2013-11-21 21:42 ` Linus Torvalds
2013-11-22 6:11 ` Yinghai Lu
2013-11-22 7:08 ` Guo Chao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).