linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] powerpc/powernv: Fix PE number for PF
@ 2015-06-19  2:26 Gavin Shan
  2015-06-19  2:26 ` [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times Gavin Shan
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Gavin Shan @ 2015-06-19  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

When CONFIG_PCI_IOV is enabled in kernel configuration, the logic reserving
PEs according to consumed M64 segments in bridge's M64 window won't work
properly. The bridge's M64 window contains VF BARs, which are M64 BARs.
Current code could reserve and pick PE number according to M64 segments
accomodating VF BARs. The patches fix the issue by reserving and picking
PE numbers based on BARs (exclude VF BARs) of PCI devices, instead of
bridge's M64 window.

The code is picked from the patchset "powerpc/powernv: PCI hotplug support",
I'm working on. With the patch applied, the PE number assigned to PF is
correct:

[root@powerio-le11 ~]# lspci -vvs 0005:01:00.0
0005:01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
	Subsystem: IBM Device 04e7
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 510
	Region 0: Memory at 3ff200000000 (64-bit, non-prefetchable) [size=1M]
	Region 2: Memory at 3d4400000000 (64-bit, prefetchable) [size=32M]
	:

[root@powerio-le11 /]# cat /sys/bus/pci/devices/0005:01:00.0/eeh_pe_config_addr
0x40


Gavin Shan (4):
  powerpc/powernv: Allow to reserve one PE for multiple times
  powerpc/powernv: Reserve M64 PEs based on BARs
  powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE()
  powerpc/powernv: Pick M64 PEs based on BARs

 arch/powerpc/platforms/powernv/pci-ioda.c | 127 +++++++++++-------------------
 arch/powerpc/platforms/powernv/pci.h      |   5 +-
 2 files changed, 51 insertions(+), 81 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times
  2015-06-19  2:26 [PATCH 0/4] powerpc/powernv: Fix PE number for PF Gavin Shan
@ 2015-06-19  2:26 ` Gavin Shan
  2015-07-16  9:54   ` [1/4] " Michael Ellerman
  2015-06-19  2:26 ` [PATCH 2/4] powerpc/powernv: Reserve M64 PEs based on BARs Gavin Shan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Gavin Shan @ 2015-06-19  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The PE numbers are reserved according to root port's M64 window,
which is aligned to M64 segment finely. So one PE shouldn't be
reserved for multiple times. We will reserve PE numbers according
to the M64 BARs of PCI device in subsequent patches, which aren't
aligned to M64 segment size finely. It means one particular PE
could be reserved for multiple times.

The patch allows one PE to be reserved for multiple times and we
print the warning message at debugging level.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8424f5c..d5bfa76 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -140,11 +140,9 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 		return;
 	}
 
-	if (test_and_set_bit(pe_no, phb->ioda.pe_alloc)) {
-		pr_warn("%s: PE %d was assigned on PHB#%x\n",
-			__func__, pe_no, phb->hose->global_number);
-		return;
-	}
+	if (test_and_set_bit(pe_no, phb->ioda.pe_alloc))
+		pr_debug("%s: PE %d was reserved on PHB#%x\n",
+			 __func__, pe_no, phb->hose->global_number);
 
 	phb->ioda.pe_array[pe_no].phb = phb;
 	phb->ioda.pe_array[pe_no].pe_number = pe_no;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] powerpc/powernv: Reserve M64 PEs based on BARs
  2015-06-19  2:26 [PATCH 0/4] powerpc/powernv: Fix PE number for PF Gavin Shan
  2015-06-19  2:26 ` [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times Gavin Shan
@ 2015-06-19  2:26 ` Gavin Shan
  2015-06-19  2:26 ` [PATCH 3/4] powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE() Gavin Shan
  2015-06-19  2:26 ` [PATCH 4/4] powerpc/powernv: Pick M64 PEs based on BARs Gavin Shan
  3 siblings, 0 replies; 6+ messages in thread
From: Gavin Shan @ 2015-06-19  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

On PHB3, some PEs might be reserved in advance to reflect the M64
segments consumed by those PEs. We're reserving PEs based on the
M64 window of root port, which might contain VF BAR. The PEs for
VFs are allocated dynamically, not reserved based on the consumed
M64 segments. So the M64 window of root port isn't reliable for
the task. Instead, we go through M64 BARs (VF BARs excluded) of
PCI devices under the specified root bus and reserve PEs accordingly,
as the patch does.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 56 ++++++++++++++++++++-----------
 arch/powerpc/platforms/powernv/pci.h      |  3 +-
 2 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d5bfa76..b1d9fec 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -229,32 +229,48 @@ fail:
 	return -EIO;
 }
 
-static void pnv_ioda2_reserve_m64_pe(struct pnv_phb *phb)
+static void pnv_ioda2_reserve_dev_m64_pe(struct pci_dev *pdev,
+					 unsigned long *pe_bitmap)
 {
-	resource_size_t sgsz = phb->ioda.m64_segsize;
-	struct pci_dev *pdev;
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
 	struct resource *r;
-	int base, step, i;
-
-	/*
-	 * Root bus always has full M64 range and root port has
-	 * M64 range used in reality. So we're checking root port
-	 * instead of root bus.
-	 */
-	list_for_each_entry(pdev, &phb->hose->bus->devices, bus_list) {
-		for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
-			r = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
-			if (!r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
-				continue;
+	resource_size_t base, sgsz, start, end;
+	int segno, i;
+
+	base = phb->ioda.m64_base;
+	sgsz = phb->ioda.m64_segsize;
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
+		r = &pdev->resource[i];
+		if (!r->parent || !pnv_pci_is_mem_pref_64(r->flags))
+			continue;
 
-			base = (r->start - phb->ioda.m64_base) / sgsz;
-			for (step = 0; step < resource_size(r) / sgsz; step++)
-				pnv_ioda_reserve_pe(phb, base + step);
+		start = _ALIGN_DOWN(r->start - base, sgsz);
+		end = _ALIGN_UP(r->end - base, sgsz);
+		for (segno = start / sgsz; segno < end / sgsz; segno++) {
+			if (pe_bitmap)
+				set_bit(segno, pe_bitmap);
+			else
+				pnv_ioda_reserve_pe(phb, segno);
 		}
 	}
 }
 
+static void pnv_ioda2_reserve_m64_pe(struct pci_bus *bus,
+				     unsigned long *pe_bitmap,
+				     bool all)
+{
+	struct pci_dev *pdev;
+
+	list_for_each_entry(pdev, &bus->devices, bus_list) {
+		pnv_ioda2_reserve_dev_m64_pe(pdev, pe_bitmap);
+
+		if (all && pdev->subordinate)
+			pnv_ioda2_reserve_m64_pe(pdev->subordinate,
+						 pe_bitmap, all);
+	}
+}
+
 static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
 				 struct pci_bus *bus, int all)
 {
@@ -1145,7 +1161,7 @@ static void pnv_pci_ioda_setup_PEs(void)
 
 		/* M64 layout might affect PE allocation */
 		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(phb);
+			phb->reserve_m64_pe(hose->bus, NULL, true);
 
 		pnv_ioda_setup_PEs(hose->bus);
 	}
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 8ef2d28..c6ddd18 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -110,7 +110,8 @@ struct pnv_phb {
 	void (*fixup_phb)(struct pci_controller *hose);
 	u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn);
 	int (*init_m64)(struct pnv_phb *phb);
-	void (*reserve_m64_pe)(struct pnv_phb *phb);
+	void (*reserve_m64_pe)(struct pci_bus *bus,
+			       unsigned long *pe_bitmap, bool all);
 	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE()
  2015-06-19  2:26 [PATCH 0/4] powerpc/powernv: Fix PE number for PF Gavin Shan
  2015-06-19  2:26 ` [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times Gavin Shan
  2015-06-19  2:26 ` [PATCH 2/4] powerpc/powernv: Reserve M64 PEs based on BARs Gavin Shan
@ 2015-06-19  2:26 ` Gavin Shan
  2015-06-19  2:26 ` [PATCH 4/4] powerpc/powernv: Pick M64 PEs based on BARs Gavin Shan
  3 siblings, 0 replies; 6+ messages in thread
From: Gavin Shan @ 2015-06-19  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The patch changes the type of last argument of pnv_ioda_setup_bus_PE()
and phb::pick_m64_pe() to boolean. No functional change.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++++----
 arch/powerpc/platforms/powernv/pci.h      | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index b1d9fec..909ed58 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -272,7 +272,7 @@ static void pnv_ioda2_reserve_m64_pe(struct pci_bus *bus,
 }
 
 static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
-				 struct pci_bus *bus, int all)
+				 struct pci_bus *bus, bool all)
 {
 	resource_size_t segsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
@@ -1064,7 +1064,7 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
  * subordinate PCI devices and buses. The second type of PE is normally
  * orgiriated by PCIe-to-PCI bridge or PLX switch downstream ports.
  */
-static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
+static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 {
 	struct pci_controller *hose = pci_bus_to_host(bus);
 	struct pnv_phb *phb = hose->private_data;
@@ -1131,12 +1131,12 @@ static void pnv_ioda_setup_PEs(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
 
-	pnv_ioda_setup_bus_PE(bus, 0);
+	pnv_ioda_setup_bus_PE(bus, false);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		if (dev->subordinate) {
 			if (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)
-				pnv_ioda_setup_bus_PE(dev->subordinate, 1);
+				pnv_ioda_setup_bus_PE(dev->subordinate, true);
 			else
 				pnv_ioda_setup_PEs(dev->subordinate);
 		}
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index c6ddd18..5915cd2 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -112,7 +112,7 @@ struct pnv_phb {
 	int (*init_m64)(struct pnv_phb *phb);
 	void (*reserve_m64_pe)(struct pci_bus *bus,
 			       unsigned long *pe_bitmap, bool all);
-	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
+	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, bool all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
 	int (*unfreeze_pe)(struct pnv_phb *phb, int pe_no, int opt);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] powerpc/powernv: Pick M64 PEs based on BARs
  2015-06-19  2:26 [PATCH 0/4] powerpc/powernv: Fix PE number for PF Gavin Shan
                   ` (2 preceding siblings ...)
  2015-06-19  2:26 ` [PATCH 3/4] powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE() Gavin Shan
@ 2015-06-19  2:26 ` Gavin Shan
  3 siblings, 0 replies; 6+ messages in thread
From: Gavin Shan @ 2015-06-19  2:26 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

On PHB3, PE might be reserved in advance to reflect the M64 segments
consumed by the PE according to M64 BARs (exclude VF BARs) of the PCI
devices included in the PE. The PE is picked based on M64 BARs instead
of the bridge's M64 windows, which might include VF BARs. Otherwise,
wrong PE could be picked.

The patch calculates the used M64 segments and PE numbers according to
the M64 BARs, excluding VF BARs, of PCI devices in one particular PE,
instead of the bridge's M64 windows. Then the right PE number is picked.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 59 ++++---------------------------
 arch/powerpc/platforms/powernv/pci.h      |  2 +-
 2 files changed, 8 insertions(+), 53 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 909ed58..5aa5b82 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -271,35 +271,18 @@ static void pnv_ioda2_reserve_m64_pe(struct pci_bus *bus,
 	}
 }
 
-static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
-				 struct pci_bus *bus, bool all)
+static int pnv_ioda2_pick_m64_pe(struct pci_bus *bus, bool all)
 {
-	resource_size_t segsz = phb->ioda.m64_segsize;
-	struct pci_dev *pdev;
-	struct resource *r;
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
 	struct pnv_ioda_pe *master_pe, *pe;
 	unsigned long size, *pe_alloc;
-	bool found;
-	int start, i, j;
+	int i;
 
 	/* Root bus shouldn't use M64 */
 	if (pci_is_root_bus(bus))
 		return IODA_INVALID_PE;
 
-	/* We support only one M64 window on each bus */
-	found = false;
-	pci_bus_for_each_resource(bus, r, i) {
-		if (r && r->parent &&
-		    pnv_pci_is_mem_pref_64(r->flags)) {
-			found = true;
-			break;
-		}
-	}
-
-	/* No M64 window found ? */
-	if (!found)
-		return IODA_INVALID_PE;
-
 	/* Allocate bitmap */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
 	pe_alloc = kzalloc(size, GFP_KERNEL);
@@ -309,35 +292,8 @@ static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
 		return IODA_INVALID_PE;
 	}
 
-	/*
-	 * Figure out reserved PE numbers by the PE
-	 * the its child PEs.
-	 */
-	start = (r->start - phb->ioda.m64_base) / segsz;
-	for (i = 0; i < resource_size(r) / segsz; i++)
-		set_bit(start + i, pe_alloc);
-
-	if (all)
-		goto done;
-
-	/*
-	 * If the PE doesn't cover all subordinate buses,
-	 * we need subtract from reserved PEs for children.
-	 */
-	list_for_each_entry(pdev, &bus->devices, bus_list) {
-		if (!pdev->subordinate)
-			continue;
-
-		pci_bus_for_each_resource(pdev->subordinate, r, i) {
-			if (!r || !r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
-				continue;
-
-			start = (r->start - phb->ioda.m64_base) / segsz;
-			for (j = 0; j < resource_size(r) / segsz ; j++)
-				clear_bit(start + j, pe_alloc);
-                }
-        }
+	/* Figure out reserved PE numbers by the PE */
+	pnv_ioda2_reserve_m64_pe(bus, pe_alloc, all);
 
 	/*
 	 * the current bus might not own M64 window and that's all
@@ -353,7 +309,6 @@ static int pnv_ioda2_pick_m64_pe(struct pnv_phb *phb,
 	 * Figure out the master PE and put all slave PEs to master
 	 * PE's list to form compound PE.
 	 */
-done:
 	master_pe = NULL;
 	i = -1;
 	while ((i = find_next_bit(pe_alloc, phb->ioda.total_pe, i + 1)) <
@@ -1073,7 +1028,7 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
 
 	/* Check if PE is determined by M64 */
 	if (phb->pick_m64_pe)
-		pe_num = phb->pick_m64_pe(phb, bus, all);
+		pe_num = phb->pick_m64_pe(bus, all);
 
 	/* The PE number isn't pinned by M64 */
 	if (pe_num == IODA_INVALID_PE)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 5915cd2..e891ff4 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -112,7 +112,7 @@ struct pnv_phb {
 	int (*init_m64)(struct pnv_phb *phb);
 	void (*reserve_m64_pe)(struct pci_bus *bus,
 			       unsigned long *pe_bitmap, bool all);
-	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, bool all);
+	int (*pick_m64_pe)(struct pci_bus *bus, bool all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
 	int (*unfreeze_pe)(struct pnv_phb *phb, int pe_no, int opt);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [1/4] powerpc/powernv: Allow to reserve one PE for multiple times
  2015-06-19  2:26 ` [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times Gavin Shan
@ 2015-07-16  9:54   ` Michael Ellerman
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2015-07-16  9:54 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev; +Cc: Gavin Shan

On Fri, 2015-19-06 at 02:26:16 UTC, Gavin Shan wrote:
> The PE numbers are reserved according to root port's M64 window,
> which is aligned to M64 segment finely. So one PE shouldn't be
> reserved for multiple times. We will reserve PE numbers according
> to the M64 BARs of PCI device in subsequent patches, which aren't
> aligned to M64 segment size finely. It means one particular PE
> could be reserved for multiple times.
> 
> The patch allows one PE to be reserved for multiple times and we
> print the warning message at debugging level.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e9dc4d7f72a375020ecb

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-07-16  9:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-19  2:26 [PATCH 0/4] powerpc/powernv: Fix PE number for PF Gavin Shan
2015-06-19  2:26 ` [PATCH 1/4] powerpc/powernv: Allow to reserve one PE for multiple times Gavin Shan
2015-07-16  9:54   ` [1/4] " Michael Ellerman
2015-06-19  2:26 ` [PATCH 2/4] powerpc/powernv: Reserve M64 PEs based on BARs Gavin Shan
2015-06-19  2:26 ` [PATCH 3/4] powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE() Gavin Shan
2015-06-19  2:26 ` [PATCH 4/4] powerpc/powernv: Pick M64 PEs based on BARs Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).