linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug
@ 2015-02-17  7:13 Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 1/7] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

The patchset was built based on patchset "powerpc/powernv: Simplify EEH
implementation", which can be found from:

https://patchwork.ozlabs.org/patch/439956/

The patchset corresponds to skiboot changes, which manages PCI slots
in a unified way: OPAL APIs used to do slot reset, power management,
presence status retrival. The patchset shouldn't be merged before
the OPAL firmware counterpart is merged:

https://patchwork.ozlabs.org/patch/440463/

The kernel changes have been split into 2 parts:
(A) Use the unified PCI slot reset OPAL API, PATCH[1] and PATCH[2]
(B) powernv-php driver to support PCI hotplug for PowerNV platform, starting
    from PATCH[3]. The device tree is scanned when the driver is loaded. If
    any PCI device node is equipped with property "ibm,slot-pluggable" and
    "ibm,reset-by-firmware", it's regarded as hotpluggable slot and the driver
    creates/registers slot for it. After that, the sysfs entries can be used
    to operate the slot.

    PATCH[3/4/5/6]: Necessary code changes to PPC PCI subsystem in order to
                    support PCI slots for PPC PowerNV platform.
    PATCH[7]      : powernv-php driver to support PCI hotplug for PowerNV
                    platform.


Hotplug testing 
===============
# cat /proc/cpuinfo | grep -i powernv
platform        : PowerNV
machine         : PowerNV 8286-41A

# pwd
/sys/bus/pci/slots
# ls
C10  C11  C12  C14  C15  C6  C7  C8  C9

# lspci -s 0003::.
0003:00:00.0 PCI bridge: IBM Device 03dc
0003:01:00.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:01.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:08.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:09.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:10.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:11.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:03:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0003:09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:0f:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
# pwd
/sys/bus/pci/slots/C10
# cat address
0003:09:00
# cat cur_bus_speed
5.0 GT/s PCIe
# cat max_bus_speed
8.0 GT/s PCIe
# cat power
1
# echo 0 > power
# lspci -s 0003::.
0003:00:00.0 PCI bridge: IBM Device 03dc
0003:01:00.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:01.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:08.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:09.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:10.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:11.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:03:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0003:0f:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
# echo 1 > power
# lspci -s 0003::.
0003:00:00.0 PCI bridge: IBM Device 03dc
0003:01:00.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:01.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:08.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:09.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:10.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:02:11.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca)
0003:03:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0003:09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:09:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0003:0f:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

Changelog
=========
v1 -> v2
   * Keep opal_pci_reinit(). In case the slot is resetted by kernel,
     instead of skiboot, this API should be called to restore states
     for those affected devices.
   * Reworked slot ID scheme so that old/new kernel can work with
     skiboot with or without unified PCI slot management support.
   * Code cleanup here and there.
   * Separate powernv-php driver to support PCI hotplug for
     PowerNV platform.
   * Check if the OPAL API supported by firmware before calling
     into it, which is necessary for back-compability.
   * Separate patch for factoring pnv_pci_poll().


Gavin Shan (7):
  powerpc/powernv: Use PCI slot reset infrastructure
  powerpc/powernv: Issue fundamental reset if required
  powerpc/pci: Move pcibios_find_pci_bus() around
  powerpc/pci: Don't scan empty slot
  powerpc/powernv: Introduce pnv_pci_poll()
  powerpc/powernv: Functions to retrieve PCI slot status
  PCI/hotplug: PowerPC PowerNV PCI hotplug driver

 arch/powerpc/include/asm/eeh.h                 |   1 +
 arch/powerpc/include/asm/opal.h                |  14 +-
 arch/powerpc/include/asm/pnv-pci.h             |   3 +
 arch/powerpc/kernel/pci-hotplug.c              |  39 ++-
 arch/powerpc/platforms/powernv/eeh-powernv.c   | 258 +++++++++--------
 arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
 arch/powerpc/platforms/powernv/pci.c           |  43 +++
 arch/powerpc/platforms/powernv/pci.h           |   1 +
 arch/powerpc/platforms/pseries/pci_dlpar.c     |  32 ---
 drivers/pci/hotplug/Kconfig                    |  12 +
 drivers/pci/hotplug/Makefile                   |   4 +
 drivers/pci/hotplug/powernv_php.c              | 126 ++++++++
 drivers/pci/hotplug/powernv_php.h              |  70 +++++
 drivers/pci/hotplug/powernv_php_slot.c         | 382 +++++++++++++++++++++++++
 14 files changed, 817 insertions(+), 170 deletions(-)
 create mode 100644 drivers/pci/hotplug/powernv_php.c
 create mode 100644 drivers/pci/hotplug/powernv_php.h
 create mode 100644 drivers/pci/hotplug/powernv_php_slot.c

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 1/7] powerpc/powernv: Use PCI slot reset infrastructure
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 2/7] powerpc/powernv: Issue fundamental reset if required Gavin Shan
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

For PowerNV platform, running on top of skiboot, all PE level reset
should be routed to firmware if the bridge of the PE primary bus has
device-node property "ibm,reset-by-firmware". Otherwise, the kernel
has to issue hot reset on PE's primary bus despite the requested
reset types, which is the behaviour before the firmware supports PCI
slot reset. So the code doesn't depend on the PCI slot reset capability
exposed from the firmware.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |   1 +
 arch/powerpc/include/asm/opal.h              |   9 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c | 218 ++++++++++++++-------------
 3 files changed, 114 insertions(+), 114 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 55abfd0..9de87ce 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -193,6 +193,7 @@ enum {
 #define EEH_RESET_DEACTIVATE	0	/* Deactivate the PE reset	*/
 #define EEH_RESET_HOT		1	/* Hot reset			*/
 #define EEH_RESET_FUNDAMENTAL	3	/* Fundamental reset		*/
+#define EEH_RESET_COMPLETE	4	/* PHB complete reset		*/
 #define EEH_LOG_TEMP		1	/* EEH temporary error log	*/
 #define EEH_LOG_PERM		2	/* EEH permanent error log	*/
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9ee0a30..ceec756 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -374,11 +374,6 @@ enum OpalPciResetState {
 	OPAL_ASSERT_RESET = 1
 };
 
-enum OpalPciMaskAction {
-	OPAL_UNMASK_ERROR_TYPE = 0,
-	OPAL_MASK_ERROR_TYPE = 1
-};
-
 enum OpalSlotLedType {
 	OPAL_SLOT_LED_ID_TYPE = 0,
 	OPAL_SLOT_LED_FAULT_TYPE = 1
@@ -867,7 +862,7 @@ int64_t opal_pci_map_pe_dma_window(uint64_t phb_id, uint16_t pe_number, uint16_t
 int64_t opal_pci_map_pe_dma_window_real(uint64_t phb_id, uint16_t pe_number,
 					uint16_t dma_window_number, uint64_t pci_start_addr,
 					uint64_t pci_mem_size);
-int64_t opal_pci_reset(uint64_t phb_id, uint8_t reset_scope, uint8_t assert_state);
+int64_t opal_pci_reset(uint64_t id, uint8_t reset_scope, uint8_t assert_state);
 
 int64_t opal_pci_get_hub_diag_data(uint64_t hub_id, void *diag_buffer,
 				   uint64_t diag_buffer_len);
@@ -883,7 +878,7 @@ int64_t opal_get_epow_status(__be64 *status);
 int64_t opal_set_system_attention_led(uint8_t led_action);
 int64_t opal_pci_next_error(uint64_t phb_id, __be64 *first_frozen_pe,
 			    __be16 *pci_error_type, __be16 *severity);
-int64_t opal_pci_poll(uint64_t phb_id);
+int64_t opal_pci_poll(uint64_t id, uint8_t *val);
 int64_t opal_return_cpu(void);
 int64_t opal_check_token(uint64_t token);
 int64_t opal_reinit_cpus(uint64_t flags);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index ede6906..8cec57d 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -665,12 +665,12 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay)
 	return ret;
 }
 
-static s64 pnv_eeh_phb_poll(struct pnv_phb *phb)
+static s64 pnv_eeh_poll(uint64_t id)
 {
 	s64 rc = OPAL_HARDWARE;
 
 	while (1) {
-		rc = opal_pci_poll(phb->opal_id);
+		rc = opal_pci_poll(id, NULL);
 		if (rc <= 0)
 			break;
 
@@ -686,84 +686,38 @@ static s64 pnv_eeh_phb_poll(struct pnv_phb *phb)
 int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 {
 	struct pnv_phb *phb = hose->private_data;
+	uint8_t scope;
 	s64 rc = OPAL_HARDWARE;
 
 	pr_debug("%s: Reset PHB#%x, option=%d\n",
 		 __func__, hose->global_number, option);
+	switch (option) {
+	case EEH_RESET_HOT:
+		scope = OPAL_RESET_PCI_HOT;
+		break;
+	case EEH_RESET_FUNDAMENTAL:
+		scope = OPAL_RESET_PCI_FUNDAMENTAL;
+		break;
+	case EEH_RESET_COMPLETE:
+		scope = OPAL_RESET_PHB_COMPLETE;
+		break;
+	case EEH_RESET_DEACTIVATE:
+		return 0;
+	default:
+		pr_warn("%s: Unsupported option %d\n",
+			__func__, option);
+		return -EINVAL;
+        }
 
-	/* Issue PHB complete reset request */
-	if (option == EEH_RESET_FUNDAMENTAL ||
-	    option == EEH_RESET_HOT)
-		rc = opal_pci_reset(phb->opal_id,
-				    OPAL_RESET_PHB_COMPLETE,
-				    OPAL_ASSERT_RESET);
-	else if (option == EEH_RESET_DEACTIVATE)
-		rc = opal_pci_reset(phb->opal_id,
-				    OPAL_RESET_PHB_COMPLETE,
-				    OPAL_DEASSERT_RESET);
-	if (rc < 0)
-		goto out;
-
-	/*
-	 * Poll state of the PHB until the request is done
-	 * successfully. The PHB reset is usually PHB complete
-	 * reset followed by hot reset on root bus. So we also
-	 * need the PCI bus settlement delay.
-	 */
-	rc = pnv_eeh_phb_poll(phb);
-	if (option == EEH_RESET_DEACTIVATE) {
-		if (system_state < SYSTEM_RUNNING)
-			udelay(1000 * EEH_PE_RST_SETTLE_TIME);
-		else
-			msleep(EEH_PE_RST_SETTLE_TIME);
-	}
-out:
-	if (rc != OPAL_SUCCESS)
-		return -EIO;
-
-	return 0;
-}
-
-static int pnv_eeh_root_reset(struct pci_controller *hose, int option)
-{
-	struct pnv_phb *phb = hose->private_data;
-	s64 rc = OPAL_HARDWARE;
-
-	pr_debug("%s: Reset PHB#%x, option=%d\n",
-		 __func__, hose->global_number, option);
-
-	/*
-	 * During the reset deassert time, we needn't care
-	 * the reset scope because the firmware does nothing
-	 * for fundamental or hot reset during deassert phase.
-	 */
-	if (option == EEH_RESET_FUNDAMENTAL)
-		rc = opal_pci_reset(phb->opal_id,
-				    OPAL_RESET_PCI_FUNDAMENTAL,
-				    OPAL_ASSERT_RESET);
-	else if (option == EEH_RESET_HOT)
-		rc = opal_pci_reset(phb->opal_id,
-				    OPAL_RESET_PCI_HOT,
-				    OPAL_ASSERT_RESET);
-	else if (option == EEH_RESET_DEACTIVATE)
-		rc = opal_pci_reset(phb->opal_id,
-				    OPAL_RESET_PCI_HOT,
-				    OPAL_DEASSERT_RESET);
-	if (rc < 0)
-		goto out;
-
-	/* Poll state of the PHB until the request is done */
-	rc = pnv_eeh_phb_poll(phb);
-	if (option == EEH_RESET_DEACTIVATE)
-		msleep(EEH_PE_RST_SETTLE_TIME);
-out:
-	if (rc != OPAL_SUCCESS)
-		return -EIO;
+	/* Issue reset and poll until it's completed */
+	rc = opal_pci_reset(phb->opal_id, scope, OPAL_ASSERT_RESET);
+	if (rc > 0)
+		rc = pnv_eeh_poll(phb->opal_id);
 
-	return 0;
+	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
 }
 
-static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
+static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 {
 	struct device_node *dn = pci_device_to_OF_node(dev);
 	struct eeh_dev *edev = of_node_to_eeh_dev(dn);
@@ -814,14 +768,57 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
+{
+	struct pci_controller *hose;
+	struct pnv_phb *phb;
+	struct device_node *dn = dev ? pci_device_to_OF_node(dev) : NULL;
+	uint64_t id = (0x1ul << 60);
+	uint8_t scope;
+	s64 rc;
+
+	/*
+	 * If the firmware can't handle it, we will issue hot reset
+	 * on the secondary bus despite the requested reset type
+	 */
+	if (!dn || !of_get_property(dn, "ibm,reset-by-firmware", NULL))
+		return __pnv_eeh_bridge_reset(dev, option);
+
+	/* The firmware can handle the request */
+	switch (option) {
+	case EEH_RESET_HOT:
+		scope = OPAL_RESET_PCI_HOT;
+		break;
+	case EEH_RESET_FUNDAMENTAL:
+		scope = OPAL_RESET_PCI_FUNDAMENTAL;
+		break;
+	case EEH_RESET_DEACTIVATE:
+		return 0;
+	case EEH_RESET_COMPLETE:
+	default:
+		pr_warn("%s: Unsupported option %d on device %s\n",
+			__func__, option, pci_name(dev));
+		return -EINVAL;
+	}
+
+	hose = pci_bus_to_host(dev->bus);
+	phb = hose->private_data;
+	id |= (dev->bus->number << 24) | (dev->devfn << 16) | phb->opal_id;
+	rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET);
+	if (rc > 0)
+		pnv_eeh_poll(id);
+
+	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *hose;
 
 	if (pci_is_root_bus(dev->bus)) {
 		hose = pci_bus_to_host(dev->bus);
-		pnv_eeh_root_reset(hose, EEH_RESET_HOT);
-		pnv_eeh_root_reset(hose, EEH_RESET_DEACTIVATE);
+		pnv_eeh_phb_reset(hose, EEH_RESET_HOT);
+		pnv_eeh_phb_reset(hose, EEH_RESET_DEACTIVATE);
 	} else {
 		pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
 		pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE);
@@ -843,8 +840,9 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 {
 	struct pci_controller *hose = pe->phb;
+	struct pnv_phb *phb;
 	struct pci_bus *bus;
-	int ret;
+	s64 rc;
 
 	/*
 	 * For PHB reset, we always have complete reset. For those PEs whose
@@ -861,42 +859,48 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 	 * state explicitly after BAR restore.
 	 */
 	if (pe->type & EEH_PE_PHB) {
-		ret = pnv_eeh_phb_reset(hose, option);
-	} else {
-		struct pnv_phb *phb;
-		s64 rc;
-
-		/*
-		 * The frozen PE might be caused by PAPR error injection
-		 * registers, which are expected to be cleared after hitting
-		 * frozen PE as stated in the hardware spec. Unfortunately,
-		 * that's not true on P7IOC. So we have to clear it manually
-		 * to avoid recursive EEH errors during recovery.
-		 */
-		phb = hose->private_data;
-		if (phb->model == PNV_PHB_MODEL_P7IOC &&
-		    (option == EEH_RESET_HOT ||
-		    option == EEH_RESET_FUNDAMENTAL)) {
-			rc = opal_pci_reset(phb->opal_id,
-					    OPAL_RESET_PHB_ERROR,
-					    OPAL_ASSERT_RESET);
-			if (rc != OPAL_SUCCESS) {
-				pr_warn("%s: Failure %lld clearing "
-					"error injection registers\n",
-					__func__, rc);
-				return -EIO;
-			}
+		switch (option) {
+		case EEH_RESET_HOT:
+		case EEH_RESET_FUNDAMENTAL:
+		case EEH_RESET_COMPLETE:
+			break;
+		case EEH_RESET_DEACTIVATE:
+			return 0;
+		default:
+			return -EINVAL;
 		}
 
-		bus = eeh_pe_bus_get(pe);
-		if (pci_is_root_bus(bus) ||
-			pci_is_root_bus(bus->parent))
-			ret = pnv_eeh_root_reset(hose, option);
-		else
-			ret = pnv_eeh_bridge_reset(bus->self, option);
+		return pnv_eeh_phb_reset(hose, EEH_RESET_COMPLETE);
 	}
 
-	return ret;
+	/*
+	 * The frozen PE might be caused by PAPR error injection
+	 * registers, which are expected to be cleared after hitting
+	 * frozen PE as stated in the hardware spec. Unfortunately,
+	 * that's not true on P7IOC. So we have to clear it manually
+	 * to avoid recursive EEH errors during recovery.
+	 */
+	phb = hose->private_data;
+	if (phb->model == PNV_PHB_MODEL_P7IOC &&
+	    (option == EEH_RESET_HOT ||
+	     option == EEH_RESET_FUNDAMENTAL)) {
+		rc = opal_pci_reset(phb->opal_id,
+				    OPAL_RESET_PHB_ERROR,
+				    OPAL_ASSERT_RESET);
+		if (rc != OPAL_SUCCESS) {
+			pr_warn("%s: Failure %lld clearing error "
+				"injection registers\n",
+				__func__, rc);
+			return -EIO;
+		}
+	}
+
+	/* Route the reset request to PHB or upstream bridge */
+	bus = eeh_pe_bus_get(pe);
+	if (pci_is_root_bus(bus))
+		return pnv_eeh_phb_reset(hose, option);
+
+	return pnv_eeh_bridge_reset(bus->self, option);
 }
 
 /**
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 2/7] powerpc/powernv: Issue fundamental reset if required
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 1/7] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 3/7] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

Function pnv_pci_reset_secondary_bus() is used to reset specified
PCI bus, which is leaded by root complex or PCI bridge. That means
the function shouldn't be called on PCI root bus and the patch
removes the logic for that case.

Also, some adapters may require fundamental reset to reload their
firmwares. The patch translates hot reset to fundamental reset
if required.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 36 +++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 8cec57d..eeda6e1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -811,18 +811,36 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
 }
 
-void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
+static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data)
 {
-	struct pci_controller *hose;
+	int *freset = data;
 
-	if (pci_is_root_bus(dev->bus)) {
-		hose = pci_bus_to_host(dev->bus);
-		pnv_eeh_phb_reset(hose, EEH_RESET_HOT);
-		pnv_eeh_phb_reset(hose, EEH_RESET_DEACTIVATE);
-	} else {
-		pnv_eeh_bridge_reset(dev, EEH_RESET_HOT);
-		pnv_eeh_bridge_reset(dev, EEH_RESET_DEACTIVATE);
+	/*
+	 * Stop the iteration immediately if there is any
+	 * one PCI device requesting fundamental reset
+	 */
+	*freset |= pdev->needs_freset;
+	return *freset;
+}
+
+void pnv_pci_reset_secondary_bus(struct pci_dev *pdev)
+{
+	int option = EEH_RESET_HOT;
+	int freset = 0;
+
+	/* In case we need fundamental reset */
+	if (pdev->subordinate) {
+		pci_walk_bus(pdev->subordinate,
+			     pnv_pci_dev_reset_type,
+			     &freset);
+
+		if (freset)
+			option = EEH_RESET_FUNDAMENTAL;
 	}
+
+	/* Issue the requested type of reset */
+	pnv_eeh_bridge_reset(pdev, option);
+	pnv_eeh_bridge_reset(pdev, EEH_RESET_DEACTIVATE);
 }
 
 /**
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 3/7] powerpc/pci: Move pcibios_find_pci_bus() around
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 1/7] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 2/7] powerpc/powernv: Issue fundamental reset if required Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 4/7] powerpc/pci: Don't scan empty slot Gavin Shan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

The patch moves pcibios_find_pci_bus() to PPC kerenl directory so
that it can be reused by hotplug code for pSeries and PowerNV
platform at the same time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/pci-hotplug.c          | 36 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/pci_dlpar.c | 32 --------------------------
 2 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 5b78917..6e2b4e3 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -21,6 +21,42 @@
 #include <asm/firmware.h>
 #include <asm/eeh.h>
 
+static struct pci_bus *find_pci_bus(struct pci_bus *bus,
+				    struct device_node *dn)
+{
+	struct pci_bus *tmp, *child = NULL;
+	struct device_node *busdn;
+
+	busdn = pci_bus_to_OF_node(bus);
+	if (busdn == dn)
+		return bus;
+
+	list_for_each_entry(tmp, &bus->children, node) {
+		child = find_pci_bus(tmp, dn);
+		if (child)
+			break;
+	}
+
+	return child;
+}
+
+/**
+ * pcibios_find_pci_bus - find PCI bus according to the given device node
+ * @dn: Device node
+ *
+ * Find the corresponding PCI bus according to the given device node.
+ */
+struct pci_bus *pcibios_find_pci_bus(struct device_node *dn)
+{
+	struct pci_dn *pdn = PCI_DN(dn);
+
+	if (!pdn  || !pdn->phb || !pdn->phb->bus)
+		return NULL;
+
+	return find_pci_bus(pdn->phb->bus, dn);
+}
+EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
+
 /**
  * pcibios_release_device - release PCI device
  * @dev: PCI device
diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
index 89e2381..98c50bc 100644
--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -32,38 +32,6 @@
 #include <asm/firmware.h>
 #include <asm/eeh.h>
 
-static struct pci_bus *
-find_bus_among_children(struct pci_bus *bus,
-                        struct device_node *dn)
-{
-	struct pci_bus *child = NULL;
-	struct pci_bus *tmp;
-	struct device_node *busdn;
-
-	busdn = pci_bus_to_OF_node(bus);
-	if (busdn == dn)
-		return bus;
-
-	list_for_each_entry(tmp, &bus->children, node) {
-		child = find_bus_among_children(tmp, dn);
-		if (child)
-			break;
-	};
-	return child;
-}
-
-struct pci_bus *
-pcibios_find_pci_bus(struct device_node *dn)
-{
-	struct pci_dn *pdn = dn->data;
-
-	if (!pdn  || !pdn->phb || !pdn->phb->bus)
-		return NULL;
-
-	return find_bus_among_children(pdn->phb->bus, dn);
-}
-EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
-
 struct pci_controller *init_phb_dynamic(struct device_node *dn)
 {
 	struct pci_controller *phb;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 4/7] powerpc/pci: Don't scan empty slot
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
                   ` (2 preceding siblings ...)
  2015-02-17  7:13 ` [PATCH RESEND v2 3/7] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 5/7] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

In hotplug case, function pcibios_add_pci_devices() is called to
rescan the specified PCI bus, which might not have any child devices.
Access to the PCI bus's child device node will cause kernel crash
without exception. The patch adds condition of skipping scanning
PCI bus without child devices, in order to avoid kernel crash.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-hotplug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 6e2b4e3..270a26d 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -120,7 +120,8 @@ void pcibios_add_pci_devices(struct pci_bus * bus)
 	if (mode == PCI_PROBE_DEVTREE) {
 		/* use ofdt-based probe */
 		of_rescan_bus(dn, bus);
-	} else if (mode == PCI_PROBE_NORMAL) {
+	} else if (mode == PCI_PROBE_NORMAL &&
+		   dn->child && PCI_DN(dn->child)) {
 		/*
 		 * Use legacy probe. In the partial hotplug case, we
 		 * probably have grandchildren devices unplugged. So
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 5/7] powerpc/powernv: Introduce pnv_pci_poll()
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
                   ` (3 preceding siblings ...)
  2015-02-17  7:13 ` [PATCH RESEND v2 4/7] powerpc/pci: Don't scan empty slot Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 6/7] powerpc/powernv: Functions to retrieve PCI slot status Gavin Shan
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

We might not get some PCI slot information (e.g. power status)
immediately by OPAL API. Instead, opal_pci_poll() need to be called
for the required information.

The patch introduces pnv_pci_poll(), which bases on original
pnv_eeh_poll(), to cover the above case

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 28 ++--------------------------
 arch/powerpc/platforms/powernv/pci.c         | 19 +++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h         |  1 +
 3 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index eeda6e1..81a9cb2 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -665,24 +665,6 @@ static int pnv_eeh_get_state(struct eeh_pe *pe, int *delay)
 	return ret;
 }
 
-static s64 pnv_eeh_poll(uint64_t id)
-{
-	s64 rc = OPAL_HARDWARE;
-
-	while (1) {
-		rc = opal_pci_poll(id, NULL);
-		if (rc <= 0)
-			break;
-
-		if (system_state < SYSTEM_RUNNING)
-			udelay(1000 * rc);
-		else
-			msleep(rc);
-	}
-
-	return rc;
-}
-
 int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 {
 	struct pnv_phb *phb = hose->private_data;
@@ -711,10 +693,7 @@ int pnv_eeh_phb_reset(struct pci_controller *hose, int option)
 
 	/* Issue reset and poll until it's completed */
 	rc = opal_pci_reset(phb->opal_id, scope, OPAL_ASSERT_RESET);
-	if (rc > 0)
-		rc = pnv_eeh_poll(phb->opal_id);
-
-	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+	return pnv_pci_poll(phb->opal_id, rc, NULL);
 }
 
 static int __pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
@@ -805,10 +784,7 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	phb = hose->private_data;
 	id |= (dev->bus->number << 24) | (dev->devfn << 16) | phb->opal_id;
 	rc = opal_pci_reset(id, scope, OPAL_ASSERT_RESET);
-	if (rc > 0)
-		pnv_eeh_poll(id);
-
-	return (rc == OPAL_SUCCESS) ? 0 : -EIO;
+	return pnv_pci_poll(id, rc, NULL);
 }
 
 static int pnv_pci_dev_reset_type(struct pci_dev *pdev, void *data)
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index c68d508..a0ffae2 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -44,6 +44,25 @@
 #define cfg_dbg(fmt...)	do { } while(0)
 //#define cfg_dbg(fmt...)	printk(fmt)
 
+int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval)
+{
+	while (rval > 0) {
+		rval = opal_pci_poll(id, pval);
+		if (rval == OPAL_SUCCESS && pval)
+			rval = opal_pci_poll(id, pval);
+
+		if (rval <= 0)
+			break;
+
+		if (system_state < SYSTEM_RUNNING)
+			udelay(1000 * rval);
+		else
+			msleep(rval);
+	}
+
+	return rval ? -EIO : 0;
+}
+
 #ifdef CONFIG_PCI_MSI
 static int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 18ae927..19f532b 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -194,6 +194,7 @@ struct pnv_phb {
 
 extern struct pci_ops pnv_pci_ops;
 
+int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval);
 void pnv_pci_dump_phb_diag_data(struct pci_controller *hose,
 				unsigned char *log_buff);
 int pnv_pci_cfg_read(struct device_node *dn,
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 6/7] powerpc/powernv: Functions to retrieve PCI slot status
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
                   ` (4 preceding siblings ...)
  2015-02-17  7:13 ` [PATCH RESEND v2 5/7] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17  7:13 ` [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
  2015-02-17 21:44 ` [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Stewart Smith
  7 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

The patch exports two functions, which base on corresponding OPAL
APIs to retrieve PCI slot status. Those functions are going to be
used by PCI hotplug module in subsequent patches:

   pnv_pci_get_power_status()     opal_pci_get_power_status()
   pnv_pci_get_presence_status()  opal_pci_get_presence_status()

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal.h                |  5 +++++
 arch/powerpc/include/asm/pnv-pci.h             |  3 +++
 arch/powerpc/platforms/powernv/opal-wrappers.S |  2 ++
 arch/powerpc/platforms/powernv/pci.c           | 24 ++++++++++++++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index ceec756..b3fcf07 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -169,6 +169,8 @@ struct opal_sg_list {
 #define OPAL_IPMI_SEND				107
 #define OPAL_IPMI_RECV				108
 #define OPAL_I2C_REQUEST			109
+#define OPAL_PCI_GET_POWER_STATUS		110
+#define OPAL_PCI_GET_PRESENCE_STATUS		111
 
 /* Device tree flags */
 
@@ -926,6 +928,9 @@ int64_t opal_ipmi_recv(uint64_t interface, struct opal_ipmi_msg *msg,
 		uint64_t *msg_len);
 int64_t opal_i2c_request(uint64_t async_token, uint32_t bus_id,
 			 struct opal_i2c_request *oreq);
+int64_t opal_pci_get_power_status(uint64_t id, uint8_t *status);
+int64_t opal_pci_get_presence_status(uint64_t id, uint8_t *status);
+
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h
index f9b4982..6a19b50 100644
--- a/arch/powerpc/include/asm/pnv-pci.h
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -13,6 +13,9 @@
 #include <linux/pci.h>
 #include <misc/cxl.h>
 
+extern int pnv_pci_get_power_status(uint64_t id, uint8_t *status);
+extern int pnv_pci_get_presence_status(uint64_t id, uint8_t *status);
+
 int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
 int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 			   unsigned int virq);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 0509bca..35e513e 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -292,3 +292,5 @@ OPAL_CALL(opal_tpo_read,			OPAL_READ_TPO);
 OPAL_CALL(opal_ipmi_send,			OPAL_IPMI_SEND);
 OPAL_CALL(opal_ipmi_recv,			OPAL_IPMI_RECV);
 OPAL_CALL(opal_i2c_request,			OPAL_I2C_REQUEST);
+OPAL_CALL(opal_pci_get_power_status,		OPAL_PCI_GET_POWER_STATUS);
+OPAL_CALL(opal_pci_get_presence_status,		OPAL_PCI_GET_PRESENCE_STATUS);
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index a0ffae2..bee6798 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -63,6 +63,30 @@ int pnv_pci_poll(uint64_t id, int64_t rval, uint8_t *pval)
 	return rval ? -EIO : 0;
 }
 
+int pnv_pci_get_power_status(uint64_t id, uint8_t *status)
+{
+	long rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_POWER_STATUS))
+		return -ENXIO;
+
+	rc = opal_pci_get_power_status(id, status);
+	return pnv_pci_poll(id, rc, status);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_power_status);
+
+int pnv_pci_get_presence_status(uint64_t id, uint8_t *status)
+{
+	long rc;
+
+	if (!opal_check_token(OPAL_PCI_GET_PRESENCE_STATUS))
+		return -ENXIO;
+
+	rc = opal_pci_get_presence_status(id, status);
+	return pnv_pci_poll(id, rc, status);
+}
+EXPORT_SYMBOL_GPL(pnv_pci_get_presence_status);
+
 #ifdef CONFIG_PCI_MSI
 static int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 {
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
                   ` (5 preceding siblings ...)
  2015-02-17  7:13 ` [PATCH RESEND v2 6/7] powerpc/powernv: Functions to retrieve PCI slot status Gavin Shan
@ 2015-02-17  7:13 ` Gavin Shan
  2015-02-17 22:09   ` Bjorn Helgaas
  2015-02-17 21:44 ` [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Stewart Smith
  7 siblings, 1 reply; 15+ messages in thread
From: Gavin Shan @ 2015-02-17  7:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

The patch intends to add standalone driver to support PCI hotplug
for PowerPC PowerNV platform, which runs on top of skiboot firmware.
The firmware identified hotpluggable slots and marked their device
tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
The driver simply scans device-tree to create/register PCI hotplug slot
accordingly.

If the skiboot firmware doesn't support slot status retrieval, the PCI
slot device node shouldn't have property "ibm,reset-by-firmware". In
that case, none of valid PCI slots will be detected from device tree.
The skiboot firmware doesn't export the capability to access attention
LEDs yet and it's something for TBD.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 drivers/pci/hotplug/Kconfig            |  12 ++
 drivers/pci/hotplug/Makefile           |   4 +
 drivers/pci/hotplug/powernv_php.c      | 126 +++++++++++
 drivers/pci/hotplug/powernv_php.h      |  70 ++++++
 drivers/pci/hotplug/powernv_php_slot.c | 382 +++++++++++++++++++++++++++++++++
 5 files changed, 594 insertions(+)
 create mode 100644 drivers/pci/hotplug/powernv_php.c
 create mode 100644 drivers/pci/hotplug/powernv_php.h
 create mode 100644 drivers/pci/hotplug/powernv_php_slot.c

diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
index df8caec..ef55dae 100644
--- a/drivers/pci/hotplug/Kconfig
+++ b/drivers/pci/hotplug/Kconfig
@@ -113,6 +113,18 @@ config HOTPLUG_PCI_SHPC
 
 	  When in doubt, say N.
 
+config HOTPLUG_PCI_POWERNV
+	tristate "PowerPC PowerNV PCI Hotplug driver"
+	depends on PPC_POWERNV && EEH
+	help
+	  Say Y here if you run PowerPC PowerNV platform that supports
+          PCI Hotplug
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called powernv-php.
+
+	  When in doubt, say N.
+
 config HOTPLUG_PCI_RPA
 	tristate "RPA PCI Hotplug driver"
 	depends on PPC_PSERIES && EEH
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 4a9aa08..a69665e 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_HOTPLUG_PCI_PCIE)		+= pciehp.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_ZT5550)	+= cpcihp_zt5550.o
 obj-$(CONFIG_HOTPLUG_PCI_CPCI_GENERIC)	+= cpcihp_generic.o
 obj-$(CONFIG_HOTPLUG_PCI_SHPC)		+= shpchp.o
+obj-$(CONFIG_HOTPLUG_PCI_POWERNV)	+= powernv-php.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA)		+= rpaphp.o
 obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR)	+= rpadlpar_io.o
 obj-$(CONFIG_HOTPLUG_PCI_SGI)		+= sgi_hotplug.o
@@ -50,6 +51,9 @@ ibmphp-objs		:=	ibmphp_core.o	\
 acpiphp-objs		:=	acpiphp_core.o	\
 				acpiphp_glue.o
 
+powernv-php-objs	:=	powernv_php.o	\
+				powernv_php_slot.o
+
 rpaphp-objs		:=	rpaphp_core.o	\
 				rpaphp_pci.o	\
 				rpaphp_slot.o
diff --git a/drivers/pci/hotplug/powernv_php.c b/drivers/pci/hotplug/powernv_php.c
new file mode 100644
index 0000000..e36eaf1
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php.c
@@ -0,0 +1,126 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sysfs.h>
+#include <linux/pci.h>
+#include <linux/pci_hotplug.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <asm/opal.h>
+#include <asm/pnv-pci.h>
+
+#include "powernv_php.h"
+
+#define DRIVER_VERSION	"0.1"
+#define DRIVER_AUTHOR	"Gavin Shan, IBM Corporation"
+#define DRIVER_DESC	"PowerPC PowerNV PCI Hotplug Driver"
+
+static int powernv_php_register_one(struct device_node *dn)
+{
+	struct powernv_php_slot *slot;
+	const __be32 *prop32;
+	int ret;
+
+	/* Check if it's hotpluggable slot */
+	prop32 = of_get_property(dn, "ibm,slot-pluggable", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return 0;
+
+	prop32 = of_get_property(dn, "ibm,reset-by-firmware", NULL);
+	if (!prop32 || !of_read_number(prop32, 1))
+		return 0;
+
+	/* Allocate slot */
+	slot = powernv_php_slot_alloc(dn);
+	if (!slot)
+		return -ENODEV;
+
+	/* Register it */
+	ret = powernv_php_slot_register(slot);
+	if (ret) {
+		powernv_php_slot_put(slot);
+		return ret;
+	}
+
+	return powernv_php_slot_enable(slot->php_slot, false);
+}
+
+int powernv_php_register(struct device_node *dn)
+{
+	struct device_node *child;
+	int ret = 0;
+
+	for_each_child_of_node(dn, child) {
+		ret = powernv_php_register_one(child);
+		if (ret)
+			break;
+
+		powernv_php_register(child);
+	}
+
+	return ret;
+}
+
+static void powernv_php_unregister_one(struct device_node *dn)
+{
+	struct powernv_php_slot *slot;
+
+	slot = powernv_php_slot_find(dn);
+	if (!slot)
+		return;
+
+	pci_hp_deregister(slot->php_slot);
+}
+
+void powernv_php_unregister(struct device_node *dn)
+{
+	struct device_node *child;
+
+	for_each_child_of_node(dn, child) {
+		powernv_php_unregister_one(child);
+		powernv_php_unregister(child);
+	}
+}
+
+static int __init powernv_php_init(void)
+{
+	struct device_node *dn;
+
+	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
+
+	/* Scan PHB nodes and their children */
+	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
+		powernv_php_register(dn);
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		powernv_php_register(dn);
+
+	return 0;
+}
+
+static void __exit powernv_php_exit(void)
+{
+	struct device_node *dn;
+
+	for_each_compatible_node(dn, NULL, "ibm,ioda-phb")
+		powernv_php_unregister(dn);
+	for_each_compatible_node(dn, NULL, "ibm,ioda2-phb")
+		powernv_php_unregister(dn);
+}
+
+module_init(powernv_php_init);
+module_exit(powernv_php_exit);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/drivers/pci/hotplug/powernv_php.h b/drivers/pci/hotplug/powernv_php.h
new file mode 100644
index 0000000..1c2b6f6
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php.h
@@ -0,0 +1,70 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _POWERNV_PHP_H
+#define _POWERNV_PHP_H
+
+/* Slot power status */
+#define POWERNV_PHP_SLOT_POWER_OFF	0
+#define POWERNV_PHP_SLOT_POWER_ON	1
+
+/* Slot presence status */
+#define POWERNV_PHP_SLOT_EMPTY		0
+#define POWERNV_PHP_SLOT_PRESENT	1
+
+/* Slot attention status */
+#define POWERNV_PHP_SLOT_ATTEN_OFF	0
+#define POWERNV_PHP_SLOT_ATTEN_ON	1
+#define POWERNV_PHP_SLOT_ATTEN_IND	2
+#define POWERNV_PHP_SLOT_ATTEN_ACT	3
+
+struct powernv_php_slot {
+	struct kref		kref;
+	int			state;
+#define POWERNV_PHP_SLOT_STATE_INIT		0x0
+#define POWERNV_PHP_SLOT_STATE_REGISTER		0x1
+#define POWERNV_PHP_SLOT_STATE_POPULATED	0x2
+	char			*name;
+	struct device_node	*dn;
+	struct pci_bus		*bus;
+	uint64_t		id;
+	int			slot_no;
+	struct hotplug_slot	*php_slot;
+	struct powernv_php_slot	*parent;
+	void (*release)(struct kref *kref);
+	struct list_head	children;
+	struct list_head	link;
+};
+
+#define to_powernv_php_slot(kref) container_of(kref, struct powernv_php_slot, kref)
+
+static inline void powernv_php_slot_get(struct powernv_php_slot *slot)
+{
+	if (slot)
+		kref_get(&slot->kref);
+}
+
+static inline int powernv_php_slot_put(struct powernv_php_slot *slot)
+{
+	if (slot)
+		return kref_put(&slot->kref, slot->release);
+
+	return 0;
+}
+
+struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn);
+struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn);
+int powernv_php_slot_register(struct powernv_php_slot *slot);
+int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan);
+int powernv_php_register(struct device_node *dn);
+void powernv_php_unregister(struct device_node *dn);
+
+#endif /* !_POWERNV_PHP_H */
diff --git a/drivers/pci/hotplug/powernv_php_slot.c b/drivers/pci/hotplug/powernv_php_slot.c
new file mode 100644
index 0000000..84c5c6f
--- /dev/null
+++ b/drivers/pci/hotplug/powernv_php_slot.c
@@ -0,0 +1,382 @@
+/*
+ * PCI Hotplug Driver for PowerPC PowerNV platform.
+ *
+ * Copyright Gavin Shan, IBM Corporation 2015.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sysfs.h>
+#include <linux/pci.h>
+#include <linux/pci_hotplug.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+
+#include <asm/opal.h>
+#include <asm/pnv-pci.h>
+
+#include "powernv_php.h"
+
+static LIST_HEAD(php_slot_list);
+static DEFINE_MUTEX(php_slot_mutex);
+
+static int get_power_status(struct hotplug_slot *php_slot, u8 *val)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t state;
+	int ret;
+
+	/* By default, the power is on */
+	*val = POWERNV_PHP_SLOT_POWER_ON;
+
+	/* Retrieve power status from firmware */
+	ret = pnv_pci_get_power_status(slot->id, &state);
+	if (!ret) {
+		*val = state ? POWERNV_PHP_SLOT_POWER_ON :
+			       POWERNV_PHP_SLOT_POWER_OFF;
+		php_slot->info->power_status = *val;
+	}
+
+	return 0;
+}
+
+static int get_adapter_status(struct hotplug_slot *php_slot, u8 *val)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t state;
+	int ret;
+
+	/* By default, the slot is empty */
+	*val = 0;
+
+	/* Retrieve presence status from firmware */
+	ret = pnv_pci_get_presence_status(slot->id, &state);
+	if (ret >= 0) {
+		*val = state ? POWERNV_PHP_SLOT_PRESENT :
+			       POWERNV_PHP_SLOT_EMPTY;
+		php_slot->info->adapter_status = *val;
+	}
+
+	return 0;
+}
+
+static int set_attention_status(struct hotplug_slot *php_slot, u8 val)
+{
+	/*
+	 * The default operation would to turn on
+	 * the attention
+	*/
+	switch (val) {
+	case POWERNV_PHP_SLOT_ATTEN_OFF:
+	case POWERNV_PHP_SLOT_ATTEN_ON:
+	case POWERNV_PHP_SLOT_ATTEN_IND:
+	case POWERNV_PHP_SLOT_ATTEN_ACT:
+		break;
+	default:
+		val = POWERNV_PHP_SLOT_ATTEN_ON;
+	}
+
+	/* FIXME: Make it real once firmware supports it */
+	php_slot->info->attention_status = val;
+
+	return 0;
+}
+
+int powernv_php_slot_enable(struct hotplug_slot *php_slot, bool rescan)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+	uint8_t presence;
+	int ret;
+
+	/* Check if the slot has been configured */
+	if (slot->state != POWERNV_PHP_SLOT_STATE_REGISTER)
+		return 0;
+
+	/* Retrieve slot presence status */
+	ret = php_slot->ops->get_adapter_status(php_slot, &presence);
+	if (ret)
+		return ret;
+
+	switch (presence) {
+	case POWERNV_PHP_SLOT_PRESENT:
+		pci_lock_rescan_remove();
+		pcibios_add_pci_devices(slot->bus);
+		pci_unlock_rescan_remove();
+		slot->state = POWERNV_PHP_SLOT_STATE_POPULATED;
+
+		/* Rescan for child hotpluggable slots */
+		if (rescan)
+			powernv_php_register(slot->dn);
+		break;
+	case POWERNV_PHP_SLOT_EMPTY:
+		break;
+	default:
+		pr_warn("%s: Invalid presence status %d on slot %s\n",
+			__func__, presence, slot->name);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int enable_slot(struct hotplug_slot *php_slot)
+{
+	return powernv_php_slot_enable(php_slot, true);
+}
+
+static int disable_slot(struct hotplug_slot *php_slot)
+{
+	struct powernv_php_slot *slot = php_slot->private;
+
+	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
+		return 0;
+
+	pci_lock_rescan_remove();
+	pcibios_remove_pci_devices(slot->bus);
+	pci_unlock_rescan_remove();
+	vm_unmap_aliases();
+
+	/* Detach the child hotpluggable slots */
+	powernv_php_unregister(slot->dn);
+
+	/* Update slot state */
+	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
+	return 0;
+}
+
+static struct hotplug_slot_ops php_slot_ops = {
+	.get_power_status	= get_power_status,
+	.get_adapter_status	= get_adapter_status,
+	.set_attention_status	= set_attention_status,
+	.enable_slot		= enable_slot,
+	.disable_slot		= disable_slot,
+};
+
+static struct powernv_php_slot *php_slot_match(struct device_node *dn,
+					       struct powernv_php_slot *slot)
+{
+	struct powernv_php_slot *target, *tmp;
+
+	if (slot->dn == dn)
+		return slot;
+
+	list_for_each_entry(tmp, &slot->children, link) {
+		target = php_slot_match(dn, tmp);
+		if (target)
+			return target;
+	}
+
+	return NULL;
+}
+
+struct powernv_php_slot *powernv_php_slot_find(struct device_node *dn)
+{
+	struct powernv_php_slot *slot, *tmp;
+
+	mutex_lock(&php_slot_mutex);
+	list_for_each_entry(tmp, &php_slot_list, link) {
+		slot = php_slot_match(dn, tmp);
+		if (slot) {
+			mutex_unlock(&php_slot_mutex);
+			return slot;
+		}
+	}
+	mutex_unlock(&php_slot_mutex);
+
+	return NULL;
+}
+
+static void php_slot_free(struct kref *kref)
+{
+	struct powernv_php_slot *slot = to_powernv_php_slot(kref);
+
+	WARN_ON(!list_empty(&slot->children));
+	kfree(slot->name);
+	kfree(slot);
+}
+
+static void php_slot_release(struct hotplug_slot *hp_slot)
+{
+	struct powernv_php_slot *slot = hp_slot->private;
+
+	/* Remove from global or child list */
+	mutex_lock(&php_slot_mutex);
+	list_del(&slot->link);
+	mutex_unlock(&php_slot_mutex);
+
+	/* Detach from parent */
+	powernv_php_slot_put(slot);
+	powernv_php_slot_put(slot->parent);
+}
+
+static bool php_slot_get_id(struct device_node *dn,
+			    uint64_t *id)
+{
+	struct device_node *parent = dn;
+	const __be64 *prop64;
+	const __be32 *prop32;
+
+	/*
+	 * The hotpluggable slot always has a compound Id, which
+	 * consists of 16-bits PHB Id, 16 bits bus/slot/function
+	 * number, and compound indicator
+	 */
+	*id = (0x1ul << 63);
+
+	/* Bus/Slot/Function number */
+	prop32 = of_get_property(dn, "reg", NULL);
+	if (!prop32)
+		return false;
+	*id |= ((of_read_number(prop32, 1) & 0x00ffff00) << 16);
+
+	/* PHB Id */
+	while ((parent = of_get_parent(parent))) {
+		if (!PCI_DN(parent)) {
+			of_node_put(parent);
+			break;
+		}
+
+		if (!of_device_is_compatible(parent, "ibm,ioda2-phb") &&
+		    !of_device_is_compatible(parent, "ibm,ioda-phb")) {
+			of_node_put(parent);
+			continue;
+		}
+
+		prop64 = of_get_property(parent, "ibm,opal-phbid", NULL);
+		if (!prop64) {
+			of_node_put(parent);
+			return false;
+		}
+
+		*id |= be64_to_cpup(prop64);
+		of_node_put(parent);
+		return true;
+	}
+
+        return false;
+}
+
+struct powernv_php_slot *powernv_php_slot_alloc(struct device_node *dn)
+{
+	struct pci_bus *bus;
+	struct powernv_php_slot *slot;
+	const char *label;
+	uint64_t id;
+	int slot_no;
+	size_t size;
+	void *pmem;
+
+	/* Slot name */
+	label = of_get_property(dn, "ibm,slot-label", NULL);
+	if (!label)
+		return NULL;
+
+	/* Slot indentifier */
+	if (!php_slot_get_id(dn, &id))
+		return NULL;
+
+	/* PCI bus */
+	bus = pcibios_find_pci_bus(dn);
+	if (!bus)
+		return NULL;
+
+	/* Slot number */
+	if (dn->child && PCI_DN(dn->child))
+		slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
+	else
+		slot_no = -1;
+
+	/* Allocate slot */
+	size = sizeof(struct powernv_php_slot) +
+	       sizeof(struct hotplug_slot) +
+	       sizeof(struct hotplug_slot_info);
+	pmem = kzalloc(size, GFP_KERNEL);
+	if (!pmem) {
+		pr_warn("%s: Cannot allocate slot for node %s\n",
+			__func__, dn->full_name);
+		return NULL;
+	}
+
+	/* Assign memory blocks */
+	slot = pmem;
+	slot->php_slot = pmem + sizeof(struct powernv_php_slot);
+	slot->php_slot->info = pmem + sizeof(struct powernv_php_slot) +
+			      sizeof(struct hotplug_slot);
+	slot->name = kstrdup(label, GFP_KERNEL);
+	if (!slot->name) {
+		pr_warn("%s: Cannot populate name for node %s\n",
+			__func__, dn->full_name);
+		kfree(pmem);
+		return NULL;
+	}
+
+	/* Initialize slot */
+	kref_init(&slot->kref);
+	slot->state = POWERNV_PHP_SLOT_STATE_INIT;
+	slot->dn = dn;
+	slot->bus = bus;
+	slot->id = id;
+	slot->slot_no = slot_no;
+	slot->release = php_slot_free;
+	slot->php_slot->ops = &php_slot_ops;
+	slot->php_slot->release = php_slot_release;
+	slot->php_slot->private = slot;
+	INIT_LIST_HEAD(&slot->children);
+	INIT_LIST_HEAD(&slot->link);
+
+	return slot;
+}
+
+int powernv_php_slot_register(struct powernv_php_slot *slot)
+{
+	struct powernv_php_slot *parent;
+	struct device_node *dn = slot->dn;
+	int ret;
+
+	/* Avoid register same slot for twice */
+	if (powernv_php_slot_find(slot->dn))
+		return -EEXIST;
+
+	/* Register slot */
+	ret = pci_hp_register(slot->php_slot, slot->bus,
+			      slot->slot_no, slot->name);
+	if (ret) {
+		pr_warn("%s: Cannot register slot %s (%d)\n",
+			__func__, slot->name, ret);
+		return ret;
+	}
+
+	/* Put into global or parent list */
+	while ((dn = of_get_parent(dn))) {
+		if (!PCI_DN(dn)) {
+			of_node_put(dn);
+			break;
+		}
+
+		parent = powernv_php_slot_find(dn);
+		if (parent) {
+			of_node_put(dn);
+			break;
+		}
+	}
+
+	mutex_lock(&php_slot_mutex);
+	if (parent) {
+		powernv_php_slot_get(parent);
+		slot->parent = parent;
+		list_add_tail(&slot->link, &parent->children);
+	} else {
+		list_add_tail(&slot->link, &php_slot_list);
+	}
+	mutex_unlock(&php_slot_mutex);
+
+	/* Update slot state */
+	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
+	return 0;
+}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug
  2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
                   ` (6 preceding siblings ...)
  2015-02-17  7:13 ` [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
@ 2015-02-17 21:44 ` Stewart Smith
  2015-02-17 22:26   ` Gavin Shan
  7 siblings, 1 reply; 15+ messages in thread
From: Stewart Smith @ 2015-02-17 21:44 UTC (permalink / raw)
  To: Gavin Shan, linuxppc-dev; +Cc: bhelgaas, linux-pci, Gavin Shan

Gavin Shan <gwshan@linux.vnet.ibm.com> writes:

> The patchset was built based on patchset "powerpc/powernv: Simplify EEH
> implementation", which can be found from:
>
> https://patchwork.ozlabs.org/patch/439956/
>
> The patchset corresponds to skiboot changes, which manages PCI slots
> in a unified way: OPAL APIs used to do slot reset, power management,
> presence status retrival. The patchset shouldn't be merged before
> the OPAL firmware counterpart is merged:
>
> https://patchwork.ozlabs.org/patch/440463/

But this patchset will work with old OPAL, right? We're not breaking
compatibility with newer kernels (with this patchset) and older
firmware?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-17  7:13 ` [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
@ 2015-02-17 22:09   ` Bjorn Helgaas
  2015-02-18  0:16     ` Gavin Shan
  0 siblings, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2015-02-17 22:09 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev, linux-pci

On Tue, Feb 17, 2015 at 06:13:23PM +1100, Gavin Shan wrote:
> The patch intends to add standalone driver to support PCI hotplug
> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
> The firmware identified hotpluggable slots and marked their device
> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
> The driver simply scans device-tree to create/register PCI hotplug slot
> accordingly.
> 
> If the skiboot firmware doesn't support slot status retrieval, the PCI
> slot device node shouldn't have property "ibm,reset-by-firmware". In
> that case, none of valid PCI slots will be detected from device tree.
> The skiboot firmware doesn't export the capability to access attention
> LEDs yet and it's something for TBD.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ...

> +static int disable_slot(struct hotplug_slot *php_slot)
> +{
> +	struct powernv_php_slot *slot = php_slot->private;
> +
> +	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
> +		return 0;
> +
> +	pci_lock_rescan_remove();
> +	pcibios_remove_pci_devices(slot->bus);
> +	pci_unlock_rescan_remove();
> +	vm_unmap_aliases();

What is vm_unmap_aliases() for?  I see this is probably copied from
rpaphp_core.c, where it was added by b4a26be9f6f8 ("powerpc/pseries: Flush
lazy kernel mappings after unplug operations").

But I don't know whether:

  - this is something specific to powerpc,
  - the lack of vm_unmap_aliases() in other hotplug paths is a bug,
  - the fact that we only do this on powerpc is covering up a
    powerpc bug somewhere

> +
> +	/* Detach the child hotpluggable slots */
> +	powernv_php_unregister(slot->dn);
> +
> +	/* Update slot state */
> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
> +	return 0;
> +}
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug
  2015-02-17 21:44 ` [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Stewart Smith
@ 2015-02-17 22:26   ` Gavin Shan
  0 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2015-02-17 22:26 UTC (permalink / raw)
  To: Stewart Smith; +Cc: bhelgaas, linuxppc-dev, Gavin Shan, linux-pci

On Wed, Feb 18, 2015 at 08:44:19AM +1100, Stewart Smith wrote:
>Gavin Shan <gwshan@linux.vnet.ibm.com> writes:
>
>> The patchset was built based on patchset "powerpc/powernv: Simplify EEH
>> implementation", which can be found from:
>>
>> https://patchwork.ozlabs.org/patch/439956/
>>
>> The patchset corresponds to skiboot changes, which manages PCI slots
>> in a unified way: OPAL APIs used to do slot reset, power management,
>> presence status retrival. The patchset shouldn't be merged before
>> the OPAL firmware counterpart is merged:
>>
>> https://patchwork.ozlabs.org/patch/440463/
>
>But this patchset will work with old OPAL, right? We're not breaking
>compatibility with newer kernels (with this patchset) and older
>firmware?

Yes. With old OPAL, the reset stuff in Linux works as before and we
shouldn't see hotpluggable slots from /sys/bus/pci/slots.

The compatibility issue was pointed by Ben. It's something fixed by
this revision (v2).

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-17 22:09   ` Bjorn Helgaas
@ 2015-02-18  0:16     ` Gavin Shan
  2015-02-18  0:30       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 15+ messages in thread
From: Gavin Shan @ 2015-02-18  0:16 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linuxppc-dev, Gavin Shan, linux-pci

On Tue, Feb 17, 2015 at 04:09:16PM -0600, Bjorn Helgaas wrote:
>On Tue, Feb 17, 2015 at 06:13:23PM +1100, Gavin Shan wrote:
>> The patch intends to add standalone driver to support PCI hotplug
>> for PowerPC PowerNV platform, which runs on top of skiboot firmware.
>> The firmware identified hotpluggable slots and marked their device
>> tree node with proper "ibm,slot-pluggable" and "ibm,reset-by-firmware".
>> The driver simply scans device-tree to create/register PCI hotplug slot
>> accordingly.
>> 
>> If the skiboot firmware doesn't support slot status retrieval, the PCI
>> slot device node shouldn't have property "ibm,reset-by-firmware". In
>> that case, none of valid PCI slots will be detected from device tree.
>> The skiboot firmware doesn't export the capability to access attention
>> LEDs yet and it's something for TBD.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> ...
>
>> +static int disable_slot(struct hotplug_slot *php_slot)
>> +{
>> +	struct powernv_php_slot *slot = php_slot->private;
>> +
>> +	if (slot->state != POWERNV_PHP_SLOT_STATE_POPULATED)
>> +		return 0;
>> +
>> +	pci_lock_rescan_remove();
>> +	pcibios_remove_pci_devices(slot->bus);
>> +	pci_unlock_rescan_remove();
>> +	vm_unmap_aliases();
>
>What is vm_unmap_aliases() for?  I see this is probably copied from
>rpaphp_core.c, where it was added by b4a26be9f6f8 ("powerpc/pseries: Flush
>lazy kernel mappings after unplug operations").
>
>But I don't know whether:
>
>  - this is something specific to powerpc,
>  - the lack of vm_unmap_aliases() in other hotplug paths is a bug,
>  - the fact that we only do this on powerpc is covering up a
>    powerpc bug somewhere
>

Yes, I copied this piece of code from rpaphp_core.c. I think Ben might
help to answer the questions as he added the patch. I had very quick
check on mm/vmalloc.c and it's reasonable to have vm_unmap_aliases()
here to flush TLB entries for ioremap() regions, which were unmapped
previously. if I'm correct. I don't think it's powerpc specific.

Thanks,
Gavin

>> +
>> +	/* Detach the child hotpluggable slots */
>> +	powernv_php_unregister(slot->dn);
>> +
>> +	/* Update slot state */
>> +	slot->state = POWERNV_PHP_SLOT_STATE_REGISTER;
>> +	return 0;
>> +}
>> 
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-18  0:16     ` Gavin Shan
@ 2015-02-18  0:30       ` Benjamin Herrenschmidt
  2015-02-18 14:30         ` Bjorn Helgaas
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-02-18  0:30 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Bjorn Helgaas, linuxppc-dev, linux-pci

On Wed, 2015-02-18 at 11:16 +1100, Gavin Shan wrote:
> >What is vm_unmap_aliases() for?  I see this is probably copied from
> >rpaphp_core.c, where it was added by b4a26be9f6f8 ("powerpc/pseries:
> Flush
> >lazy kernel mappings after unplug operations").
> >
> >But I don't know whether:
> >
> >  - this is something specific to powerpc,
> >  - the lack of vm_unmap_aliases() in other hotplug paths is a bug,
> >  - the fact that we only do this on powerpc is covering up a
> >    powerpc bug somewhere
> >
> 
> Yes, I copied this piece of code from rpaphp_core.c. I think Ben might
> help to answer the questions as he added the patch. I had very quick
> check on mm/vmalloc.c and it's reasonable to have vm_unmap_aliases()
> here to flush TLB entries for ioremap() regions, which were unmapped
> previously. if I'm correct. I don't think it's powerpc specific.

It's specific to running under the PowerVM hypervisor, and thus doesn't
affect PowerNV, just don't copy it over.

It comes from the fact that the generic ioremap code nowadays delays
TLB flushing on unmap. The TLB flushing code is what, on powerpc,
ensures that we remove the translations from the MMU hash table (the
hash table is essentially treated as an extended in-memory TLB), which
on pseries turns into hypervisor calls.

When running under that hypervisor, the HV ensures that no translation
still exists in the hash before allowing a device to be removed from
a partition. If translations still exist, the removal fails.

So we need to force the generic ioremap code to perform all the TLB
flushes for iounmap'ed regions before we "complete" the unplug operation
from a kernel perspective so that the device can be re-assigned to
another partition.

This is thus useless on platforms like powernv which do not run under
such a hypervisor.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-18  0:30       ` Benjamin Herrenschmidt
@ 2015-02-18 14:30         ` Bjorn Helgaas
  2015-02-18 21:03           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2015-02-18 14:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linux-mm, linuxppc-dev, Gavin Shan, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org

[+cc linux-mm, linux-kernel]

For context, the start of this discussion was here:
http://lkml.kernel.org/r/1424157203-691-8-git-send-email-gwshan@linux.vnet.ibm.com
where Gavin is adding a new PCI hotplug driver for PowerNV.  That new
driver calls vm_unmap_aliases() the same way we do in the existing RPA
hotplug driver here:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/hotplug/rpaphp_core.c#n432

I'm trying to figure out whether it's correct to use
vm_unmap_aliases() here, but I'm not an mm person so all I have is my
gut feeling that something doesn't smell right.

On Tue, Feb 17, 2015 at 6:30 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2015-02-18 at 11:16 +1100, Gavin Shan wrote:
>> >What is vm_unmap_aliases() for?  I see this is probably copied from
>> >rpaphp_core.c, where it was added by b4a26be9f6f8 ("powerpc/pseries:
>> Flush
>> >lazy kernel mappings after unplug operations").
>> >
>> >But I don't know whether:
>> >
>> >  - this is something specific to powerpc,
>> >  - the lack of vm_unmap_aliases() in other hotplug paths is a bug,
>> >  - the fact that we only do this on powerpc is covering up a
>> >    powerpc bug somewhere
>>
>> Yes, I copied this piece of code from rpaphp_core.c. I think Ben might
>> help to answer the questions as he added the patch. I had very quick
>> check on mm/vmalloc.c and it's reasonable to have vm_unmap_aliases()
>> here to flush TLB entries for ioremap() regions, which were unmapped
>> previously. if I'm correct. I don't think it's powerpc specific.
>
> It's specific to running under the PowerVM hypervisor, and thus doesn't
> affect PowerNV, just don't copy it over.
>
> It comes from the fact that the generic ioremap code nowadays delays
> TLB flushing on unmap. The TLB flushing code is what, on powerpc,
> ensures that we remove the translations from the MMU hash table (the
> hash table is essentially treated as an extended in-memory TLB), which
> on pseries turns into hypervisor calls.
>
> When running under that hypervisor, the HV ensures that no translation
> still exists in the hash before allowing a device to be removed from
> a partition. If translations still exist, the removal fails.
>
> So we need to force the generic ioremap code to perform all the TLB
> flushes for iounmap'ed regions before we "complete" the unplug operation
> from a kernel perspective so that the device can be re-assigned to
> another partition.
>
> This is thus useless on platforms like powernv which do not run under
> such a hypervisor.

So the hypervisor call that removes the device from the partition will
fail if there are any translations that reference the memory of the
device.

Let me go through this in excruciating detail to see if I understand
what's going on:

  - PCI core enumerates device D1
  - PCI core sets device D1 BAR 0 = 0x1000
  - driver claims D1
  - driver ioremaps 0x1000 at virtual address V
  - translation V -> 0x1000 is in TLB
  - driver iounmaps V (but V -> 0x1000 translation may remain in TLB)
  - driver releases D1
  - hot-remove D1 (without vm_unmap_aliases(), hypervisor would fail this)
  - it would be a bug to reference V here, but if we did, the
virt-to-phys translation would succeed and we'd have a Master Abort or
Unsupported Request on PCI/PCIe
  - hot-add D2
  - PCI core enumerates device D2
  - PCI core sets device D2 BAR 0 = 0x1000
  - it would be a bug to reference V here (before ioremapping), but if
we did, the reference would reach D2

I don't see anything hypervisor-specific here except for the fact that
the hypervisor checks for existing translations and most other
platforms don't.  But it seems like the unexpected PCI aborts could
happen on any platform.

Are we saying that those PCI aborts are OK, since it's a bug to make
those references in the first place?  Or would we rather take a TLB
miss fault instead so the references never make it to PCI?

I would think there would be similar issues when unmapping and
re-mapping plain old physical memory.  But I don't see
vm_unmap_aliases() calls there, so those issues must be handled
differently.  Should we handle this PCI hotplug issue the same way we
handle RAM?

Bjorn

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  2015-02-18 14:30         ` Bjorn Helgaas
@ 2015-02-18 21:03           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2015-02-18 21:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-mm, linuxppc-dev, Gavin Shan, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org

On Wed, 2015-02-18 at 08:30 -0600, Bjorn Helgaas wrote:
> 
> So the hypervisor call that removes the device from the partition will
> fail if there are any translations that reference the memory of the
> device.
> 
> Let me go through this in excruciating detail to see if I understand
> what's going on:
> 
>   - PCI core enumerates device D1
>   - PCI core sets device D1 BAR 0 = 0x1000
>   - driver claims D1
>   - driver ioremaps 0x1000 at virtual address V
>   - translation V -> 0x1000 is in TLB
>   - driver iounmaps V (but V -> 0x1000 translation may remain in TLB)
>   - driver releases D1
>   - hot-remove D1 (without vm_unmap_aliases(), hypervisor would fail
> this)
>   - it would be a bug to reference V here, but if we did, the
> virt-to-phys translation would succeed and we'd have a Master Abort or
> Unsupported Request on PCI/PCIe
>   - hot-add D2
>   - PCI core enumerates device D2
>   - PCI core sets device D2 BAR 0 = 0x1000
>   - it would be a bug to reference V here (before ioremapping), but if
> we did, the reference would reach D2
> 
> I don't see anything hypervisor-specific here except for the fact that
> the hypervisor checks for existing translations and most other
> platforms don't.  But it seems like the unexpected PCI aborts could
> happen on any platform.

Well, only if we incorrectly dereferenced an ioremap'ed address for
which the driver who owns it is long gone so fairly unlikely. I'm not
saying you shouldn't put the vm_unmap_aliases() in the generic unplug
code, I wouldn't mind that, but I don't think we have a nasty bug to
squash here :)

> Are we saying that those PCI aborts are OK, since it's a bug to make
> those references in the first place?  Or would we rather take a TLB
> miss fault instead so the references never make it to PCI?

I think a miss fault which is basically a page fault -> oops is
preferable for debugging (after all that MMIO might hvae been reassigned
to another device, so that abort might actually instead turn into
writing to the wrong device... bad).

However I also think the scenario is very unlikely.

> I would think there would be similar issues when unmapping and
> re-mapping plain old physical memory.  But I don't see
> vm_unmap_aliases() calls there, so those issues must be handled
> differently.  Should we handle this PCI hotplug issue the same way we
> handle RAM?

If we don't have a vm_unmap_aliases() in the memory unplug path we
probably have a bug on those HVs too :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-02-18 21:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-17  7:13 [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 1/7] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 2/7] powerpc/powernv: Issue fundamental reset if required Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 3/7] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 4/7] powerpc/pci: Don't scan empty slot Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 5/7] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 6/7] powerpc/powernv: Functions to retrieve PCI slot status Gavin Shan
2015-02-17  7:13 ` [PATCH RESEND v2 7/7] PCI/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
2015-02-17 22:09   ` Bjorn Helgaas
2015-02-18  0:16     ` Gavin Shan
2015-02-18  0:30       ` Benjamin Herrenschmidt
2015-02-18 14:30         ` Bjorn Helgaas
2015-02-18 21:03           ` Benjamin Herrenschmidt
2015-02-17 21:44 ` [PATCH RESEND v2 0/7] powerpc/powernv: Unified PCI slot reset and hotplug Stewart Smith
2015-02-17 22:26   ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).