linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] powerpc: fix oops in pcibios_release_device() after pcibios_free_controller()
@ 2016-08-10  0:44 Mauricio Faria de Oliveira
  2016-08-10  0:44 ` [PATCH v2 1/2] powerpc: add refcount to struct pci_controller Mauricio Faria de Oliveira
  2016-08-10  0:44 ` [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses Mauricio Faria de Oliveira
  0 siblings, 2 replies; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10  0:44 UTC (permalink / raw)
  To: linuxppc-dev

This patchset addresses the problem/suggestion discussed previously [1],
by adding a kref reference counting to the PHB (struct pci_controller):

> It's possible to hit an oops/crash if pcibios_release_device() accesses the
> phb struct and it had been freed earlier -- by pcibios_free_controller() --
> as the memory it pointed to can be reused.

> If after reuse 'phb->controller_ops.release_device' is non-NULL it will be
> called, but it points to an invalid location (that function pointer is not
> set anywhere in the code, so if it's non-NULL, that's not correct), and so
> it hits an oops and the system crashes.

> That problem can happen with the pSeries platform's DLPAR remove operation
> if references to devices are held until after the pcibios_free_controller()
> function runs, and then released - exercising pcibios_release_device() path.

More problem details/call trace are described in the original submission [1].

With the patch applied (tested on 4.8-rc1), the test-case demonstrates that
the PHB is only released/freed after the last reference to the PCI device(s)
is dropped:

  Enable debugging messages:

  # echo 'file pci-common.c +pf; file pci-hotplug.c +pf' > /sys/kernel/debug/dynamic_debug/control
  # echo 8 > /proc/sys/kernel/printk

  Hold references to both PCI devices in the PHB:

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
  <...> /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/<...>
  
  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
  <...> /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/<...>
  
  # cat > /dev/sdaa & pid1=$!
  # cat > /dev/sdab & pid2=$!
  
  Perform DLPAR remove of the PHB:

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  888.776964] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  888.776983] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  893.696431] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  908.352717] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  908.352744] pcibios_remove_bus: PCI 0021:01, pci_bus c0000001e7d59400, phb c0000001e7d57400
  [  908.352753] controller_put: PCI domain 33, phb c0000001e7d57400
  [  908.352811] pcibios_free_controller: PCI domain 33, phb c0000001e7d57400, phb->is_dynamic 1
  [  908.352820] controller_put: PCI domain 33, phb c0000001e7d57400
  [  908.352832] rpadlpar_io: slot PHB 33 removed
  
  Notice the PHB was not freed yet (controller_free() was not called)

  Drop the last references to the PCI devices:

  # kill -9 $pid1
  [  991.221998] pcibios_release_device: PCI 0021:01:00.0, pci_dev c0000001ee0b7000, phb c0000001e7d57400
  [  991.222005] controller_put: PCI domain 33, phb c0000001e7d57400
  
  # kill -9 $pid2
  [  996.076293] pcibios_release_device: PCI 0021:01:00.1, pci_dev c0000001ee0b3800, phb c0000001e7d57400
  [  996.076299] controller_put: PCI domain 33, phb c0000001e7d57400
  [  996.076303] controller_free: PCI domain: 33, phb c0000001e7d57400, phb->is_dynamic 1

  Notice that only now the PHB was freed.

Note: this patchset currently covers references from struct pci_dev/pci_bus,
which _is_ enough to resolve this particular problem; it does not yet cover
references from struct pci_dn/eeh_pe/eeh_dev (but since those are unchanged
by/unrelated to this patchset, they remain working in the very same manner).

I have gone to great lengths in time studying the relevant code for EEH in
order to implement those too, but am not yet sure of all the details (e.g.,
lifetime of eeh_dev, removal of pci_dn, etc) that need to be considered to
kfree() them - will likely ask Gavin & maintainers for RFC after some time.

Links:
  [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-July/145264.html 

Changelog:
  v2: change approach to use krefs (suggestion by benh & mpe).

Mauricio Faria de Oliveira (2):
  powerpc: add refcount to struct pci_controller
  powerpc: update pci_controller.refcount for PCI devices and buses

 arch/powerpc/include/asm/pci-bridge.h | 15 ++++++++
 arch/powerpc/kernel/pci-common.c      | 72 +++++++++++++++++++++++++++++++++--
 arch/powerpc/kernel/pci-hotplug.c     | 29 ++++++++++++++
 arch/powerpc/kernel/pci_of_scan.c     |  1 +
 4 files changed, 114 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] powerpc: add refcount to struct pci_controller
  2016-08-10  0:44 [PATCH v2 0/2] powerpc: fix oops in pcibios_release_device() after pcibios_free_controller() Mauricio Faria de Oliveira
@ 2016-08-10  0:44 ` Mauricio Faria de Oliveira
  2016-08-10  1:45   ` Andrew Donnellan
  2016-08-10  0:44 ` [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses Mauricio Faria de Oliveira
  1 sibling, 1 reply; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10  0:44 UTC (permalink / raw)
  To: linuxppc-dev

This commit introduces the 'refcount' field in struct pci_controller,
along with the corresponding functions 'controller_(get|put|free)()'.

The functions 'pcibios_(alloc|free)_controller()' are modified to use
that in a backwards compatible manner. (i.e., kfree(phb) is performed
when pcibios_free_controller() is called.)

So, this patch adds the infrastructure with no functional differences
to current users of pcibios_(alloc|free)_controller().  Notably, only
the pseries platform calls pcibios_free_controller() for some purpose
other than to release the pci_controller in case of errors just after
the call to pcibios_alloc_controller() (i.e., 'goto error' scenarios).

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h | 15 +++++++++++
 arch/powerpc/kernel/pci-common.c      | 47 ++++++++++++++++++++++++++++++++---
 2 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index b5e88e4..6fde0a9 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -10,6 +10,7 @@
 #include <linux/pci.h>
 #include <linux/list.h>
 #include <linux/ioport.h>
+#include <linux/kref.h>
 
 struct device_node;
 
@@ -128,9 +129,23 @@ struct pci_controller {
 	struct pci_dn *pci_data;
 #endif	/* CONFIG_PPC64 */
 
+	/*
+	 * Reference counting for the structures:
+	 * - TODO pci_dev
+	 * - TODO pci_bus
+	 * - TODO pci_dn
+	 * - TODO eeh_pe
+	 * - TODO eeh_dev
+	 */
+	struct kref refcount;
+
 	void *private_data;
 };
 
+void controller_get(struct pci_controller *phb);
+void controller_put(struct pci_controller *phb);
+void controller_free(struct kref *kref);
+
 /* These are used for config access before all the PCI probing
    has been done. */
 extern int early_read_config_byte(struct pci_controller *hose, int bus,
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 08afddf..29b37d3 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -114,6 +114,8 @@ struct pci_controller *pcibios_alloc_controller(struct device_node *dev)
 	phb = zalloc_maybe_bootmem(sizeof(struct pci_controller), GFP_KERNEL);
 	if (phb == NULL)
 		return NULL;
+
+	kref_init(&phb->refcount); /* use first reference for hose_list entry */
 	spin_lock(&hose_spinlock);
 	phb->global_number = get_phb_number(dev);
 	list_add_tail(&phb->list_node, &hose_list);
@@ -130,12 +132,53 @@ struct pci_controller *pcibios_alloc_controller(struct device_node *dev)
 		PHB_SET_NODE(phb, nid);
 	}
 #endif
+
+	pr_debug("PCI domain %d, phb %p, phb->is_dynamic %d\n",
+		phb->global_number, phb, phb->is_dynamic);
+
 	return phb;
 }
 EXPORT_SYMBOL_GPL(pcibios_alloc_controller);
 
+void controller_get(struct pci_controller *phb)
+{
+	if (unlikely(!phb)) {
+		pr_warn("%s: null PHB; refcount bug!\n", __func__);
+		WARN_ON(1);
+	} else {
+		pr_debug("PCI domain %d, phb %p\n", phb->global_number, phb);
+		kref_get(&phb->refcount);
+	}
+}
+
+void controller_put(struct pci_controller *phb)
+{
+	if (unlikely(!phb)) {
+		pr_warn("%s: null PHB; refcount bug!\n", __func__);
+		WARN_ON(1);
+	} else {
+		pr_debug("PCI domain %d, phb %p\n", phb->global_number, phb);
+		kref_put(&phb->refcount, controller_free);
+	}
+}
+
+void controller_free(struct kref *kref)
+{
+	struct pci_controller *phb = container_of(kref, struct pci_controller,
+							refcount);
+
+	pr_info("%s: PCI domain: %d, phb %p, phb->is_dynamic %d\n",
+		__func__, phb->global_number, phb, phb->is_dynamic);
+
+	if (phb->is_dynamic)
+		kfree(phb);
+}
+
 void pcibios_free_controller(struct pci_controller *phb)
 {
+	pr_debug("PCI domain %d, phb %p, phb->is_dynamic %d\n",
+		phb->global_number, phb, phb->is_dynamic);
+
 	spin_lock(&hose_spinlock);
 
 	/* Clear bit of phb_bitmap to allow reuse of this PHB number. */
@@ -143,10 +186,8 @@ void pcibios_free_controller(struct pci_controller *phb)
 		clear_bit(phb->global_number, phb_bitmap);
 
 	list_del(&phb->list_node);
+	controller_put(phb);
 	spin_unlock(&hose_spinlock);
-
-	if (phb->is_dynamic)
-		kfree(phb);
 }
 EXPORT_SYMBOL_GPL(pcibios_free_controller);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses
  2016-08-10  0:44 [PATCH v2 0/2] powerpc: fix oops in pcibios_release_device() after pcibios_free_controller() Mauricio Faria de Oliveira
  2016-08-10  0:44 ` [PATCH v2 1/2] powerpc: add refcount to struct pci_controller Mauricio Faria de Oliveira
@ 2016-08-10  0:44 ` Mauricio Faria de Oliveira
  2016-08-10  3:35   ` Andrew Donnellan
  1 sibling, 1 reply; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10  0:44 UTC (permalink / raw)
  To: linuxppc-dev

This patch employs the refcount in struct pci_controller to track
the references from PCI devices and buses (struct pci_dev/pci_bus).

In order to do that without modifying any PCI scan/probe approach
(e.g., PCI_PROBE_DEVTREE and PCI_PROBE_NORMAL), it leverages some
of the PCI arch-specific callback: pci_(add|release)_device() and
pci_(add|remove)_bus().

(a small change is required for PCI_PROBE_DEVTREE, which makes it
consistent with PCI_PROBE_NORMAL - the pci_dev should inherit the
parent pci_bus's phb pointer - see pci_setup_device() in probe.c)

This also has the advantage that locking for kref_(get|put)() is
satisfied by the 'pci_rescan_remove_lock' mutex, which is normal
practice for usage of the PCI subsystem - thus already in place.
More details added in comment on pcibios_release_device().

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h |  4 ++--
 arch/powerpc/kernel/pci-common.c      | 25 +++++++++++++++++++++++++
 arch/powerpc/kernel/pci-hotplug.c     | 29 +++++++++++++++++++++++++++++
 arch/powerpc/kernel/pci_of_scan.c     |  1 +
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 6fde0a9..d10eee3 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -131,8 +131,8 @@ struct pci_controller {
 
 	/*
 	 * Reference counting for the structures:
-	 * - TODO pci_dev
-	 * - TODO pci_bus
+	 * - pci_dev
+	 * - pci_bus
 	 * - TODO pci_dn
 	 * - TODO eeh_pe
 	 * - TODO eeh_dev
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 29b37d3..c55e9c0 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1047,6 +1047,17 @@ static void pcibios_setup_device(struct pci_dev *dev)
 
 int pcibios_add_device(struct pci_dev *dev)
 {
+	struct pci_controller *phb = pci_bus_to_host(dev->bus);
+
+	pr_debug("PCI %s, pci_dev %p, phb %p\n", dev_name(&dev->dev), dev, phb);
+
+	if (!phb)
+		pr_warn("%s: PCI device %s has null PHB; refcount bug!",
+			__func__, dev_name(&dev->dev)); /* WARN_ON ahead */
+
+	/* locking: see comment on pcibios_release_device(). */
+	controller_get(phb);
+
 	/*
 	 * We can only call pcibios_setup_device() after bus setup is complete,
 	 * since some of the platform specific DMA setup code depends on it.
@@ -1062,6 +1073,20 @@ int pcibios_add_device(struct pci_dev *dev)
 	return 0;
 }
 
+void pcibios_add_bus(struct pci_bus *bus)
+{
+	struct pci_controller *phb = pci_bus_to_host(bus);
+
+	pr_debug("PCI %s, pci_bus %p, phb %p\n", dev_name(&bus->dev), bus, phb);
+
+	if (!phb)
+		pr_warn("%s: PCI bus %s has null PHB; refcount bug!",
+			__func__, dev_name(&bus->dev)); /* WARN_ON ahead */
+
+	/* locking: see comment on pcibios_release_device(). */
+	controller_get(phb);
+}
+
 void pcibios_setup_bus_devices(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 2d71269..24d1a2a 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -55,15 +55,44 @@ EXPORT_SYMBOL_GPL(pci_find_bus_by_node);
  * @dev: PCI device
  *
  * The function is called before releasing the indicated PCI device.
+ *
+ * The locking for kref_get() and kref_put() of the PHB/pci_controller
+ * in pcibios_(add|release)_device() and pcibios_(add|remove)_bus() is
+ * satisfied by the pci_rescan_remove_lock mutex (required for rescan/
+ * remove paths of PCI devices/buses; the scan path doesn't require it,
+ * as there is only addition of devices/buses - no removal at all.)
  */
 void pcibios_release_device(struct pci_dev *dev)
 {
 	struct pci_controller *phb = pci_bus_to_host(dev->bus);
 
+	pr_debug("PCI %s, pci_dev %p, phb %p\n", dev_name(&dev->dev), dev, phb);
+
 	eeh_remove_device(dev);
 
 	if (phb->controller_ops.release_device)
 		phb->controller_ops.release_device(dev);
+
+	if (unlikely(!phb))
+		pr_warn("%s: PCI device %s has null PHB; refcount bug!",
+			__func__, dev_name(&dev->dev)); /* WARN_ON ahead */
+
+	/* locking: see comment on pcibios_release_device(). */
+	controller_put(phb);
+}
+
+void pcibios_remove_bus(struct pci_bus *bus)
+{
+	struct pci_controller *phb = pci_bus_to_host(bus);
+
+	pr_debug("PCI %s, pci_bus %p, phb %p\n", dev_name(&bus->dev), bus, phb);
+
+	if (unlikely(!phb))
+		pr_warn("%s: PCI bus %s has null PHB; refcount bug!",
+			__func__, dev_name(&bus->dev)); /* WARN_ON ahead */
+
+	/* locking: see comment on pcibios_release_device(). */
+	controller_put(phb);
 }
 
 /**
diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index 526ac67..e6f0deb 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -142,6 +142,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
 	dev->devfn = devfn;
 	dev->multifunction = 0;		/* maybe a lie? */
 	dev->needs_freset = 0;		/* pcie fundamental reset required */
+	dev->sysdata = bus->sysdata;	/* inherit bus's phb for phb->refcount */
 	set_pcie_port_type(dev);
 
 	pci_dev_assign_slot(dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] powerpc: add refcount to struct pci_controller
  2016-08-10  0:44 ` [PATCH v2 1/2] powerpc: add refcount to struct pci_controller Mauricio Faria de Oliveira
@ 2016-08-10  1:45   ` Andrew Donnellan
  2016-08-10 12:03     ` Mauricio Faria de Oliveira
  2016-08-10 13:53     ` Mauricio Faria de Oliveira
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Donnellan @ 2016-08-10  1:45 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira, linuxppc-dev

On 10/08/16 10:44, Mauricio Faria de Oliveira wrote:
> This commit introduces the 'refcount' field in struct pci_controller,
> along with the corresponding functions 'controller_(get|put|free)()'.
>
> The functions 'pcibios_(alloc|free)_controller()' are modified to use
> that in a backwards compatible manner. (i.e., kfree(phb) is performed
> when pcibios_free_controller() is called.)
>
> So, this patch adds the infrastructure with no functional differences
> to current users of pcibios_(alloc|free)_controller().  Notably, only
> the pseries platform calls pcibios_free_controller() for some purpose
> other than to release the pci_controller in case of errors just after
> the call to pcibios_alloc_controller() (i.e., 'goto error' scenarios).

cxl's vPHB API also uses pcibios_free_controller() (which is why we 
export the symbol, as it's called from within the cxl module). When we 
remove/shutdown the underlying cxl device, we remove all the devices 
sitting on the vPHB and then free the vPHB.

I'm currently working on a cxl defect found by an IBM test team where we 
run into this - will review this patch more thoroughly and test it shortly.


Andrew

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses
  2016-08-10  0:44 ` [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses Mauricio Faria de Oliveira
@ 2016-08-10  3:35   ` Andrew Donnellan
  2016-08-10 12:30     ` Mauricio Faria de Oliveira
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Donnellan @ 2016-08-10  3:35 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira, linuxppc-dev

On 10/08/16 10:44, Mauricio Faria de Oliveira wrote:
> This patch employs the refcount in struct pci_controller to track
> the references from PCI devices and buses (struct pci_dev/pci_bus).
>
> In order to do that without modifying any PCI scan/probe approach
> (e.g., PCI_PROBE_DEVTREE and PCI_PROBE_NORMAL), it leverages some
> of the PCI arch-specific callback: pci_(add|release)_device() and
> pci_(add|remove)_bus().
>
> (a small change is required for PCI_PROBE_DEVTREE, which makes it
> consistent with PCI_PROBE_NORMAL - the pci_dev should inherit the
> parent pci_bus's phb pointer - see pci_setup_device() in probe.c)
>
> This also has the advantage that locking for kref_(get|put)() is
> satisfied by the 'pci_rescan_remove_lock' mutex, which is normal
> practice for usage of the PCI subsystem - thus already in place.
> More details added in comment on pcibios_release_device().
>
> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>

>  void pcibios_release_device(struct pci_dev *dev)
>  {
>  	struct pci_controller *phb = pci_bus_to_host(dev->bus);
>
> +	pr_debug("PCI %s, pci_dev %p, phb %p\n", dev_name(&dev->dev), dev, phb);
> +
>  	eeh_remove_device(dev);
>
>  	if (phb->controller_ops.release_device)
>  		phb->controller_ops.release_device(dev);
> +
> +	if (unlikely(!phb))
> +		pr_warn("%s: PCI device %s has null PHB; refcount bug!",
> +			__func__, dev_name(&dev->dev)); /* WARN_ON ahead */

This should probably go before trying to dereference phb->controller_ops 
above?


-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] powerpc: add refcount to struct pci_controller
  2016-08-10  1:45   ` Andrew Donnellan
@ 2016-08-10 12:03     ` Mauricio Faria de Oliveira
  2016-08-10 13:53     ` Mauricio Faria de Oliveira
  1 sibling, 0 replies; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10 12:03 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev

On 08/09/2016 10:45 PM, Andrew Donnellan wrote:
>> [snip] Notably, only
>> the pseries platform calls pcibios_free_controller() for some purpose
>> other than to release the pci_controller in case of errors just after
>> the call to pcibios_alloc_controller() (i.e., 'goto error' scenarios).
>
> cxl's vPHB API also uses pcibios_free_controller() [snip]

Cool. I see I missed this report line from grep; thanks. I was mostly
biased at arch/powerpc/ and driver/pci/ these days.

> I'm currently working on a cxl defect found by an IBM test team where we
> run into this - will review this patch more thoroughly and test it shortly.

That's great; thanks!

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses
  2016-08-10  3:35   ` Andrew Donnellan
@ 2016-08-10 12:30     ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10 12:30 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev

On 08/10/2016 12:35 AM, Andrew Donnellan wrote:
>>      if (phb->controller_ops.release_device)
>>          phb->controller_ops.release_device(dev);
>> +
>> +    if (unlikely(!phb))
>> +        pr_warn("%s: PCI device %s has null PHB; refcount bug!",
>> +            __func__, dev_name(&dev->dev)); /* WARN_ON ahead */
>
> This should probably go before trying to dereference phb->controller_ops
> above?

You're right; I misplaced this check; will fix that.

Just a bit of explanation/history:

While trying to understand why I didn't hit that null dereference
when I initially hit the WARN_ON (the reason for the 'small change'
in the commit description), I found that back then I checked for
'pci_dev->sysdata' directly (not 'phb' -- early stages of the patch).

The former indeed was null, as it didn't inherit 'pci_bus->sysdata'
on pseries, but the code uses 'phb = dev->bus->sysdata' (and this
was not null as pci_bus->sysdata was actually set).

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] powerpc: add refcount to struct pci_controller
  2016-08-10  1:45   ` Andrew Donnellan
  2016-08-10 12:03     ` Mauricio Faria de Oliveira
@ 2016-08-10 13:53     ` Mauricio Faria de Oliveira
  2016-08-10 21:46       ` Mauricio Faria de Oliveira
  1 sibling, 1 reply; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10 13:53 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev

On 08/09/2016 10:45 PM, Andrew Donnellan wrote:
> I'm currently working on a cxl defect found by an IBM test team where we
> run into this - will review this patch more thoroughly and test it shortly.

Gavin provided a review/suggestions via chat, pointing to rely on the
refcount that already exists in the PCI subsystem (not reinvent another)
and leverage the release of the PCI root bus -- which is much simpler!

He replied there should be no problems w/ the EEH reset path (PCI root
bus not released) nor w/ other structs w/ refs to the PHB (PCI DN, EEH
PE, EEH DEV).

I'll go down that path for a PATCH v3.

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] powerpc: add refcount to struct pci_controller
  2016-08-10 13:53     ` Mauricio Faria de Oliveira
@ 2016-08-10 21:46       ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 9+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-10 21:46 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev

On 08/10/2016 10:53 AM, Mauricio Faria de Oliveira wrote:
> I'll go down that path for a PATCH v3.

That is,
'powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)'

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-10 21:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-10  0:44 [PATCH v2 0/2] powerpc: fix oops in pcibios_release_device() after pcibios_free_controller() Mauricio Faria de Oliveira
2016-08-10  0:44 ` [PATCH v2 1/2] powerpc: add refcount to struct pci_controller Mauricio Faria de Oliveira
2016-08-10  1:45   ` Andrew Donnellan
2016-08-10 12:03     ` Mauricio Faria de Oliveira
2016-08-10 13:53     ` Mauricio Faria de Oliveira
2016-08-10 21:46       ` Mauricio Faria de Oliveira
2016-08-10  0:44 ` [PATCH v2 2/2] powerpc: update pci_controller.refcount for PCI devices and buses Mauricio Faria de Oliveira
2016-08-10  3:35   ` Andrew Donnellan
2016-08-10 12:30     ` Mauricio Faria de Oliveira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).