* [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
[not found] <20170415170107.643253702@linutronix.de>
@ 2017-04-15 17:01 ` Thomas Gleixner
2017-04-17 6:46 ` Peter Zijlstra
2017-04-18 19:44 ` Bjorn Helgaas
2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
1 sibling, 2 replies; 6+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
To: LKML
Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
Bjorn Helgaas, linux-pci
Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
unearthed a circular lock dependency which was hidden from lockdep due to
the lockdep annotation of get_online_cpus() which prevents lockdep from
creating full dependency chains. There are several variants of this. And
example is:
Chain exists of:
cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex
CPU0 CPU1
---- ----
lock(&item->mutex);
lock(drm_global_mutex);
lock(&item->mutex);
lock(cpu_hotplug_lock.rw_sem);
because there are dependencies through workqueues. The call chain is:
get_online_cpus
apply_workqueue_attrs
__alloc_workqueue_key
ttm_mem_global_init
ast_ttm_mem_global_init
drm_global_item_ref
ast_mm_init
ast_driver_load
drm_dev_register
drm_get_pci_dev
ast_pci_probe
local_pci_probe
work_for_cpu_fn
process_one_work
worker_thread
This is not a problem of get_online_cpus() recursion, it's a possible
deadlock undetected by lockdep so far.
The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
protect the PCI probing.
There is a side effect to this: cpu_hotplug_disable() makes a concurrent
cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI
probing usually happens during the boot process where no interaction is
possible. Any later invocations are infrequent enough and concurrent
hotplug attempts are so unlikely that the danger of user space visible
regressions is very close to zero. Anyway, thats preferrable over a real
deadlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
drivers/pci/pci-driver.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
return 0;
}
+static bool pci_physfn_is_probed(struct pci_dev *dev)
+{
+#ifdef CONFIG_ATS
+ return dev->physfn->is_probed;
+#else
+ return false;
+#endif
+}
+
static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
const struct pci_device_id *id)
{
- int error, node;
+ int error, node, cpu;
struct drv_dev_and_id ddi = { drv, dev, id };
/*
@@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri
if (node >= 0 && node != numa_node_id()) {
int cpu;
- get_online_cpus();
+ cpu_hotplug_disable();
cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
if (cpu < nr_cpu_ids)
error = work_on_cpu(cpu, local_pci_probe, &ddi);
else
error = local_pci_probe(&ddi);
- put_online_cpus();
+ cpu_hotplug_enable();
} else
error = local_pci_probe(&ddi);
^ permalink raw reply [flat|nested] 6+ messages in thread
* [patch 18/20] PCI: Replace the racy recursion prevention
[not found] <20170415170107.643253702@linutronix.de>
2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
1 sibling, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
To: LKML
Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
Bjorn Helgaas, linux-pci
pci_call_probe() can called recursively when a physcial function is probed
and the probing creates virtual functions, which are populated via
pci_bus_add_device() which in turn can end up calling pci_call_probe()
again.
The code has an interesting way to prevent recursing into the workqueue
code. That's accomplished by a check whether the current task runs already
on the numa node which is associated with the device.
While that works to prevent the recursion into the workqueue code, it's
racy versus normal execution as there is no guarantee that the node does
not vanish after the check.
Make the detection reliable by:
- Mark a probed device as 'is_probed' in pci_call_probe()
- Check in pci_call_probe for a virtual function. If it's a virtual
function and the associated physical function device is marked
'is_probed' then this is a recursive call, so the call can be invoked in
the calling context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
drivers/pci/pci-driver.c | 35 ++++++++++++++---------------------
include/linux/pci.h | 1 +
2 files changed, 15 insertions(+), 21 deletions(-)
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -341,33 +341,26 @@ static int pci_call_probe(struct pci_dri
* on the right node.
*/
node = dev_to_node(&dev->dev);
+ dev->is_probed = 1;
+
+ cpu_hotplug_disable();
/*
- * On NUMA systems, we are likely to call a PF probe function using
- * work_on_cpu(). If that probe calls pci_enable_sriov() (which
- * adds the VF devices via pci_bus_add_device()), we may re-enter
- * this function to call the VF probe function. Calling
- * work_on_cpu() again will cause a lockdep warning. Since VFs are
- * always on the same node as the PF, we can work around this by
- * avoiding work_on_cpu() when we're already on the correct node.
- *
- * Preemption is enabled, so it's theoretically unsafe to use
- * numa_node_id(), but even if we run the probe function on the
- * wrong node, it should be functionally correct.
+ * Prevent nesting work_on_cpu() for the case where a Virtual Function
+ * device is probed from work_on_cpu() of the Physical device.
*/
- if (node >= 0 && node != numa_node_id()) {
- int cpu;
-
- cpu_hotplug_disable();
+ if (dev->is_virtfn && pci_physfn_is_probed(dev))
+ cpu = nr_cpu_ids;
+ else
cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
- if (cpu < nr_cpu_ids)
- error = work_on_cpu(cpu, local_pci_probe, &ddi);
- else
- error = local_pci_probe(&ddi);
- cpu_hotplug_enable();
- } else
+
+ if (cpu < nr_cpu_ids)
+ error = work_on_cpu(cpu, local_pci_probe, &ddi);
+ else
error = local_pci_probe(&ddi);
+ dev->is_probed = 0;
+ cpu_hotplug_enable();
return error;
}
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -365,6 +365,7 @@ struct pci_dev {
unsigned int irq_managed:1;
unsigned int has_secondary_link:1;
unsigned int non_compliant_bars:1; /* broken BARs; ignore them */
+ unsigned int is_probed:1; /* device probing in progress */
pci_dev_flags_t dev_flags;
atomic_t enable_cnt; /* pci_enable_device has been called */
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-17 6:46 ` Peter Zijlstra
2017-04-17 7:40 ` Thomas Gleixner
2017-04-18 19:44 ` Bjorn Helgaas
1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2017-04-17 6:46 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
Bjorn Helgaas, linux-pci
On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> +++ b/drivers/pci/pci-driver.c
> @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
> return 0;
> }
>
> +static bool pci_physfn_is_probed(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_ATS
> + return dev->physfn->is_probed;
> +#else
> + return false;
> +#endif
> +}
> +
Should be in the next patch perhaps?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
2017-04-17 6:46 ` Peter Zijlstra
@ 2017-04-17 7:40 ` Thomas Gleixner
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2017-04-17 7:40 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
Bjorn Helgaas, linux-pci
On Mon, 17 Apr 2017, Peter Zijlstra wrote:
> On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> > +++ b/drivers/pci/pci-driver.c
> > @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
> > return 0;
> > }
> >
> > +static bool pci_physfn_is_probed(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_ATS
> > + return dev->physfn->is_probed;
> > +#else
> > + return false;
> > +#endif
> > +}
> > +
>
> Should be in the next patch perhaps?
Indeed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-04-17 6:46 ` Peter Zijlstra
@ 2017-04-18 19:44 ` Bjorn Helgaas
2017-04-18 19:51 ` Thomas Gleixner
1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2017-04-18 19:44 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
Sebastian Siewior, Bjorn Helgaas, linux-pci
On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
> unearthed a circular lock dependency which was hidden from lockdep due to
> the lockdep annotation of get_online_cpus() which prevents lockdep from
> creating full dependency chains. There are several variants of this. And
> example is:
>
> Chain exists of:
>
> cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex
>
> CPU0 CPU1
> ---- ----
> lock(&item->mutex);
> lock(drm_global_mutex);
> lock(&item->mutex);
> lock(cpu_hotplug_lock.rw_sem);
>
> because there are dependencies through workqueues. The call chain is:
>
> get_online_cpus
> apply_workqueue_attrs
> __alloc_workqueue_key
> ttm_mem_global_init
> ast_ttm_mem_global_init
> drm_global_item_ref
> ast_mm_init
> ast_driver_load
> drm_dev_register
> drm_get_pci_dev
> ast_pci_probe
> local_pci_probe
> work_for_cpu_fn
> process_one_work
> worker_thread
>
> This is not a problem of get_online_cpus() recursion, it's a possible
> deadlock undetected by lockdep so far.
>
> The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
> protect the PCI probing.
>
> There is a side effect to this: cpu_hotplug_disable() makes a concurrent
> cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI
> probing usually happens during the boot process where no interaction is
> possible. Any later invocations are infrequent enough and concurrent
> hotplug attempts are so unlikely that the danger of user space visible
> regressions is very close to zero. Anyway, thats preferrable over a real
> deadlock.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
> drivers/pci/pci-driver.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
> return 0;
> }
>
> +static bool pci_physfn_is_probed(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_ATS
I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS.
But I think CONFIG_PCI_IOV would be more appropriate. With that, and
squashing this into the next patch,
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
I expect you'll merge this along with the rest of the series. Let me
know if you need anything else from me.
> + return dev->physfn->is_probed;
> +#else
> + return false;
> +#endif
> +}
> +
> static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> const struct pci_device_id *id)
> {
> - int error, node;
> + int error, node, cpu;
> struct drv_dev_and_id ddi = { drv, dev, id };
>
> /*
> @@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri
> if (node >= 0 && node != numa_node_id()) {
> int cpu;
>
> - get_online_cpus();
> + cpu_hotplug_disable();
> cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> if (cpu < nr_cpu_ids)
> error = work_on_cpu(cpu, local_pci_probe, &ddi);
> else
> error = local_pci_probe(&ddi);
> - put_online_cpus();
> + cpu_hotplug_enable();
> } else
> error = local_pci_probe(&ddi);
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
2017-04-18 19:44 ` Bjorn Helgaas
@ 2017-04-18 19:51 ` Thomas Gleixner
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2017-04-18 19:51 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
Sebastian Siewior, Bjorn Helgaas, linux-pci
On Tue, 18 Apr 2017, Bjorn Helgaas wrote:
> > +static bool pci_physfn_is_probed(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_ATS
>
> I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS.
yes.
>
> But I think CONFIG_PCI_IOV would be more appropriate. With that, and
The physfn member is under CONFIG_PCI_ATS so I took that one, but you are
right PCI_IOV is the proper one. Will fix.
> squashing this into the next patch,
Did so
Thanks,
tglx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-04-18 19:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170415170107.643253702@linutronix.de>
2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-04-17 6:46 ` Peter Zijlstra
2017-04-17 7:40 ` Thomas Gleixner
2017-04-18 19:44 ` Bjorn Helgaas
2017-04-18 19:51 ` Thomas Gleixner
2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).