* [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() [not found] <20170415170107.643253702@linutronix.de> @ 2017-04-15 17:01 ` Thomas Gleixner 2017-04-17 6:46 ` Peter Zijlstra 2017-04-18 19:44 ` Bjorn Helgaas 2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner 1 sibling, 2 replies; 6+ messages in thread From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw) To: LKML Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem unearthed a circular lock dependency which was hidden from lockdep due to the lockdep annotation of get_online_cpus() which prevents lockdep from creating full dependency chains. There are several variants of this. And example is: Chain exists of: cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex CPU0 CPU1 ---- ---- lock(&item->mutex); lock(drm_global_mutex); lock(&item->mutex); lock(cpu_hotplug_lock.rw_sem); because there are dependencies through workqueues. The call chain is: get_online_cpus apply_workqueue_attrs __alloc_workqueue_key ttm_mem_global_init ast_ttm_mem_global_init drm_global_item_ref ast_mm_init ast_driver_load drm_dev_register drm_get_pci_dev ast_pci_probe local_pci_probe work_for_cpu_fn process_one_work worker_thread This is not a problem of get_online_cpus() recursion, it's a possible deadlock undetected by lockdep so far. The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to protect the PCI probing. There is a side effect to this: cpu_hotplug_disable() makes a concurrent cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI probing usually happens during the boot process where no interaction is possible. Any later invocations are infrequent enough and concurrent hotplug attempts are so unlikely that the danger of user space visible regressions is very close to zero. Anyway, thats preferrable over a real deadlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org --- drivers/pci/pci-driver.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi) return 0; } +static bool pci_physfn_is_probed(struct pci_dev *dev) +{ +#ifdef CONFIG_ATS + return dev->physfn->is_probed; +#else + return false; +#endif +} + static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, const struct pci_device_id *id) { - int error, node; + int error, node, cpu; struct drv_dev_and_id ddi = { drv, dev, id }; /* @@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri if (node >= 0 && node != numa_node_id()) { int cpu; - get_online_cpus(); + cpu_hotplug_disable(); cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); - put_online_cpus(); + cpu_hotplug_enable(); } else error = local_pci_probe(&ddi); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() 2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner @ 2017-04-17 6:46 ` Peter Zijlstra 2017-04-17 7:40 ` Thomas Gleixner 2017-04-18 19:44 ` Bjorn Helgaas 1 sibling, 1 reply; 6+ messages in thread From: Peter Zijlstra @ 2017-04-17 6:46 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote: > +++ b/drivers/pci/pci-driver.c > @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi) > return 0; > } > > +static bool pci_physfn_is_probed(struct pci_dev *dev) > +{ > +#ifdef CONFIG_ATS > + return dev->physfn->is_probed; > +#else > + return false; > +#endif > +} > + Should be in the next patch perhaps? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() 2017-04-17 6:46 ` Peter Zijlstra @ 2017-04-17 7:40 ` Thomas Gleixner 0 siblings, 0 replies; 6+ messages in thread From: Thomas Gleixner @ 2017-04-17 7:40 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci On Mon, 17 Apr 2017, Peter Zijlstra wrote: > On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote: > > +++ b/drivers/pci/pci-driver.c > > @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi) > > return 0; > > } > > > > +static bool pci_physfn_is_probed(struct pci_dev *dev) > > +{ > > +#ifdef CONFIG_ATS > > + return dev->physfn->is_probed; > > +#else > > + return false; > > +#endif > > +} > > + > > Should be in the next patch perhaps? Indeed. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() 2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner 2017-04-17 6:46 ` Peter Zijlstra @ 2017-04-18 19:44 ` Bjorn Helgaas 2017-04-18 19:51 ` Thomas Gleixner 1 sibling, 1 reply; 6+ messages in thread From: Bjorn Helgaas @ 2017-04-18 19:44 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote: > Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem > unearthed a circular lock dependency which was hidden from lockdep due to > the lockdep annotation of get_online_cpus() which prevents lockdep from > creating full dependency chains. There are several variants of this. And > example is: > > Chain exists of: > > cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex > > CPU0 CPU1 > ---- ---- > lock(&item->mutex); > lock(drm_global_mutex); > lock(&item->mutex); > lock(cpu_hotplug_lock.rw_sem); > > because there are dependencies through workqueues. The call chain is: > > get_online_cpus > apply_workqueue_attrs > __alloc_workqueue_key > ttm_mem_global_init > ast_ttm_mem_global_init > drm_global_item_ref > ast_mm_init > ast_driver_load > drm_dev_register > drm_get_pci_dev > ast_pci_probe > local_pci_probe > work_for_cpu_fn > process_one_work > worker_thread > > This is not a problem of get_online_cpus() recursion, it's a possible > deadlock undetected by lockdep so far. > > The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to > protect the PCI probing. > > There is a side effect to this: cpu_hotplug_disable() makes a concurrent > cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI > probing usually happens during the boot process where no interaction is > possible. Any later invocations are infrequent enough and concurrent > hotplug attempts are so unlikely that the danger of user space visible > regressions is very close to zero. Anyway, thats preferrable over a real > deadlock. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: linux-pci@vger.kernel.org > --- > drivers/pci/pci-driver.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi) > return 0; > } > > +static bool pci_physfn_is_probed(struct pci_dev *dev) > +{ > +#ifdef CONFIG_ATS I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS. But I think CONFIG_PCI_IOV would be more appropriate. With that, and squashing this into the next patch, Acked-by: Bjorn Helgaas <bhelgaas@google.com> I expect you'll merge this along with the rest of the series. Let me know if you need anything else from me. > + return dev->physfn->is_probed; > +#else > + return false; > +#endif > +} > + > static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, > const struct pci_device_id *id) > { > - int error, node; > + int error, node, cpu; > struct drv_dev_and_id ddi = { drv, dev, id }; > > /* > @@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri > if (node >= 0 && node != numa_node_id()) { > int cpu; > > - get_online_cpus(); > + cpu_hotplug_disable(); > cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); > if (cpu < nr_cpu_ids) > error = work_on_cpu(cpu, local_pci_probe, &ddi); > else > error = local_pci_probe(&ddi); > - put_online_cpus(); > + cpu_hotplug_enable(); > } else > error = local_pci_probe(&ddi); > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() 2017-04-18 19:44 ` Bjorn Helgaas @ 2017-04-18 19:51 ` Thomas Gleixner 0 siblings, 0 replies; 6+ messages in thread From: Thomas Gleixner @ 2017-04-18 19:51 UTC (permalink / raw) To: Bjorn Helgaas Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci On Tue, 18 Apr 2017, Bjorn Helgaas wrote: > > +static bool pci_physfn_is_probed(struct pci_dev *dev) > > +{ > > +#ifdef CONFIG_ATS > > I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS. yes. > > But I think CONFIG_PCI_IOV would be more appropriate. With that, and The physfn member is under CONFIG_PCI_ATS so I took that one, but you are right PCI_IOV is the proper one. Will fix. > squashing this into the next patch, Did so Thanks, tglx ^ permalink raw reply [flat|nested] 6+ messages in thread
* [patch 18/20] PCI: Replace the racy recursion prevention [not found] <20170415170107.643253702@linutronix.de> 2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner @ 2017-04-15 17:01 ` Thomas Gleixner 1 sibling, 0 replies; 6+ messages in thread From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw) To: LKML Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior, Bjorn Helgaas, linux-pci pci_call_probe() can called recursively when a physcial function is probed and the probing creates virtual functions, which are populated via pci_bus_add_device() which in turn can end up calling pci_call_probe() again. The code has an interesting way to prevent recursing into the workqueue code. That's accomplished by a check whether the current task runs already on the numa node which is associated with the device. While that works to prevent the recursion into the workqueue code, it's racy versus normal execution as there is no guarantee that the node does not vanish after the check. Make the detection reliable by: - Mark a probed device as 'is_probed' in pci_call_probe() - Check in pci_call_probe for a virtual function. If it's a virtual function and the associated physical function device is marked 'is_probed' then this is a recursive call, so the call can be invoked in the calling context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org --- drivers/pci/pci-driver.c | 35 ++++++++++++++--------------------- include/linux/pci.h | 1 + 2 files changed, 15 insertions(+), 21 deletions(-) --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -341,33 +341,26 @@ static int pci_call_probe(struct pci_dri * on the right node. */ node = dev_to_node(&dev->dev); + dev->is_probed = 1; + + cpu_hotplug_disable(); /* - * On NUMA systems, we are likely to call a PF probe function using - * work_on_cpu(). If that probe calls pci_enable_sriov() (which - * adds the VF devices via pci_bus_add_device()), we may re-enter - * this function to call the VF probe function. Calling - * work_on_cpu() again will cause a lockdep warning. Since VFs are - * always on the same node as the PF, we can work around this by - * avoiding work_on_cpu() when we're already on the correct node. - * - * Preemption is enabled, so it's theoretically unsafe to use - * numa_node_id(), but even if we run the probe function on the - * wrong node, it should be functionally correct. + * Prevent nesting work_on_cpu() for the case where a Virtual Function + * device is probed from work_on_cpu() of the Physical device. */ - if (node >= 0 && node != numa_node_id()) { - int cpu; - - cpu_hotplug_disable(); + if (dev->is_virtfn && pci_physfn_is_probed(dev)) + cpu = nr_cpu_ids; + else cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); - if (cpu < nr_cpu_ids) - error = work_on_cpu(cpu, local_pci_probe, &ddi); - else - error = local_pci_probe(&ddi); - cpu_hotplug_enable(); - } else + + if (cpu < nr_cpu_ids) + error = work_on_cpu(cpu, local_pci_probe, &ddi); + else error = local_pci_probe(&ddi); + dev->is_probed = 0; + cpu_hotplug_enable(); return error; } --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -365,6 +365,7 @@ struct pci_dev { unsigned int irq_managed:1; unsigned int has_secondary_link:1; unsigned int non_compliant_bars:1; /* broken BARs; ignore them */ + unsigned int is_probed:1; /* device probing in progress */ pci_dev_flags_t dev_flags; atomic_t enable_cnt; /* pci_enable_device has been called */ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-04-18 19:51 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20170415170107.643253702@linutronix.de> 2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner 2017-04-17 6:46 ` Peter Zijlstra 2017-04-17 7:40 ` Thomas Gleixner 2017-04-18 19:44 ` Bjorn Helgaas 2017-04-18 19:51 ` Thomas Gleixner 2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).