[PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes

linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
@ 2013-04-11 15:30 Michael S. Tsirkin
  2013-04-11 18:05 ` Tejun Heo
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-11 15:30 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Ming Lei, Greg Kroah-Hartman, David Miller, Roland Dreier, netdev,
	Yan Burman, Jack Morgenstein, Bjorn Helgaas, linux-pci, Tejun Heo

The following lockdep warning is reported to trigger since 3.9-rc1:

=============================================
[ INFO: possible recursive locking detected ]
3.9.0-rc1 #96 Not tainted
---------------------------------------------
kworker/0:1/734 is trying to acquire lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250

but task is already holding lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((&wfc.work));
  lock((&wfc.work));

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/0:1/734:
 #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
device_attach+0x25/0xb0

stack backtrace:
Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
Call Trace:
 [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
 [<ffffffff81095150>] __lock_acquire+0x440/0xc70
 [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
 [<ffffffff810959da>] lock_acquire+0x5a/0x70
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff81066cf5>] flush_work+0x45/0x250
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
 [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
 [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
 [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81066f74>] work_on_cpu+0x74/0x90
 [<ffffffff81063820>] ? keventd_up+0x20/0x20
 [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
 [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
 [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
 [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
 [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
 [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
 [<ffffffff812db1bb>] __device_attach+0x4b/0x60
 [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
 [<ffffffff812db298>] device_attach+0x98/0xb0
 [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
 [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
 [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
 [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
 [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
 [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
 [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
 [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
 [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
 [<ffffffff81064cfb>] worker_thread+0x30b/0x430
 [<ffffffff810649f0>] ? manage_workers+0x340/0x340
 [<ffffffff8106cea6>] kthread+0xd6/0xe0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70

Reference: http://marc.info/?l=linux-netdev&m=136249690901892&w=2

The issue is that a driver, in it's probe function, calls
pci_sriov_enable so a PF device probe causes VF probe (AKA nested
probe).  Each probe in pci_device_probe is (normally) run through
work_on_cpu (this is to get the right numa node for memory allocated by
the driver).  In turn work_on_cpu does this internally:

        schedule_work_on(cpu, &wfc.work);
        flush_work(&wfc.work);

So if you are running probe on CPU1, and cause another
probe on the same CPU, this will try to flush
workqueue from inside same workqueue which of course
deadlocks.

Nested probing might be tricky to get right generally.

But for pci_sriov_enable, the situation is actually very simple: all VFs
naturally have same affinity as the PF, and cpumask_any_and is actually
same as cpumask_first_and, so it always gives us the same CPU.
So let's just detect that, and run the probing for VFs locally without a
workqueue.

This is hardly elegant, but looks to me like an appropriate quick fix
for 3.9.

Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

Reposting due to missed Cc's. Sorry about the noise.

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 1fa1e48..6eeb5ec 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 		int cpu;
 
 		get_online_cpus();
 		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-		if (cpu < nr_cpu_ids)
+		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
 			error = work_on_cpu(cpu, local_pci_probe, &ddi);
 		else
 			error = local_pci_probe(&ddi);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 15:30 [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes Michael S. Tsirkin
@ 2013-04-11 18:05 ` Tejun Heo
  2013-04-11 18:58   ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2013-04-11 18:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

Hello,

On Thu, Apr 11, 2013 at 06:30:30PM +0300, Michael S. Tsirkin wrote:
> The issue is that a driver, in it's probe function, calls
> pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> probe).  Each probe in pci_device_probe is (normally) run through
> work_on_cpu (this is to get the right numa node for memory allocated by
> the driver).  In turn work_on_cpu does this internally:
> 
>         schedule_work_on(cpu, &wfc.work);
>         flush_work(&wfc.work);
> 
> So if you are running probe on CPU1, and cause another
> probe on the same CPU, this will try to flush
> workqueue from inside same workqueue which of course
> deadlocks.
> 
> Nested probing might be tricky to get right generally.

Hmm... how about adding a work_on_cpu_nested() which takes @subclass
argument?  Wouldn't that be much cleaner?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 18:05 ` Tejun Heo
@ 2013-04-11 18:58   ` Michael S. Tsirkin
  2013-04-11 19:04     ` Tejun Heo
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-11 18:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 11, 2013 at 11:05:17AM -0700, Tejun Heo wrote:
> Hello,
> 
> On Thu, Apr 11, 2013 at 06:30:30PM +0300, Michael S. Tsirkin wrote:
> > The issue is that a driver, in it's probe function, calls
> > pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> > probe).  Each probe in pci_device_probe is (normally) run through
> > work_on_cpu (this is to get the right numa node for memory allocated by
> > the driver).  In turn work_on_cpu does this internally:
> > 
> >         schedule_work_on(cpu, &wfc.work);
> >         flush_work(&wfc.work);
> > 
> > So if you are running probe on CPU1, and cause another
> > probe on the same CPU, this will try to flush
> > workqueue from inside same workqueue which of course
> > deadlocks.
> > 
> > Nested probing might be tricky to get right generally.
> 
> Hmm... how about adding a work_on_cpu_nested() which takes @subclass
> argument?  Wouldn't that be much cleaner?
> 
> Thanks.

Is that 3.9 material though?

> -- 
> tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 18:58   ` Michael S. Tsirkin
@ 2013-04-11 19:04     ` Tejun Heo
  2013-04-11 19:17       ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2013-04-11 19:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

Hey,

On Thu, Apr 11, 2013 at 09:58:54PM +0300, Michael S. Tsirkin wrote:
> > Hmm... how about adding a work_on_cpu_nested() which takes @subclass
> > argument?  Wouldn't that be much cleaner?
> > 
> > Thanks.
> 
> Is that 3.9 material though?

Why wouldn't it be?  It's actually safer as it doesn't change any
logic.  It's just updating lockdep annotation, which is what's needed
here anyway.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 19:04     ` Tejun Heo
@ 2013-04-11 19:17       ` Michael S. Tsirkin
  2013-04-11 19:20         ` Tejun Heo
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-11 19:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 11, 2013 at 12:04:08PM -0700, Tejun Heo wrote:
> Hey,
> 
> On Thu, Apr 11, 2013 at 09:58:54PM +0300, Michael S. Tsirkin wrote:
> > > Hmm... how about adding a work_on_cpu_nested() which takes @subclass
> > > argument?  Wouldn't that be much cleaner?
> > > 
> > > Thanks.
> > 
> > Is that 3.9 material though?
> 
> Why wouldn't it be?  It's actually safer as it doesn't change any
> logic.  It's just updating lockdep annotation, which is what's needed
> here anyway.
> 
> Thanks.

Hmm no, there's a real deadlock here: you are
trying to flush from work1 from within work2 running
on same workqueue. work2 can't event start running.
The problem is not annotation.

> -- 
> tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 19:17       ` Michael S. Tsirkin
@ 2013-04-11 19:20         ` Tejun Heo
  2013-04-11 20:30           ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2013-04-11 19:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 11, 2013 at 10:17:17PM +0300, Michael S. Tsirkin wrote:
> Hmm no, there's a real deadlock here: you are
> trying to flush from work1 from within work2 running
> on same workqueue. work2 can't event start running.
> The problem is not annotation.

No, that has changed years ago with introduction of cmwq.  System
workqueues are now expected to have high enough maximum concurrency to
not cause deadlock as long as memory for worker creation is available,
so as long as your work item doesn't directly sit in the memory
reclaim path, it's safe to flush a different work item running on the
same workqueue with sufficiently high max_active.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 19:20         ` Tejun Heo
@ 2013-04-11 20:30           ` Michael S. Tsirkin
  2013-04-11 20:41             ` Tejun Heo
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-11 20:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 11, 2013 at 12:20:05PM -0700, Tejun Heo wrote:
> On Thu, Apr 11, 2013 at 10:17:17PM +0300, Michael S. Tsirkin wrote:
> > Hmm no, there's a real deadlock here: you are
> > trying to flush from work1 from within work2 running
> > on same workqueue. work2 can't event start running.
> > The problem is not annotation.
> 
> No, that has changed years ago with introduction of cmwq.  System
> workqueues are now expected to have high enough maximum concurrency to
> not cause deadlock as long as memory for worker creation is available,
> so as long as your work item doesn't directly sit in the memory
> reclaim path, it's safe to flush a different work item running on the
> same workqueue with sufficiently high max_active.
> 
> Thanks.

Okay, so you are saying it's a false-positive?
Want to send a patch so Or can try it out?

> -- 
> tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 20:30           ` Michael S. Tsirkin
@ 2013-04-11 20:41             ` Tejun Heo
  2013-04-11 21:52               ` Or Gerlitz
       [not found]               ` <516AA80F.7040505@mellanox.com>
  0 siblings, 2 replies; 21+ messages in thread
From: Tejun Heo @ 2013-04-11 20:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

Hello,

On Thu, Apr 11, 2013 at 11:30:53PM +0300, Michael S. Tsirkin wrote:
> Okay, so you are saying it's a false-positive?

Yeah, I think so.  It didn't actually lock up, right?  It it did,
our analysis upto this point is likely to be completely wrong.

> Want to send a patch so Or can try it out?

Hmmm... something like the following on the workqueue side (completely
untested).

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 8afab27..899d470 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -466,14 +466,21 @@ static inline bool __deprecated flush_delayed_work_sync(struct delayed_work *dwo
 }
 
 #ifndef CONFIG_SMP
-static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
+static inline long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *),
+				      void *arg, int subclass)
 {
 	return fn(arg);
 }
 #else
-long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg);
+long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
+			int subclass);
 #endif /* CONFIG_SMP */
 
+static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
+{
+	return work_on_cpu_nested(cpu, fn, arg, 0);
+}
+
 #ifdef CONFIG_FREEZER
 extern void freeze_workqueues_begin(void);
 extern bool freeze_workqueues_busy(void);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 81f2457..c2be670 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3555,25 +3555,30 @@ static void work_for_cpu_fn(struct work_struct *work)
 }
 
 /**
- * work_on_cpu - run a function in user context on a particular cpu
+ * work_on_cpu_nested - run a function in user context on a particular cpu
  * @cpu: the cpu to run on
  * @fn: the function to run
  * @arg: the function arg
+ * @subclass: lockdep subclass
  *
  * This will return the value @fn returns.
  * It is up to the caller to ensure that the cpu doesn't go offline.
  * The caller must not hold any locks which would prevent @fn from completing.
+ *
+ * XXX: explain @subclass
  */
-long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
+long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
+			int subclass)
 {
 	struct work_for_cpu wfc = { .fn = fn, .arg = arg };
 
 	INIT_WORK_ONSTACK(&wfc.work, work_for_cpu_fn);
+	lock_set_subclass(&wfc.work.lockdep_map, subclass, _RET_IP_);
 	schedule_work_on(cpu, &wfc.work);
 	flush_work(&wfc.work);
 	return wfc.ret;
 }
-EXPORT_SYMBOL_GPL(work_on_cpu);
+EXPORT_SYMBOL_GPL(work_on_cpu_nested);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-11 20:41             ` Tejun Heo
@ 2013-04-11 21:52               ` Or Gerlitz
       [not found]               ` <516AA80F.7040505@mellanox.com>
  1 sibling, 0 replies; 21+ messages in thread
From: Or Gerlitz @ 2013-04-11 21:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michael S. Tsirkin, Or Gerlitz, Ming Lei, Greg Kroah-Hartman,
	David Miller, Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 11, 2013 at 11:41 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Thu, Apr 11, 2013 at 11:30:53PM +0300, Michael S. Tsirkin wrote:
>> Okay, so you are saying it's a false-positive?
>
> Yeah, I think so.  It didn't actually lock up, right?  It it did,
> our analysis upto this point is likely to be completely wrong.
>
>> Want to send a patch so Or can try it out?


I can test that Sunday
>
> Hmmm... something like the following on the workqueue side (completely
> untested).
>
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 8afab27..899d470 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -466,14 +466,21 @@ static inline bool __deprecated flush_delayed_work_sync(struct delayed_work *dwo
>  }
>
>  #ifndef CONFIG_SMP
> -static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +static inline long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *),
> +                                     void *arg, int subclass)
>  {
>         return fn(arg);
>  }
>  #else
> -long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg);
> +long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
> +                       int subclass);
>  #endif /* CONFIG_SMP */
>
> +static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +{
> +       return work_on_cpu_nested(cpu, fn, arg, 0);
> +}
> +
>  #ifdef CONFIG_FREEZER
>  extern void freeze_workqueues_begin(void);
>  extern bool freeze_workqueues_busy(void);
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 81f2457..c2be670 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3555,25 +3555,30 @@ static void work_for_cpu_fn(struct work_struct *work)
>  }
>
>  /**
> - * work_on_cpu - run a function in user context on a particular cpu
> + * work_on_cpu_nested - run a function in user context on a particular cpu
>   * @cpu: the cpu to run on
>   * @fn: the function to run
>   * @arg: the function arg
> + * @subclass: lockdep subclass
>   *
>   * This will return the value @fn returns.
>   * It is up to the caller to ensure that the cpu doesn't go offline.
>   * The caller must not hold any locks which would prevent @fn from completing.
> + *
> + * XXX: explain @subclass
>   */
> -long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +long work_on_cpu_nested(unsigned int cpu, long (*fn)(void *), void *arg,
> +                       int subclass)
>  {
>         struct work_for_cpu wfc = { .fn = fn, .arg = arg };
>
>         INIT_WORK_ONSTACK(&wfc.work, work_for_cpu_fn);
> +       lock_set_subclass(&wfc.work.lockdep_map, subclass, _RET_IP_);
>         schedule_work_on(cpu, &wfc.work);
>         flush_work(&wfc.work);
>         return wfc.ret;
>  }
> -EXPORT_SYMBOL_GPL(work_on_cpu);
> +EXPORT_SYMBOL_GPL(work_on_cpu_nested);
>  #endif /* CONFIG_SMP */
>
>  #ifdef CONFIG_FREEZER
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

[parent not found: <516AA80F.7040505@mellanox.com>]

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
       [not found]               ` <516AA80F.7040505@mellanox.com>
@ 2013-04-14 13:43                 ` Tejun Heo
  2013-04-18  8:33                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2013-04-14 13:43 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Michael S. Tsirkin, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Sun, Apr 14, 2013 at 03:58:55PM +0300, Or Gerlitz wrote:
> So the patch eliminated the lockdep warning for mlx4 nested probing
> sequence, but introduced lockdep warning for
> 00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC
> Interrupt Controller (rev 22)

Oops, the patch in itself doesn't really change anything.  The caller
should use a different subclass for the nested invocation, just like
spin_lock_nested() and friends.  Sorry about not being clear.
Michael, can you please help?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-14 13:43                 ` Tejun Heo
@ 2013-04-18  8:33                   ` Michael S. Tsirkin
  2013-04-18  9:40                     ` Jack Morgenstein
  2013-04-18 14:49                     ` Or Gerlitz
  0 siblings, 2 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18  8:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Sun, Apr 14, 2013 at 06:43:39AM -0700, Tejun Heo wrote:
> On Sun, Apr 14, 2013 at 03:58:55PM +0300, Or Gerlitz wrote:
> > So the patch eliminated the lockdep warning for mlx4 nested probing
> > sequence, but introduced lockdep warning for
> > 00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC
> > Interrupt Controller (rev 22)
> 
> Oops, the patch in itself doesn't really change anything.  The caller
> should use a different subclass for the nested invocation, just like
> spin_lock_nested() and friends.  Sorry about not being clear.
> Michael, can you please help?
> 
> Thanks.
> 
> -- 
> tejun

So like this on top. Tejun, you didn't add your S.O.B and patch
description, if this helps as we expect they will be needed.

---->

pci: use work_on_cpu_nested for nested SRIOV

Snce 3.9-rc1 mlx driver started triggering a lockdep warning.

The issue is that a driver, in it's probe function, calls
pci_sriov_enable so a PF device probe causes VF probe (AKA nested
probe).  Each probe in pci_device_probe which is (normally) run through
work_on_cpu (this is to get the right numa node for memory allocated by
the driver).  In turn work_on_cpu does this internally:

        schedule_work_on(cpu, &wfc.work);
        flush_work(&wfc.work);

So if you are running probe on CPU1, and cause another
probe on the same CPU, this will try to flush
workqueue from inside same workqueue which triggers
a lockdep warning.

Nested probing might be tricky to get right generally.

But for pci_sriov_enable, the situation is actually very simple:
VFs almost never use the same driver as the PF so the warning
is bogus there.

This is hardly elegant as it might shut up some real warnings if a buggy
driver actually probes itself in a nested way, but looks to me like an
appropriate quick fix for 3.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 1fa1e48..9c836ef 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -286,9 +286,9 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 		int cpu;

 		get_online_cpus();
-		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-		if (cpu < nr_cpu_ids)
-			error = work_on_cpu(cpu, local_pci_probe, &ddi);
+		cpu = cpumask_first_and(cpumask_of_node(node), cpu_online_mask);
+		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
+			error = work_on_cpu_nested(cpu, local_pci_probe, &ddi);
 		else
 			error = local_pci_probe(&ddi);
 		put_online_cpus();

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18  8:33                   ` Michael S. Tsirkin
@ 2013-04-18  9:40                     ` Jack Morgenstein
  2013-04-18  8:48                       ` Michael S. Tsirkin
  2013-04-18 14:49                     ` Or Gerlitz
  1 sibling, 1 reply; 21+ messages in thread
From: Jack Morgenstein @ 2013-04-18  9:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tejun Heo, Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Bjorn Helgaas, linux-pci

On Thursday 18 April 2013 11:33, Michael S. Tsirkin wrote:
> But for pci_sriov_enable, the situation is actually very simple:
> VFs almost never use the same driver as the PF so the warning
> is bogus there.
> 
What about the case where the VF driver IS the same as the PF driver?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18  9:40                     ` Jack Morgenstein
@ 2013-04-18  8:48                       ` Michael S. Tsirkin
  2013-04-18  9:57                         ` Jack Morgenstein
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18  8:48 UTC (permalink / raw)
  To: Jack Morgenstein
  Cc: Tejun Heo, Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Bjorn Helgaas, linux-pci

On Thu, Apr 18, 2013 at 12:40:09PM +0300, Jack Morgenstein wrote:
> On Thursday 18 April 2013 11:33, Michael S. Tsirkin wrote:
> > But for pci_sriov_enable, the situation is actually very simple:
> > VFs almost never use the same driver as the PF so the warning
> > is bogus there.
> > 
> What about the case where the VF driver IS the same as the PF driver?

Then it can deadlock, e.g. if driver takes a global mutex.  But it's an
internal driver issue the, you can trigger a deadlock through hardware
too, e.g. if VF initialization blocks until PF is fully initialized.
I think it's not the case for Mellanox, is it?
This is what I refer to: would be nice to fix nested probing in general
but it seems disabling the warning is the best we can do for 3.9 since
it causes false positives.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18  8:48                       ` Michael S. Tsirkin
@ 2013-04-18  9:57                         ` Jack Morgenstein
  0 siblings, 0 replies; 21+ messages in thread
From: Jack Morgenstein @ 2013-04-18  9:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tejun Heo, Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Bjorn Helgaas, linux-pci

On Thursday 18 April 2013 11:48, Michael S. Tsirkin wrote:
> On Thu, Apr 18, 2013 at 12:40:09PM +0300, Jack Morgenstein wrote:
> > On Thursday 18 April 2013 11:33, Michael S. Tsirkin wrote:
> > > But for pci_sriov_enable, the situation is actually very simple:
> > > VFs almost never use the same driver as the PF so the warning
> > > is bogus there.
> > > 
> > What about the case where the VF driver IS the same as the PF driver?
> 
> Then it can deadlock, e.g. if driver takes a global mutex.  But it's an
> internal driver issue the, you can trigger a deadlock through hardware
> too, e.g. if VF initialization blocks until PF is fully initialized.
> I think it's not the case for Mellanox, is it?

Correct, the Mellanox driver does not deadlock.

> This is what I refer to: would be nice to fix nested probing in general
> but it seems disabling the warning is the best we can do for 3.9 since
> it causes false positives.
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18  8:33                   ` Michael S. Tsirkin
  2013-04-18  9:40                     ` Jack Morgenstein
@ 2013-04-18 14:49                     ` Or Gerlitz
  2013-04-18 13:54                       ` Michael S. Tsirkin
  1 sibling, 1 reply; 21+ messages in thread
From: Or Gerlitz @ 2013-04-18 14:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tejun Heo, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On 18/04/2013 11:33, Michael S. Tsirkin wrote:
> On Sun, Apr 14, 2013 at 06:43:39AM -0700, Tejun Heo wrote:
>> On Sun, Apr 14, 2013 at 03:58:55PM +0300, Or Gerlitz wrote:
>>> So the patch eliminated the lockdep warning for mlx4 nested probing
>>> sequence, but introduced lockdep warning for
>>> 00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC
>>> Interrupt Controller (rev 22)
>> Oops, the patch in itself doesn't really change anything.  The caller
>> should use a different subclass for the nested invocation, just like
>> spin_lock_nested() and friends.  Sorry about not being clear.
>> Michael, can you please help?
>>
>> Thanks.
>>
>> -- 
>> tejun
> So like this on top. Tejun, you didn't add your S.O.B and patch
> description, if this helps as we expect they will be needed.
>
> ---->
>
> pci: use work_on_cpu_nested for nested SRIOV
>
> Snce 3.9-rc1 mlx driver started triggering a lockdep warning.
>
> The issue is that a driver, in it's probe function, calls
> pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> probe).  Each probe in pci_device_probe which is (normally) run through
> work_on_cpu (this is to get the right numa node for memory allocated by
> the driver).  In turn work_on_cpu does this internally:
>
>          schedule_work_on(cpu, &wfc.work);
>          flush_work(&wfc.work);
>
> So if you are running probe on CPU1, and cause another
> probe on the same CPU, this will try to flush
> workqueue from inside same workqueue which triggers
> a lockdep warning.
>
> Nested probing might be tricky to get right generally.
>
> But for pci_sriov_enable, the situation is actually very simple:
> VFs almost never use the same driver as the PF so the warning
> is bogus there.
>
> This is hardly elegant as it might shut up some real warnings if a buggy
> driver actually probes itself in a nested way, but looks to me like an
> appropriate quick fix for 3.9.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 1fa1e48..9c836ef 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -286,9 +286,9 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>   		int cpu;
>   
>   		get_online_cpus();
> -		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> -		if (cpu < nr_cpu_ids)
> -			error = work_on_cpu(cpu, local_pci_probe, &ddi);
> +		cpu = cpumask_first_and(cpumask_of_node(node), cpu_online_mask);
> +		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
> +			error = work_on_cpu_nested(cpu, local_pci_probe, &ddi);

as you wrote to me later, missing here is SINGLE_DEPTH_NESTING as the 
last param to work_on_cpu_nested
>   		else
>   			error = local_pci_probe(&ddi);
>   		put_online_cpus();

So now I used  Tejun's patch and Michael patch on top of the net.git as 
of commit 2e0cbf2cc2c9371f0aa198857d799175ffe231a6
"net: mvmdio: add select PHYLIB" from April 13 -- and I still see 
this... so we're not there yet

=====================================
[ BUG: bad unlock balance detected! ]
3.9.0-rc6+ #56 Not tainted
-------------------------------------
swapper/0/1 is trying to release lock ((&wfc.work)) at:
[<ffffffff81220167>] pci_device_probe+0x117/0x120
but there are no more locks to release!

other info that might help us debug this:
2 locks held by swapper/0/1:
  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da443>] 
__driver_attach+0x53/0xb0
  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da451>] 
__driver_attach+0x61/0xb0

stack backtrace:
Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc6+ #56
Call Trace:
  [<ffffffff81220167>] ? pci_device_probe+0x117/0x120
  [<ffffffff81093529>] print_unlock_imbalance_bug+0xf9/0x100
  [<ffffffff8109616f>] lock_set_class+0x27f/0x7c0
  [<ffffffff81091d9e>] ? mark_held_locks+0x9e/0x130
  [<ffffffff81220167>] ? pci_device_probe+0x117/0x120
  [<ffffffff81066aeb>] work_on_cpu_nested+0x8b/0xc0
  [<ffffffff810633c0>] ? keventd_up+0x20/0x20
  [<ffffffff8121f420>] ? pci_pm_prepare+0x60/0x60
  [<ffffffff81220167>] pci_device_probe+0x117/0x120
  [<ffffffff812da0fa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff812da24f>] driver_probe_device+0x8f/0x230
  [<ffffffff812da493>] __driver_attach+0xa3/0xb0
  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
  [<ffffffff812d86fc>] bus_for_each_dev+0x8c/0xb0
  [<ffffffff812da079>] driver_attach+0x19/0x20
  [<ffffffff812d91a0>] bus_add_driver+0x1f0/0x250
  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
  [<ffffffff812daadf>] driver_register+0x6f/0x150
  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
  [<ffffffff8122026f>] __pci_register_driver+0x5f/0x70
  [<ffffffff818bd5ff>] pcie_portdrv_init+0x69/0x7a
  [<ffffffff810001fd>] do_one_initcall+0x3d/0x170
  [<ffffffff81895943>] kernel_init_freeable+0x10d/0x19c
  [<ffffffff818959d2>] ? kernel_init_freeable+0x19c/0x19c
  [<ffffffff8145a040>] ? rest_init+0x160/0x160
  [<ffffffff8145a049>] kernel_init+0x9/0xf0
  [<ffffffff8146ca6c>] ret_from_fork+0x7c/0xb0
  [<ffffffff8145a040>] ? rest_init+0x160/0x160
ioapic: probe of 0000:00:13.0 failed with error -22
pci_hotplug: PCI Hot Plug PCI Core version: 0.5


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 14:49                     ` Or Gerlitz
@ 2013-04-18 13:54                       ` Michael S. Tsirkin
  2013-04-18 18:19                         ` Tejun Heo
  2013-04-18 18:41                         ` Or Gerlitz
  0 siblings, 2 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18 13:54 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Tejun Heo, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 18, 2013 at 05:49:20PM +0300, Or Gerlitz wrote:
> On 18/04/2013 11:33, Michael S. Tsirkin wrote:
> >On Sun, Apr 14, 2013 at 06:43:39AM -0700, Tejun Heo wrote:
> >>On Sun, Apr 14, 2013 at 03:58:55PM +0300, Or Gerlitz wrote:
> >>>So the patch eliminated the lockdep warning for mlx4 nested probing
> >>>sequence, but introduced lockdep warning for
> >>>00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC
> >>>Interrupt Controller (rev 22)
> >>Oops, the patch in itself doesn't really change anything.  The caller
> >>should use a different subclass for the nested invocation, just like
> >>spin_lock_nested() and friends.  Sorry about not being clear.
> >>Michael, can you please help?
> >>
> >>Thanks.
> >>
> >>-- 
> >>tejun
> >So like this on top. Tejun, you didn't add your S.O.B and patch
> >description, if this helps as we expect they will be needed.
> >
> >---->
> >
> >pci: use work_on_cpu_nested for nested SRIOV
> >
> >Snce 3.9-rc1 mlx driver started triggering a lockdep warning.
> >
> >The issue is that a driver, in it's probe function, calls
> >pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> >probe).  Each probe in pci_device_probe which is (normally) run through
> >work_on_cpu (this is to get the right numa node for memory allocated by
> >the driver).  In turn work_on_cpu does this internally:
> >
> >         schedule_work_on(cpu, &wfc.work);
> >         flush_work(&wfc.work);
> >
> >So if you are running probe on CPU1, and cause another
> >probe on the same CPU, this will try to flush
> >workqueue from inside same workqueue which triggers
> >a lockdep warning.
> >
> >Nested probing might be tricky to get right generally.
> >
> >But for pci_sriov_enable, the situation is actually very simple:
> >VFs almost never use the same driver as the PF so the warning
> >is bogus there.
> >
> >This is hardly elegant as it might shut up some real warnings if a buggy
> >driver actually probes itself in a nested way, but looks to me like an
> >appropriate quick fix for 3.9.
> >
> >Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> >---
> >diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >index 1fa1e48..9c836ef 100644
> >--- a/drivers/pci/pci-driver.c
> >+++ b/drivers/pci/pci-driver.c
> >@@ -286,9 +286,9 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> >  		int cpu;
> >  		get_online_cpus();
> >-		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> >-		if (cpu < nr_cpu_ids)
> >-			error = work_on_cpu(cpu, local_pci_probe, &ddi);
> >+		cpu = cpumask_first_and(cpumask_of_node(node), cpu_online_mask);
> >+		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
> >+			error = work_on_cpu_nested(cpu, local_pci_probe, &ddi);
> 
> as you wrote to me later, missing here is SINGLE_DEPTH_NESTING as
> the last param to work_on_cpu_nested
> >  		else
> >  			error = local_pci_probe(&ddi);
> >  		put_online_cpus();
> 
> So now I used  Tejun's patch and Michael patch on top of the net.git
> as of commit 2e0cbf2cc2c9371f0aa198857d799175ffe231a6
> "net: mvmdio: add select PHYLIB" from April 13 -- and I still see
> this... so we're not there yet
> 
> =====================================
> [ BUG: bad unlock balance detected! ]
> 3.9.0-rc6+ #56 Not tainted
> -------------------------------------
> swapper/0/1 is trying to release lock ((&wfc.work)) at:
> [<ffffffff81220167>] pci_device_probe+0x117/0x120
> but there are no more locks to release!
> 
> other info that might help us debug this:
> 2 locks held by swapper/0/1:
>  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da443>]
> __driver_attach+0x53/0xb0
>  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff812da451>]
> __driver_attach+0x61/0xb0
> 
> stack backtrace:
> Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc6+ #56
> Call Trace:
>  [<ffffffff81220167>] ? pci_device_probe+0x117/0x120
>  [<ffffffff81093529>] print_unlock_imbalance_bug+0xf9/0x100
>  [<ffffffff8109616f>] lock_set_class+0x27f/0x7c0
>  [<ffffffff81091d9e>] ? mark_held_locks+0x9e/0x130
>  [<ffffffff81220167>] ? pci_device_probe+0x117/0x120
>  [<ffffffff81066aeb>] work_on_cpu_nested+0x8b/0xc0
>  [<ffffffff810633c0>] ? keventd_up+0x20/0x20
>  [<ffffffff8121f420>] ? pci_pm_prepare+0x60/0x60
>  [<ffffffff81220167>] pci_device_probe+0x117/0x120
>  [<ffffffff812da0fa>] ? driver_sysfs_add+0x7a/0xb0
>  [<ffffffff812da24f>] driver_probe_device+0x8f/0x230
>  [<ffffffff812da493>] __driver_attach+0xa3/0xb0
>  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
>  [<ffffffff812da3f0>] ? driver_probe_device+0x230/0x230
>  [<ffffffff812d86fc>] bus_for_each_dev+0x8c/0xb0
>  [<ffffffff812da079>] driver_attach+0x19/0x20
>  [<ffffffff812d91a0>] bus_add_driver+0x1f0/0x250
>  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
>  [<ffffffff812daadf>] driver_register+0x6f/0x150
>  [<ffffffff818bd596>] ? dmi_pcie_pme_disable_msi+0x21/0x21
>  [<ffffffff8122026f>] __pci_register_driver+0x5f/0x70
>  [<ffffffff818bd5ff>] pcie_portdrv_init+0x69/0x7a
>  [<ffffffff810001fd>] do_one_initcall+0x3d/0x170
>  [<ffffffff81895943>] kernel_init_freeable+0x10d/0x19c
>  [<ffffffff818959d2>] ? kernel_init_freeable+0x19c/0x19c
>  [<ffffffff8145a040>] ? rest_init+0x160/0x160
>  [<ffffffff8145a049>] kernel_init+0x9/0xf0
>  [<ffffffff8146ca6c>] ret_from_fork+0x7c/0xb0
>  [<ffffffff8145a040>] ? rest_init+0x160/0x160
> ioapic: probe of 0000:00:13.0 failed with error -22
> pci_hotplug: PCI Hot Plug PCI Core version: 0.5

Tejun, what do you say my patch is used for 3.9,
and we can revisit for 3.10.
The release is almost here.
If yes please send your Ack.


-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 13:54                       ` Michael S. Tsirkin
@ 2013-04-18 18:19                         ` Tejun Heo
  2013-04-18 18:25                           ` Bjorn Helgaas
  2013-04-18 18:41                         ` Or Gerlitz
  1 sibling, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2013-04-18 18:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 18, 2013 at 04:54:58PM +0300, Michael S. Tsirkin wrote:
> Tejun, what do you say my patch is used for 3.9,
> and we can revisit for 3.10.
> The release is almost here.
> If yes please send your Ack.

Yeap, let's do that.

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 18:19                         ` Tejun Heo
@ 2013-04-18 18:25                           ` Bjorn Helgaas
  2013-04-18 20:11                             ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Bjorn Helgaas @ 2013-04-18 18:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michael S. Tsirkin, Or Gerlitz, Ming Lei, Greg Kroah-Hartman,
	David Miller, Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	linux-pci@vger.kernel.org

On Thu, Apr 18, 2013 at 12:19 PM, Tejun Heo <tj@kernel.org> wrote:
> On Thu, Apr 18, 2013 at 04:54:58PM +0300, Michael S. Tsirkin wrote:
>> Tejun, what do you say my patch is used for 3.9,
>> and we can revisit for 3.10.
>> The release is almost here.
>> If yes please send your Ack.
>
> Yeap, let's do that.
>
> Acked-by: Tejun Heo <tj@kernel.org>

Michael, can you post a new version with Tejun's ack?  IIRC, this was
in drivers/pci, but I haven't been following this and am not sure
exactly what you want applied.  Thanks.

Bjorn

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 18:25                           ` Bjorn Helgaas
@ 2013-04-18 20:11                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18 20:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Tejun Heo, Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	linux-pci@vger.kernel.org

On Thu, Apr 18, 2013 at 12:25:59PM -0600, Bjorn Helgaas wrote:
> On Thu, Apr 18, 2013 at 12:19 PM, Tejun Heo <tj@kernel.org> wrote:
> > On Thu, Apr 18, 2013 at 04:54:58PM +0300, Michael S. Tsirkin wrote:
> >> Tejun, what do you say my patch is used for 3.9,
> >> and we can revisit for 3.10.
> >> The release is almost here.
> >> If yes please send your Ack.
> >
> > Yeap, let's do that.
> >
> > Acked-by: Tejun Heo <tj@kernel.org>
> 
> Michael, can you post a new version with Tejun's ack?  IIRC, this was
> in drivers/pci, but I haven't been following this and am not sure
> exactly what you want applied.  Thanks.
> 
> Bjorn

Done. Subject is:
[PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV
it's the same patch with Tejun's ack and a minor
correction in the commit message.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 13:54                       ` Michael S. Tsirkin
  2013-04-18 18:19                         ` Tejun Heo
@ 2013-04-18 18:41                         ` Or Gerlitz
  2013-04-18 20:03                           ` Michael S. Tsirkin
  1 sibling, 1 reply; 21+ messages in thread
From: Or Gerlitz @ 2013-04-18 18:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Tejun Heo, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 18, 2013 at 4:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
[...]
> Tejun, what do you say my patch is used for 3.9, and we can revisit for 3.10.
> The release is almost here. If yes please send your Ack.

Michael,

I assume you mean pull to 3.9 both Tejun's and your patch, correct? I
wasn't sure what does this really buys us... we got read from the
false-positive lockdep warning which takes place during nested probe
and got another lockdep warning during the probe of Interrupt
controller

Or.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 18:41                         ` Or Gerlitz
@ 2013-04-18 20:03                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18 20:03 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, Tejun Heo, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein,
	Bjorn Helgaas, linux-pci

On Thu, Apr 18, 2013 at 09:41:31PM +0300, Or Gerlitz wrote:
> On Thu, Apr 18, 2013 at 4:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> [...]
> > Tejun, what do you say my patch is used for 3.9, and we can revisit for 3.10.
> > The release is almost here. If yes please send your Ack.
> 
> Michael,
> 
> I assume you mean pull to 3.9 both Tejun's and your patch, correct? I
> wasn't sure what does this really buys us... we got read from the
> false-positive lockdep warning which takes place during nested probe
> and got another lockdep warning during the probe of Interrupt
> controller
> 
> Or.

No I mean my original patch.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-04-18 21:10 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-11 15:30 [PATCH repost for-3.9] pci: avoid work_on_cpu for nested SRIOV probes Michael S. Tsirkin
2013-04-11 18:05 ` Tejun Heo
2013-04-11 18:58   ` Michael S. Tsirkin
2013-04-11 19:04     ` Tejun Heo
2013-04-11 19:17       ` Michael S. Tsirkin
2013-04-11 19:20         ` Tejun Heo
2013-04-11 20:30           ` Michael S. Tsirkin
2013-04-11 20:41             ` Tejun Heo
2013-04-11 21:52               ` Or Gerlitz
     [not found]               ` <516AA80F.7040505@mellanox.com>
2013-04-14 13:43                 ` Tejun Heo
2013-04-18  8:33                   ` Michael S. Tsirkin
2013-04-18  9:40                     ` Jack Morgenstein
2013-04-18  8:48                       ` Michael S. Tsirkin
2013-04-18  9:57                         ` Jack Morgenstein
2013-04-18 14:49                     ` Or Gerlitz
2013-04-18 13:54                       ` Michael S. Tsirkin
2013-04-18 18:19                         ` Tejun Heo
2013-04-18 18:25                           ` Bjorn Helgaas
2013-04-18 20:11                             ` Michael S. Tsirkin
2013-04-18 18:41                         ` Or Gerlitz
2013-04-18 20:03                           ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).