Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM

linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
       [not found]             ` <20151203192616.GJ27463-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
@ 2016-01-26 17:38               ` Thierry Reding
       [not found]                 ` <20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Thierry Reding @ 2016-01-26 17:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Peter Zijlstra, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 6332 bytes --]

On Thu, Dec 03, 2015 at 02:26:16PM -0500, Tejun Heo wrote:
> Task or work item involved in memory reclaim trying to flush a
> non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> ---
> Hello,
> 
> So, something like this.  Seems to work fine here.  If there's no
> objection, I'm gonna push it through wq/for-4.5.
> 
> Thanks.
> 
>  kernel/workqueue.c |   35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2330,6 +2330,37 @@ repeat:
>  	goto repeat;
>  }
>  
> +/**
> + * check_flush_dependency - check for flush dependency sanity
> + * @target_wq: workqueue being flushed
> + * @target_work: work item being flushed (NULL for workqueue flushes)
> + *
> + * %current is trying to flush the whole @target_wq or @target_work on it.
> + * If @target_wq doesn't have %WQ_MEM_RECLAIM, verify that %current is not
> + * reclaiming memory or running on a workqueue which doesn't have
> + * %WQ_MEM_RECLAIM as that can break forward-progress guarantee leading to
> + * a deadlock.
> + */
> +static void check_flush_dependency(struct workqueue_struct *target_wq,
> +				   struct work_struct *target_work)
> +{
> +	work_func_t target_func = target_work ? target_work->func : NULL;
> +	struct worker *worker;
> +
> +	if (target_wq->flags & WQ_MEM_RECLAIM)
> +		return;
> +
> +	worker = current_wq_worker();
> +
> +	WARN_ONCE(current->flags & PF_MEMALLOC,
> +		  "workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%pf",
> +		  current->pid, current->comm, target_wq->name, target_func);
> +	WARN_ONCE(worker && (worker->current_pwq->wq->flags & WQ_MEM_RECLAIM),
> +		  "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing !WQ_MEM_RECLAIM %s:%pf",
> +		  worker->current_pwq->wq->name, worker->current_func,
> +		  target_wq->name, target_func);
> +}
> +
>  struct wq_barrier {
>  	struct work_struct	work;
>  	struct completion	done;
> @@ -2539,6 +2570,8 @@ void flush_workqueue(struct workqueue_st
>  		list_add_tail(&this_flusher.list, &wq->flusher_overflow);
>  	}
>  
> +	check_flush_dependency(wq, NULL);
> +
>  	mutex_unlock(&wq->mutex);
>  
>  	wait_for_completion(&this_flusher.done);
> @@ -2711,6 +2744,8 @@ static bool start_flush_work(struct work
>  		pwq = worker->current_pwq;
>  	}
>  
> +	check_flush_dependency(pwq->wq, work);
> +
>  	insert_wq_barrier(pwq, barr, work, worker);
>  	spin_unlock_irq(&pool->lock);
>  

I've started noticing the following during boot on some of the devices I
work with:

[    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
[    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
[    4.748099] Modules linked in:
[    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
[    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[    4.765762] Workqueue: deferwq deferred_probe_work_func
[    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
[    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
[    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
[    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
[    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
[    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
[    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
[    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
[    4.836018] [<c00f728c>] (migrate_prep) from [<c00bfbc4>] (alloc_contig_range+0xd8/0x338)
[    4.844186] [<c00bfbc4>] (alloc_contig_range) from [<c00f8f18>] (cma_alloc+0xe0/0x1ac)
[    4.852093] [<c00f8f18>] (cma_alloc) from [<c001cac4>] (__alloc_from_contiguous+0x38/0xd8)
[    4.860346] [<c001cac4>] (__alloc_from_contiguous) from [<c001ceb4>] (__dma_alloc+0x240/0x278)
[    4.868944] [<c001ceb4>] (__dma_alloc) from [<c001cf78>] (arm_dma_alloc+0x54/0x5c)
[    4.876506] [<c001cf78>] (arm_dma_alloc) from [<c0355ea4>] (dmam_alloc_coherent+0xc0/0xec)
[    4.884764] [<c0355ea4>] (dmam_alloc_coherent) from [<c039cc4c>] (ahci_port_start+0x150/0x1dc)
[    4.893367] [<c039cc4c>] (ahci_port_start) from [<c0384734>] (ata_host_start.part.3+0xc8/0x1c8)
[    4.902055] [<c0384734>] (ata_host_start.part.3) from [<c03898dc>] (ata_host_activate+0x50/0x148)
[    4.910919] [<c03898dc>] (ata_host_activate) from [<c039d558>] (ahci_host_activate+0x44/0x114)
[    4.919523] [<c039d558>] (ahci_host_activate) from [<c039f05c>] (ahci_platform_init_host+0x1d8/0x3c8)
[    4.928733] [<c039f05c>] (ahci_platform_init_host) from [<c039e6bc>] (tegra_ahci_probe+0x448/0x4e8)
[    4.937770] [<c039e6bc>] (tegra_ahci_probe) from [<c0347058>] (platform_drv_probe+0x50/0xac)
[    4.946197] [<c0347058>] (platform_drv_probe) from [<c03458cc>] (driver_probe_device+0x214/0x2c0)
[    4.955061] [<c03458cc>] (driver_probe_device) from [<c0343cc0>] (bus_for_each_drv+0x60/0x94)
[    4.963575] [<c0343cc0>] (bus_for_each_drv) from [<c03455d8>] (__device_attach+0xb0/0x114)
[    4.971828] [<c03455d8>] (__device_attach) from [<c0344ab8>] (bus_probe_device+0x84/0x8c)
[    4.979994] [<c0344ab8>] (bus_probe_device) from [<c0344f48>] (deferred_probe_work_func+0x68/0x98)
[    4.988941] [<c0344f48>] (deferred_probe_work_func) from [<c003b738>] (process_one_work+0x120/0x3f8)
[    4.998062] [<c003b738>] (process_one_work) from [<c003ba48>] (worker_thread+0x38/0x55c)
[    5.006144] [<c003ba48>] (worker_thread) from [<c0040f14>] (kthread+0xdc/0xf4)
[    5.013362] [<c0040f14>] (kthread) from [<c000f778>] (ret_from_fork+0x14/0x3c)

This seems to be caused by the interaction of the probe deferral
workqueue with WQ_MEM_RECLAIM workqueue. Any ideas on how to solve this?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
       [not found]                 ` <20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
@ 2016-01-28 10:12                   ` Peter Zijlstra
       [not found]                     ` <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
  2016-01-29 10:59                   ` [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues Tejun Heo
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2016-01-28 10:12 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Tejun Heo, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ

On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > Task or work item involved in memory reclaim trying to flush a
> > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> I've started noticing the following during boot on some of the devices I
> work with:
> 
> [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
> [    4.748099] Modules linked in:
> [    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
> [    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> [    4.765762] Workqueue: deferwq deferred_probe_work_func
> [    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
> [    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
> [    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
> [    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
> [    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
> [    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
> [    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
> [    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)

Right, also, I think it makes sense to do lru_add_drain_all() from a
WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
freed.

Does something like the below cure things?

TJ does this make sense to you?

---
 mm/swap.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/mm/swap.c b/mm/swap.c
index 09fe5e97714a..a3de016b2a9d 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -666,6 +666,15 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
 
 static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
 
+static struct workqueue_struct *lru_wq;
+
+static int __init lru_init(void)
+{
+	lru_wq = create_workqueue("lru");
+	return 0;
+}
+early_initcall(lru_init);
+
 void lru_add_drain_all(void)
 {
 	static DEFINE_MUTEX(lock);
@@ -685,7 +694,7 @@ void lru_add_drain_all(void)
 		    pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
 		    need_activate_page_drain(cpu)) {
 			INIT_WORK(work, lru_add_drain_per_cpu);
-			schedule_work_on(cpu, work);
+			queue_work_on(cpu, &lru_wq, work);
 			cpumask_set_cpu(cpu, &has_work);
 		}
 	}

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
       [not found]                     ` <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
@ 2016-01-28 12:47                       ` Thierry Reding
  2016-01-28 12:48                         ` Thierry Reding
  2016-01-29 11:09                       ` Tejun Heo
  1 sibling, 1 reply; 11+ messages in thread
From: Thierry Reding @ 2016-01-28 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tejun Heo, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ

[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]

On Thu, Jan 28, 2016 at 11:12:10AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > > Task or work item involved in memory reclaim trying to flush a
> > > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> > I've started noticing the following during boot on some of the devices I
> > work with:
> > 
> > [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> > [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
> > [    4.748099] Modules linked in:
> > [    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
> > [    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > [    4.765762] Workqueue: deferwq deferred_probe_work_func
> > [    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
> > [    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
> > [    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
> > [    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
> > [    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
> > [    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
> > [    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
> > [    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
> 
> Right, also, I think it makes sense to do lru_add_drain_all() from a
> WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
> freed.
> 
> Does something like the below cure things?
> 
> TJ does this make sense to you?
> 
> ---
>  mm/swap.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 09fe5e97714a..a3de016b2a9d 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -666,6 +666,15 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> +static struct workqueue_struct *lru_wq;
> +
> +static int __init lru_init(void)
> +{
> +	lru_wq = create_workqueue("lru");
> +	return 0;
> +}
> +early_initcall(lru_init);
> +
>  void lru_add_drain_all(void)
>  {
>  	static DEFINE_MUTEX(lock);
> @@ -685,7 +694,7 @@ void lru_add_drain_all(void)
>  		    pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
>  		    need_activate_page_drain(cpu)) {
>  			INIT_WORK(work, lru_add_drain_per_cpu);
> -			schedule_work_on(cpu, work);
> +			queue_work_on(cpu, &lru_wq, work);
                                           ^

This ampersand is too much here and causes a compile-time warning.
Removing it and booting the resulting kernel doesn't trigger the
WQ_MEM_RECLAIM warning anymore, though.

Tested on top of next-20160128.

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
  2016-01-28 12:47                       ` Thierry Reding
@ 2016-01-28 12:48                         ` Thierry Reding
  0 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2016-01-28 12:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tejun Heo, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ

[-- Attachment #1: Type: text/plain, Size: 3652 bytes --]

On Thu, Jan 28, 2016 at 01:47:00PM +0100, Thierry Reding wrote:
> On Thu, Jan 28, 2016 at 11:12:10AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > > > Task or work item involved in memory reclaim trying to flush a
> > > > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > > > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> > > I've started noticing the following during boot on some of the devices I
> > > work with:
> > > 
> > > [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> > > [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
> > > [    4.748099] Modules linked in:
> > > [    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
> > > [    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > > [    4.765762] Workqueue: deferwq deferred_probe_work_func
> > > [    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
> > > [    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
> > > [    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
> > > [    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
> > > [    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
> > > [    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
> > > [    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
> > > [    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
> > 
> > Right, also, I think it makes sense to do lru_add_drain_all() from a
> > WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
> > freed.
> > 
> > Does something like the below cure things?
> > 
> > TJ does this make sense to you?
> > 
> > ---
> >  mm/swap.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 09fe5e97714a..a3de016b2a9d 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -666,6 +666,15 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
> >  
> >  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
> >  
> > +static struct workqueue_struct *lru_wq;
> > +
> > +static int __init lru_init(void)
> > +{
> > +	lru_wq = create_workqueue("lru");
> > +	return 0;
> > +}
> > +early_initcall(lru_init);
> > +
> >  void lru_add_drain_all(void)
> >  {
> >  	static DEFINE_MUTEX(lock);
> > @@ -685,7 +694,7 @@ void lru_add_drain_all(void)
> >  		    pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
> >  		    need_activate_page_drain(cpu)) {
> >  			INIT_WORK(work, lru_add_drain_per_cpu);
> > -			schedule_work_on(cpu, work);
> > +			queue_work_on(cpu, &lru_wq, work);
>                                            ^
> 
> This ampersand is too much here and causes a compile-time warning.
> Removing it and booting the resulting kernel doesn't trigger the
> WQ_MEM_RECLAIM warning anymore, though.
> 
> Tested on top of next-20160128.

This implies that if you want to turn this into a proper patch:

Tested-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Alternatively, if you come up with a different way to fix things,
please let me know and I'll be happy to test again.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues
       [not found]                 ` <20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
  2016-01-28 10:12                   ` Peter Zijlstra
@ 2016-01-29 10:59                   ` Tejun Heo
  2016-01-29 15:07                     ` Thierry Reding
                                       ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Tejun Heo @ 2016-01-29 10:59 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Peter Zijlstra, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA

fca839c00a12 ("workqueue: warn if memory reclaim tries to flush
!WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which
triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to
flush a !WQ_MEM_RECLAIM workquee.

This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory
reclaim path and making it depend on something which may need more
memory to make forward progress can lead to deadlocks.  Unfortunately,
workqueues created with the legacy create*_workqueue() interface
always have WQ_MEM_RECLAIM regardless of whether they are depended
upon memory reclaim or not.  These spurious WQ_MEM_RECLAIM markings
cause spurious triggering of the flush dependency checks.

  WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
  workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
  ...
  Workqueue: deferwq deferred_probe_work_func
  [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
  [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
  [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
  [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
  [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
  [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
  [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
  [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
  [<c00f728c>] (migrate_prep) from [<c00bfbc4>] (alloc_contig_range+0xd8/0x338)
  [<c00bfbc4>] (alloc_contig_range) from [<c00f8f18>] (cma_alloc+0xe0/0x1ac)
  [<c00f8f18>] (cma_alloc) from [<c001cac4>] (__alloc_from_contiguous+0x38/0xd8)
  [<c001cac4>] (__alloc_from_contiguous) from [<c001ceb4>] (__dma_alloc+0x240/0x278)
  [<c001ceb4>] (__dma_alloc) from [<c001cf78>] (arm_dma_alloc+0x54/0x5c)
  [<c001cf78>] (arm_dma_alloc) from [<c0355ea4>] (dmam_alloc_coherent+0xc0/0xec)
  [<c0355ea4>] (dmam_alloc_coherent) from [<c039cc4c>] (ahci_port_start+0x150/0x1dc)
  [<c039cc4c>] (ahci_port_start) from [<c0384734>] (ata_host_start.part.3+0xc8/0x1c8)
  [<c0384734>] (ata_host_start.part.3) from [<c03898dc>] (ata_host_activate+0x50/0x148)
  [<c03898dc>] (ata_host_activate) from [<c039d558>] (ahci_host_activate+0x44/0x114)
  [<c039d558>] (ahci_host_activate) from [<c039f05c>] (ahci_platform_init_host+0x1d8/0x3c8)
  [<c039f05c>] (ahci_platform_init_host) from [<c039e6bc>] (tegra_ahci_probe+0x448/0x4e8)
  [<c039e6bc>] (tegra_ahci_probe) from [<c0347058>] (platform_drv_probe+0x50/0xac)
  [<c0347058>] (platform_drv_probe) from [<c03458cc>] (driver_probe_device+0x214/0x2c0)
  [<c03458cc>] (driver_probe_device) from [<c0343cc0>] (bus_for_each_drv+0x60/0x94)
  [<c0343cc0>] (bus_for_each_drv) from [<c03455d8>] (__device_attach+0xb0/0x114)
  [<c03455d8>] (__device_attach) from [<c0344ab8>] (bus_probe_device+0x84/0x8c)
  [<c0344ab8>] (bus_probe_device) from [<c0344f48>] (deferred_probe_work_func+0x68/0x98)
  [<c0344f48>] (deferred_probe_work_func) from [<c003b738>] (process_one_work+0x120/0x3f8)
  [<c003b738>] (process_one_work) from [<c003ba48>] (worker_thread+0x38/0x55c)
  [<c003ba48>] (worker_thread) from [<c0040f14>] (kthread+0xdc/0xf4)
  [<c0040f14>] (kthread) from [<c000f778>] (ret_from_fork+0x14/0x3c)

Fix it by marking workqueues created via create*_workqueue() with
__WQ_LEGACY and disabling flush dependency checks on them.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Reported-by: Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Link: http://lkml.kernel.org/g/20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org
---
Hello, Thierry.

Can youp please verify the fix?

Thanks.

 include/linux/workqueue.h |    9 +++++----
 kernel/workqueue.c        |    3 ++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 0e32bc7..ca73c50 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -311,6 +311,7 @@ enum {
 
 	__WQ_DRAINING		= 1 << 16, /* internal: workqueue is draining */
 	__WQ_ORDERED		= 1 << 17, /* internal: workqueue is ordered */
+	__WQ_LEGACY		= 1 << 18, /* internal: create*_workqueue() */
 
 	WQ_MAX_ACTIVE		= 512,	  /* I like 512, better ideas? */
 	WQ_MAX_UNBOUND_PER_CPU	= 4,	  /* 4 * #cpus for unbound wq */
@@ -411,12 +412,12 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
 	alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
 
 #define create_workqueue(name)						\
-	alloc_workqueue("%s", WQ_MEM_RECLAIM, 1, (name))
+	alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
 #define create_freezable_workqueue(name)				\
-	alloc_workqueue("%s", WQ_FREEZABLE | WQ_UNBOUND | WQ_MEM_RECLAIM, \
-			1, (name))
+	alloc_workqueue("%s", __WQ_LEGACY | WQ_FREEZABLE | WQ_UNBOUND |	\
+			WQ_MEM_RECLAIM, 1, (name))
 #define create_singlethread_workqueue(name)				\
-	alloc_ordered_workqueue("%s", WQ_MEM_RECLAIM, name)
+	alloc_ordered_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, name)
 
 extern void destroy_workqueue(struct workqueue_struct *wq);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 61a0264..dc7faad 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2355,7 +2355,8 @@ static void check_flush_dependency(struct workqueue_struct *target_wq,
 	WARN_ONCE(current->flags & PF_MEMALLOC,
 		  "workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%pf",
 		  current->pid, current->comm, target_wq->name, target_func);
-	WARN_ONCE(worker && (worker->current_pwq->wq->flags & WQ_MEM_RECLAIM),
+	WARN_ONCE(worker && ((worker->current_pwq->wq->flags &
+			      (WQ_MEM_RECLAIM | __WQ_LEGACY)) == WQ_MEM_RECLAIM),
 		  "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing !WQ_MEM_RECLAIM %s:%pf",
 		  worker->current_pwq->wq->name, worker->current_func,
 		  target_wq->name, target_func);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
       [not found]                     ` <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
  2016-01-28 12:47                       ` Thierry Reding
@ 2016-01-29 11:09                       ` Tejun Heo
       [not found]                         ` <20160129110941.GK32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
  1 sibling, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2016-01-29 11:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thierry Reding, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, Johannes Weiner,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

Hello, Peter.

On Thu, Jan 28, 2016 at 11:12:10AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > > Task or work item involved in memory reclaim trying to flush a
> > > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> > I've started noticing the following during boot on some of the devices I
> > work with:
> > 
> > [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> > [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
...
> Right, also, I think it makes sense to do lru_add_drain_all() from a
> WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
> freed.
> 
> Does something like the below cure things?
> 
> TJ does this make sense to you?

The problem here is that deferwq which has nothing to do with memory
reclaim is marked with WQ_MEM_RECLAIM because it was created the old
create_singlethread_workqueue() which doesn't distinguish workqueues
which may be used on mem reclaim path and thus has to mark all as
needing forward progress guarantee.  I posted a patch to disable
disable flush dependency checks on those workqueues and there's a
outreachy project to weed out the users of the old interface, so
hopefully this won't be an issue soon.

As for whether lru drain should have WQ_MEM_RECLAIM, I'm not sure.
Cc'ing linux-mm.  The rule here is that any workquee which is depended
upon during memory reclaim should have WQ_MEM_RECLAIM set.  IOW, if a
work item failing to start execution under memory pressure can prevent
memory from being reclaimed, it should be scheduled on a
WQ_MEM_RECLAIM workqueue.  Would this be the case for
lru_add_drain_per_cpu()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues
  2016-01-29 10:59                   ` [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues Tejun Heo
@ 2016-01-29 15:07                     ` Thierry Reding
  2016-01-29 18:32                     ` Tejun Heo
       [not found]                     ` <20160129105946.GJ32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
  2 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2016-01-29 15:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Peter Zijlstra, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel, kernel-team, Jon Hunter, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 4038 bytes --]

On Fri, Jan 29, 2016 at 05:59:46AM -0500, Tejun Heo wrote:
> fca839c00a12 ("workqueue: warn if memory reclaim tries to flush
> !WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which
> triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to
> flush a !WQ_MEM_RECLAIM workquee.
> 
> This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory
> reclaim path and making it depend on something which may need more
> memory to make forward progress can lead to deadlocks.  Unfortunately,
> workqueues created with the legacy create*_workqueue() interface
> always have WQ_MEM_RECLAIM regardless of whether they are depended
> upon memory reclaim or not.  These spurious WQ_MEM_RECLAIM markings
> cause spurious triggering of the flush dependency checks.
> 
>   WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
>   workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
>   ...
>   Workqueue: deferwq deferred_probe_work_func
>   [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
>   [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
>   [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
>   [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
>   [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
>   [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
>   [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
>   [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
>   [<c00f728c>] (migrate_prep) from [<c00bfbc4>] (alloc_contig_range+0xd8/0x338)
>   [<c00bfbc4>] (alloc_contig_range) from [<c00f8f18>] (cma_alloc+0xe0/0x1ac)
>   [<c00f8f18>] (cma_alloc) from [<c001cac4>] (__alloc_from_contiguous+0x38/0xd8)
>   [<c001cac4>] (__alloc_from_contiguous) from [<c001ceb4>] (__dma_alloc+0x240/0x278)
>   [<c001ceb4>] (__dma_alloc) from [<c001cf78>] (arm_dma_alloc+0x54/0x5c)
>   [<c001cf78>] (arm_dma_alloc) from [<c0355ea4>] (dmam_alloc_coherent+0xc0/0xec)
>   [<c0355ea4>] (dmam_alloc_coherent) from [<c039cc4c>] (ahci_port_start+0x150/0x1dc)
>   [<c039cc4c>] (ahci_port_start) from [<c0384734>] (ata_host_start.part.3+0xc8/0x1c8)
>   [<c0384734>] (ata_host_start.part.3) from [<c03898dc>] (ata_host_activate+0x50/0x148)
>   [<c03898dc>] (ata_host_activate) from [<c039d558>] (ahci_host_activate+0x44/0x114)
>   [<c039d558>] (ahci_host_activate) from [<c039f05c>] (ahci_platform_init_host+0x1d8/0x3c8)
>   [<c039f05c>] (ahci_platform_init_host) from [<c039e6bc>] (tegra_ahci_probe+0x448/0x4e8)
>   [<c039e6bc>] (tegra_ahci_probe) from [<c0347058>] (platform_drv_probe+0x50/0xac)
>   [<c0347058>] (platform_drv_probe) from [<c03458cc>] (driver_probe_device+0x214/0x2c0)
>   [<c03458cc>] (driver_probe_device) from [<c0343cc0>] (bus_for_each_drv+0x60/0x94)
>   [<c0343cc0>] (bus_for_each_drv) from [<c03455d8>] (__device_attach+0xb0/0x114)
>   [<c03455d8>] (__device_attach) from [<c0344ab8>] (bus_probe_device+0x84/0x8c)
>   [<c0344ab8>] (bus_probe_device) from [<c0344f48>] (deferred_probe_work_func+0x68/0x98)
>   [<c0344f48>] (deferred_probe_work_func) from [<c003b738>] (process_one_work+0x120/0x3f8)
>   [<c003b738>] (process_one_work) from [<c003ba48>] (worker_thread+0x38/0x55c)
>   [<c003ba48>] (worker_thread) from [<c0040f14>] (kthread+0xdc/0xf4)
>   [<c0040f14>] (kthread) from [<c000f778>] (ret_from_fork+0x14/0x3c)
> 
> Fix it by marking workqueues created via create*_workqueue() with
> __WQ_LEGACY and disabling flush dependency checks on them.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Thierry Reding <thierry.reding@gmail.com>
> Link: http://lkml.kernel.org/g/20160126173843.GA11115@ulmo.nvidia.com

Thanks for fixing this, everything is back to normal:

Tested-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
       [not found]                         ` <20160129110941.GK32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
@ 2016-01-29 15:17                           ` Peter Zijlstra
  2016-01-29 18:28                             ` Tejun Heo
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2016-01-29 15:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Thierry Reding, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, Johannes Weiner,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

On Fri, Jan 29, 2016 at 06:09:41AM -0500, Tejun Heo wrote:
>  I posted a patch to disable
> disable flush dependency checks on those workqueues and there's a
> outreachy project to weed out the users of the old interface, so
> hopefully this won't be an issue soon.

Will that same project review all workqueue users for the strict per-cpu
stuff, so we can finally kill that weird stuff you do on hotplug?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
  2016-01-29 15:17                           ` Peter Zijlstra
@ 2016-01-29 18:28                             ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2016-01-29 18:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thierry Reding, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel, kernel-team, Jon Hunter, linux-tegra, rmk+kernel,
	Johannes Weiner, linux-mm

Hey, Peter.

On Fri, Jan 29, 2016 at 04:17:39PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 29, 2016 at 06:09:41AM -0500, Tejun Heo wrote:
> >  I posted a patch to disable
> > disable flush dependency checks on those workqueues and there's a
> > outreachy project to weed out the users of the old interface, so
> > hopefully this won't be an issue soon.
> 
> Will that same project review all workqueue users for the strict per-cpu
> stuff, so we can finally kill that weird stuff you do on hotplug?

Unfortunately not.  We do want to distinguish cpu-affine for
correctness and as an optimization; however, making that distinction
is unlikely to make the dynamic worker affinity binding go away.  We
can't forcifully shut down workers which are executing work items
which are affine as an optimization when the CPU goes down.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues
  2016-01-29 10:59                   ` [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues Tejun Heo
  2016-01-29 15:07                     ` Thierry Reding
@ 2016-01-29 18:32                     ` Tejun Heo
       [not found]                     ` <20160129105946.GJ32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
  2 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2016-01-29 18:32 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Peter Zijlstra, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel, kernel-team, Jon Hunter, linux-tegra

On Fri, Jan 29, 2016 at 05:59:46AM -0500, Tejun Heo wrote:
> fca839c00a12 ("workqueue: warn if memory reclaim tries to flush
> !WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which
> triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to
> flush a !WQ_MEM_RECLAIM workquee.
> 
> This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory
> reclaim path and making it depend on something which may need more
> memory to make forward progress can lead to deadlocks.  Unfortunately,
> workqueues created with the legacy create*_workqueue() interface
> always have WQ_MEM_RECLAIM regardless of whether they are depended
> upon memory reclaim or not.  These spurious WQ_MEM_RECLAIM markings
> cause spurious triggering of the flush dependency checks.
...

Applied to wq/for-4.5-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues
       [not found]                     ` <20160129105946.GJ32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
@ 2016-02-02  6:54                       ` Archit Taneja
  0 siblings, 0 replies; 11+ messages in thread
From: Archit Taneja @ 2016-02-02  6:54 UTC (permalink / raw)
  To: Tejun Heo, Thierry Reding
  Cc: Peter Zijlstra, Ulrich Obergfell, Ingo Molnar, Andrew Morton,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Jon Hunter, linux-tegra-u79uwXL29TY76Z2rM5mHXA



On 01/29/2016 04:29 PM, Tejun Heo wrote:
> fca839c00a12 ("workqueue: warn if memory reclaim tries to flush
> !WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which
> triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to
> flush a !WQ_MEM_RECLAIM workquee.
>
> This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory
> reclaim path and making it depend on something which may need more
> memory to make forward progress can lead to deadlocks.  Unfortunately,
> workqueues created with the legacy create*_workqueue() interface
> always have WQ_MEM_RECLAIM regardless of whether they are depended
> upon memory reclaim or not.  These spurious WQ_MEM_RECLAIM markings
> cause spurious triggering of the flush dependency checks.
>
>    WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
>    workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
>    ...
>    Workqueue: deferwq deferred_probe_work_func
>    [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
>    [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
>    [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
>    [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
>    [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
>    [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
>    [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
>    [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
>    [<c00f728c>] (migrate_prep) from [<c00bfbc4>] (alloc_contig_range+0xd8/0x338)
>    [<c00bfbc4>] (alloc_contig_range) from [<c00f8f18>] (cma_alloc+0xe0/0x1ac)
>    [<c00f8f18>] (cma_alloc) from [<c001cac4>] (__alloc_from_contiguous+0x38/0xd8)
>    [<c001cac4>] (__alloc_from_contiguous) from [<c001ceb4>] (__dma_alloc+0x240/0x278)
>    [<c001ceb4>] (__dma_alloc) from [<c001cf78>] (arm_dma_alloc+0x54/0x5c)
>    [<c001cf78>] (arm_dma_alloc) from [<c0355ea4>] (dmam_alloc_coherent+0xc0/0xec)
>    [<c0355ea4>] (dmam_alloc_coherent) from [<c039cc4c>] (ahci_port_start+0x150/0x1dc)
>    [<c039cc4c>] (ahci_port_start) from [<c0384734>] (ata_host_start.part.3+0xc8/0x1c8)
>    [<c0384734>] (ata_host_start.part.3) from [<c03898dc>] (ata_host_activate+0x50/0x148)
>    [<c03898dc>] (ata_host_activate) from [<c039d558>] (ahci_host_activate+0x44/0x114)
>    [<c039d558>] (ahci_host_activate) from [<c039f05c>] (ahci_platform_init_host+0x1d8/0x3c8)
>    [<c039f05c>] (ahci_platform_init_host) from [<c039e6bc>] (tegra_ahci_probe+0x448/0x4e8)
>    [<c039e6bc>] (tegra_ahci_probe) from [<c0347058>] (platform_drv_probe+0x50/0xac)
>    [<c0347058>] (platform_drv_probe) from [<c03458cc>] (driver_probe_device+0x214/0x2c0)
>    [<c03458cc>] (driver_probe_device) from [<c0343cc0>] (bus_for_each_drv+0x60/0x94)
>    [<c0343cc0>] (bus_for_each_drv) from [<c03455d8>] (__device_attach+0xb0/0x114)
>    [<c03455d8>] (__device_attach) from [<c0344ab8>] (bus_probe_device+0x84/0x8c)
>    [<c0344ab8>] (bus_probe_device) from [<c0344f48>] (deferred_probe_work_func+0x68/0x98)
>    [<c0344f48>] (deferred_probe_work_func) from [<c003b738>] (process_one_work+0x120/0x3f8)
>    [<c003b738>] (process_one_work) from [<c003ba48>] (worker_thread+0x38/0x55c)
>    [<c003ba48>] (worker_thread) from [<c0040f14>] (kthread+0xdc/0xf4)
>    [<c0040f14>] (kthread) from [<c000f778>] (ret_from_fork+0x14/0x3c)
>
> Fix it by marking workqueues created via create*_workqueue() with
> __WQ_LEGACY and disabling flush dependency checks on them.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Reported-by: Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Link: http://lkml.kernel.org/g/20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org
> ---
> Hello, Thierry.
>
> Can youp please verify the fix?

This fixes a similar backtrace observed when the drm/msm driver
tries to allocate a vram buffer via cma.

Thanks,
Archit

>
> Thanks.
>
>   include/linux/workqueue.h |    9 +++++----
>   kernel/workqueue.c        |    3 ++-
>   2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 0e32bc7..ca73c50 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -311,6 +311,7 @@ enum {
>
>   	__WQ_DRAINING		= 1 << 16, /* internal: workqueue is draining */
>   	__WQ_ORDERED		= 1 << 17, /* internal: workqueue is ordered */
> +	__WQ_LEGACY		= 1 << 18, /* internal: create*_workqueue() */
>
>   	WQ_MAX_ACTIVE		= 512,	  /* I like 512, better ideas? */
>   	WQ_MAX_UNBOUND_PER_CPU	= 4,	  /* 4 * #cpus for unbound wq */
> @@ -411,12 +412,12 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
>   	alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
>
>   #define create_workqueue(name)						\
> -	alloc_workqueue("%s", WQ_MEM_RECLAIM, 1, (name))
> +	alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
>   #define create_freezable_workqueue(name)				\
> -	alloc_workqueue("%s", WQ_FREEZABLE | WQ_UNBOUND | WQ_MEM_RECLAIM, \
> -			1, (name))
> +	alloc_workqueue("%s", __WQ_LEGACY | WQ_FREEZABLE | WQ_UNBOUND |	\
> +			WQ_MEM_RECLAIM, 1, (name))
>   #define create_singlethread_workqueue(name)				\
> -	alloc_ordered_workqueue("%s", WQ_MEM_RECLAIM, name)
> +	alloc_ordered_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, name)
>
>   extern void destroy_workqueue(struct workqueue_struct *wq);
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 61a0264..dc7faad 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2355,7 +2355,8 @@ static void check_flush_dependency(struct workqueue_struct *target_wq,
>   	WARN_ONCE(current->flags & PF_MEMALLOC,
>   		  "workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%pf",
>   		  current->pid, current->comm, target_wq->name, target_func);
> -	WARN_ONCE(worker && (worker->current_pwq->wq->flags & WQ_MEM_RECLAIM),
> +	WARN_ONCE(worker && ((worker->current_pwq->wq->flags &
> +			      (WQ_MEM_RECLAIM | __WQ_LEGACY)) == WQ_MEM_RECLAIM),
>   		  "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing !WQ_MEM_RECLAIM %s:%pf",
>   		  worker->current_pwq->wq->name, worker->current_func,
>   		  target_wq->name, target_func);
>

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-02-02  6:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20151203002810.GJ19878@mtj.duckdns.org>
     [not found] ` <20151203093350.GP17308@twins.programming.kicks-ass.net>
     [not found]   ` <20151203100018.GO11639@twins.programming.kicks-ass.net>
     [not found]     ` <20151203144811.GA27463@mtj.duckdns.org>
     [not found]       ` <20151203150442.GR17308@twins.programming.kicks-ass.net>
     [not found]         ` <20151203150604.GC27463@mtj.duckdns.org>
     [not found]           ` <20151203192616.GJ27463@mtj.duckdns.org>
     [not found]             ` <20151203192616.GJ27463-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-01-26 17:38               ` [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue Thierry Reding
     [not found]                 ` <20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2016-01-28 10:12                   ` Peter Zijlstra
     [not found]                     ` <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-01-28 12:47                       ` Thierry Reding
2016-01-28 12:48                         ` Thierry Reding
2016-01-29 11:09                       ` Tejun Heo
     [not found]                         ` <20160129110941.GK32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-01-29 15:17                           ` Peter Zijlstra
2016-01-29 18:28                             ` Tejun Heo
2016-01-29 10:59                   ` [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues Tejun Heo
2016-01-29 15:07                     ` Thierry Reding
2016-01-29 18:32                     ` Tejun Heo
     [not found]                     ` <20160129105946.GJ32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-02  6:54                       ` Archit Taneja

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).