All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Ulrich Obergfell
	<uobergfe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	kernel-team-b10kYP2dOMg@public.gmane.org,
	Jon Hunter <jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org
Subject: Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
Date: Thu, 28 Jan 2016 13:47:00 +0100	[thread overview]
Message-ID: <20160128124700.GA26897@ulmo> (raw)
In-Reply-To: <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]

On Thu, Jan 28, 2016 at 11:12:10AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > > Task or work item involved in memory reclaim trying to flush a
> > > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> > I've started noticing the following during boot on some of the devices I
> > work with:
> > 
> > [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> > [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
> > [    4.748099] Modules linked in:
> > [    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
> > [    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > [    4.765762] Workqueue: deferwq deferred_probe_work_func
> > [    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
> > [    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
> > [    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
> > [    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
> > [    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
> > [    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
> > [    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
> > [    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
> 
> Right, also, I think it makes sense to do lru_add_drain_all() from a
> WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
> freed.
> 
> Does something like the below cure things?
> 
> TJ does this make sense to you?
> 
> ---
>  mm/swap.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 09fe5e97714a..a3de016b2a9d 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -666,6 +666,15 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> +static struct workqueue_struct *lru_wq;
> +
> +static int __init lru_init(void)
> +{
> +	lru_wq = create_workqueue("lru");
> +	return 0;
> +}
> +early_initcall(lru_init);
> +
>  void lru_add_drain_all(void)
>  {
>  	static DEFINE_MUTEX(lock);
> @@ -685,7 +694,7 @@ void lru_add_drain_all(void)
>  		    pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
>  		    need_activate_page_drain(cpu)) {
>  			INIT_WORK(work, lru_add_drain_per_cpu);
> -			schedule_work_on(cpu, work);
> +			queue_work_on(cpu, &lru_wq, work);
                                           ^

This ampersand is too much here and causes a compile-time warning.
Removing it and booting the resulting kernel doesn't trigger the
WQ_MEM_RECLAIM warning anymore, though.

Tested on top of next-20160128.

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Thierry Reding <thierry.reding@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>, Ulrich Obergfell <uobergfe@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, kernel-team@fb.com,
	Jon Hunter <jonathanh@nvidia.com>,
	linux-tegra@vger.kernel.org, rmk+kernel@arm.linux.org.uk
Subject: Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
Date: Thu, 28 Jan 2016 13:47:00 +0100	[thread overview]
Message-ID: <20160128124700.GA26897@ulmo> (raw)
In-Reply-To: <20160128101210.GC6357@twins.programming.kicks-ass.net>

[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]

On Thu, Jan 28, 2016 at 11:12:10AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2016 at 06:38:43PM +0100, Thierry Reding wrote:
> > > Task or work item involved in memory reclaim trying to flush a
> > > non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to
> > > deadlock.  Trigger WARN_ONCE() if such conditions are detected.
> > I've started noticing the following during boot on some of the devices I
> > work with:
> > 
> > [    4.723705] WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144()
> > [    4.736818] workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu
> > [    4.748099] Modules linked in:
> > [    4.751342] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.5.0-rc1-00018-g420fc292d9c7 #1
> > [    4.759504] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > [    4.765762] Workqueue: deferwq deferred_probe_work_func
> > [    4.771004] [<c0017acc>] (unwind_backtrace) from [<c0013134>] (show_stack+0x10/0x14)
> > [    4.778746] [<c0013134>] (show_stack) from [<c0245f18>] (dump_stack+0x94/0xd4)
> > [    4.785966] [<c0245f18>] (dump_stack) from [<c0026f9c>] (warn_slowpath_common+0x80/0xb0)
> > [    4.794048] [<c0026f9c>] (warn_slowpath_common) from [<c0026ffc>] (warn_slowpath_fmt+0x30/0x40)
> > [    4.802736] [<c0026ffc>] (warn_slowpath_fmt) from [<c00390b8>] (check_flush_dependency+0x138/0x144)
> > [    4.811769] [<c00390b8>] (check_flush_dependency) from [<c0039ca0>] (flush_work+0x50/0x15c)
> > [    4.820112] [<c0039ca0>] (flush_work) from [<c00c51b0>] (lru_add_drain_all+0x130/0x180)
> > [    4.828110] [<c00c51b0>] (lru_add_drain_all) from [<c00f728c>] (migrate_prep+0x8/0x10)
> 
> Right, also, I think it makes sense to do lru_add_drain_all() from a
> WQ_MEM_RECLAIM workqueue, it is, after all, aiding in getting memory
> freed.
> 
> Does something like the below cure things?
> 
> TJ does this make sense to you?
> 
> ---
>  mm/swap.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 09fe5e97714a..a3de016b2a9d 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -666,6 +666,15 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> +static struct workqueue_struct *lru_wq;
> +
> +static int __init lru_init(void)
> +{
> +	lru_wq = create_workqueue("lru");
> +	return 0;
> +}
> +early_initcall(lru_init);
> +
>  void lru_add_drain_all(void)
>  {
>  	static DEFINE_MUTEX(lock);
> @@ -685,7 +694,7 @@ void lru_add_drain_all(void)
>  		    pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
>  		    need_activate_page_drain(cpu)) {
>  			INIT_WORK(work, lru_add_drain_per_cpu);
> -			schedule_work_on(cpu, work);
> +			queue_work_on(cpu, &lru_wq, work);
                                           ^

This ampersand is too much here and causes a compile-time warning.
Removing it and booting the resulting kernel doesn't trigger the
WQ_MEM_RECLAIM warning anymore, though.

Tested on top of next-20160128.

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2016-01-28 12:47 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-03  0:28 [PATCH 1/2] watchdog: introduce touch_softlockup_watchdog_sched() Tejun Heo
2015-12-03  0:28 ` [PATCH 2/2] workqueue: implement lockup detector Tejun Heo
2015-12-03 14:49   ` Tejun Heo
2015-12-03 17:50   ` Don Zickus
2015-12-03 19:43     ` Tejun Heo
2015-12-03 20:12       ` Ulrich Obergfell
2015-12-03 20:54         ` Tejun Heo
2015-12-04  8:02           ` Ingo Molnar
2015-12-04 16:52             ` Don Zickus
2015-12-04 13:19           ` Ulrich Obergfell
2015-12-07 19:06   ` [PATCH v2 " Tejun Heo
2015-12-07 21:38     ` Don Zickus
2015-12-07 21:39       ` Tejun Heo
2015-12-08 16:00         ` Don Zickus
2015-12-08 16:31           ` Tejun Heo
2015-12-03  9:33 ` [PATCH 1/2] watchdog: introduce touch_softlockup_watchdog_sched() Peter Zijlstra
2015-12-03 10:00   ` Peter Zijlstra
2015-12-03 14:48     ` Tejun Heo
2015-12-03 15:04       ` Peter Zijlstra
2015-12-03 15:06         ` Tejun Heo
2015-12-03 19:26           ` [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue Tejun Heo
2015-12-03 20:43             ` Peter Zijlstra
2015-12-03 20:56               ` Tejun Heo
2015-12-03 21:09                 ` Peter Zijlstra
2015-12-03 22:04                   ` Tejun Heo
2015-12-04 12:51                     ` Peter Zijlstra
2015-12-07 15:58             ` Tejun Heo
     [not found]             ` <20151203192616.GJ27463-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2016-01-26 17:38               ` Thierry Reding
2016-01-26 17:38                 ` Thierry Reding
     [not found]                 ` <20160126173843.GA11115-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2016-01-28 10:12                   ` Peter Zijlstra
2016-01-28 10:12                     ` Peter Zijlstra
     [not found]                     ` <20160128101210.GC6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-01-28 12:47                       ` Thierry Reding [this message]
2016-01-28 12:47                         ` Thierry Reding
2016-01-28 12:48                         ` Thierry Reding
2016-01-28 12:48                           ` Thierry Reding
2016-01-29 11:09                       ` Tejun Heo
2016-01-29 11:09                         ` Tejun Heo
2016-01-29 11:09                         ` Tejun Heo
     [not found]                         ` <20160129110941.GK32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-01-29 15:17                           ` Peter Zijlstra
2016-01-29 15:17                             ` Peter Zijlstra
2016-01-29 15:17                             ` Peter Zijlstra
2016-01-29 18:28                             ` Tejun Heo
2016-01-29 18:28                               ` Tejun Heo
2016-01-29 10:59                   ` [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues Tejun Heo
2016-01-29 10:59                     ` Tejun Heo
2016-01-29 15:07                     ` Thierry Reding
2016-01-29 18:32                     ` Tejun Heo
     [not found]                     ` <20160129105946.GJ32380-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2016-02-02  6:54                       ` Archit Taneja
2016-02-02  6:54                         ` Archit Taneja
2016-03-10 15:12             ` [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue Adrian Hunter
2016-03-11 17:52               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160128124700.GA26897@ulmo \
    --to=thierry.reding-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org \
    --cc=kernel-team-b10kYP2dOMg@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=uobergfe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.