linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, Mel Gorman <mgorman@suse.de>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] mm: move pcp and lru-pcp drainging into vmstat_wq
Date: Wed, 8 Feb 2017 12:56:10 +0100	[thread overview]
Message-ID: <20170208115609.GH5686@dhcp22.suse.cz> (raw)
In-Reply-To: <378bf4f0-e16d-a68f-6c91-a05cda47991e@suse.cz>

On Wed 08-02-17 11:49:11, Vlastimil Babka wrote:
> On 02/07/2017 10:09 PM, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > We currently have 2 specific WQ_RECLAIM workqueues. One for updating
> > pcp stats vmstat_wq and one dedicated to drain per cpu lru caches. This
> > seems more than necessary because both can run on a single WQ. Both
> > do not block on locks requiring a memory allocation nor perform any
> > allocations themselves. We will save one rescuer thread this way.
> > 
> > On the other hand drain_all_pages queues work on the system wq which
> > doesn't have rescuer and so this depend on memory allocation (when all
> > workers are stuck allocating and new ones cannot be created). This is
> > not critical as there should be somebody invoking the OOM killer (e.g.
> > the forking worker) and get the situation unstuck and eventually
> > performs the draining. Quite annoying though. This worker should be
> > using WQ_RECLAIM as well. We can reuse the same one as for lru draining
> > and vmstat.
> > 
> > Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> > 
> > Hi,
> > Tetsuo has noted that drain_all_pages doesn't use WQ_RECLAIM [1]
> > and asked whether we can move the worker to the vmstat_wq which is
> > WQ_RECLAIM. I think the deadlock he has described shouldn't happen but
> > it would be really better to have the rescuer. I also think that we do
> > not really need 2 or more workqueues and also pull lru draining in.
> > 
> > What do you think? Please note I haven't tested it yet.
> 
> Why not, I guess, of course I may be overlooking some subtlety. You could
> have CC'd Tejun and Christoph.

will do on the next submission

> Watch out for the init order though, maybe? Is there no caller of the
> lru/pcp drain before module_init(setup_vmstat) happens?

Hard to tell. I expect that there shouldn't be but I will add
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c0a7c38cd91..73018f07bcc9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2370,6 +2370,13 @@ void drain_all_pages(struct zone *zone)
 	 */
 	static cpumask_t cpus_with_pcps;
 
+	/*
+	 * Make sure nobody triggers this path before vmstat_wq is fully
+	 * initialized.
+	 */
+	if (WARN_ON(!vmstat_wq))
+		return;
+
 	/* Workqueues cannot recurse */
 	if (current->flags & PF_WQ_WORKER)
 		return;
diff --git a/mm/swap.c b/mm/swap.c
index 23f09d6dd212..39c240fc9d48 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -676,6 +676,13 @@ void lru_add_drain_all(void)
 	static struct cpumask has_work;
 	int cpu;
 
+	/*
+	 * Make sure nobody triggers this path before vmstat_wq is fully
+	 * initialized.
+	 */
+	if (WARN_ON(!vmstat_wq))
+		return;
+
 	mutex_lock(&lock);
 	get_online_cpus();
 	cpumask_clear(&has_work);

to be sure.
[...]

> > @@ -1763,9 +1762,11 @@ static int vmstat_cpu_dead(unsigned int cpu)
> > 
> >  static int __init setup_vmstat(void)
> >  {
> > -#ifdef CONFIG_SMP
> > -	int ret;
> > +	int ret = 0;
> > +
> > +	vmstat_wq = alloc_workqueue("vmstat", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
> 
> Did you want to set ret to -ENOMEM if the alloc fails, or something?
> Otherwise I don't see why the changes.

no I just didn't want to get defined but not used warning for
CONFIG_SMP=n

> 
> > 
> > +#ifdef CONFIG_SMP
> >  	ret = cpuhp_setup_state_nocalls(CPUHP_MM_VMSTAT_DEAD, "mm/vmstat:dead",
> >  					NULL, vmstat_cpu_dead);
> >  	if (ret < 0)
> > @@ -1789,7 +1790,7 @@ static int __init setup_vmstat(void)
> >  	proc_create("vmstat", S_IRUGO, NULL, &proc_vmstat_file_operations);
> >  	proc_create("zoneinfo", S_IRUGO, NULL, &proc_zoneinfo_file_operations);
> >  #endif
> > -	return 0;
> > +	return ret;
> >  }
> >  module_init(setup_vmstat)
> > 
> > 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-08 11:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07 21:09 [RFC PATCH] mm: move pcp and lru-pcp drainging into vmstat_wq Michal Hocko
2017-02-08 10:49 ` Vlastimil Babka
2017-02-08 11:56   ` Michal Hocko [this message]
2017-02-08 10:53 ` Mel Gorman
2017-02-08 12:03   ` Michal Hocko
2017-02-08 12:31     ` Mel Gorman
2017-02-08 12:44       ` Tetsuo Handa
2017-02-08 12:56         ` Michal Hocko
2017-02-08 12:50       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170208115609.GH5686@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).