Re: [Question] Should direct reclaim time be bounded?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [Question] Should direct reclaim time be bounded?
Date: Mon, 1 Jul 2019 09:59:20 +0100	[thread overview]
Message-ID: <20190701085920.GB2812@suse.de> (raw)
In-Reply-To: <dede2f84-90bf-347a-2a17-fb6b521bf573@oracle.com>

On Fri, Jun 28, 2019 at 11:20:42AM -0700, Mike Kravetz wrote:
> On 4/24/19 7:35 AM, Vlastimil Babka wrote:
> > On 4/23/19 6:39 PM, Mike Kravetz wrote:
> >>> That being said, I do not think __GFP_RETRY_MAYFAIL is wrong here. It
> >>> looks like there is something wrong in the reclaim going on.
> >>
> >> Ok, I will start digging into that.  Just wanted to make sure before I got
> >> into it too deep.
> >>
> >> BTW - This is very easy to reproduce.  Just try to allocate more huge pages
> >> than will fit into memory.  I see this 'reclaim taking forever' behavior on
> >> v5.1-rc5-mmotm-2019-04-19-14-53.  Looks like it was there in v5.0 as well.
> > 
> > I'd suspect this in should_continue_reclaim():
> > 
> >         /* Consider stopping depending on scan and reclaim activity */
> >         if (sc->gfp_mask & __GFP_RETRY_MAYFAIL) {
> >                 /*
> >                  * For __GFP_RETRY_MAYFAIL allocations, stop reclaiming if the
> >                  * full LRU list has been scanned and we are still failing
> >                  * to reclaim pages. This full LRU scan is potentially
> >                  * expensive but a __GFP_RETRY_MAYFAIL caller really wants to succeed
> >                  */
> >                 if (!nr_reclaimed && !nr_scanned)
> >                         return false;
> > 
> > And that for some reason, nr_scanned never becomes zero. But it's hard
> > to figure out through all the layers of functions :/
> 
> I got back to looking into the direct reclaim/compaction stalls when
> trying to allocate huge pages.  As previously mentioned, the code is
> looping for a long time in shrink_node().  The routine
> should_continue_reclaim() returns true perhaps more often than it should.
> 
> As Vlastmil guessed, my debug code output below shows nr_scanned is remaining
> non-zero for quite a while.  This was on v5.2-rc6.
> 

I think it would be reasonable to have should_continue_reclaim allow an
exit if scanning at higher priority than DEF_PRIORITY - 2, nr_scanned is
less than SWAP_CLUSTER_MAX and no pages are being reclaimed.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2019-07-01  8:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-23  4:07 [Question] Should direct reclaim time be bounded? Mike Kravetz
2019-04-23  7:19 ` Michal Hocko
2019-04-23 16:39   ` Mike Kravetz
2019-04-24 14:35     ` Vlastimil Babka
2019-06-28 18:20       ` Mike Kravetz
2019-07-01  8:59         ` Mel Gorman [this message]
2019-07-02  3:15           ` Mike Kravetz
2019-07-08  5:19             ` Hillf Danton
2019-07-03  9:43             ` Mel Gorman
2019-07-03 23:54               ` Mike Kravetz
2019-07-04 11:09                 ` Michal Hocko
2019-07-04 15:11                   ` Mike Kravetz
2019-07-10 18:42             ` Mike Kravetz
2019-07-10 19:44               ` Michal Hocko
2019-07-10 23:36                 ` Mike Kravetz
2019-07-11  7:12                   ` Michal Hocko
2019-07-12  9:49                     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2019-07-11 15:44 Hillf Danton
2019-07-12  5:47 Hillf Danton
2019-07-13  1:11 ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190701085920.GB2812@suse.de \
    --to=mgorman@suse.de \
    --cc=aarcange@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.