Re: [RFC PATCH 0/6] proactive kcompactd

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@redhat.com>
Subject: Re: [RFC PATCH 0/6] proactive kcompactd
Date: Mon, 21 Aug 2017 10:10:14 -0400	[thread overview]
Message-ID: <20170821141014.GC1371@cmpxchg.org> (raw)
In-Reply-To: <alpine.DEB.2.10.1708091353500.1218@chino.kir.corp.google.com>

On Wed, Aug 09, 2017 at 01:58:42PM -0700, David Rientjes wrote:
> On Thu, 27 Jul 2017, Vlastimil Babka wrote:
> 
> > As we discussed at last LSF/MM [1], the goal here is to shift more compaction
> > work to kcompactd, which currently just makes a single high-order page
> > available and then goes to sleep. The last patch, evolved from the initial RFC
> > [2] does this by recording for each order > 0 how many allocations would have
> > potentially be able to skip direct compaction, if the memory wasn't fragmented.
> > Kcompactd then tries to compact as long as it takes to make that many
> > allocations satisfiable. This approach avoids any hooks in allocator fast
> > paths. There are more details to this, see the last patch.
> > 
> 
> I think I would have liked to have seen "less proactive" :)
> 
> Kcompactd currently has the problem that it is MIGRATE_SYNC_LIGHT so it 
> continues until it can defragment memory.  On a host with 128GB of memory 
> and 100GB of it sitting in a hugetlb pool, we constantly get kcompactd 
> wakeups for order-2 memory allocation.  The stats are pretty bad:
> 
> compact_migrate_scanned 2931254031294 
> compact_free_scanned    102707804816705 
> compact_isolated        1309145254 
> 
> 0.0012% of memory scanned is ever actually isolated.  We constantly see 
> very high cpu for compaction_alloc() because kcompactd is almost always 
> running in the background and iterating most memory completely needlessly 
> (define needless as 0.0012% of memory scanned being isolated).

The free page scanner will inevitably wade through mostly used memory,
but 0.0012% is lower than what systems usually have free. I'm guessing
this is because of concurrent allocation & free cycles racing with the
scanner? There could also be an issue with how we do partial scans.

Anyway, we've also noticed scalability issues with the current scanner
on 128G and 256G machines. Even with a better efficiency - finding the
1% of free memory, that's still a ton of linear search space.

I've been toying around with the below patch. It adds a free page
bitmap, allowing the free scanner to quickly skip over the vast areas
of used memory. I don't have good data on skip-efficiency at higher
uptimes and the resulting fragmentation yet. The overhead added to the
page allocator is concerning, but I cannot think of a better way to
make the search more efficient. What do you guys think?

---

next prev parent reply	other threads:[~2017-08-21 14:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-27 16:06 [RFC PATCH 0/6] proactive kcompactd Vlastimil Babka
2017-07-27 16:06 ` [PATCH 1/6] mm, kswapd: refactor kswapd_try_to_sleep() Vlastimil Babka
2017-07-28  9:38   ` Mel Gorman
2017-07-27 16:06 ` [PATCH 2/6] mm, kswapd: don't reset kswapd_order prematurely Vlastimil Babka
2017-07-28 10:16   ` Mel Gorman
2017-07-27 16:06 ` [PATCH 3/6] mm, kswapd: reset kswapd's order to 0 when it fails to reclaim enough Vlastimil Babka
2017-07-27 16:06 ` [PATCH 4/6] mm, kswapd: wake up kcompactd when kswapd had too many failures Vlastimil Babka
2017-07-28 10:41   ` Mel Gorman
2017-07-27 16:07 ` [RFC PATCH 5/6] mm, compaction: stop when number of free pages goes below watermark Vlastimil Babka
2017-07-28 10:43   ` Mel Gorman
2017-07-27 16:07 ` [RFC PATCH 6/6] mm: make kcompactd more proactive Vlastimil Babka
2017-07-28 10:58   ` Mel Gorman
2017-08-09 20:58 ` [RFC PATCH 0/6] proactive kcompactd David Rientjes
2017-08-21 14:10   ` Johannes Weiner [this message]
2017-08-21 21:40     ` Rik van Riel
2017-08-22 20:57     ` David Rientjes
2017-08-23  5:36     ` Joonsoo Kim
2017-08-23  8:12       ` Vlastimil Babka
2017-08-24  6:24         ` Joonsoo Kim
2017-08-24 11:30           ` Vlastimil Babka
2017-08-24 23:51             ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170821141014.GC1371@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).