All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Brendan Jackman <jackmanb@google.com>, Zi Yan <ziy@nvidia.com>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <liam@infradead.org>,
	Mike Rapoport <rppt@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] mm: page_alloc: fix non-movable reclaim storm in defrag_mode
Date: Wed, 1 Jul 2026 17:02:02 -0400	[thread overview]
Message-ID: <akWASvO97r-AG0Ow@cmpxchg.org> (raw)
In-Reply-To: <3f013cf5-008a-4207-85ce-d6f7c0296d99@kernel.org>

On Wed, Jul 01, 2026 at 08:06:03PM +0200, Vlastimil Babka (SUSE) wrote:
> On 6/26/26 20:21, Johannes Weiner wrote:
> > As we deployed defrag_mode into Meta production, pressure spikes and
> > excessive swapping were observed on some workloads. Tracing confirmed
> > that this is unmovable/reclaimable requests spinning in the allocator
> > and direct reclaim, causing excessive amounts of swap.
> > 
> > The initial plan for defrag_mode was to rely on kswapd/kcompactd to
> > produce blocks, and if those are overwhelmed under high pressure, let
> > the allocator fall back (__rmqueue_steal()) after its retry loops.
> > However, that retrying results in more reclaim on some of these
> > workloads than we'd hoped, sometimes excessively so, spurred on by the
> > !costly order conditions in should_reclaim_retry().
> > 
> > The storms are dependent on the request type. Reclaim will inevitably
> > make room in existing movable blocks, since that's where the LRU pages
> > live. So if movable requests retry on reclaim, they make progress.
> > 
> > When non-movable requests spin in reclaim that isn't productive. They
> > cannot use the individually freed pages, and the process is unlikely
> > to accidentally free whole blocks to meet the ALLOC_NOFRAGMENT bar.
> > They spin and overreclaim excessively, which tanks performance and
> > triggers userspace guards like swap exhaustion or pressure based OOM.
> > 
> > To fix this, send non-movable requests, regardless of order, into
> > pageblock reclaim/compaction. This way, they help move things along to
> > meet the ALLOC_NOFRAGMENT bar. After this patch, the reclaim storms
> > and excess OOM rates are no longer observed in production.
> > 
> > The longer-term plan is still to have all requests, including the
> > movable ones, help make blocks to spread the cost of defragmenting
> > more evenly and fairly; combined with proper watermarking to reduce
> > allocation latencies in the common case. However, doing this naively
> > unearths scaling and concurrency limitations in compaction that need
> > to be addressed first. Promoting just non-movables for now is the
> > minimally viable bug fix for the above issue.
> > 
> > Fixes: f38356df6474 ("mm: page_alloc: introduce defrag_mode")
> 
> That's from 6.15. Do you intend any stable backporting, or we just mark it
> as a heads up for anyone who tracks fixes and might consider it.

Good point, let's Cc: stable.

I doubt there are many defrag_mode users at this point, but this is
quite the handgrenade that went off in production once already and was
a pain to debug.

      reply	other threads:[~2026-07-01 21:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 18:21 [PATCH 0/4] mm: fix reclaim storms in defrag_mode Johannes Weiner
2026-06-26 18:21 ` [PATCH 1/4] mm: page_alloc: __GFP_FS lockdep annotation for direct compaction Johannes Weiner
2026-07-01 13:45   ` Vlastimil Babka (SUSE)
2026-06-26 18:21 ` [PATCH 2/4] mm: compaction: support non-movable compaction for pageblock requests Johannes Weiner
2026-07-01 14:19   ` Vlastimil Babka (SUSE)
2026-07-01 15:28     ` Johannes Weiner
2026-07-01 18:14       ` Vlastimil Babka (SUSE)
2026-07-01 21:11         ` Johannes Weiner
2026-06-26 18:21 ` [PATCH 3/4] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2026-07-01 18:02   ` Vlastimil Babka (SUSE)
2026-07-01 20:57     ` Johannes Weiner
2026-06-26 18:21 ` [PATCH 4/4] mm: page_alloc: fix non-movable reclaim storm in defrag_mode Johannes Weiner
2026-06-26 18:29   ` Zi Yan
2026-06-26 18:43     ` Johannes Weiner
2026-07-01 18:06   ` Vlastimil Babka (SUSE)
2026-07-01 21:02     ` Johannes Weiner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akWASvO97r-AG0Ow@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=jackmanb@google.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.