All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Zi Yan <ziy@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Brendan Jackman <jackmanb@google.com>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <liam@infradead.org>,
	Mike Rapoport <rppt@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] mm: page_alloc: fix non-movable reclaim storm in defrag_mode
Date: Fri, 26 Jun 2026 14:43:29 -0400	[thread overview]
Message-ID: <aj7IUVdXgqDG6w34@cmpxchg.org> (raw)
In-Reply-To: <DJJ707PKYUZS.3VQ58AKYG0XNP@nvidia.com>

On Fri, Jun 26, 2026 at 02:29:24PM -0400, Zi Yan wrote:
> On Fri Jun 26, 2026 at 2:21 PM EDT, Johannes Weiner wrote:
> > As we deployed defrag_mode into Meta production, pressure spikes and
> > excessive swapping were observed on some workloads. Tracing confirmed
> > that this is unmovable/reclaimable requests spinning in the allocator
> > and direct reclaim, causing excessive amounts of swap.
> >
> > The initial plan for defrag_mode was to rely on kswapd/kcompactd to
> > produce blocks, and if those are overwhelmed under high pressure, let
> > the allocator fall back (__rmqueue_steal()) after its retry loops.
> > However, that retrying results in more reclaim on some of these
> > workloads than we'd hoped, sometimes excessively so, spurred on by the
> > !costly order conditions in should_reclaim_retry().
> >
> > The storms are dependent on the request type. Reclaim will inevitably
> > make room in existing movable blocks, since that's where the LRU pages
> > live. So if movable requests retry on reclaim, they make progress.
> >
> > When non-movable requests spin in reclaim that isn't productive. They
> > cannot use the individually freed pages, and the process is unlikely
> > to accidentally free whole blocks to meet the ALLOC_NOFRAGMENT bar.
> > They spin and overreclaim excessively, which tanks performance and
> > triggers userspace guards like swap exhaustion or pressure based OOM.
> >
> > To fix this, send non-movable requests, regardless of order, into
> > pageblock reclaim/compaction. This way, they help move things along to
> > meet the ALLOC_NOFRAGMENT bar. After this patch, the reclaim storms
> > and excess OOM rates are no longer observed in production.
> >
> > The longer-term plan is still to have all requests, including the
> > movable ones, help make blocks to spread the cost of defragmenting
> > more evenly and fairly; combined with proper watermarking to reduce
> > allocation latencies in the common case. However, doing this naively
> > unearths scaling and concurrency limitations in compaction that need
> > to be addressed first. Promoting just non-movables for now is the
> > minimally viable bug fix for the above issue.
> >
> > Fixes: f38356df6474 ("mm: page_alloc: introduce defrag_mode")
> 
> Should be
> Fixes: e3aa7df331bc ("mm: page_alloc: defrag_mode").
> Since I cannot find f38356df6474 in the tree.

Oops, indeed. I managed to pull that commit from the old development
branch I still had locally.


  reply	other threads:[~2026-06-26 18:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 18:21 [PATCH 0/4] mm: fix reclaim storms in defrag_mode Johannes Weiner
2026-06-26 18:21 ` [PATCH 1/4] mm: page_alloc: __GFP_FS lockdep annotation for direct compaction Johannes Weiner
2026-07-01 13:45   ` Vlastimil Babka (SUSE)
2026-06-26 18:21 ` [PATCH 2/4] mm: compaction: support non-movable compaction for pageblock requests Johannes Weiner
2026-07-01 14:19   ` Vlastimil Babka (SUSE)
2026-07-01 15:28     ` Johannes Weiner
2026-07-01 18:14       ` Vlastimil Babka (SUSE)
2026-07-01 21:11         ` Johannes Weiner
2026-06-26 18:21 ` [PATCH 3/4] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2026-07-01 18:02   ` Vlastimil Babka (SUSE)
2026-07-01 20:57     ` Johannes Weiner
2026-06-26 18:21 ` [PATCH 4/4] mm: page_alloc: fix non-movable reclaim storm in defrag_mode Johannes Weiner
2026-06-26 18:29   ` Zi Yan
2026-06-26 18:43     ` Johannes Weiner [this message]
2026-07-01 18:06   ` Vlastimil Babka (SUSE)
2026-07-01 21:02     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aj7IUVdXgqDG6w34@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=jackmanb@google.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.