From: Andrew Morton <akpm@linux-foundation.org>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: linux-mm@kvack.org, Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@redhat.com>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
Date: Sat, 25 Oct 2025 21:40:07 -0700 [thread overview]
Message-ID: <20251025214007.736d659ee266a416c40aa6e5@linux-foundation.org> (raw)
In-Reply-To: <20251024022711.382238-1-jiayuan.chen@linux.dev>
On Fri, 24 Oct 2025 10:27:11 +0800 Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> We encountered a scenario where direct memory reclaim was triggered,
> leading to increased system latency:
Who is "we", if I may ask?
> 1. The memory.low values set on host pods are actually quite large, some
> pods are set to 10GB, others to 20GB, etc.
> 2. Since most pods have memory protection configured, each time kswapd is
> woken up, if a pod's memory usage hasn't exceeded its own memory.low,
> its memory won't be reclaimed.
> 3. When applications start up, rapidly consume memory, or experience
> network traffic bursts, the kernel reaches steal_suitable_fallback(),
> which sets watermark_boost and subsequently wakes kswapd.
> 4. In the core logic of kswapd thread (balance_pgdat()), when reclaim is
> triggered by watermark_boost, the maximum priority is 10. Higher
> priority values mean less aggressive LRU scanning, which can result in
> no pages being reclaimed during a single scan cycle:
>
> if (nr_boost_reclaim && sc.priority == DEF_PRIORITY - 2)
> raise_priority = false;
>
> 5. This eventually causes pgdat->kswapd_failures to continuously
> accumulate, exceeding MAX_RECLAIM_RETRIES, and consequently kswapd stops
> working. At this point, the system's available memory is still
> significantly above the high watermark — it's inappropriate for kswapd
> to stop under these conditions.
>
> The final observable issue is that a brief period of rapid memory
> allocation causes kswapd to stop running, ultimately triggering direct
> reclaim and making the applications unresponsive.
>
This logic appears to be at least eight years old. Can you suggest why
this issue is being observed after so much time?
>
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -7128,7 +7128,12 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx)
> goto restart;
> }
>
> - if (!sc.nr_reclaimed)
> + /*
> + * If the reclaim was boosted, we might still be far from the
> + * watermark_high at this point. We need to avoid increasing the
> + * failure count to prevent the kswapd thread from stopping.
> + */
> + if (!sc.nr_reclaimed && !boosted)
> atomic_inc(&pgdat->kswapd_failures);
>
Thanks, I'll toss it in for testing and shall await reviewer input
before proceeding further.
next parent reply other threads:[~2025-10-26 4:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251024022711.382238-1-jiayuan.chen@linux.dev>
2025-10-26 4:40 ` Andrew Morton [this message]
2025-11-08 1:11 ` [PATCH v2] mm/vmscan: skip increasing kswapd_failures when reclaim was boosted Shakeel Butt
2025-11-12 2:21 ` Jiayuan Chen
2025-11-13 23:41 ` Shakeel Butt
2025-11-13 10:02 ` Michal Hocko
2025-11-13 19:28 ` Shakeel Butt
2025-11-14 2:23 ` Jiayuan Chen
2025-11-13 23:47 ` Shakeel Butt
2025-11-14 4:17 ` Jiayuan Chen
2025-11-15 0:40 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251025214007.736d659ee266a416c40aa6e5@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=jiayuan.chen@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).