From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 47BFF28641F for ; Fri, 14 Nov 2025 04:17:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763093874; cv=none; b=ciVjgdWfGHl1YlY8+jX7/ql3IlLI9fbrp+VBmo3SUbOc4gWwD4q6hQgP2pF4WTMrhjl32BISOgF9es6BjsgZVf6ooFjNMK4JQt+5frwuYkMd80VVUq+YAknNgzRqZWE4N2miZEq6N59SYpXOVSRF8So8nIEyEggrFGwuq4Ml3kc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763093874; c=relaxed/simple; bh=f00inmLUBlVwUvsnYGHk1G7dG1C9HyaWjoxXjmj3JFQ=; h=MIME-Version:Date:Content-Type:From:Message-ID:Subject:To:Cc: In-Reply-To:References; b=kiTvyw39Mke7l931iYYeWHg0847vgvDGb+VfPFOkMm33UQCdX/j4Z/HbKVVSePYNxxf3l4fK4XjB3SdGPz33gzCWVe4cLbduOYP7MDoe+8El1WnRy/ThVd4DKe07LeNZyv7hEVd/yl61Xkfwgyxigy0yFppcd4Ku4SZd92tQVyY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=iNIABp+G; arc=none smtp.client-ip=95.215.58.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="iNIABp+G" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1763093869; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8za1Ngk+XgqKIbf8gfc+93lq99ESfyr8n5JBwYJ5ZPM=; b=iNIABp+GOkH+yVV+pP3IoXbtDNwfJUTTE02piniavS0Fy7Izy/QK0JxTWBWs18aeTCj35D YOuLj7hXSFIa1FHKM6rFWRDUu1zHKg7VdDpAJsnbfZpRnzYIehW1LZDZfyd/47wfZhvpwU zN2kne/nWG/Dm4pA59k01JPsvfLL9gU= Date: Fri, 14 Nov 2025 04:17:40 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Jiayuan Chen" Message-ID: <53de0b3ee0b822418e909db29bfa6513faff9d36@linux.dev> TLS-Required: No Subject: Re: [PATCH v2] mm/vmscan: skip increasing kswapd_failures when reclaim was boosted To: "Shakeel Butt" , "Andrew Morton" Cc: linux-mm@kvack.org, "Andrew Morton" , "Johannes Weiner" , "David Hildenbrand" , "Michal Hocko" , "Qi Zheng" , "Lorenzo Stoakes" , "Axel Rasmussen" , "Yuanchu Xie" , "Wei Xu" , linux-kernel@vger.kernel.org In-Reply-To: References: <20251024022711.382238-1-jiayuan.chen@linux.dev> X-Migadu-Flow: FLOW_OUT November 14, 2025 at 07:47, "Shakeel Butt" wrote: [...] > > The final observable issue is that a brief period of rapid memory > > allocation causes kswapd to stop running, ultimately triggering dire= ct > > reclaim and making the applications unresponsive. > >=20=20 >=20> Signed-off-by: Jiayuan Chen > >=20 >=20Please resolve Andrew's comment and add couple of lines on boosted > watermark increasing the chances of kswapd failures and the patch only > targets that particular scenario, the general solution TBD in the commi= t > message. >=20 >=20With that, you can add: >=20 >=20Reviewed-by: Shakeel Butt > I see this patch is already in mm-next. I'm not sure how to proceed. Perhaps Andrew needs to do a git rebase and then reword the commit messag= e? But regardless, I'll reword the commit message here and please let me kno= w how to proceed if possible: ''' mm/vmscan: skip increasing kswapd_failures when reclaim was boosted We have a colocation cluster used for deploying both offline and online services simultaneously. In this environment, we encountered a scenario where direct memory reclamation was triggered due to kswapd not running. 1. When applications start up, rapidly consume memory, or experience network traffic bursts, the kernel reaches steal_suitable_fallback(), which sets watermark_boost and subsequently wakes kswapd. 2. In the core logic of kswapd thread (balance_pgdat()), when reclaim is triggered by watermark_boost, the maximum priority is 10. Higher priority values mean less aggressive LRU scanning, which can result in no pages being reclaimed during a single scan cycle: if (nr_boost_reclaim && sc.priority =3D=3D DEF_PRIORITY - 2) raise_priority =3D false; 3. Additionally, many of our pods are configured with memory.low, which prevents memory reclamation in certain cgroups, further increasing the chance of failing to reclaim memory. 4. This eventually causes pgdat->kswapd_failures to continuously accumulate, exceeding MAX_RECLAIM_RETRIES, and consequently kswapd sto= ps working. At this point, the system's available memory is still significantly above the high watermark =E2=80=94 it's inappropriate fo= r kswapd to stop under these conditions. The final observable issue is that a brief period of rapid memory allocation causes kswapd to stop running, ultimately triggering direct reclaim and making the applications unresponsive. This problem leading to direct memory reclamation has been a long-standin= g issue in our production environment. We initially held the simple assumption that it was caused by applications allocating memory too rapid= ly for kswapd to keep up with reclamation. However, after we began monitorin= g kswapd's runtime behavior, we discovered a different pattern: ''' kswapd initially exhibits very aggressive activity even when there is sti= ll considerable free memory, but it subsequently stops running entirely, eve= n as memory levels approach the low watermark. ''' In summary, both boosted watermarks and memory.low increase the probabili= ty of kswapd operation failures. This patch specifically addresses the scenario involving boosted watermar= ks by not incrementing kswapd_failures when reclamation fails. A more genera= l solution, potentially addressing memory.low or other cases, requires furt= her discussion. Link: https://lkml.kernel.org/r/20251024022711.382238-1-jiayuan.chen@linu= x.dev Reviewed-by: Shakeel Butt Signed-off-by: Jiayuan Chen Cc: Axel Rasmussen Cc: David Hildenbrand Cc: Johannes Weiner Cc: Lorenzo Stoakes Cc: Michal Hocko Cc: Qi Zheng Cc: Shakeel Butt Cc: Wei Xu Cc: Yuanchu Xie Signed-off-by: Andrew Morton '''