From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Seiji Nishikawa <snishika@redhat.com>,
Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 6.1 78/81] mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
Date: Mon, 6 Jan 2025 16:16:50 +0100 [thread overview]
Message-ID: <20250106151132.371873989@linuxfoundation.org> (raw)
In-Reply-To: <20250106151129.433047073@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Seiji Nishikawa <snishika@redhat.com>
commit 6aaced5abd32e2a57cd94fd64f824514d0361da8 upstream.
The task sometimes continues looping in throttle_direct_reclaim() because
allow_direct_reclaim(pgdat) keeps returning false.
#0 [ffff80002cb6f8d0] __switch_to at ffff8000080095ac
#1 [ffff80002cb6f900] __schedule at ffff800008abbd1c
#2 [ffff80002cb6f990] schedule at ffff800008abc50c
#3 [ffff80002cb6f9b0] throttle_direct_reclaim at ffff800008273550
#4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68
#5 [ffff80002cb6fae0] __alloc_pages_nodemask at ffff8000082c4660
#6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98
#7 [ffff80002cb6fca0] do_anonymous_page at ffff80000829f5a8
#8 [ffff80002cb6fce0] __handle_mm_fault at ffff8000082a5974
#9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4
At this point, the pgdat contains the following two zones:
NODE: 4 ZONE: 0 ADDR: ffff00817fffe540 NAME: "DMA32"
SIZE: 20480 MIN/LOW/HIGH: 11/28/45
VM_STAT:
NR_FREE_PAGES: 359
NR_ZONE_INACTIVE_ANON: 18813
NR_ZONE_ACTIVE_ANON: 0
NR_ZONE_INACTIVE_FILE: 50
NR_ZONE_ACTIVE_FILE: 0
NR_ZONE_UNEVICTABLE: 0
NR_ZONE_WRITE_PENDING: 0
NR_MLOCK: 0
NR_BOUNCE: 0
NR_ZSPAGES: 0
NR_FREE_CMA_PAGES: 0
NODE: 4 ZONE: 1 ADDR: ffff00817fffec00 NAME: "Normal"
SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264
VM_STAT:
NR_FREE_PAGES: 146
NR_ZONE_INACTIVE_ANON: 94668
NR_ZONE_ACTIVE_ANON: 3
NR_ZONE_INACTIVE_FILE: 735
NR_ZONE_ACTIVE_FILE: 78
NR_ZONE_UNEVICTABLE: 0
NR_ZONE_WRITE_PENDING: 0
NR_MLOCK: 0
NR_BOUNCE: 0
NR_ZSPAGES: 0
NR_FREE_CMA_PAGES: 0
In allow_direct_reclaim(), while processing ZONE_DMA32, the sum of
inactive/active file-backed pages calculated in zone_reclaimable_pages()
based on the result of zone_page_state_snapshot() is zero.
Additionally, since this system lacks swap, the calculation of inactive/
active anonymous pages is skipped.
crash> p nr_swap_pages
nr_swap_pages = $1937 = {
counter = 0
}
As a result, ZONE_DMA32 is deemed unreclaimable and skipped, moving on to
the processing of the next zone, ZONE_NORMAL, despite ZONE_DMA32 having
free pages significantly exceeding the high watermark.
The problem is that the pgdat->kswapd_failures hasn't been incremented.
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_failures
$1935 = 0x0
This is because the node deemed balanced. The node balancing logic in
balance_pgdat() evaluates all zones collectively. If one or more zones
(e.g., ZONE_DMA32) have enough free pages to meet their watermarks, the
entire node is deemed balanced. This causes balance_pgdat() to exit early
before incrementing the kswapd_failures, as it considers the overall
memory state acceptable, even though some zones (like ZONE_NORMAL) remain
under significant pressure.
The patch ensures that zone_reclaimable_pages() includes free pages
(NR_FREE_PAGES) in its calculation when no other reclaimable pages are
available (e.g., file-backed or anonymous pages). This change prevents
zones like ZONE_DMA32, which have sufficient free pages, from being
mistakenly deemed unreclaimable. By doing so, the patch ensures proper
node balancing, avoids masking pressure on other zones like ZONE_NORMAL,
and prevents infinite loops in throttle_direct_reclaim() caused by
allow_direct_reclaim(pgdat) repeatedly returning false.
The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused
by a node being incorrectly deemed balanced despite pressure in certain
zones, such as ZONE_NORMAL. This issue arises from
zone_reclaimable_pages() returning 0 for zones without reclaimable file-
backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient
free pages to be skipped.
The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored
during reclaim, masking pressure in other zones. Consequently,
pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback
mechanisms in allow_direct_reclaim() from being triggered, leading to an
infinite loop in throttle_direct_reclaim().
This patch modifies zone_reclaimable_pages() to account for free pages
(NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones
with sufficient free pages are not skipped, enabling proper balancing and
reclaim behavior.
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20241130164346.436469-1-snishika@redhat.com
Link: https://lkml.kernel.org/r/20241130161236.433747-2-snishika@redhat.com
Fixes: 5a1c84b404a7 ("mm: remove reclaim and compaction retry approximations")
Signed-off-by: Seiji Nishikawa <snishika@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/vmscan.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -588,7 +588,14 @@ unsigned long zone_reclaimable_pages(str
if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL))
nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) +
zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON);
-
+ /*
+ * If there are no reclaimable file-backed or anonymous pages,
+ * ensure zones with sufficient free pages are not skipped.
+ * This prevents zones like DMA32 from being ignored in reclaim
+ * scenarios where they can still help alleviate memory pressure.
+ */
+ if (nr == 0)
+ nr = zone_page_state_snapshot(zone, NR_FREE_PAGES);
return nr;
}
next prev parent reply other threads:[~2025-01-06 15:22 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-06 15:15 [PATCH 6.1 00/81] 6.1.124-rc1 review Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 01/81] x86/hyperv: Fix hv tsc page based sched_clock for hibernation Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 02/81] selinux: ignore unknown extended permissions Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 03/81] btrfs: fix use-after-free in btrfs_encoded_read_endio() Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 04/81] tracing: Have process_string() also allow arrays Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 05/81] thunderbolt: Add support for Intel Lunar Lake Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 06/81] thunderbolt: Add support for Intel Panther Lake-M/P Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 07/81] thunderbolt: Dont display nvm_version unless upgrade supported Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 08/81] xhci: retry Stop Endpoint on buggy NEC controllers Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 09/81] usb: xhci: Limit Stop Endpoint retries Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 10/81] xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 11/81] net: mctp: handle skb cleanup on sock_queue failures Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 12/81] RDMA/mlx5: Enforce same type port association for multiport RoCE Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 13/81] RDMA/bnxt_re: Add check for path mtu in modify_qp Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 14/81] RDMA/bnxt_re: Fix reporting hw_ver in query_device Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 15/81] RDMA/bnxt_re: Fix max_qp_wrs reported Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 16/81] RDMA/bnxt_re: Fix the locking while accessing the QP table Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 17/81] drm/bridge: adv7511_audio: Update Audio InfoFrame properly Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 18/81] net: dsa: microchip: Fix KSZ9477 set_ageing_time function Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 19/81] net: dsa: microchip: add ksz_rmw8() function Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 20/81] net: dsa: microchip: Fix LAN937X set_ageing_time function Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 21/81] RDMA/hns: Refactor mtr find Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 22/81] RDMA/hns: Remove unused parameters and variables Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 23/81] RDMA/hns: Fix mapping error of zero-hop WQE buffer Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 24/81] RDMA/hns: Fix warning storm caused by invalid input in IO path Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 25/81] RDMA/hns: Fix missing flush CQE for DWQE Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 26/81] net: stmmac: platform: provide devm_stmmac_probe_config_dt() Greg Kroah-Hartman
2025-01-06 15:15 ` [PATCH 6.1 27/81] net: stmmac: dont create a MDIO bus if unnecessary Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 28/81] net: stmmac: restructure the error path of stmmac_probe_config_dt() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 29/81] net: fix memory leak in tcp_conn_request() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 30/81] ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 31/81] ip_tunnel: annotate data-races around t->parms.link Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 32/81] ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_bind_dev() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 33/81] ipv4: ip_tunnel: Unmask upper DSCP bits in ip_md_tunnel_xmit() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 34/81] ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_xmit() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 35/81] net: Fix netns for ip_tunnel_init_flow() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 36/81] netrom: check buffer length before accessing it Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 37/81] drm/i915/dg1: Fix power gate sequence Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 38/81] netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 39/81] net: llc: reset skb->transport_header Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 40/81] ALSA: usb-audio: US16x08: Initialize array before use Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 41/81] eth: bcmsysport: fix call balance of priv->clk handling routines Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 42/81] net: mv643xx_eth: fix an OF node reference leak Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 43/81] net: wwan: t7xx: Fix FSM command timeout issue Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 44/81] RDMA/rtrs: Ensure ib_sge list is accessible Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 45/81] net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 46/81] net: restrict SO_REUSEPORT to inet sockets Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 47/81] net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 48/81] af_packet: fix vlan_get_tci() vs MSG_PEEK Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 49/81] af_packet: fix vlan_get_protocol_dgram() " Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 50/81] ila: serialize calls to nf_register_net_hooks() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 51/81] btrfs: rename and export __btrfs_cow_block() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 52/81] btrfs: fix use-after-free when COWing tree bock and tracing is enabled Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 53/81] wifi: mac80211: wake the queues in case of failure in resume Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 54/81] drm/amdkfd: Correct the migration DMA map direction Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 55/81] btrfs: flush delalloc workers queue before stopping cleaner kthread during unmount Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 56/81] ALSA: hda/realtek: Add new alc2xx-fixup-headset-mic model Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 57/81] sound: usb: enable DSD output for ddHiFi TC44C Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 58/81] sound: usb: format: dont warn that raw DSD is unsupported Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 59/81] bpf: fix potential error return Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 60/81] ksmbd: retry iterate_dir in smb2_query_dir Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 61/81] net: usb: qmi_wwan: add Telit FE910C04 compositions Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 62/81] Bluetooth: hci_core: Fix sleeping function called from invalid context Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 63/81] irqchip/gic: Correct declaration of *percpu_base pointer in union gic_base Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 64/81] ARC: build: Try to guess GCC variant of cross compiler Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 65/81] usb: xhci: Avoid queuing redundant Stop Endpoint commands Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 66/81] modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 67/81] modpost: fix the missed iteration for the max bit in do_input() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 68/81] ALSA hda/realtek: Add quirk for Framework F111:000C Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 69/81] ALSA: seq: oss: Fix races at processing SysEx messages Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 70/81] kcov: mark in_softirq_really() as __always_inline Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 71/81] RDMA/uverbs: Prevent integer overflow issue Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 72/81] pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 73/81] sky2: Add device ID 11ab:4373 for Marvell 88E8075 Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 74/81] net/sctp: Prevent autoclose integer overflow in sctp_association_init() Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 75/81] drm: adv7511: Drop dsi single lane support Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 76/81] dt-bindings: display: adi,adv7533: Drop " Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 77/81] mm/readahead: fix large folio support in async readahead Greg Kroah-Hartman
2025-01-06 15:16 ` Greg Kroah-Hartman [this message]
2025-01-06 15:16 ` [PATCH 6.1 79/81] mptcp: fix TCP options overflow Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 80/81] mptcp: fix recvbuffer adjust on sleeping rcvmsg Greg Kroah-Hartman
2025-01-06 15:16 ` [PATCH 6.1 81/81] mptcp: dont always assume copied data in mptcp_cleanup_rbuf() Greg Kroah-Hartman
2025-01-06 18:22 ` [PATCH 6.1 00/81] 6.1.124-rc1 review Pavel Machek
2025-01-06 19:29 ` Florian Fainelli
2025-01-06 22:26 ` Peter Schneider
2025-01-07 0:22 ` SeongJae Park
2025-01-07 7:10 ` Ron Economos
2025-01-07 12:33 ` Mark Brown
2025-01-07 12:36 ` Naresh Kamboju
2025-01-07 12:44 ` Jon Hunter
2025-01-07 20:59 ` [PATCH 6.1] " Hardik Garg
2025-01-07 23:16 ` [PATCH 6.1 00/81] " Shuah Khan
2025-01-08 12:54 ` Muhammad Usama Anjum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250106151132.371873989@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=mgorman@techsingularity.net \
--cc=patches@lists.linux.dev \
--cc=snishika@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.