linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc.c: Avoid infinite retries caused by cpuset race
@ 2025-04-16  8:24 Tianyang Zhang
  2025-04-21 10:00 ` Harry Yoo
  0 siblings, 1 reply; 16+ messages in thread
From: Tianyang Zhang @ 2025-04-16  8:24 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, Tianyang Zhang

__alloc_pages_slowpath has no change detection for ac->nodemask
in the part of retry path, while cpuset can modify it in parallel.
For some processes that set mempolicy as MPOL_BIND, this results
ac->nodemask changes, and then the should_reclaim_retry will
judge based on the latest nodemask and jump to retry, while the
get_page_from_freelist only traverses the zonelist from
ac->preferred_zoneref, which selected by a expired nodemask
and may cause infinite retries in some cases

cpu 64:
__alloc_pages_slowpath {
        /* ..... */
retry:
        /* ac->nodemask = 0x1, ac->preferred->zone->nid = 1 */
        if (alloc_flags & ALLOC_KSWAPD)
                wake_all_kswapds(order, gfp_mask, ac);
        /* cpu 1:
        cpuset_write_resmask
            update_nodemask
                update_nodemasks_hier
                    update_tasks_nodemask
                        mpol_rebind_task
                         mpol_rebind_policy
                          mpol_rebind_nodemask
		// mempolicy->nodes has been modified,
		// which ac->nodemask point to

        */
        /* ac->nodemask = 0x3, ac->preferred->zone->nid = 1 */
        if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
                                 did_some_progress > 0, &no_progress_loops))
                goto retry;
}

Simultaneously starting multiple cpuset01 from LTP can quickly
reproduce this issue on a multi node server when the maximum
memory pressure is reached and the swap is enabled

Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
---
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fd6b865cb1ab..1e82f5214a42 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4530,6 +4530,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	}
 
 retry:
+	/*
+	 * Deal with possible cpuset update races or zonelist updates to avoid
+	 * infinite retries.
+	 */
+	if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
+	    check_retry_zonelist(zonelist_iter_cookie))
+		goto restart;
+
 	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
 	if (alloc_flags & ALLOC_KSWAPD)
 		wake_all_kswapds(order, gfp_mask, ac);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-05-15  3:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16  8:24 [PATCH] mm/page_alloc.c: Avoid infinite retries caused by cpuset race Tianyang Zhang
2025-04-21 10:00 ` Harry Yoo
2025-04-21 20:28   ` Suren Baghdasaryan
2025-04-23  2:38     ` Tianyang Zhang
2025-04-23 15:35       ` Suren Baghdasaryan
2025-05-14  7:15         ` Vlastimil Babka
2025-04-22 12:10   ` Tianyang Zhang
2025-04-23  0:11     ` Andrew Morton
2025-04-23  0:22       ` Suren Baghdasaryan
2025-05-11  3:07         ` Andrew Morton
2025-05-13 16:26           ` Suren Baghdasaryan
2025-05-13 19:16             ` Andrew Morton
2025-05-13 19:33               ` Suren Baghdasaryan
2025-05-14  7:34               ` Vlastimil Babka
2025-05-14 22:42                 ` Andrew Morton
2025-05-15  3:19                 ` Tianyang Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).