From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E07F0C3ABC3 for ; Tue, 13 May 2025 19:33:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AE128D0003; Tue, 13 May 2025 15:33:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 35DF28D0001; Tue, 13 May 2025 15:33:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FE578D0003; Tue, 13 May 2025 15:33:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0005D8D0001 for ; Tue, 13 May 2025 15:33:39 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C753F816CE for ; Tue, 13 May 2025 19:33:40 +0000 (UTC) X-FDA: 83438884200.13.9624059 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf02.hostedemail.com (Postfix) with ESMTP id F08368000B for ; Tue, 13 May 2025 19:33:38 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1LA8b4MN; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747164819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/QdidXaz0eUcUxYbDPRAABDwmZ5AyXqLsvFT/Qi83Kk=; b=Y05IBi/+SH/SCi8XVRdiG1WdA+giQbB3wFkMn28idHaJuOLhvOiNGbxbOAQucVI2Qe+fSZ hLiJ2Bc54OxDaCRvgRj0TGKLtmgPObCeNrerfX3jbaeJsVLJNeiLqeKEV1eEfxFGO3Pg1J 5fdr/JliZJAjDvn2X582y0uDO3w9HfM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747164819; a=rsa-sha256; cv=none; b=QAijtwVs6BER8rtxvxDEOrYBmMYJByH6zkgyxBqNs4vGirwf1RhahzhlLHwhRkZU1i8QOD UM+cUlmUjMo0u1t/ilnrxpi+FMe3pt71QD+rGcHfKTUj3xy+KrPxVoGzKuXN63pF6KtCqa jBoMBTytkpj20RJ9Cug2ruyAId7Vcic= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1LA8b4MN; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-47666573242so80601cf.0 for ; Tue, 13 May 2025 12:33:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747164818; x=1747769618; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/QdidXaz0eUcUxYbDPRAABDwmZ5AyXqLsvFT/Qi83Kk=; b=1LA8b4MN8iH+jZWR3oGh+HuKGTLHWp8bUQWhNHfLGlgZuGTROXaC35Or7uLvHAyt6D /Sr0Rh9dliwJjogbylAsOTDQit78NsobX5Ign4tB8Ecg3rNyUwazX/Gc5WOADQFaDI0R 0PJaqUkF648gixWPTZuYViAjigQTeVxmg9brGYQYR1NmLwIH0mjRHO0gSmfncIBfeKpU O4clfPqVtWKqaSCCV9cAQQ72TPBVBZD5fj1IubPx1Et0DTq0o0Xlf6bFyTnRtZE01eg0 OHI+IP/0DMbvqI4UQais1F9jWdXMdoeWqpbdpR5ZJHGITYzgZZlXsE33SL2JrAhX2JNA ZkyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747164818; x=1747769618; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/QdidXaz0eUcUxYbDPRAABDwmZ5AyXqLsvFT/Qi83Kk=; b=DPsSdG1JRY6IfqAoylarHuo/HzPo9nruLuvSqJkRFYSHAK/dZs6oo4GkjZd4eryrY/ 8piq/86ZPrX9Nc0JSUC8WK2otRW7PxhIvOHvPUpwy/YdztnX/2NgBBqEQlKd5Foec5xj TEO/9h1XS5PgNHe5KFSlFhiR4P/8kGgT8BQXHPf8Aoif0Svgu5D35MxhE0lpjYBGVsfK 5a1qPSWqFGzcWXBe8P2GTzaWxXdjgcXw4EOxKrWQ52Qp0BtDQDHm5Ng1r2XWXDDuONdH T5pOlcJuSaIMo9Cvz7rslXh3Ngz0w1MgJz9QePUk8Lz8AdeXi1P3VB015+PXjGxonLEG Xokw== X-Forwarded-Encrypted: i=1; AJvYcCVuwrP8WJ0DIXJGebjt4b+FcuTB0RZ7GUZpocg2CDpLXCnmyCZeY/heQ9K/prpoOY3u8dF/XhI0LA==@kvack.org X-Gm-Message-State: AOJu0YxFbokGVGUDmix119fbR4SsOc3rbkYptb0rjb6rCj+qB9z5rbwH J+6jvVC4g2oXa1iKvBjmgTT13UaYBfo8dI20/P2gIbE6ZuW/dawxNBzLYCxj/wdJ8mIeaZ2eA6b YSYtWM57frUlL3FrKlB+UBLmCNS/tv+9Mi6W3LWoH X-Gm-Gg: ASbGncvsxHuIr50XTQ9TkKky4212i30kTL/RqKoqFHVdM9D5RVw6MV3idKQgw6cqypD nHnx1+Y2jQ+etPcEtTC0ON+5jJ/jU01AhdEssnlK0FtOUiXztv/6yk132y4N7cke5hiJV6eJHiF gtd39W6kNR1gBmhaxqMTClk1qTYkAR0xXsmHfZ82nuudTXns+hkGIjsJXIS1oV X-Google-Smtp-Source: AGHT+IFGz32who9VOthrMLGakslBSZMdw/k5TozY8FHAvDqS+Qg1nC5ez27mfHiXse3/bUjtLtkB6H55qqn8aB0etkk= X-Received: by 2002:a05:622a:1ba5:b0:47d:4e8a:97f0 with SMTP id d75a77b69052e-4949623e119mr520891cf.29.1747164817525; Tue, 13 May 2025 12:33:37 -0700 (PDT) MIME-Version: 1.0 References: <20250416082405.20988-1-zhangtianyang@loongson.cn> <025e3f51-2ab5-bc58-5475-b57103169a82@loongson.cn> <20250422171116.f3928045a13205dc1b9a46ea@linux-foundation.org> <20250510200740.b7de2408e40be7ad5392fed9@linux-foundation.org> <20250513121609.a9741e49a0e865f25f966de1@linux-foundation.org> In-Reply-To: <20250513121609.a9741e49a0e865f25f966de1@linux-foundation.org> From: Suren Baghdasaryan Date: Tue, 13 May 2025 12:33:26 -0700 X-Gm-Features: AX0GCFtrr8CkO6VIsSjbyYRy48JGkLyIgCcXLWXsfvm3u06pI0N7qjxpPclNc1M Message-ID: Subject: Re: [PATCH] mm/page_alloc.c: Avoid infinite retries caused by cpuset race To: Andrew Morton Cc: Tianyang Zhang , Harry Yoo , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vlastimil Babka , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: p9kw3sg51jn4f3spaiicxfawnf5fexub X-Rspamd-Queue-Id: F08368000B X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1747164818-481095 X-HE-Meta: U2FsdGVkX1/A5pWaUrT9xvez7af9II6ZJ/Yu6S6g7Enhdaj3SVlQpe7JO2JNjQqNtT6s6nj3tvgDdpW9mAjR6+hLNCgQN8PL5kTzikqrFQbreLt6T1YeCXXJSyNpumjzKr3UkfeGRcUYZLKKMDZTpsig/cESIzFPv7/RCjkh/A8m4ua7TgeT8umLtqPZG3Y2SUf9xMf+XuCYeKEkDaxeIQd+vJc8mLDlNs2JMmYbh/QYyiQp+fLu/7QQL8oeMg/PI+0jBm8LKHjDSRaL7LcSLM+wQ25n9yIBindA4P4YH5BEVB+OHqOeYTyAnNgi3xWEJ05WlYk8oRlNinhqjeNjrkce5KJ3h7XtLQtGVt8JLLqakWd5FAHkcsHLpVL8ypX5uR4eVkniOgGwnHnAQ2i+BU7Idw3E4xbzzvRMRB5LnI0VySpEsadNJnzfYMAV+Xe1AiiLDzjSDQ5kAfvDxiLeMCA1+hDYEnPrBcsZsZ1fA73qivD92KlzwZZhGKuKcdXjQUYKDiGr+q4hPl7xYASUMQqARFvcmrUPFkP4XuPxyUpN85EVKHqohU2OszVO0NA3PV+2CFTCfOMErXTGRh3tzsLhyMFGlbkpFimsmYJUgGo8hrIu1ru53lSNjKChGctgufQtV0Yii5ObPFo+NLJDHk2YFa2/rCDoKUZ3N8REA95hzNJBQ9K7iXcsKmmfaUzUUqiDUhj3ganTXeE9xclVd1lyGAZtweZNHsFBRxdXpfJSa2Nu6UW1N/tnEYngbUQ8fULG718EmIwJH5F6LV58gIlX1j+YgMV6s9tnI6/II807XUCp7PUKnhKlQqawdOA5Z6cNtxHA8GPtAcvkEfr0y6ZAk+9JxkeRmW6P8PGY8JQGkZK5GjNhhhF80ORHZVpwe8Wae17Whn14NHlFeEk3p0aKPjcAmOarHRQQzF5CMxeW3V8BbM1JiQbNujnhPVKUIMC5wAOwmjtRjNCeXff dVQPDlwY kqAB/0HLri9+eZatHK5dJ7PYF8nvTVuK1OfLGSDFknd6xlse7sTXTBJq6yEVM75mozNvMCj/eXkl3DBDftfYz9FkfcqNmomI4TE0taGZk3p8AopFnmzFEDslb2WWw39FUONKkn2dKW/SA7YHOQfhKYHjnaM1eR5m/qp68VBRFGwsrh1mw8kO2c/Zet/z5H9iERcfFKTOiKINaspkq1eGcpwJ6rH9GXvuFzzaT1joJkVuBTfuy0bQa62sfMVJyBOyw1gVo796Np7LFcaPSqYzMtn3mbbayDb0r/OPc7US9gBjK6BGZExzE0pP7XUmYS9NY95owx0uTcK75zYGQlaBtgKu18jvYzj9gZN8sKyEW+TreJk8ODDkczZ2NwNM0Put5aHwr4pT7DMg+JYLW90pUvSSybMRl5Hcnnz74q1v5Ayk3HBP6wJXOotMrtZOvCiWVtwsAl2nMsKq8nvFVCoPmHEDYPYiIrl+T3catC/PuslGiRk4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 13, 2025 at 12:16=E2=80=AFPM Andrew Morton wrote: > > On Tue, 13 May 2025 09:26:53 -0700 Suren Baghdasaryan = wrote: > > > > > > This has been in mm-hotfixes-unstable for six days. Hopefully we= 'll > > > > > see some review activity soon (please). > > > > > > > > I reviewed and provided my feedback but saw neither a reply nor a > > > > respin with proposed changes. > > > > > > OK, thanks. Do you have time to put together a modified version of t= his? > > > > I think the code is fine as is. Would be good to add Fixes: tag but it > > will require some investigation to find the appropriate patch to > > reference here. > > Below is what is in mm-hotfixes. It doesn't actually have any > acked-by's or reviewed-by's. > > So... final call for review, please. Reviewed-by: Suren Baghdasaryan > > > From: Tianyang Zhang > Subject: mm/page_alloc.c: avoid infinite retries caused by cpuset race > Date: Wed, 16 Apr 2025 16:24:05 +0800 > > __alloc_pages_slowpath has no change detection for ac->nodemask in the > part of retry path, while cpuset can modify it in parallel. For some > processes that set mempolicy as MPOL_BIND, this results ac->nodemask > changes, and then the should_reclaim_retry will judge based on the latest > nodemask and jump to retry, while the get_page_from_freelist only > traverses the zonelist from ac->preferred_zoneref, which selected by a > expired nodemask and may cause infinite retries in some cases > > cpu 64: > __alloc_pages_slowpath { > /* ..... */ > retry: > /* ac->nodemask =3D 0x1, ac->preferred->zone->nid =3D 1 */ > if (alloc_flags & ALLOC_KSWAPD) > wake_all_kswapds(order, gfp_mask, ac); > /* cpu 1: > cpuset_write_resmask > update_nodemask > update_nodemasks_hier > update_tasks_nodemask > mpol_rebind_task > mpol_rebind_policy > mpol_rebind_nodemask > // mempolicy->nodes has been modified, > // which ac->nodemask point to > > */ > /* ac->nodemask =3D 0x3, ac->preferred->zone->nid =3D 1 */ > if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags, > did_some_progress > 0, &no_progress_loop= s)) > goto retry; > } > > Simultaneously starting multiple cpuset01 from LTP can quickly reproduce > this issue on a multi node server when the maximum memory pressure is > reached and the swap is enabled > > Link: https://lkml.kernel.org/r/20250416082405.20988-1-zhangtianyang@loon= gson.cn > Fixes: 902b62810a57 ("mm, page_alloc: fix more premature OOM due to race = with cpuset update"). > Signed-off-by: Tianyang Zhang > Cc: Vlastimil Babka > Cc: Suren Baghdasaryan > Cc: Michal Hocko > Cc: Brendan Jackman > Cc: Johannes Weiner > Cc: Zi Yan > Cc: > Signed-off-by: Andrew Morton > --- > > mm/page_alloc.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > --- a/mm/page_alloc.c~mm-page_allocc-avoid-infinite-retries-caused-by-cpu= set-race > +++ a/mm/page_alloc.c > @@ -4562,6 +4562,14 @@ restart: > } > > retry: > + /* > + * Deal with possible cpuset update races or zonelist updates to = avoid > + * infinite retries. > + */ > + if (check_retry_cpuset(cpuset_mems_cookie, ac) || > + check_retry_zonelist(zonelist_iter_cookie)) > + goto restart; > + > /* Ensure kswapd doesn't accidentally go to sleep as long as we l= oop */ > if (alloc_flags & ALLOC_KSWAPD) > wake_all_kswapds(order, gfp_mask, ac); > _ >