From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BDB9C3DA7F for ; Thu, 15 Aug 2024 06:32:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 036306B0088; Thu, 15 Aug 2024 02:32:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F28546B0089; Thu, 15 Aug 2024 02:32:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF20A6B008A; Thu, 15 Aug 2024 02:32:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C092F6B0088 for ; Thu, 15 Aug 2024 02:32:51 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 615E7413F4 for ; Thu, 15 Aug 2024 06:32:51 +0000 (UTC) X-FDA: 82453511742.15.AA3FC08 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) by imf02.hostedemail.com (Postfix) with ESMTP id A25DC80024 for ; Thu, 15 Aug 2024 06:32:49 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Xt/7iruv"; spf=pass (imf02.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723703533; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qDCk+gru4KSJH6QS6nIVyyOpJVWF4oWZoIwgJGoUB1g=; b=ouQCZDVmGvXV6MMZDTXdE8nh6sxUW3sHQhbjQTPUebPcjPVp7fYBTK9yvF3SXTeDsfacQU /04VkEcWagGSVCaRmJfiN6KBB4enbimv2IsCGl2PJlP0nBDV3ReJhA+dr/FP4dyUydQ10I jAG8YPqvh/mFPPnOAPgSjPWvTT55oO8= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Xt/7iruv"; spf=pass (imf02.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723703533; a=rsa-sha256; cv=none; b=UTIAESorVEO56jqo8fTWwI4yG8j74b60t6cq3VXIfQY6XAAjUfFech8hjbDjVBG4MRhFMq mItsGJ2/c/oi3JdYZzMPql7BBaLPrAOAZTC9hxIOhE3cYZkCxkzvVVL+1I7u5DerTh/dkF ysPOpAkZPCfDgzjSPE4FfwW2KWZB0iA= Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6bf705959f1so2712096d6.1 for ; Wed, 14 Aug 2024 23:32:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723703569; x=1724308369; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qDCk+gru4KSJH6QS6nIVyyOpJVWF4oWZoIwgJGoUB1g=; b=Xt/7iruvL2tb1LziWs5wYjmyHy2KTNlXGVmDv73NXInvpxT0jFojVJmn0AysK8K4aR ZcqSnH9EgCLwsErABZvm2z/fEZW+Z4nv03Rw1lQTB46ZyX6GjyJ8JoMnK/7Xalix0wuN wQoDabd41AlDh1yJD9wkzjeeQI/qSETJa77zn637Lf8XmkdM2vfUZGVZd1APs3LQLdC6 nTvqcYcDMbFVNwMAoaV5NUzl0ozdL8qopQclOjDzSEAhj5fQILG0kxOc8naca5VqUOdg 5Bwzlbltq4oVLwkqvlOUb7YdVN1hKqWgLo0rsvWvvY8mBAI0sV9qe2uSmhMFMfauKi+b yTAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723703569; x=1724308369; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qDCk+gru4KSJH6QS6nIVyyOpJVWF4oWZoIwgJGoUB1g=; b=HE4N/iZT6pa30TY5RRn9+CklkozukUQ/xdoAU7xD1w4iKIHNwbrW5JwM4xGgJzkqhb /f329irK19BhX/qvSVYjWAE/bCBm9xpGk2GyLuyUOAfdltuXCQ1p6NdgQlqjwCwEuDSa Z2CozoTU4qHoyoOaKiswC9i4cm17GvhbjAC3Eo4Vi2nNu3c1Wo/79yT8bLAIJpOYOm60 uH3tT3imD/y1hhuF4MZM3hQ8qvwjMe0yLNIj2/W3vGrSGj0ry93za9ZGIRaLttP8w6pm wChEfXTgstPl4NvEmpmQ1E5Eht/nphQ9fFdXO2iEaASCN1flrCG6yYCv6KgX4DgkRlj9 ObEw== X-Forwarded-Encrypted: i=1; AJvYcCWq8/gv8TtlGYFXyewfrNxrLq/fPk92wnP7ee9CNL9tdI61GZnsv33JwwboawYUU422gjXYr2RC8ehv97lz91ouZlA= X-Gm-Message-State: AOJu0Yy9J+21QIuu3iaMocgw7TBiv+Jg9uOqMNOg/LiJIer4OIryazrh MVP52kT27N0ucau29eHGca0sTfMdf6/pDAepOcobTGjsNWBmYR/2OqQ2xotU7JundD0afuYwc78 LV31YV225gQqQYPXvgyxs8uIondQ= X-Google-Smtp-Source: AGHT+IEQcBFgy7LbM+8f2LOP/+hr772Z+43eFxXcya7lMo920Fe5pGHv57wT8aS953jRlIIrHPt3Nw8WhejBhNupnvk= X-Received: by 2002:ad4:5be6:0:b0:6bb:8b7b:c2df with SMTP id 6a1803df08f44-6bf6de80febmr36762516d6.25.1723703568468; Wed, 14 Aug 2024 23:32:48 -0700 (PDT) MIME-Version: 1.0 References: <20240812090525.80299-1-laoar.shao@gmail.com> <20240812090525.80299-2-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Thu, 15 Aug 2024 14:32:10 +0800 Message-ID: Subject: Re: [PATCH 1/2] mm: Add memalloc_nowait_{save,restore} To: Michal Hocko Cc: Christoph Hellwig , akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, david@fromorbit.com, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Kent Overstreet Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: A25DC80024 X-Stat-Signature: d546116mk9ahfj3igtpj7jpdppmms779 X-HE-Tag: 1723703569-312383 X-HE-Meta: U2FsdGVkX1+UIqZTpp0m7Sf/hjePt+77ZOeSOoPKmWYMW9ggrTLtwUvwjqvpafr9TmsxNV0m9a55aLkcKtg0ZHZQ9SJYDuPc4VAs4W8dNMZ13pMwscT/o8q0NXHZUwSoGBZx6/4TJ7yB8/A6J2Tkn/aHT4Vu+sIQyro7Fsc2DIOjukNfFNUuLR76YidrqdY5QYAlsYjLVX883blO8rR8Uypfx2/NOT76PDzU6ugYAp/dOzfH20oDaWcBAsRAGL4QwPR3mWlWTZYMyvp97Vaxe/kJo0BETu+5+GkK4z9xDS4+dmw8TTH01RHXtWAIwy60/6VU49vUtt86CJSZVNeFhAcsAGxrgAIn+TngwC4HJtr9QxA4x30A9vkFJ0IeakoKdGLQPA+6zq6f/g8VkwvjMl8jp5c2oIJxwRqwYsHOUo4/1NQVe7nkzj0iryUAnMY37RRuYcLSgYDrElDcnHBxYslXOiHlSYzWvZZ8QRUs90QVrIZad92L+Azo6MJO5GwBwe1cZ4jAWj6yT3qIk+cw5nVhKcn+5rubTPN/aDAm2k4mtpbiWeTdDuzOXZW23AObzZ/i8KCztFdoxeWZDelgAsSwAiWiab5FYzLqAY7p49ltzpkLgIPqvrfC5eWoiXv5v9bq9yWw6KHvoJjdlhykRZwd10OtGk7nkIQBHVELY57InPIGIC9sv4My1WRWTxNLyWll2P59UPW31aETS5J8w3zN31Pe1GLyWpGr/KeKXISS5vg5PDvTeTXrC8PBu4j+D4AUQofuF7XrKk0dOhyHmDyiJmQs20THN5oyy/Mzx3zKOUdwHTx9W9q9kl44uzdWQcZM1q72YZUtXOhaalcmTRUaqO4KbaPiyZttzti1zbRbIg0aLVobkSYs/2nAawO9zoggU7XDbQc7Zhq4F1znHLZt1udn3r2ivsLJNTRp7Rj0U5E10vtuo7c7dp9jskTry2EmrXaPRImG1zvjLb0 mq/9tQgv yyJCELjvHj4hEmzOJybIwGFZVATv64mvCRXbrWj45AGRUDq1bgYFR7PsSGadZFB75YNb9J2GjEj2acV5fgXqf9YB5b+AuDfiF3wMMb7qsawss0FbKi/ILw+KFGRAj5N4BXKm4nUgNFu49oxcwl6r6Ikkx9vnkQGSrOpvxwvtadumeNi84RMlsiGrmpxkRvVZuagCdt4vd8RrgqbBe2un/SNUIDkhpmrvQKhEpf3B8AKnjhdaXBTxrHl3J/dgz0n1ioGJv1p9FgzdrJMbmJbu1+l5TPBNcTESuPWlM96WtFxPVB2TfSLb2KxrzGZAa+NsPBMSqk1klm5RDZbxj2+XJtt/oXdwcZ71b9/JnASBGhuDYvpH7uymMRL7zdbQQ/fs1ioTk6rYIbFWier2GaWVYb0tI7r7M7Y6/8OHwv99MNRMPEdFTknXBtm6xVYf8vGIsgvuUiV4MU/gfcSGfui+zD/1xWvzQqRE03w3lSG1+2jUdF/Nbhigp+oAZxUn2p2lWMKJMydqFnmPvJnE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 15, 2024 at 2:22=E2=80=AFPM Michal Hocko wrot= e: > > On Thu 15-08-24 11:26:09, Yafang Shao wrote: > > On Wed, Aug 14, 2024 at 8:43=E2=80=AFPM Michal Hocko = wrote: > [...] > > > > If that's the case, I believe we should at least consider adding th= e > > > > following code change to the kernel: > > > > > > We already do have that > > > /* > > > * All existing users of the __GFP_NOFAIL are blockab= le, so warn > > > * of any new users that actually require GFP_NOWAIT > > > */ > > > if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > > > goto fail; > > > > I don't see a reason to place the `goto fail;` above the > > `__alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE, ac);= ` > > line. Since we've already woken up kswapd, it should be acceptable to > > allocate memory from ALLOC_MIN_RESERVE temporarily. Why not consider > > implementing the following changes instead? > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 9ecf99190ea2..598d4df829cd 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -4386,13 +4386,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned > > int order, > > * we always retry > > */ > > if (gfp_mask & __GFP_NOFAIL) { > > - /* > > - * All existing users of the __GFP_NOFAIL are blockable= , so warn > > - * of any new users that actually require GFP_NOWAIT > > - */ > > - if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > > - goto fail; > > - > > /* > > * PF_MEMALLOC request from this context is rather biza= rre > > * because we cannot reclaim anything and only can loop= waiting > > @@ -4419,6 +4412,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned > > int order, > > if (page) > > goto got_pg; > > > > + /* > > + * All existing users of the __GFP_NOFAIL are blockable= , so warn > > + * of any new users that actually require GFP_NOWAIT > > + */ > > + if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) { > > + goto fail; > > + } > > + > > cond_resched(); > > goto retry; > > } > > How does this solve anything. It will still eventually fail the NOFAIL > allocation. It might happen slightly later but that doesn't change the > fact it will _fail_. I have referenced a discussion why that is not > really desireable and why Barry wants that addressed. We have added that > WARN_ON_ONCE because we have assumed that people do understand that > NOFAIL without reclaim is just too much to ask. We were wrong there was > one user in the kernel. That one was not too hard to find out because > you can _grep_ for those flags. Scoped APIs make that impossible! > > > > But Barry has patches to turn that into BUG because failing NOFAIL > > > allocations is not cool and cause unexpected failures. Have a look at > > > https://lore.kernel.org/all/20240731000155.109583-1-21cnbao@gmail.com= / > > > > > > > > I am really > > > > > surprised that we even have PF_MEMALLOC_NORECLAIM in the first pl= ace! > > > > > > > > There's use cases for it. > > > > > > Right but there are certain constrains that we need to worry about to > > > have a maintainable code. Scope allocation contrains are really a goo= d > > > feature when that has a well defined semantic. E.g. NOFS, NOIO or > > > NOMEMALLOC (although this is more self inflicted injury exactly becau= se > > > PF_MEMALLOC had a "use case"). NOWAIT scope semantic might seem a goo= d > > > feature but it falls appart on nested NOFAIL allocations! So the flag= is > > > usable _only_ if you fully control the whole scoped context. Good luc= k > > > with that long term! This is fragile, hard to review and even harder = to > > > keep working properly. The flag would have been Nacked on that ground= . > > > But nobody asked... > > > > It's already implemented, and complaints won't resolve the issue. How > > about making the following change to provide a warning when this new > > flag is used incorrectly? > > How does this solve anything at all? It will warn you that your code is > incorrect and what next? Are you going to remove GFP_NOFAIL from the > nested allocation side? NOFAIL is a strong requirement and it is not > used nilly willy. There must have been a very good reason to use it. Are > you going to drop the scope? > > Let me repeat, nested NOFAIL allocations will BUG_ON on failure. The key question is whether it actually fails after we've already woken up kswapd. Have we encountered real issues, or is this just based on code review? Instead of allowing it to fail, why not allocate from the reserve memory to prevent this from happening? > Your > warning might catch those users slightly earlier when allocation succeed > but that doesn't make those crashes impossible. PF_MEMALLOC_NORECLAIM > might be already merged but this concept is inherently fragile and > should be reverted rather than finding impossible ways around it. And it > should be done before this spreads outside of bcachefs. > -- > Michal Hocko > SUSE Labs -- Regards Yafang