From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 49860CD342D
	for <linux-mm@archiver.kernel.org>; Tue,  3 Sep 2024 13:16:41 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 7167D6B007B; Tue,  3 Sep 2024 09:16:40 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 6C6DF6B0083; Tue,  3 Sep 2024 09:16:40 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 5674D6B0085; Tue,  3 Sep 2024 09:16:40 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16])
	by kanga.kvack.org (Postfix) with ESMTP id 37B396B007B
	for <linux-mm@kvack.org>; Tue,  3 Sep 2024 09:16:40 -0400 (EDT)
Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay09.hostedemail.com (Postfix) with ESMTP id CA711802E9
	for <linux-mm@kvack.org>; Tue,  3 Sep 2024 13:16:39 +0000 (UTC)
X-FDA: 82523476518.12.53A2CAB
Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52])
	by imf23.hostedemail.com (Postfix) with ESMTP id 07D7E14000D
	for <linux-mm@kvack.org>; Tue,  3 Sep 2024 13:16:37 +0000 (UTC)
Authentication-Results: imf23.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=JIrPt+2M;
	spf=pass (imf23.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1725369272;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=pGIgWA+Z6lZNw0/44TLTAh5+lMfGacrdZzpuKxzMwJk=;
	b=b7ll210A8Y+DOOfS7x1lF4gTH5N/bcOXf2ZfEW2CjngOwBKAk319mylTIrd3/Sm/fmDY/A
	s4wckUsoRYEOGcB16vwp628w09Jc2oatpUrpF589fF5vcAIArXKUF1iFaIxrDktB6m5Jj3
	gmPAuAGoKHdJPpztux13EcaOZld+Xrc=
ARC-Authentication-Results: i=1;
	imf23.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=JIrPt+2M;
	spf=pass (imf23.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725369272; a=rsa-sha256;
	cv=none;
	b=OhXjoHWUQn2Oey0LsUW85MdShGhkLbJ3j4QYSlENHlZs7Nxc86sYkospcJ+fw+7hwWw6Mi
	3gAZw7ABXe+qiY4Ef9yFPjiNwBpAx7FCosrEcaH2rQE9G+Wt6ttG+LqxFeygHZPLwwfKPv
	h4oDXWKVkzMFj4qWW3TE3Fvb5tTNdkc=
Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6c352bcb569so17586646d6.0
        for <linux-mm@kvack.org>; Tue, 03 Sep 2024 06:16:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1725369397; x=1725974197; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=pGIgWA+Z6lZNw0/44TLTAh5+lMfGacrdZzpuKxzMwJk=;
        b=JIrPt+2Ms2HGQhnhEyykNptklXu3SiWFIXOCZ11QmgYrpCgAPlhRkHp1iUN4q8rqHh
         jnQYQsJgrWKB+rCp4PApbEB82WAbWJW4BpfgaF/SS3VvAdhZlePY8l1o5lwVjDb92tL0
         uS9fbuGR9QEd12XUlg/fdY1qj5jbyspJnD/2aywjJ7hBvp3l5lELrlBTjx/Bu5jnaFFJ
         NQV8KO7ynhuiZXSn8lwocYIIykYj4Ot8ofOQQ/tgQbDAIiyee35DHhsUVw9QBO8YTvyT
         05TGANvvGuTxivfx4u35+6sGW/znXgvaj4t63SoiSPV5yG7Z6tv/10X772iSfPc65qAf
         IlYw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1725369397; x=1725974197;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=pGIgWA+Z6lZNw0/44TLTAh5+lMfGacrdZzpuKxzMwJk=;
        b=sXjNfk1VG2e+CzXMphJdKy/MuQf65KC/i8Uc7HGYAeBsbKIYEsLa3tGkPq+ib8hEC8
         lyWFwiWRZHwmWhkF+dvsc9R3Vj0wFyJE++RI4B3WfLTTtcq7xUENw6DkCWRwSBgqyAuq
         9XHTlS66/iHTJT6AuT3VmiJwtySjzHI8N9yyw+Ku4XoKYwYS9hplgnfRU1tfaI2xXS2q
         lEK+vr6hVqH36WKIQO052DSLHkQ8kyKZD5ncE5Q3tfKuvAFZuPgvVD8LZes0BCxsCF3G
         OAxCeTdZiZmA97TbOt+0SnJ++TLs/qGPeQaEgJ+e02pwquRaPdZKA28G4C9cyDvfCH/s
         SzcA==
X-Forwarded-Encrypted: i=1; AJvYcCVFgXLXytrqZAGnBCQrysX+nVksfDR1oKccRDYjzdOvFfHsHJHhDikEo0YFuTren04YlTQb5ZPYxw==@kvack.org
X-Gm-Message-State: AOJu0YysHd2CcAFTAGl+WPJjuhUDn8USn03Qqc8mbE2BrQL2JB/aXnrU
	k3RBwazx6h49uATMqB5252JgZV41oLXxLJ0Z4P01wk8fzjbR0Js5a1EcgKxBc4vfWC5AYJR8kFX
	MZOs6d8h/VnOmGWvGP2nHti2BPPM=
X-Google-Smtp-Source: AGHT+IH6xv/ihNNRfXoP5NW1PSMYDKkOs322EqcazacXLKGTZDpA+iegLV9xf3ZOwqnW2xoHN9i400zDIGtzUo49i9Q=
X-Received: by 2002:a05:6214:3109:b0:6c3:657b:4111 with SMTP id
 6a1803df08f44-6c3657b43e7mr73942296d6.52.1725369396843; Tue, 03 Sep 2024
 06:16:36 -0700 (PDT)
MIME-Version: 1.0
References: <ZtBWxWunhXTh0bhS@tiehlicka> <wjfubyrzk4ovtuae5uht7uhhigkrym2anmo5w5vp7xgq3zss76@s2uy3qindie4>
 <ZtCFP5w6yv/aykui@dread.disaster.area> <CALOAHbCssCSb7zF6VoKugFjAQcMACmOTtSCzd7n8oGfXdsxNsg@mail.gmail.com>
 <ZtPhAdqZgq6s4zmk@dread.disaster.area> <CALOAHbBEF=i7e+Zet-L3vEyQRcwmOn7b6vmut0-ae8_DQipOAw@mail.gmail.com>
 <ZtVzP2wfQoJrBXjF@tiehlicka> <CALOAHbAbzJL31jeGfXnbXmbXMpPv-Ak3o3t0tusjs-N-NHisiQ@mail.gmail.com>
 <ZtWArlHgX8JnZjFm@tiehlicka> <CALOAHbD=mzSBoNqCVf5TTOge4oTZq7Foxdv4H2U1zfBwjNoVKA@mail.gmail.com>
 <20240903124416.GE424729@mit.edu>
In-Reply-To: <20240903124416.GE424729@mit.edu>
From: Yafang Shao <laoar.shao@gmail.com>
Date: Tue, 3 Sep 2024 21:15:59 +0800
Message-ID: <CALOAHbCAN8KwgxoSw4Rg2Uuwp0=LcGY8WRMqLbpEP5MkW4H_XQ@mail.gmail.com>
Subject: Re: [PATCH] bcachefs: Switch to memalloc_flags_do() for vmalloc allocations
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Michal Hocko <mhocko@suse.com>, Dave Chinner <david@fromorbit.com>, 
	Kent Overstreet <kent.overstreet@linux.dev>, Matthew Wilcox <willy@infradead.org>, 
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, 
	linux-kernel@vger.kernel.org, Dave Chinner <dchinner@redhat.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Server: rspam06
X-Rspamd-Queue-Id: 07D7E14000D
X-Stat-Signature: wh5hnzbczi866d4y14anztjbp16dky9o
X-Rspam-User: 
X-HE-Tag: 1725369397-772419
X-HE-Meta: U2FsdGVkX1/wd3SOkni2/gyK6pQV11tB6S+mehKqdn1UYhmiWm4904Alj6b3/JFwzBr0jpIu6vEQBMviy4T9nQaSwoda8mErMUtBbaGvUMxu1sL4ZlPD1vI03GKKkMOsSqS+gOSy5Tlj5Mg0ZQNCMqushVc8XMxNzYcjtEBCGBgovAncTD5IUTOjiFHOS4KMfrVrkImJHAG7q11+cnZaSa9EMg5Q6UzVwW/29mCgyG0JwzPMI9z3LpWXCB7c/MoMElNLXkgBd3/GsPiHB/ac0b/uwUeVzNddz5SyY+ZzDFxtjh2tGTDK4aWslr18M0dUwAn3K/o64+lDNt2fmXZob7LtIum0Zqciq90Q49fOipCxNjZlFAZzwCFAFQOM8w9GjLy9QLjOxcxd2eYfpk7YC0atGEerFbGgqvaJMtFvehNyEeXhBN++IDRR1MAOUmZD2JmyYaBfi4gMJ3SMyPNUhYErwXNl/DhG23yMSo8y4Zr4en1WijVa89V6NYpQn+fXH1gnWRK07YeM4IjM4jzD/s8UYkYHQbP/oIfUVimdJIzGzdUueL6C/ukSRS/B3W9IUSv4ZSss/z4kr1m1JsTjDEiatzEhIh6fRYkAyXzzYdEErhRWRTm3utVlgIZqCNfKV4rSNvtpACbgn7gIX7Up0fcXF4Mkyf/c26mCKp1qPfpYpYDf3G6VAtMQxXNeg4TlerfdVslbrFDrGoymTBwdHiUP2C0vuvQfqOO9LqomYUP1baOkS4ZTYO8bQ193IVfATXD/83Ov74x/9vQE7oKws7JXs3QDLLWFOzax8Jj6odzSS5isE7OlN9OGoVde2T+HGrhBIWk8JCJ5wsymn6uvgfQUaYRqJ8cTa3wVDJ9al2JRv8f+bn9396mMtVfvMMh/6uhfAOpVgMfWOwCaV+seBzRuSf00wCHR3l3wHJwCBLEAUaR1mlGnBHEsgPBQh1ayJRAvGNYphvvseYN3+Wt
 JnRaLMWf
 3C6vMPFZkKgOHSfcieDa9tABpyieRxGF79FOtA5Cf8wiZ0Qo4ciIPPlzbay34tkEleYrK8rsdHCk4lpno2bUS1qA1Bi5TRxNpMK4i/mTltmNfiMBrj75FMaDSmnWi159SUcg/xY0JggSx7rL9ZWh9MLHg0NwkR2+k6k8YRFguJx+x7Urk4hRKO6/YGo8fNzfcir/ll1Xc33DVMmYV2WydUreeNfsODHYXUw0vCcAweu6mI/Eo4O+r7ecg9gFyn2YsX8BZr4CQRqVX4XpMNsganK+jMOfk9oZUr88QIx9jOWDG+eZrIWPmAWrHYLsdpGDbWRrXHj61fPDSzqIoQk+HGIB741sCEAGay1GPK2UMUWjAACIFhI4e95HW5jRySnzImnuexlB+8H4cMb8EVN2WAVfI1VC5w7dI8cr78lauyw7cmdg=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Tue, Sep 3, 2024 at 8:44=E2=80=AFPM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Sep 03, 2024 at 02:34:05PM +0800, Yafang Shao wrote:
> >
> > When setting GFP_NOFAIL, it's important to not only enable direct
> > reclaim but also the OOM killer. In scenarios where swap is off and
> > there is minimal page cache, setting GFP_NOFAIL without __GFP_FS can
> > result in an infinite loop. In other words, GFP_NOFAIL should not be
> > used with GFP_NOFS. Unfortunately, many call sites do combine them.
> > For example:
> >
> > XFS:
> >
> > fs/xfs/libxfs/xfs_exchmaps.c: GFP_NOFS | __GFP_NOFAIL
> > fs/xfs/xfs_attr_item.c: GFP_NOFS | __GFP_NOFAIL
> >
> > EXT4:
> >
> > fs/ext4/mballoc.c: GFP_NOFS | __GFP_NOFAIL
> > fs/ext4/extents.c: GFP_NOFS | __GFP_NOFAIL
> >
> > This seems problematic, but I'm not an FS expert. Perhaps Dave or Ted
> > could provide further insight.
>
> GFP_NOFS is needed because we need to signal to the mm layer to avoid
> recursing into file system layer --- for example, to clean a page by
> writing it back to the FS.  Since we may have taken various file
> system locks, recursing could lead to deadlock, which would make the
> system (and the user) sad.
>
> If the mm layer wants to OOM kill a process, that should be fine as
> far as the file system is concerned --- this could reclaim anonymous
> pages that don't need to be written back, for example.  And we don't
> need to write back dirty pages before the process killed.  So I'm a
> bit puzzled why (as you imply; I haven't dug into the mm code in
> question) GFP_NOFS implies disabling the OOM killer?

Refer to the out_of_memory() function [0]:

    if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
        return true;

[0]. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre=
e/mm/oom_kill.c#n1137

Is it possible that this check can be removed?

>
> Regards,
>
>                                         - Ted
>
> P.S.  Note that this is a fairly simplistic, very conservative set of
> constraints.  If you have several dozen file sysetems mounted, and
> we're deep in the guts of file system A, it might be *fine* to clean
> pages associated with file system B or file system C.  Unless of
> course, file system A is a loop-back mount onto a file located in file
> system B, in which case writing into file system A might require
> taking locks related to file system B.  But that aside, in theory we
> could allow certain types of page reclaim if we were willing to track
> which file systems are busy.
>
> On the other hand, if the system is allowed to get that busy,
> performance is going to be *terrible*, and so perhaps the better thing
> to do is to teach the container manager not to schedule so many jobs
> on the server in the first place, or having the mobile OS kill off
> applications that aren't in the foreground, or giving the OOM killer
> license to kill off jobs much earlier, etc.  By the time we get to the
> point where we are trying to use these last dozen or so pages, the
> system is going to be thrashing super-badly, and the user is going to
> be *quite* unhappy.  So arguably these problems should be solved much
> higher up the software stack, by not letting the system get into such
> a condition in the first place.

I completely agree with your point. However, in the real world, things
don't always work as expected, which is why it's crucial to ensure the
OOM killer is effective during system thrashing. Unfortunately, the
kernel's OOM killer doesn't always perform as expected, particularly
under heavy thrashing. This is one reason why user-space OOM killers
like oomd exist.


--
Regards
Yafang