From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2748CD342E for ; Tue, 3 Sep 2024 13:31:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54CE68D016F; Tue, 3 Sep 2024 09:31:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FCE88D016E; Tue, 3 Sep 2024 09:31:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39E438D016F; Tue, 3 Sep 2024 09:31:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 15CED8D016E for ; Tue, 3 Sep 2024 09:31:03 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5A415A84B3 for ; Tue, 3 Sep 2024 13:31:02 +0000 (UTC) X-FDA: 82523512764.16.79667F4 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) by imf13.hostedemail.com (Postfix) with ESMTP id 4D7E520018 for ; Tue, 3 Sep 2024 13:31:00 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=Tp7r0Iha; spf=pass (imf13.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725370185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Kktsjhgg2t8wZvNqOBe4kVpKivr6aFxzWI5uzU3G4jY=; b=Azm0NofbwyPwQGCzSZqizAVIQmzrpj4I2lyAAWuGvwHO4y68aV+ypMcMwCibe2TYSKBdZt OsriFHalWDGOSYL4aqjnMGErDhPPSZNQ9UIXP+2lxDKrpywTwi2GjyPeABcXyifYLMnK+M 64tXHecaCsEE+RciDFrSe54sqZfYulc= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=Tp7r0Iha; spf=pass (imf13.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725370185; a=rsa-sha256; cv=none; b=Wn/vNXP530T0BqKRGutyO3bAcq++HKA8vHvAXauJ59eCYlpmyHGowmj92jRcFUK+SDFUhm x4gTcdjMSldalMCmc69QTeH8WF+xMcDNAUROA1qoTJvC9mr9KPoJJlTFzElxH2GVJwBqn9 NINhmk7sOJNkCSkVh7rZ1+XgglgxJlU= Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-a86abbd68ffso882435766b.0 for ; Tue, 03 Sep 2024 06:30:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1725370258; x=1725975058; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Kktsjhgg2t8wZvNqOBe4kVpKivr6aFxzWI5uzU3G4jY=; b=Tp7r0IhaFSBf/amJ25prWFz6tLTL4SPst4VkGbx1mb8TXdzhPOn+3x2pwgbca8htIL 87bEQIjzBbFN+FbSdlim6+Nzniw2y8CdCj8PAkATKlhJLSF3QoH00aUJj1RX3gTI7Mg1 I9UnovRBK00W/rO5hwXrf0NygtQtEvKL2FZi+RdcdLiSHA2yvqqLSM+QWwqlai6Z/zFA s1dgfxz5OtFeJWl/dckGf9MtDXQozZst9RmNO62+a76MM8iPr3RGfz4GX3mnjVq+Fat2 zDVOvoN4YcX1fTS3WOpei6qDJ3c9MkMdzQDIx5sDbUN5uv+Z0EgptCSLQIWB60LY3YJU 9GoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725370258; x=1725975058; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Kktsjhgg2t8wZvNqOBe4kVpKivr6aFxzWI5uzU3G4jY=; b=jRVUFpdwHlbBHss7odg9kHtQuR1+4ZH4DW9oFpRKcvL201qtrH1M8YkM3v2HwJttR2 6l4zxJ8QVSSrPd8ywVgcyXTZBanyCFwcNJk0uxMHdO9aL4U9WDNQcGCejE8s2G6tRW4p +WhWSHEmBhqf2PVzNtKLLgwkR1v0BoGThehHXJ34ABxn0w35P3rBO5I8BxKjM9d6eIKk IzxHIe655LSkPHZtOe4wab24P+TiyXdcpL4D0S3beIwRNFajJX9fQcd2q8qur08WTsd4 WgLZjEfVuIwGUmrbj2YN0O0GXnTaNi07BK7kYsaehpMYpJrsq3PNn+QasHcZurA9ehbo P1og== X-Forwarded-Encrypted: i=1; AJvYcCXKCm14jrYZTCXDrLxbN4xD34VJBBrXyuxWfb8pkk7I2KMc3eY8EF7RJY+AuFQp3I1vIXLXQZDepw==@kvack.org X-Gm-Message-State: AOJu0Yx+HKu5miIuYQqd9z5nmeyukK2ERVVG0VEyHglRofoVpASsz8gd jRAMB9e1dSjFH2pZ2XA4MlZ6dnGL8BHaGPtQ0H9/a7L5dohaZY0PhOBXGJmVGTY= X-Google-Smtp-Source: AGHT+IHnuu8zcNVorEycrG3KrlDXU05SQYlVYNfC+byPIiJdWhVTn2IFVlBjt9GH76DTYU2WVkVU2Q== X-Received: by 2002:a17:906:ee8c:b0:a75:1923:eb2e with SMTP id a640c23a62f3a-a89823ca0afmr1882078166b.14.1725370252922; Tue, 03 Sep 2024 06:30:52 -0700 (PDT) Received: from localhost ([193.86.92.181]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a898900f079sm683327166b.66.2024.09.03.06.30.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Sep 2024 06:30:52 -0700 (PDT) Date: Tue, 3 Sep 2024 15:30:52 +0200 From: Michal Hocko To: Theodore Ts'o Cc: Yafang Shao , Dave Chinner , Kent Overstreet , Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dave Chinner Subject: Re: [PATCH] bcachefs: Switch to memalloc_flags_do() for vmalloc allocations Message-ID: References: <20240903124416.GE424729@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240903124416.GE424729@mit.edu> X-Stat-Signature: 8ae8bxc7yio19g6f9wwms3a1xt1ion3o X-Rspam-User: X-Rspamd-Queue-Id: 4D7E520018 X-Rspamd-Server: rspam02 X-HE-Tag: 1725370260-589145 X-HE-Meta: U2FsdGVkX1+gbXK3zTGXYa8FM5MUzk/J6lfFYagzEfoshYCSYjwE2k05LaxBWewP9t4qmcADBHvje5Q6c7Jw9PLDlkL0as0T2v4A4IqOhUtQwQSj/J7SEb88V2K2YdqGxV7BfC5LD8JnNZ6jA1TLfxdFa0YAuzqGz4fM6AmlJ0fEdlh/OnQggnoRZBMN+W7/QsvfqV/aDubWiD5RTEdH80qHImWOHec3druq4cHtJLCWP3qRKZwcCYMUTDEkTkhlQASuXD301B5WeQz81NGuGIBBoAAun0qyV3nWHTaQg/JCDzQy6ctkcHK9A7KaCAE1wm+JOIS7oIc7xZ4fjCgzZYx+JAkYen2RU4Pyn6hDrm+IDtRJw6mTzjiZWGSYISoce3TY3shxn6P9r01hSDKtxXs0zc13cnpPkkndATvWwOqeM25KP+LSMqjDUNT2A77i6bTGc4NOblwRyWrJ9VKaaKzAbJX2qdR25WTSyEmFObNU1mQ41DkdF0tG9YsBitWVvd/R+f5esbWDSl62EccqcE9Uyf2u8MaQFztqDng48967uXw37H9F89VjZ/VrDn5BVIjiUK+Xr7wBj55wz9bmAuQlW0WDEMsH4vKWSocP28YvXudMkX8MpCzJQBZjwBnq2eMGilhVNoWdKsNnFB4LhZwEEnrmPWRt8izOawfAemwhIRAXw9hP7fa1kPgb6ZKLG1eKMGWtdBLx4VYkZZ7qjDfTZ9Ej1SqL2Pv/vfBrx/d6hdUgMxSk9Nq4c9BgBR9vspbiWeoU7Ighk5ADnuCVGx+Y5kcMq9RWC8/T+NpTNC1w2XxD3qd9oFjri2DhAPoUK/F5wfkNdkYPAdD6X8xL212oLUC4hEhdm4T9VxonyRl0M389Mu7KppPQ08hdVGCJbkpzIpd4uccdBakEChzudUuo879QinprlzxmFvRbXnJKrUYf/+AMV2fuAFbzMR4a8NgVLScnbPEw91NMrH6 hoxdwH4q YLWvEt1PMn9fQoKpkfEtA7Ex7nNTHRr55Dh4Yi455Ice0HUzLR07OPzGiALvEJVC1LLmb6Kd5mY4up9YsUn1/mjFZZXDrV06J0kixOUfXc0lFZIzWYxgjxyGbmgUpjTSQLYRG79h5XbQMIXlFPwuCmut3nh9ejXB+ZH5Ws22r3qq+qmsTpDzhhwTBlOapC8A+swq86cluP3ghz/0DEqADeb/rzxoyUl49USCd8zY+sh1F3jPgcac29Nq2Kw5+CZq/7dJ2QNbrDJWHqTUYAIDxEaA7ahe13hh4mo9TQCFN4Z0guru6S+omItbjSpQG1FWvXv2IO4pzlVGUpa/l84YVsgrmzrJug1vdiDT5vEU6NWTPD748TqEa6im88xv2fGEqTwvqlelXCOZbbhA8G8sorWPk7aMvIDBK8zP6DxQnC+JHErY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 03-09-24 08:44:16, Theodore Ts'o wrote: > On Tue, Sep 03, 2024 at 02:34:05PM +0800, Yafang Shao wrote: > > > > When setting GFP_NOFAIL, it's important to not only enable direct > > reclaim but also the OOM killer. In scenarios where swap is off and > > there is minimal page cache, setting GFP_NOFAIL without __GFP_FS can > > result in an infinite loop. In other words, GFP_NOFAIL should not be > > used with GFP_NOFS. Unfortunately, many call sites do combine them. > > For example: > > > > XFS: > > > > fs/xfs/libxfs/xfs_exchmaps.c: GFP_NOFS | __GFP_NOFAIL > > fs/xfs/xfs_attr_item.c: GFP_NOFS | __GFP_NOFAIL > > > > EXT4: > > > > fs/ext4/mballoc.c: GFP_NOFS | __GFP_NOFAIL > > fs/ext4/extents.c: GFP_NOFS | __GFP_NOFAIL > > > > This seems problematic, but I'm not an FS expert. Perhaps Dave or Ted > > could provide further insight. > > GFP_NOFS is needed because we need to signal to the mm layer to avoid > recursing into file system layer --- for example, to clean a page by > writing it back to the FS. Since we may have taken various file > system locks, recursing could lead to deadlock, which would make the > system (and the user) sad. > > If the mm layer wants to OOM kill a process, that should be fine as > far as the file system is concerned --- this could reclaim anonymous > pages that don't need to be written back, for example. And we don't > need to write back dirty pages before the process killed. So I'm a > bit puzzled why (as you imply; I haven't dug into the mm code in > question) GFP_NOFS implies disabling the OOM killer? Yes, because there might be a lot of fs pages pinned while performing NOFS allocation and that could fire the OOM killer way too prematurely. This has been quite some time ago since this was introduced but I do remember workloads hitting that. Also there is usually kswapd making sufficient progress to move forward. There are cases where kswapd is completely stuck and other __GFP_FS allocations triggering full direct reclaim or background kworkers freeing some memory and OOM killer doesn't have good enough picture to make an educated guess the oom killer is the only available way forward. A typical example would be a workload that would care is trashing but still making a slow progress which is acceptable which is acceptable because the most important workload makes a decent progress (the working set fits in or is mlocked) and rebuilding the state is more harmfull than a slow IO. -- Michal Hocko SUSE Labs