From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F895C36017 for ; Wed, 2 Apr 2025 17:24:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8EF69280004; Wed, 2 Apr 2025 13:24:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 87706280001; Wed, 2 Apr 2025 13:24:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71A7D280004; Wed, 2 Apr 2025 13:24:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 54451280001 for ; Wed, 2 Apr 2025 13:24:15 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4998980497 for ; Wed, 2 Apr 2025 17:24:17 +0000 (UTC) X-FDA: 83289777354.02.A1132D9 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf29.hostedemail.com (Postfix) with ESMTP id F0F43120015 for ; Wed, 2 Apr 2025 17:24:14 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=F4+dWA8Y; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743614655; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MMVZTZdsjCij0P4cHkp+t6GHi08BvbXKpnaZVEfTiDY=; b=E1ZaRtLM57Yj/jvMu+M0P5c+rYYc7o0BmdoHnuoqBHHKcqc8bxCpBmAGP5izhPghaSb5Jr CbCNM6XsMlgckeEiukae2Yqba58rkEdzKO1aSxkq2hkR7XIqJKFTXhWZnksrwVfQfo9N0r lygpbGFyzVLC5SDHyNFEeE7Oouzrx94= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743614655; a=rsa-sha256; cv=none; b=YWTpFUhMl4slr/DgP5Ar8rQsB4/y0fxmKq45jIV6Sc1Xn/p6+tTnkg1dHPiOMXBjWHhM9s /P8H1tB5vGJ8osJPXeC9Z5uFB1pSJSO6aBGyVzD3Dv45UtguUrTxtKmGJ2Q11W/MbdfUWj HJ5ob6AGmhNUWxn0+LC5DpKLULYGolE= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=F4+dWA8Y; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=MMVZTZdsjCij0P4cHkp+t6GHi08BvbXKpnaZVEfTiDY=; b=F4+dWA8Y9u3OT/Z0M5fTeYt317 imvas6j05UwKhvcab0A+nPC9h0MjmxyQxoc7oktwY790F9QGaOU9OkRGvjVzE0iGlV5ZxdDgOKRxe S6T1g8faxTnJD7XGQbf3VC+iGa/yyAFxlRP/KYHS+C+eAL4cEkyds7Yi+vSfgsrTY7fh/4wP2sZkU KRycofvp84d86PsjLXxSIdIZ78+K0tDOzfppE2ZhuuzBCzjO9b06j2pDL5S4xc2D+GdxTF12tky6V ajLja1r04gHDAWR3ozhNo0M1kM3BaJ4Z5wH8gb9ajmEjL5c+S0n1VcKVyH3Unjl958HGvz15hVI32 sJ4oV7Nw==; Received: from willy by casper.infradead.org with local (Exim 4.98.1 #2 (Red Hat Linux)) id 1u01ow-00000009uU1-2mMM; Wed, 02 Apr 2025 17:24:10 +0000 Date: Wed, 2 Apr 2025 18:24:10 +0100 From: Matthew Wilcox To: Michal Hocko Cc: Dave Chinner , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org, Vlastimil Babka Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files Message-ID: References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: F0F43120015 X-Stat-Signature: pg4dft8x6hcqz7rtgocigez3gzwtc1aj X-HE-Tag: 1743614654-581872 X-HE-Meta: U2FsdGVkX1+7b0PN6SZAxUNSBwJCCs49HYrjPf7OrJs+Lo80C0cpMrS0Gp8K2JihPmF0AdjJ7mPi6dQZkz0gjeufmbs4NNpGp16Yi4W8CZ5H72zzdRH0RzOp5nwXeF3fgrqVVISe6Vr6QcYQAREMiFztuUpOzjLifuN9Unid77ijSMHJf8vqR1SLiDpiud1OZ4dR30Kk9Lx+942HtJJdBn9M6qoNgUN63AkMh2W+ggRzCdZ0yZ5G1tPDDebd+3KcNpTsPxChKWThoLlf5kH3WjB2P8sM7NtgMHS9A7JqOxehugmvqxXEmJKILqcMkhG7dYx2fvk9Mr2gxBhJLdahp+81tH2f03eSwrLph8zjq0pew+z5GsxhyTkqL4MnwxekRZZgnlE9cFazAPmCesBsOxyDtyn+LKG2hxSiKRaEAhIAce1QYOHXHgNrv7Pa0JZeRuJPo3wNmyplTnMl0uACdXbC1rx56TDJpAE6UUNOccrngdLUUh6oYfkQ2PYg9cWHdiE4Eh1hRxL1NjSMblFvK8KEOIirltFymYQfoFTRMAUHsUHyLQp60WxwKBw+ZBXdd0/pP4sVLzgYlnY6Tdv0GGMoUiU7QbQ8v7cByxRUgiwTkADOABD3rcrnhvWwUsfrWyvx/11hFCgRRKEgHca3NGNGjiLxBrH3ZvwT0lfKU8B9K9HOPlxK7UF7e8PUqW211oo1G+GI6BKVBrpLoT4l4uDAmiz/oRBVnIxbA+mH/GuDk4ISlAb6K/pQtMiHySuPcFfekQeVVqkEKq+79cjH242rD2YUNlCTgtCGGP5y4Yh/32PYHxo1gMjqevCRpPNxqazFhllKuiIJDPHAW4qTJrZCojX3ykArflt+L9EFm9W+YUOKw6AO2N+XdWLzFlOLIidNyZvFw30AFb5rdNnaH5Kcokn2bcSN3Ra9icYiqOnczXt574in6pzHWGmWkwKJqBjjR4JVIGwm+mkEf2B 1cXIUbU1 Nj2jfpDMqxSpWaNCPy9/eY1REQGBOdLszNSpT7UXjcgMHc+/dPuoEzB9iBWDFG3FvQ7W49bgBehnECz+eTysYVwXvSQMflhI0U709pmLxpGlQ96NiNfhfkmD3au/cQtetHvc/Ml/7CAA60ThmABLUEWUP5euxGDWzLQFtzux/My1n30zP8JNG5sySk4I76iix04FvARmidIkqg2+qW4ru00uhauIQ2tRBdQfdfkOjUjecqvk7aXsPqzfhvHAgjzf6LoHh8GQGbqnzGb5hOhf0b34cLpPiFaSFtKJJkJK4VLCUn/I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote: > On Wed 02-04-25 22:32:14, Dave Chinner wrote: > > > > > >+ /* > > > > > >+ * Use vmalloc if the count is too large to avoid costly high-order page > > > > > >+ * allocations. > > > > > >+ */ > > > > > >+ if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > > > > > >+ kbuf = kvzalloc(count + 1, GFP_KERNEL); > > > > > > > > > > Why not move this check into kvmalloc family? > > > > > > > > Hmm should this check really be in kvmalloc family? > > > > > > Modifying the existing kvmalloc functions risks performance regressions. > > > Could we instead introduce a new variant like vkmalloc() (favoring > > > vmalloc over kmalloc) or kvmalloc_costless()? > > > > We should fix kvmalloc() instead of continuing to force > > subsystems to work around the limitations of kvmalloc(). > > Agreed! > > > Have a look at xlog_kvmalloc() in XFS. It implements a basic > > fast-fail, no retry high order kmalloc before it falls back to > > vmalloc by turning off direct reclaim for the kmalloc() call. > > Hence if the there isn't a high-order page on the free lists ready > > to allocate, it falls back to vmalloc() immediately. ... but if vmalloc fails, it goes around again! This is exactly why we don't want filesystems implementing workarounds for MM problems. What a mess. > if (size > PAGE_SIZE) { > flags |= __GFP_NOWARN; > > if (!(flags & __GFP_RETRY_MAYFAIL)) > flags |= __GFP_NORETRY; > + else > + flags &= ~__GFP_DIRECT_RECLAIM; I think it might be better to do this: flags |= __GFP_NOWARN; if (!(flags & __GFP_RETRY_MAYFAIL)) flags |= __GFP_NORETRY; + else if (size > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) + flags &= ~__GFP_DIRECT_RECLAIM; I think it's entirely appropriate for a call to kvmalloc() to do direct reclaim if it's asking for, say, 16KiB and we don't have any of those available. Better than exacerbating the fragmentation problem by allocating 4x4KiB pages, each from different groupings.