From: Harry Yoo <harry.yoo@oracle.com>
To: Kees Cook <kees@kernel.org>
Cc: Yafang Shao <laoar.shao@gmail.com>,
joel.granados@kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>,
linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files
Date: Wed, 2 Apr 2025 13:15:14 +0900 [thread overview]
Message-ID: <Z-y50vEs_9MbjQhi@harry> (raw)
In-Reply-To: <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org>
On Tue, Apr 01, 2025 at 07:01:04AM -0700, Kees Cook wrote:
>
>
> On April 1, 2025 12:30:46 AM PDT, Yafang Shao <laoar.shao@gmail.com> wrote:
> >While investigating a kcompactd 100% CPU utilization issue in production, I
> >observed frequent costly high-order (order-6) page allocations triggered by
> >proc file reads from monitoring tools. This can be reproduced with a simple
> >test case:
> >
> > fd = open(PROC_FILE, O_RDONLY);
> > size = read(fd, buff, 256KB);
> > close(fd);
> >
> >Although we should modify the monitoring tools to use smaller buffer sizes,
> >we should also enhance the kernel to prevent these expensive high-order
> >allocations.
> >
> >Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> >Cc: Josef Bacik <josef@toxicpanda.com>
> >---
> > fs/proc/proc_sysctl.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> >diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> >index cc9d74a06ff0..c53ba733bda5 100644
> >--- a/fs/proc/proc_sysctl.c
> >+++ b/fs/proc/proc_sysctl.c
> >@@ -581,7 +581,15 @@ static ssize_t proc_sys_call_handler(struct kiocb *iocb, struct iov_iter *iter,
> > error = -ENOMEM;
> > if (count >= KMALLOC_MAX_SIZE)
> > goto out;
> >- kbuf = kvzalloc(count + 1, GFP_KERNEL);
> >+
> >+ /*
> >+ * Use vmalloc if the count is too large to avoid costly high-order page
> >+ * allocations.
> >+ */
> >+ if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> >+ kbuf = kvzalloc(count + 1, GFP_KERNEL);
>
> Why not move this check into kvmalloc family?
Hmm should this check really be in kvmalloc family?
I don't think users would expect kvmalloc() to implictly decide on using
vmalloc() without trying kmalloc() first, just because it's a high-order
allocation.
> >+ else
> >+ kbuf = vmalloc(count + 1);
>
> You dropped the zeroing. This must be vzalloc.
>
> > if (!kbuf)
> > goto out;
> >
>
> Alternatively, why not force count to be <PAGE_SIZE? What uses >PAGE_SIZE writes in proc/sys?
>
> -Kees
>
> --
> Kees Cook
--
Cheers,
Harry (formerly known as Hyeonggon)
next prev parent reply other threads:[~2025-04-02 4:15 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-01 7:30 [PATCH] proc: Avoid costly high-order page allocations when reading proc files Yafang Shao
2025-04-01 14:01 ` Kees Cook
2025-04-01 14:50 ` Yafang Shao
2025-04-02 4:15 ` Harry Yoo [this message]
2025-04-02 8:42 ` Yafang Shao
2025-04-02 9:25 ` Vlastimil Babka
2025-04-02 12:17 ` Michal Hocko
2025-04-02 18:25 ` Shakeel Butt
2025-04-02 11:32 ` Dave Chinner
2025-04-02 12:24 ` Michal Hocko
2025-04-02 17:24 ` Matthew Wilcox
2025-04-02 18:30 ` Shakeel Butt
2025-04-02 22:38 ` Dave Chinner
2025-04-02 21:16 ` Dave Chinner
2025-04-02 23:10 ` Shakeel Butt
2025-04-03 1:22 ` Dave Chinner
2025-04-03 3:32 ` Yafang Shao
2025-04-03 5:05 ` Shakeel Butt
2025-04-03 7:20 ` Michal Hocko
2025-04-03 4:37 ` Shakeel Butt
2025-04-03 7:22 ` Michal Hocko
2025-04-03 7:43 ` [PATCH] mm: kvmalloc: make kmalloc fast path real fast path Michal Hocko
2025-04-03 8:24 ` Vlastimil Babka
2025-04-03 8:59 ` Michal Hocko
2025-04-03 16:21 ` Kees Cook
2025-04-03 19:49 ` Michal Hocko
2025-04-04 15:33 ` Darrick J. Wong
2025-04-03 18:30 ` Shakeel Butt
2025-04-03 19:51 ` Michal Hocko
2025-04-09 1:10 ` Dave Chinner
2025-06-04 18:42 ` Matthew Wilcox
2025-04-09 7:35 ` Michal Hocko
2025-04-09 9:11 ` Vlastimil Babka
2025-04-09 12:20 ` Michal Hocko
2025-04-09 12:23 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-y50vEs_9MbjQhi@harry \
--to=harry.yoo@oracle.com \
--cc=joel.granados@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kees@kernel.org \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.