From: Hao Ge <hao.ge@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
Shuah Khan <skhan@linuxfoundation.org>,
Jonathan Corbet <corbet@lwn.net>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, Sourav Panda <souravpanda@google.com>,
Abhishek Bapat <abhishekbapat@google.com>
Subject: Re: [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
Date: Mon, 25 May 2026 15:32:20 +0800 [thread overview]
Message-ID: <4ae038f0-cc33-4a60-b59b-ae86bb541735@linux.dev> (raw)
In-Reply-To: <20260522131108.f972659717367c67082f3766@linux-foundation.org>
Hi Andrew and Suren
On 2026/5/23 04:11, Andrew Morton wrote:
> On Fri, 22 May 2026 17:45:32 +0000 Abhishek Bapat <abhishekbapat@google.com> wrote:
>
>> Currently, memory allocation profiling data is primarily exposed through
>> /proc/allocinfo. While useful for manual inspection, this text-based
>> interface poses challenges for production monitoring and large-scale
>> analysis:
>>
>> 1. Userspace must parse large amounts of text to extract specific
>> fields.
>> 2. To find specific tags, userspace must read the entire dataset,
>> requiring many context switches and high data copying.
>> 3. The kernel currently aggregates per-CPU counters for every allocation
>> size, even those the user intends to filter out immediately.
>>
>> This series introduces a new IOCTL-based binary interface for allocinfo
>> that supports kernel-side filtering. By allowing the user to specify a
>> filter mask, we significantly reduce the work performed in-kernel and
>> the amount of data transferred to userspace.
>>
>> Performance measurements were conducted on an Intel Xeon Platinum 8481C
>> (224 CPUs) with caches dropped before each run.
>>
>> The IOCTL mechanism shows a ~20x performance improvement for
>> filtered queries. The kernel avoids the expensive per-CPU counter
>> aggregation (alloc_tag_read) for any tags that fail the initial string
>> or location filters.
>>
>> Scenario 1: Specific File Filtering (arch/x86/events/rapl.c)
>> 1. Traditional (cat /proc/allocinfo | grep): 22ms (sys)
>> 2. IOCTL Interface: 1ms (sys)
>>
>> Scenario 2: Compound Filtering (Filename + Size)
>> 1. Traditional: (cat ... | grep | awk): 21ms (sys)
>> 2. IOCTL Interface: 1ms (sys)
>>
>> Scenario 3: Size-Based Filtering (min_size = 1MB)
>> 1. Traditional: (cat ... | awk): 21ms (sys)
>> 2. IOCTL Interface: 14ms (sys)
> Yup, textual interfaces aren't fast.
>
> And ioctl-baed interfaces aren't popular. One would prefer to see an
> interface which uses read()/lseek(), pread(), etc. It would be
> appropriate for this [0/N] to have a discussion of why that approach
> was not chosen.
>
>> .../userspace-api/ioctl/ioctl-number.rst | 2 +
>> MAINTAINERS | 2 +
>> include/linux/codetag.h | 1 +
>> include/uapi/linux/alloc_tag.h | 87 +++
>> lib/alloc_tag.c | 303 ++++++++++-
>> lib/codetag.c | 11 +
>> tools/testing/selftests/alloc_tag/Makefile | 9 +
>> .../alloc_tag/allocinfo_ioctl_test.c | 505 ++++++++++++++++++
>> 8 files changed, 918 insertions(+), 2 deletions(-)
>> create mode 100644 include/uapi/linux/alloc_tag.h
>> create mode 100644 tools/testing/selftests/alloc_tag/Makefile
>> create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
> At some point this should grow user-facing documentation, please.
>
> And the right time for that is now, because such documentation is
> useful for code review - it makes that review both easier and more
> useful.
>
> Sashiko had a few things to say:
>
> https://sashiko.dev/#/patchset/cover.1779471082.git.abhishekbapat@google.com
I notice that Sashiko has reported a pre-existing issue, as described below:
> static void *allocinfo_start(struct seq_file *m, loff_t *pos)
This is a pre-existing issue, but can resuming a sequential read on
/proc/allocinfo cause a use-after-free if a kernel module is unloaded
between read() system calls?
The seq_file read operation updates priv->iter.ct during allocinfo_next(),
stops iteration, and returns to userspace. If the module containing
priv->iter.ct is unloaded while the lock is dropped, the module's codetag
memory is freed.
On the next read() system call, allocinfo_start() with pos > 0 reacquires
the lock but returns priv without validating if priv->iter.ct still belongs
to a valid module. Does allocinfo_show() then dereference this dangling
pointer?
[ ... ]
This issue is unrelated to the current patch series and can be resolved
by reverting commit 9f44df50fee4.
Therefore, I have submitted a separate patch addressing this issue,
which is available at the link below:
https://lore.kernel.org/all/20260525072117.112779-1-hao.ge@linux.dev/
Thanks
Best Regards
Hao
prev parent reply other threads:[~2026-05-25 7:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 17:45 [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 1/6] alloc_tag: add ioctl to /proc/allocinfo Abhishek Bapat
2026-05-22 20:11 ` Andrew Morton
2026-05-25 2:20 ` Hao Ge
2026-05-22 17:45 ` [PATCH v2 2/6] alloc_tag: add ioctl filters " Abhishek Bapat
2026-05-25 2:59 ` Hao Ge
2026-05-22 17:45 ` [PATCH v2 3/6] alloc_tag: add size-based filtering to ioctl Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 4/6] alloc_tag: add accuracy based " Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 5/6] kselftest: alloc_tag: add kselftest for ioctl interface Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 6/6] kselftest: alloc_tag: extend the allocinfo ioctl kselftest Abhishek Bapat
2026-05-22 20:11 ` [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP Andrew Morton
2026-05-25 7:32 ` Hao Ge [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ae038f0-cc33-4a60-b59b-ae86bb541735@linux.dev \
--to=hao.ge@linux.dev \
--cc=abhishekbapat@google.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=kent.overstreet@linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=skhan@linuxfoundation.org \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox