All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Ge <hao.ge@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Sourav Panda <souravpanda@google.com>,
	Abhishek Bapat <abhishekbapat@google.com>
Subject: Re: [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
Date: Mon, 25 May 2026 15:32:20 +0800	[thread overview]
Message-ID: <4ae038f0-cc33-4a60-b59b-ae86bb541735@linux.dev> (raw)
In-Reply-To: <20260522131108.f972659717367c67082f3766@linux-foundation.org>

Hi Andrew and Suren


On 2026/5/23 04:11, Andrew Morton wrote:
> On Fri, 22 May 2026 17:45:32 +0000 Abhishek Bapat <abhishekbapat@google.com> wrote:
>
>> Currently, memory allocation profiling data is primarily exposed through
>> /proc/allocinfo. While useful for manual inspection, this text-based
>> interface poses challenges for production monitoring and large-scale
>> analysis:
>>
>> 1. Userspace must parse large amounts of text to extract specific
>> fields.
>> 2. To find specific tags, userspace must read the entire dataset,
>> requiring many context switches and high data copying.
>> 3. The kernel currently aggregates per-CPU counters for every allocation
>> size, even those the user intends to filter out immediately.
>>
>> This series introduces a new IOCTL-based binary interface for allocinfo
>> that supports kernel-side filtering. By allowing the user to specify a
>> filter mask, we significantly reduce the work performed in-kernel and
>> the amount of data transferred to userspace.
>>
>> Performance measurements were conducted on an Intel Xeon Platinum 8481C
>> (224 CPUs) with caches dropped before each run.
>>
>> The IOCTL mechanism shows a ~20x performance improvement for
>> filtered queries. The kernel avoids the expensive per-CPU counter
>> aggregation (alloc_tag_read) for any tags that fail the initial string
>> or location filters.
>>
>> Scenario 1: Specific File Filtering (arch/x86/events/rapl.c)
>> 1. Traditional (cat /proc/allocinfo | grep): 22ms (sys)
>> 2. IOCTL Interface: 1ms (sys)
>>
>> Scenario 2: Compound Filtering (Filename + Size)
>> 1. Traditional: (cat ... | grep | awk): 21ms (sys)
>> 2. IOCTL Interface: 1ms (sys)
>>
>> Scenario 3: Size-Based Filtering (min_size = 1MB)
>> 1. Traditional: (cat ... | awk): 21ms (sys)
>> 2. IOCTL Interface: 14ms (sys)
> Yup, textual interfaces aren't fast.
>
> And ioctl-baed interfaces aren't popular.  One would prefer to see an
> interface which uses read()/lseek(), pread(), etc.  It would be
> appropriate for this [0/N] to have a discussion of why that approach
> was not chosen.
>
>>   .../userspace-api/ioctl/ioctl-number.rst      |   2 +
>>   MAINTAINERS                                   |   2 +
>>   include/linux/codetag.h                       |   1 +
>>   include/uapi/linux/alloc_tag.h                |  87 +++
>>   lib/alloc_tag.c                               | 303 ++++++++++-
>>   lib/codetag.c                                 |  11 +
>>   tools/testing/selftests/alloc_tag/Makefile    |   9 +
>>   .../alloc_tag/allocinfo_ioctl_test.c          | 505 ++++++++++++++++++
>>   8 files changed, 918 insertions(+), 2 deletions(-)
>>   create mode 100644 include/uapi/linux/alloc_tag.h
>>   create mode 100644 tools/testing/selftests/alloc_tag/Makefile
>>   create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
> At some point this should grow user-facing documentation, please.
>
> And the right time for that is now, because such documentation is
> useful for code review - it makes that review both easier and more
> useful.
>
> Sashiko had a few things to say:
>
> 	https://sashiko.dev/#/patchset/cover.1779471082.git.abhishekbapat@google.com

I notice that Sashiko has reported a pre-existing issue, as described below:


 >  static void *allocinfo_start(struct seq_file *m, loff_t *pos)
This is a pre-existing issue, but can resuming a sequential read on
/proc/allocinfo cause a use-after-free if a kernel module is unloaded
between read() system calls?
The seq_file read operation updates priv->iter.ct during allocinfo_next(),
stops iteration, and returns to userspace. If the module containing
priv->iter.ct is unloaded while the lock is dropped, the module's codetag
memory is freed.
On the next read() system call, allocinfo_start() with pos > 0 reacquires
the lock but returns priv without validating if priv->iter.ct still belongs
to a valid module. Does allocinfo_show() then dereference this dangling
pointer?
[ ... ]

This issue is unrelated to the current patch series and can be resolved

by reverting commit 9f44df50fee4.

Therefore, I have submitted a separate patch addressing this issue,

which is available at the link below:

https://lore.kernel.org/all/20260525072117.112779-1-hao.ge@linux.dev/

Thanks

Best Regards

Hao



  reply	other threads:[~2026-05-25  7:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 17:45 [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 1/6] alloc_tag: add ioctl to /proc/allocinfo Abhishek Bapat
2026-05-22 20:11   ` Andrew Morton
2026-06-03 19:53     ` Suren Baghdasaryan
2026-05-25  2:20   ` Hao Ge
2026-06-03 19:59     ` Suren Baghdasaryan
2026-05-22 17:45 ` [PATCH v2 2/6] alloc_tag: add ioctl filters " Abhishek Bapat
2026-05-25  2:59   ` Hao Ge
2026-06-04 23:53     ` Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 3/6] alloc_tag: add size-based filtering to ioctl Abhishek Bapat
2026-05-26  3:11   ` Hao Ge
2026-06-03 20:40     ` Suren Baghdasaryan
2026-05-22 17:45 ` [PATCH v2 4/6] alloc_tag: add accuracy based " Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 5/6] kselftest: alloc_tag: add kselftest for ioctl interface Abhishek Bapat
2026-05-22 17:45 ` [PATCH v2 6/6] kselftest: alloc_tag: extend the allocinfo ioctl kselftest Abhishek Bapat
2026-05-22 20:11 ` [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP Andrew Morton
2026-05-25  7:32   ` Hao Ge [this message]
2026-06-03 19:51     ` Suren Baghdasaryan
2026-06-04 18:24       ` Abhishek Bapat
2026-06-03 19:49   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ae038f0-cc33-4a60-b59b-ae86bb541735@linux.dev \
    --to=hao.ge@linux.dev \
    --cc=abhishekbapat@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=skhan@linuxfoundation.org \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.