Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH RFC v4 0/6] iio: add Open Sensor Fusion IIO driver
From: Kim Jinseob @ 2026-06-08 23:27 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Jonathan Cameron, linux-iio, David Lechner, Nuno Sá,
	Andy Shevchenko, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Shuah Khan, devicetree, linux-kernel, linux-doc
In-Reply-To: <20260608-catnap-thinness-e25c9b8983c3@spud>

> Other than the fact that new revisions must not be sent as a diff on top
> of a prior revision, please stop sending new versions without actually
> replying to my v1 comments.

You are right, sorry for the noise.

I made a process mistake here. I prepared v4 on top of the previously sent
series instead of preparing it as a full standalone replacement series from
the proper base. I will not ask you to review v4 in that form, and I will
prepare the next revision as a full series from a clean base.

I also should have answered your protocol versioning questions directly before
sending another revision.

> What does "v0" mean here? Is the data format not complete yet?
> Are versions of the protocol likely to be backwards compatible?
> Will the device identify what version of the protocol it implements?

The current OSF wire header starts with a fixed 4-byte magic, "OSF0", at a
fixed offset. The same header also carries explicit protocol_major and
protocol_minor fields at fixed offsets.

For the currently supported firmware and driver, protocol_major is 0. The "0"
in "OSF0" is intended to denote the current major wire-format revision, not
the Linux driver identity.

The binding is intended for devices implementing this discoverable OSF header
layout. The driver currently supports protocol major version 0. Minor version
changes are intended to be backward compatible. Incompatible wire-format
changes require a new protocol_major value.

If a future major revision cannot be discovered using the same fixed header
layout, or is not compatible with this binding, it should use a new compatible
string.

I will spell this out in the binding commit message and documentation in the
next revision.

Jinseob

^ permalink raw reply

* Re: [RFC PATCH v1 00/13] exec: add spawn templates for repeated executable startup
From: Andy Lutomirski @ 2026-06-09  0:01 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Li Chen, Kees Cook, Alexander Viro, linux-fsdevel, linux-api,
	linux-kernel, linux-mm, linux-arch, linux-doc, linux-kselftest,
	x86, Arnd Bergmann, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Jan Kara,
	Jonathan Corbet, Shuah Khan
In-Reply-To: <20260528-madig-fachrichtung-fehlinformation-61117ba640da@brauner>

On Thu, May 28, 2026 at 4:05 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, May 28, 2026 at 05:52:21PM +0800, Li Chen wrote:
> > Hi,
> >
> > This is an early RFC for an idea that is probably still rough in both the
> > UAPI and implementation details. Sorry for the rough edges; I am sending
> > it now to check whether this direction is worth pursuing and to get
> > feedback on the kernel/userspace boundary.
>
> The idea of having a builder api for exec isn't all that crazy. But it
> should simply be built on top of pidfds and thus pidfs itself instead.
> It has all the basic infrastructure in place already. Any implementation
> should also allow userspace to implement posix_spawn() on top of it.
>
> fd = pidfd_open(0, PIDFD_EMPTY /* or better name */)
>
> pidfd_config(fd, ...) // modeled similar to fsconfig()
>

After contemplating this for a bit... why pidfd?  Doesn't a pidfd
refer to an actual process that is, or at least was, running?  This
new thing is a process that we are contemplating spawning.  I can
imagine that basically all pidfd APIs would be a bit confused by the
nonexistence of the process in question.

^ permalink raw reply

* Re: [PATCH v3 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
From: Abhishek Bapat @ 2026-06-09  0:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Suren Baghdasaryan, Kent Overstreet, Hao Ge, Shuah Khan,
	Jonathan Corbet, linux-doc, linux-kernel, linux-mm, Sourav Panda
In-Reply-To: <20260605170858.9ee9ca2181a041bb9a4c3098@linux-foundation.org>

On Fri, Jun 5, 2026 at 5:09 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Fri,  5 Jun 2026 23:36:45 +0000 Abhishek Bapat <abhishekbapat@google.com> wrote:
>
> > Currently, memory allocation profiling data is primarily exposed through
> > /proc/allocinfo. While useful for manual inspection, this text-based
> > interface poses challenges for production monitoring and large-scale
> > analysis:
> >
> > 1. Userspace must parse large amounts of text to extract specific
> > fields.
> > 2. To find specific tags, userspace must read the entire dataset,
> > requiring many context switches and high data copying.
> > 3. The kernel currently aggregates per-CPU counters for every allocation
> > size, even those the user intends to filter out immediately.
> >
> > This series introduces a new IOCTL-based binary interface for allocinfo
> > that supports kernel-side filtering. By allowing the user to specify a
> > filter mask, we significantly reduce the work performed in-kernel and
> > the amount of data transferred to userspace.
>
> Thanks.  AI review found several things - you'll want to address at
> least the first few.
>
>         https://sashiko.dev/#/patchset/cover.1780701922.git.abhishekbapat@google.com

All, please note I missed attaching the reason for choosing the IOCTL
mechanism to this cover letter, but I will attach it to the v4
patchset cover letter along with other changes. Thanks!

^ permalink raw reply

* Re: [PATCH v3 1/6] alloc_tag: add ioctl to /proc/allocinfo
From: Abhishek Bapat @ 2026-06-09  0:19 UTC (permalink / raw)
  To: Hao Ge
  Cc: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Shuah Khan,
	Jonathan Corbet, linux-doc, linux-kernel, linux-mm, Sourav Panda
In-Reply-To: <41a7ebb9-1113-4f13-abbf-6f55d99d62f3@linux.dev>

On Sun, Jun 7, 2026 at 6:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>
> Hi Suren and Abhishek
>
>
> Thanks for the new version.
>
>
> On 2026/6/6 07:36, Abhishek Bapat wrote:
> > From: Suren Baghdasaryan <surenb@google.com>
> >
> > Add the following ioctl commands for /proc/allocinfo file:
> >
> > ALLOCINFO_IOC_CONTENT_ID - gets content identifier which can be used
> > to check whether the file content has changed specifically due to module
> > load/unload. Every time a module is loaded / unloaded, the returned
> > value will be different. By comparing the identifier value at the
> > beginning and at the end of the content retrieval operation, users can
> > validate retrieved information for consistency.
> >
> > ALLOCINFO_IOC_GET_AT - gets the record at the specified position. This
> > is the position of a record in /proc/allocinfo.
> >
> > ALLOCINFO_IOC_GET_NEXT - gets the record next to the last retrieved
> > one. If no records were previously retrieved, returns the first
> > record.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
> > ---
> >   Documentation/mm/allocation-profiling.rst     |   5 +
> >   .../userspace-api/ioctl/ioctl-number.rst      |   2 +
> >   MAINTAINERS                                   |   1 +
> >   include/linux/codetag.h                       |   2 +
> >   include/uapi/linux/alloc_tag.h                |  54 ++++
> >   lib/alloc_tag.c                               | 232 +++++++++++++++++-
> >   lib/codetag.c                                 |  18 ++
> >   7 files changed, 312 insertions(+), 2 deletions(-)
> >   create mode 100644 include/uapi/linux/alloc_tag.h
> >
> > diff --git a/Documentation/mm/allocation-profiling.rst b/Documentation/mm/allocation-profiling.rst
> > index 5389d241176a..c3a28467955f 100644
> > --- a/Documentation/mm/allocation-profiling.rst
> > +++ b/Documentation/mm/allocation-profiling.rst
> > @@ -46,6 +46,11 @@ sysctl:
> >   Runtime info:
> >     /proc/allocinfo
> >
> > +  Profiling data can be retrieved either by reading `/proc/allocinfo` directly as
> > +  text or programmatically via `ioctl()` calls defined in `<uapi/linux/alloc_tag.h>`.
> > +  The ioctl interface supports structured binary data extraction as well as filtering
> > +  by module name, function, file, line number, accuracy, or allocation size limits.
> > +
> >   Example output::
> >
> >     root@moria-kvm:~# sort -g /proc/allocinfo|tail|numfmt --to=iec
> > diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> > index 331223761fff..84f6808a8578 100644
> > --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> > +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> > @@ -349,6 +349,8 @@ Code  Seq#    Include File                                             Comments
> >                                                                          <mailto:luzmaximilian@gmail.com>
> >   0xA5  20-2F  linux/surface_aggregator/dtx.h                            Microsoft Surface DTX driver
> >                                                                          <mailto:luzmaximilian@gmail.com>
> > +0xA6  00-0F  uapi/linux/alloc_tag.h                                    Memory allocation profiling
> > +                                                                       <mailto:surenb@google.com>
> >   0xAA  00-3F  linux/uapi/linux/userfaultfd.h
> >   0xAB  00-1F  linux/nbd.h
> >   0xAC  00-1F  linux/raw.h
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index a31f6f207afd..77f3fc487691 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -16711,6 +16711,7 @@ S:    Maintained
> >   F:  Documentation/mm/allocation-profiling.rst
> >   F:  include/linux/alloc_tag.h
> >   F:  include/linux/pgalloc_tag.h
> > +F:   include/uapi/linux/alloc_tag.h
> >   F:  lib/alloc_tag.c
> >
> >   MEMORY CONTROLLER DRIVERS
> > diff --git a/include/linux/codetag.h b/include/linux/codetag.h
> > index ddae7484ca45..a25a085c2df1 100644
> > --- a/include/linux/codetag.h
> > +++ b/include/linux/codetag.h
> > @@ -77,6 +77,8 @@ struct codetag_iterator {
> >   void codetag_lock_module_list(struct codetag_type *cttype);
> >   bool codetag_trylock_module_list(struct codetag_type *cttype);
> >   void codetag_unlock_module_list(struct codetag_type *cttype);
> > +unsigned long codetag_get_content_id(struct codetag_type *cttype);
> > +unsigned int codetag_get_count(struct codetag_type *cttype);
> >   struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
> >   struct codetag *codetag_next_ct(struct codetag_iterator *iter);
> >
> > diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
> > new file mode 100644
> > index 000000000000..901199bad514
> > --- /dev/null
> > +++ b/include/uapi/linux/alloc_tag.h
> > @@ -0,0 +1,54 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/*
> > + *  include/linux/alloc_tag.h
>
> nit: it should be include/uapi/linux/alloc_tag.h
>
> (I guess you may have missed the comment I brought up before. It is not
> a critical problem though.)
>
Apologies, I missed that comment earlier. Included in the v4 patchset.
Thanks for bringing this up.

> > + */
> > +
> > +#ifndef _UAPI_ALLOC_TAG_H
> > +#define _UAPI_ALLOC_TAG_H
> > +
> > +#include <linux/types.h>
> > +
> > +#define ALLOCINFO_STR_SIZE   64
> > +
> > +struct allocinfo_content_id {
> > +     __u64 id;
> > +};
> > +
> > +struct allocinfo_tag {
> > +     /* Longer names are trimmed */
> > +     char modname[ALLOCINFO_STR_SIZE];
> > +     char function[ALLOCINFO_STR_SIZE];
> > +     char filename[ALLOCINFO_STR_SIZE];
> > +     __u64 lineno;
> > +};
> > +
> > +/* The alignment ensures 32-bit compatible interfaces are not broken */
> > +struct allocinfo_counter {
> > +     __u64 bytes;
> > +     __u64 calls;
> > +     __u8 accurate;
> > +} __attribute__((aligned(8)));
> > +
> > +struct allocinfo_tag_data {
> > +     struct allocinfo_tag tag;
> > +     struct allocinfo_counter counter;
> > +};
> > +
> > +struct allocinfo_get_at {
> > +     __u64 pos;      /* input */
> > +     struct allocinfo_tag_data data;
> > +};
> > +
> > +#define _ALLOCINFO_IOC_CONTENT_ID    0
> > +#define _ALLOCINFO_IOC_GET_AT                1
> > +#define _ALLOCINFO_IOC_GET_NEXT              2
> > +
> > +#define ALLOCINFO_IOC_BASE           0xA6
> > +#define ALLOCINFO_IOC_CONTENT_ID     _IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_CONTENT_ID,     \
> > +                                          struct allocinfo_content_id)
> > +#define ALLOCINFO_IOC_GET_AT         _IOWR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_AT,        \
> > +                                           struct allocinfo_get_at)
> > +#define ALLOCINFO_IOC_GET_NEXT               _IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_NEXT,       \
> > +                                          struct allocinfo_tag_data)
> > +
> > +#endif /* _UAPI_ALLOC_TAG_H */
> > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > index d9be1cf5187d..a0577215eb3d 100644
> > --- a/lib/alloc_tag.c
> > +++ b/lib/alloc_tag.c
> > @@ -5,6 +5,7 @@
> >   #include <linux/gfp.h>
> >   #include <linux/kallsyms.h>
> >   #include <linux/module.h>
> > +#include <linux/mutex.h>
> >   #include <linux/page_ext.h>
> >   #include <linux/pgalloc_tag.h>
> >   #include <linux/proc_fs.h>
> > @@ -14,6 +15,7 @@
> >   #include <linux/string_choices.h>
> >   #include <linux/vmalloc.h>
> >   #include <linux/kmemleak.h>
> > +#include <uapi/linux/alloc_tag.h>
> >
> >   #define ALLOCINFO_FILE_NAME         "allocinfo"
> >   #define MODULE_ALLOC_TAG_VMAP_SIZE  (100000UL * sizeof(struct alloc_tag))
> > @@ -47,6 +49,10 @@ struct allocinfo_private {
> >       struct codetag_iterator iter;
> >       struct codetag_iterator reported_iter;
> >       bool print_header;
> > +     /* ioctl uses a separate iterator not to interfere with reads */
> > +     struct codetag_iterator ioctl_iter;
> > +     bool positioned; /* seq_open_private() sets to 0 */
> > +     struct mutex ioctl_lock;
> >   };
> >
> >   static void *allocinfo_start(struct seq_file *m, loff_t *pos)
> > @@ -130,6 +136,229 @@ static const struct seq_operations allocinfo_seq_op = {
> >       .show   = allocinfo_show,
> >   };
> >
> > +/*
> > + * Initializes seq_file operations and allocates private state when opening
> > + * the /proc/allocinfo procfs entry.
> > + */
> > +static int allocinfo_open(struct inode *inode, struct file *file)
> > +{
> > +     int ret;
> > +
> > +     ret = seq_open_private(file, &allocinfo_seq_op,
> > +                            sizeof(struct allocinfo_private));
> > +     if (!ret) {
> > +             struct seq_file *m = file->private_data;
> > +             struct allocinfo_private *priv = m->private;
> > +
> > +             mutex_init(&priv->ioctl_lock);
> > +     }
> > +     return ret;
> > +}
> > +
> > +/*
> > + * Cleans up the seq_file state and frees up the private state allocated in
> > + * allocinfo_open() when closing the /proc/allocinfo file descriptor.
> > + */
> > +static int allocinfo_release(struct inode *inode, struct file *file)
> > +{
> > +     return seq_release_private(inode, file);
> > +}
> > +
> > +/*
> > + * Returns a pointer to the suffix of a string so that its length fits within
> > + * ALLOCINFO_STR_SIZE, preserving the trailing characters.
> > + */
> > +static const char *allocinfo_str(const char *str)
> > +{
> > +     size_t len = strlen(str);
> > +
> > +     /* Keep an extra space for the trailing NULL. */
> > +     if (len >= ALLOCINFO_STR_SIZE)
> > +             str += (len - ALLOCINFO_STR_SIZE) + 1;
> > +     return str;
> > +}
> > +
> > +/* Copy a string and trim from the beginning if it's too long */
> > +static void allocinfo_copy_str(char *dest, const char *src)
> > +{
> > +     strscpy_pad(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE);
> > +}
> > +
> > +/*
> > + * Populates the UAPI allocinfo_tag_data structure with active runtime
> > + * profiling counters extracted from the given kernel codetag.
> > + */
> > +static void allocinfo_to_params(struct codetag *ct,
> > +                             struct allocinfo_tag_data *data)
> > +{
> > +     struct alloc_tag *tag = ct_to_alloc_tag(ct);
> > +     struct alloc_tag_counters counter = alloc_tag_read(tag);
> > +
> > +     if (ct->modname)
> > +             allocinfo_copy_str(data->tag.modname, ct->modname);
> > +     else
> > +             data->tag.modname[0] = '\0';
> > +     allocinfo_copy_str(data->tag.function, ct->function);
> > +     allocinfo_copy_str(data->tag.filename, ct->filename);
> > +     data->tag.lineno = ct->lineno;
> > +     data->counter.bytes = counter.bytes;
> > +     data->counter.calls = counter.calls;
> > +     data->counter.accurate = !alloc_tag_is_inaccurate(tag);
> > +}
> > +
> > +/*
> > + * Retrieves the unique content ID representing the current allocation tag module
> > + * layout, allowing userspace to detect if modules were loaded / unloaded.
> > + */
> > +static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
> > +{
> > +     struct allocinfo_content_id params;
> > +
> > +     codetag_lock_module_list(alloc_tag_cttype);
> > +     params.id = codetag_get_content_id(alloc_tag_cttype);
> > +     codetag_unlock_module_list(alloc_tag_cttype);
> > +     if (copy_to_user(arg, &params, sizeof(params)))
> > +             return -EFAULT;
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Seeks the ioctl iterator to the specified 0-indexed tag position, reads its
> > + * profiling data and returns it to userspace.
> > + */
> > +static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
> > +{
> > +     struct allocinfo_private *priv;
> > +     struct codetag *ct;
> > +     __u64 pos;
> > +     struct allocinfo_get_at params = {0};
> > +
> > +     if (copy_from_user(&params, arg, sizeof(params)))
> > +             return -EFAULT;
> > +
> > +     priv = m->private;
> > +     pos = params.pos;
> > +
> > +     mutex_lock(&priv->ioctl_lock);
> > +     codetag_lock_module_list(alloc_tag_cttype);
> > +
> > +     if (pos >= codetag_get_count(alloc_tag_cttype)) {
> > +             codetag_unlock_module_list(alloc_tag_cttype);
> > +             mutex_unlock(&priv->ioctl_lock);
> > +             return -ENOENT;
> > +     }
> > +
> > +     /* Find the codetag */
> > +     priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
> > +     ct = codetag_next_ct(&priv->ioctl_iter);
> > +     while (ct && pos--)
> > +             ct = codetag_next_ct(&priv->ioctl_iter);
> > +     if (ct) {
> > +             allocinfo_to_params(ct, &params.data);
> > +             priv->positioned = true;
> > +     }
> > +
> > +     codetag_unlock_module_list(alloc_tag_cttype);
> > +     mutex_unlock(&priv->ioctl_lock);
> > +
> > +     if (!ct)
> > +             return -ENOENT;
> > +
> > +     if (copy_to_user(arg, &params, sizeof(params)))
> > +             return -EFAULT;
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Advances the ioctl iterator to the next allocation tag in the sequence and
> > + * returns its profiling data to userspace.
> > + */
> > +static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
> > +{
> > +     struct allocinfo_private *priv;
> > +     struct codetag *ct;
> > +     struct allocinfo_tag_data params;
> > +     int ret = 0;
> > +
> > +     memset(&params, 0, sizeof(params));
> > +     priv = m->private;
> > +
> > +     mutex_lock(&priv->ioctl_lock);
> > +     codetag_lock_module_list(alloc_tag_cttype);
> > +
> > +     if (!priv->positioned) {
> > +             priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
> > +             priv->positioned = true;
> > +     }
> > +
> > +     ct = codetag_next_ct(&priv->ioctl_iter);
> > +     if (ct)
> > +             allocinfo_to_params(ct, &params);
> > +
> > +     if (!ct) {
> > +             priv->positioned = false;
> > +             ret = -ENOENT;
> > +     }
> > +     codetag_unlock_module_list(alloc_tag_cttype);
> > +     mutex_unlock(&priv->ioctl_lock);
> > +
> > +     if (ret == 0) {
> > +             if (copy_to_user(arg, &params, sizeof(params)))
> > +                     return -EFAULT;
> > +     }
> > +     return ret;
> > +}
> > +
> > +/*
> > + * Entry point ioctl function for /proc/allocinfo routing requests to fetch the
> > + * layout content ID, seek to a specific tag, or read sequential tags.
> > + */
> > +static long allocinfo_ioctl(struct file *file, unsigned int cmd,
> > +                         unsigned long __arg)
> > +{
> > +     void __user *arg = (void __user *)__arg;
> > +     int ret;
> > +
> > +     switch (cmd) {
> > +     case ALLOCINFO_IOC_CONTENT_ID:
> > +             ret = allocinfo_ioctl_get_content_id(file->private_data, arg);
> > +             break;
> > +     case ALLOCINFO_IOC_GET_AT:
> > +             ret = allocinfo_ioctl_get_at(file->private_data, arg);
> > +             break;
> > +     case ALLOCINFO_IOC_GET_NEXT:
> > +             ret = allocinfo_ioctl_get_next(file->private_data, arg);
> > +             break;
> > +     default:
> > +             ret = -ENOIOCTLCMD;
> > +             break;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> > +#ifdef CONFIG_COMPAT
> > +static long allocinfo_compat_ioctl(struct file *file, unsigned int cmd,
> > +                                unsigned long arg)
> > +{
> > +     return allocinfo_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
> > +}
> > +#endif
> > +
> > +static const struct proc_ops allocinfo_proc_ops = {
> > +     .proc_open              = allocinfo_open,
> > +     .proc_read_iter         = seq_read_iter,
> > +     .proc_lseek             = seq_lseek,
> > +     .proc_release           = allocinfo_release,
> > +     .proc_ioctl             = allocinfo_ioctl,
> > +#ifdef CONFIG_COMPAT
> > +     .proc_compat_ioctl      = allocinfo_compat_ioctl,
> > +#endif
> > +
> > +};
> > +
> >   size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sleep)
> >   {
> >       struct codetag_iterator iter;
> > @@ -993,8 +1222,7 @@ static int __init alloc_tag_init(void)
> >               return 0;
> >       }
> >
> > -     if (!proc_create_seq_private(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_seq_op,
> > -                                  sizeof(struct allocinfo_private), NULL)) {
> > +     if (!proc_create(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_proc_ops)) {
> >               pr_err("Failed to create %s file\n", ALLOCINFO_FILE_NAME);
> >               shutdown_mem_profiling(false);
> >               return -ENOMEM;
> > diff --git a/lib/codetag.c b/lib/codetag.c
> > index 4001a7ea6675..a9cda4c962a3 100644
> > --- a/lib/codetag.c
> > +++ b/lib/codetag.c
> > @@ -19,6 +19,8 @@ struct codetag_type {
> >       struct codetag_type_desc desc;
> >       /* generates unique sequence number for module load */
> >       unsigned long next_mod_seq;
> > +     /* bumped on every module load and unload */
> > +     unsigned long content_id;
> >   };
> >
> >   struct codetag_range {
> > @@ -50,6 +52,20 @@ void codetag_unlock_module_list(struct codetag_type *cttype)
> >       up_read(&cttype->mod_lock);
> >   }
> >
> > +unsigned long codetag_get_content_id(struct codetag_type *cttype)
> > +{
> > +     lockdep_assert_held(&cttype->mod_lock);
> > +
> > +     return cttype->content_id;
> > +}
> > +
> > +unsigned int codetag_get_count(struct codetag_type *cttype)
> > +{
> > +     lockdep_assert_held(&cttype->mod_lock);
> > +
> > +     return cttype->count;
> > +}
> > +
> >   struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype)
> >   {
> >       struct codetag_iterator iter = {
> > @@ -204,6 +220,7 @@ static int codetag_module_init(struct codetag_type *cttype, struct module *mod)
> >
> >       down_write(&cttype->mod_lock);
> >       cmod->mod_seq = ++cttype->next_mod_seq;
> > +     ++cttype->content_id;
>
> I have a comment on the content_id bump placement.
>
> ++cttype->content_id is placed before idr_alloc and the module_load
>
> callback. If idr_alloc fails or module_load returns an error
>
> (While the chance of this occurring is very low.), the idr entry gets
>
> rolled back but content_id has already been bumped. The actual
>
> content didn't change in this case, so userspace would see a
>
> different content_id and assume the data is inconsistent when it
>
> isn't.
>
>
> Thanks
>
> Best Regards
>
> Hao

While I agree with your comment, I decided to place the counter
increment there because the chance of failure is low. Furthermore,
even if it falsely invalidates user data, the user will simply query
the content again. This placement also aligns with where the
previously used field (cttype->next_mod_seq) was incremented. Let me
know if you still think I should move it. Thanks!

>
> >       mod_id = idr_alloc(&cttype->mod_idr, cmod, 0, 0, GFP_KERNEL);
> >       if (mod_id >= 0) {
> >               if (cttype->desc.module_load) {
> > @@ -368,6 +385,7 @@ void codetag_unload_module(struct module *mod)
> >                       cttype->count -= range_size(cttype, &cmod->range);
> >                       idr_remove(&cttype->mod_idr, mod_id);
> >                       kfree(cmod);
> > +                     ++cttype->content_id;
> >               }
> >               up_write(&cttype->mod_lock);
> >               if (found && cttype->desc.free_section_mem)

^ permalink raw reply

* Re: [PATCH v3 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
From: Suren Baghdasaryan @ 2026-06-09  0:29 UTC (permalink / raw)
  To: Abhishek Bapat
  Cc: Andrew Morton, Kent Overstreet, Hao Ge, Shuah Khan,
	Jonathan Corbet, linux-doc, linux-kernel, linux-mm, Sourav Panda
In-Reply-To: <CAL41Mv6pZOVacLdUGta7UnxmFryumBbN6=Po50KfzgLzMs2PQg@mail.gmail.com>

On Mon, Jun 8, 2026 at 5:02 PM Abhishek Bapat <abhishekbapat@google.com> wrote:
>
> On Fri, Jun 5, 2026 at 5:09 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Fri,  5 Jun 2026 23:36:45 +0000 Abhishek Bapat <abhishekbapat@google.com> wrote:
> >
> > > Currently, memory allocation profiling data is primarily exposed through
> > > /proc/allocinfo. While useful for manual inspection, this text-based
> > > interface poses challenges for production monitoring and large-scale
> > > analysis:
> > >
> > > 1. Userspace must parse large amounts of text to extract specific
> > > fields.
> > > 2. To find specific tags, userspace must read the entire dataset,
> > > requiring many context switches and high data copying.
> > > 3. The kernel currently aggregates per-CPU counters for every allocation
> > > size, even those the user intends to filter out immediately.
> > >
> > > This series introduces a new IOCTL-based binary interface for allocinfo
> > > that supports kernel-side filtering. By allowing the user to specify a
> > > filter mask, we significantly reduce the work performed in-kernel and
> > > the amount of data transferred to userspace.
> >
> > Thanks.  AI review found several things - you'll want to address at
> > least the first few.
> >
> >         https://sashiko.dev/#/patchset/cover.1780701922.git.abhishekbapat@google.com
>
> All, please note I missed attaching the reason for choosing the IOCTL
> mechanism to this cover letter, but I will attach it to the v4
> patchset cover letter along with other changes. Thanks!

Can you please add it here now so that we can review that?

^ permalink raw reply

* Re: [PATCH v3 4/6] alloc_tag: add accuracy based filtering to ioctl
From: Abhishek Bapat @ 2026-06-09  0:51 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Hao Ge, Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel,
	linux-mm, Sourav Panda, Andrew Morton, Kent Overstreet
In-Reply-To: <CAJuCfpHtdd=9D68cfRp4HDHHHCZdzTNP_RH9i2D-f9tJXht56A@mail.gmail.com>

Btw, Sashiko left a comment in this patch stating that the value of
"inaccurate" is echoed back to the user. But since it's an input-only
parameter, that is expected. Changing it in the kernel would be
unexpected. Hence, I will ignore that comment.

On Mon, Jun 8, 2026 at 1:55 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Mon, Jun 8, 2026 at 1:25 AM Hao Ge <hao.ge@linux.dev> wrote:
> >
> >
> > On 2026/6/8 14:22, Hao Ge wrote:
> > > Hi Abhishek
> > >
> > >
> > > On 2026/6/6 07:36, Abhishek Bapat wrote:
> > >> Extend the allocinfo filtering mechanism to allow users to filter tags
> > >> based on their accuracy.
> > >>
> > >> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
> > >> ---
> > >>   include/uapi/linux/alloc_tag.h | 3 +++
> > >>   lib/alloc_tag.c                | 8 ++++++++
> > >>   2 files changed, 11 insertions(+)
> > >>
> > >> diff --git a/include/uapi/linux/alloc_tag.h
> > >> b/include/uapi/linux/alloc_tag.h
> > >> index 0e648192df4d..42445bdb11c5 100644
> > >> --- a/include/uapi/linux/alloc_tag.h
> > >> +++ b/include/uapi/linux/alloc_tag.h
> > >> @@ -20,6 +20,7 @@ struct allocinfo_tag {
> > >>       char function[ALLOCINFO_STR_SIZE];
> > >>       char filename[ALLOCINFO_STR_SIZE];
> > >>       __u64 lineno;
> > >> +    __u64 inaccurate;
> > >
> > >
> > > I was wondering if it would make sense to define inaccurate as a flags
> > > field
> > >
> > > (e.g. __u64 flags with ALLOCINFO_TAG_F_INACCURATE (1 <<0)),
> > >
> > > so that only bit 0 is used today and the upper bits are reserved for
> > > future use,
> > >
> > > aligning with current kernel codebase.
> > >
> > > This design also allows for better extensibility if we need to
> > >
> > > add new flags for any reason in the future.
> > >
> > > We also need to add flag validity checks if we go this route.
> > >
> > And I've reviewed the issue reported by Sashiko, and I think it's valid.
> >
> > When we expand the allocinfo_tag_data structure
> >
> > struct allocinfo_tag_data{
> >
> >      char modname[64];
> >
> >      char function[64];
> >
> >      char filename[64];
> >
> >      __u64 lineno;
> >
> >      __u64 inaccurate;
> >
> >      __u64 bytes;
> >
> >      __u64 calls;
> >
> >      __u8 accurate;
> >    /* padding */
> >
> > }
> >
> > I think user space may see two fields related to inaccuracy.
>
> Yes but one field (inside allocinfo_tag) is the input parameter which
> user provides to specify the filtering criteria and the other is the
> returned tag information. It's similar to any other tag attribute
> which you can be included in the filters.
>
> >
> > How do you like these modifications?
> >
> >
> > diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
> > --- a/include/uapi/linux/alloc_tag.h
> > +++ b/include/uapi/linux/alloc_tag.h
> > @@ -20,7 +20,6 @@ struct allocinfo_tag {
> >       char function[ALLOCINFO_STR_SIZE];
> >       char filename[ALLOCINFO_STR_SIZE];
> >       __u64 lineno;
> > -    __u64 inaccurate;
> >   };
> >
> >   /* The alignment ensures 32-bit compatible interfaces are not broken */
> > @@ -40,7 +39,7 @@ enum {
> >       ALLOCINFO_FILTER_FUNCTION,
> >       ALLOCINFO_FILTER_FILENAME,
> >       ALLOCINFO_FILTER_LINENO,
> > -    ALLOCINFO_FILTER_INACCURATE,
> > +    ALLOCINFO_FILTER_FLAGS,
> >       ALLOCINFO_FILTER_MIN_SIZE,
> >       ALLOCINFO_FILTER_MAX_SIZE,
> >       __ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
> > @@ -50,16 +49,20 @@ enum {
> >   #define ALLOCINFO_FILTER_MASK_FUNCTION        (1 <<
> > ALLOCINFO_FILTER_FUNCTION)
> >   #define ALLOCINFO_FILTER_MASK_FILENAME        (1 <<
> > ALLOCINFO_FILTER_FILENAME)
> >   #define ALLOCINFO_FILTER_MASK_LINENO        (1 << ALLOCINFO_FILTER_LINENO)
> > -#define ALLOCINFO_FILTER_MASK_INACCURATE    (1 <<
> > ALLOCINFO_FILTER_INACCURATE)
> > +#define ALLOCINFO_FILTER_MASK_FLAGS        (1 << ALLOCINFO_FILTER_FLAGS)
> >   #define ALLOCINFO_FILTER_MASK_MIN_SIZE        (1 <<
> > ALLOCINFO_FILTER_MIN_SIZE)
> >   #define ALLOCINFO_FILTER_MASK_MAX_SIZE        (1 <<
> > ALLOCINFO_FILTER_MAX_SIZE)
> >
> >   #define ALLOCINFO_FILTER_MASKS \
> >       ((1 << (__ALLOCINFO_FILTER_LAST + 1)) - 1)
> >
> > +#define ALLOCINFO_FILTER_F_INACCURATE    (1ULL << 0)
> > +#define ALLOCINFO_FILTER_FLAGS_ALL ALLOCINFO_FILTER_F_INACCURATE
> > +
> >   struct allocinfo_filter {
> >       __u64 mask; /* bitmask of the filter fields used */
> >       struct allocinfo_tag fields;
> > +    __u64 flags; /* bitmask of ALLOCINFO_FILTER_F_* */
> >       __u64 min_size;
> >       __u64 max_size;
> >   };
> > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > --- a/lib/alloc_tag.c
> > +++ b/lib/alloc_tag.c
> > @@ -249,8 +249,6 @@ static bool matches_filter(struct codetag *ct,
> > struct allocinfo_filter *filter,
> >                  struct alloc_tag_counters *counters,
> >                  bool *fetched_counters)
> >   {
> > -    bool inaccurate;
> > -
> >       if (!filter || !filter->mask)
> >           return true;
> >
> > @@ -277,10 +275,11 @@ static bool matches_filter(struct codetag *ct,
> > struct allocinfo_filter *filter,
> >           ct->lineno != filter->fields.lineno)
> >           return false;
> >
> > -    if (filter->mask & ALLOCINFO_FILTER_MASK_INACCURATE) {
> > -        inaccurate = !!(ct->flags & CODETAG_FLAG_INACCURATE);
> > -        if (inaccurate != !!(filter->fields.inaccurate))
> > -            return false;
> > +    if (filter->mask & ALLOCINFO_FILTER_MASK_FLAGS) {
> > +        if (filter->flags & ALLOCINFO_FILTER_F_INACCURATE) {
> > +            if (!(ct->flags & CODETAG_FLAG_INACCURATE))
>
> How would you filter records which have only accurate data?
>
> Overall I would prefer ALLOCINFO_FILTER_MASK_INACCURATE rather than
> ALLOCINFO_FILTER_MASK_FLAGS. The fact that this attribute is a
> single-bit flag is a technical detail. It's still a tag attribuite
> like file and module names and IMO deserves its own filter.
>
>
>
> > +                return false;
> > +        }
> >       }
> >
> >       if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE |
> > ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
> > @@ -318,6 +317,10 @@ static int allocinfo_ioctl_get_at(struct seq_file
> > *m, void __user *arg)
> >       if (params.filter.mask & ~ALLOCINFO_FILTER_MASKS)
> >           return -EINVAL;
> >
> > +    if ((params.filter.mask & ALLOCINFO_FILTER_MASK_FLAGS) &&
> > +        (params.filter.flags & ~ALLOCINFO_FILTER_FLAGS_ALL))
> > +        return -EINVAL;
> > +
> >       if ((params.filter.mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) &&
> >           (params.filter.mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) &&
> >           params.filter.min_size > params.filter.max_size)
> >
> >
> > Thanks
> >
> > Best Regards
> >
> > Hao
> >
> >
> > >
> > > Thanks
> > >
> > > Best Regards
> > >
> > > Hao
> > >
> > >
> > >>   };
> > >>     /* The alignment ensures 32-bit compatible interfaces are not
> > >> broken */
> > >> @@ -39,6 +40,7 @@ enum {
> > >>       ALLOCINFO_FILTER_FUNCTION,
> > >>       ALLOCINFO_FILTER_FILENAME,
> > >>       ALLOCINFO_FILTER_LINENO,
> > >> +    ALLOCINFO_FILTER_INACCURATE,
> > >>       ALLOCINFO_FILTER_MIN_SIZE,
> > >>       ALLOCINFO_FILTER_MAX_SIZE,
> > >>       __ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
> > >> @@ -48,6 +50,7 @@ enum {
> > >>   #define ALLOCINFO_FILTER_MASK_FUNCTION        (1 <<
> > >> ALLOCINFO_FILTER_FUNCTION)
> > >>   #define ALLOCINFO_FILTER_MASK_FILENAME        (1 <<
> > >> ALLOCINFO_FILTER_FILENAME)
> > >>   #define ALLOCINFO_FILTER_MASK_LINENO        (1 <<
> > >> ALLOCINFO_FILTER_LINENO)
> > >> +#define ALLOCINFO_FILTER_MASK_INACCURATE    (1 <<
> > >> ALLOCINFO_FILTER_INACCURATE)
> > >>   #define ALLOCINFO_FILTER_MASK_MIN_SIZE        (1 <<
> > >> ALLOCINFO_FILTER_MIN_SIZE)
> > >>   #define ALLOCINFO_FILTER_MASK_MAX_SIZE        (1 <<
> > >> ALLOCINFO_FILTER_MAX_SIZE)
> > >>   diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> > >> index ddc6946f56ab..cbcd12c4ef9c 100644
> > >> --- a/lib/alloc_tag.c
> > >> +++ b/lib/alloc_tag.c
> > >> @@ -249,6 +249,8 @@ static bool matches_filter(struct codetag *ct,
> > >> struct allocinfo_filter *filter,
> > >>                  struct alloc_tag_counters *counters,
> > >>                  bool *fetched_counters)
> > >>   {
> > >> +    bool inaccurate;
> > >> +
> > >>       if (!filter || !filter->mask)
> > >>           return true;
> > >>   @@ -275,6 +277,12 @@ static bool matches_filter(struct codetag *ct,
> > >> struct allocinfo_filter *filter,
> > >>           ct->lineno != filter->fields.lineno)
> > >>           return false;
> > >>   +    if (filter->mask & ALLOCINFO_FILTER_MASK_INACCURATE) {
> > >> +        inaccurate = !!(ct->flags & CODETAG_FLAG_INACCURATE);
> > >> +        if (inaccurate != !!(filter->fields.inaccurate))
> > >> +            return false;
> > >> +    }
> > >> +
> > >>       if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE |
> > >> ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
> > >>           if (!*fetched_counters) {
> > >>               *counters = allocinfo_prefetch_counters(ct);

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox