Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH v5 1/6] alloc_tag: add ioctl to /proc/allocinfo
From: Hao Ge @ 2026-06-16  1:40 UTC (permalink / raw)
  To: Abhishek Bapat, Suren Baghdasaryan, Andrew Morton,
	Kent Overstreet
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda
In-Reply-To: <fa1fe7d869e2ff45907b271ac4066aa0339d037c.1781564384.git.abhishekbapat@google.com>

Hi Abhishek


On 2026/6/16 07:04, Abhishek Bapat wrote:
> From: Suren Baghdasaryan <surenb@google.com>
>
> Add the following ioctl commands for /proc/allocinfo file:
>
> ALLOCINFO_IOC_CONTENT_ID - gets content identifier which can be used
> to check whether the file content has changed specifically due to module
> load/unload. Every time a module is loaded / unloaded, the returned
> value will be different. By comparing the identifier value at the
> beginning and at the end of the content retrieval operation, users can
> validate retrieved information for consistency.
>
> ALLOCINFO_IOC_GET_AT - gets the record at the specified position. This
> is the position of a record in /proc/allocinfo.
>
> ALLOCINFO_IOC_GET_NEXT - gets the record next to the last retrieved
> one. If no records were previously retrieved, returns the first
> record.
>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>


Thanks for updating the patch, LGTM.

Acked-by: Hao Ge <hao.ge@linux.dev>


> ---
>   Documentation/mm/allocation-profiling.rst     |   5 +
>   .../userspace-api/ioctl/ioctl-number.rst      |   2 +
>   MAINTAINERS                                   |   1 +
>   include/linux/codetag.h                       |   2 +
>   include/uapi/linux/alloc_tag.h                |  60 +++++
>   lib/alloc_tag.c                               | 235 +++++++++++++++++-
>   lib/codetag.c                                 |  18 ++
>   7 files changed, 321 insertions(+), 2 deletions(-)
>   create mode 100644 include/uapi/linux/alloc_tag.h
>
> diff --git a/Documentation/mm/allocation-profiling.rst b/Documentation/mm/allocation-profiling.rst
> index 5389d241176a..c3a28467955f 100644
> --- a/Documentation/mm/allocation-profiling.rst
> +++ b/Documentation/mm/allocation-profiling.rst
> @@ -46,6 +46,11 @@ sysctl:
>   Runtime info:
>     /proc/allocinfo
>   
> +  Profiling data can be retrieved either by reading `/proc/allocinfo` directly as
> +  text or programmatically via `ioctl()` calls defined in `<uapi/linux/alloc_tag.h>`.
> +  The ioctl interface supports structured binary data extraction as well as filtering
> +  by module name, function, file, line number, accuracy, or allocation size limits.
> +
>   Example output::
>   
>     root@moria-kvm:~# sort -g /proc/allocinfo|tail|numfmt --to=iec
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 331223761fff..84f6808a8578 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -349,6 +349,8 @@ Code  Seq#    Include File                                             Comments
>                                                                          <mailto:luzmaximilian@gmail.com>
>   0xA5  20-2F  linux/surface_aggregator/dtx.h                            Microsoft Surface DTX driver
>                                                                          <mailto:luzmaximilian@gmail.com>
> +0xA6  00-0F  uapi/linux/alloc_tag.h                                    Memory allocation profiling
> +                                                                       <mailto:surenb@google.com>
>   0xAA  00-3F  linux/uapi/linux/userfaultfd.h
>   0xAB  00-1F  linux/nbd.h
>   0xAC  00-1F  linux/raw.h
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 65bd4328fe05..019cc4c285a3 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16713,6 +16713,7 @@ S:	Maintained
>   F:	Documentation/mm/allocation-profiling.rst
>   F:	include/linux/alloc_tag.h
>   F:	include/linux/pgalloc_tag.h
> +F:	include/uapi/linux/alloc_tag.h
>   F:	lib/alloc_tag.c
>   
>   MEMORY CONTROLLER DRIVERS
> diff --git a/include/linux/codetag.h b/include/linux/codetag.h
> index ddae7484ca45..a25a085c2df1 100644
> --- a/include/linux/codetag.h
> +++ b/include/linux/codetag.h
> @@ -77,6 +77,8 @@ struct codetag_iterator {
>   void codetag_lock_module_list(struct codetag_type *cttype);
>   bool codetag_trylock_module_list(struct codetag_type *cttype);
>   void codetag_unlock_module_list(struct codetag_type *cttype);
> +unsigned long codetag_get_content_id(struct codetag_type *cttype);
> +unsigned int codetag_get_count(struct codetag_type *cttype);
>   struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
>   struct codetag *codetag_next_ct(struct codetag_iterator *iter);
>   
> diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
> new file mode 100644
> index 000000000000..0928e1a48d49
> --- /dev/null
> +++ b/include/uapi/linux/alloc_tag.h
> @@ -0,0 +1,60 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * alloc_tag IOCTL API definition
> + *
> + * Copyright (C) 2026 Google, LLC.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef _UAPI_ALLOC_TAG_H
> +#define _UAPI_ALLOC_TAG_H
> +
> +#include <linux/types.h>
> +
> +#define ALLOCINFO_STR_SIZE	64
> +
> +struct allocinfo_content_id {
> +	__u64 id;
> +};
> +
> +struct allocinfo_tag {
> +	/* Longer names are trimmed */
> +	char modname[ALLOCINFO_STR_SIZE];
> +	char function[ALLOCINFO_STR_SIZE];
> +	char filename[ALLOCINFO_STR_SIZE];
> +	__u64 lineno;
> +};
> +
> +/* The alignment ensures 32-bit compatible interfaces are not broken */
> +struct allocinfo_counter {
> +	__u64 bytes;
> +	__u64 calls;
> +	__u8 accurate;
> +} __attribute__((aligned(8)));
> +
> +struct allocinfo_tag_data {
> +	struct allocinfo_tag tag;
> +	struct allocinfo_counter counter;
> +};
> +
> +struct allocinfo_get_at {
> +	__u64 pos;	/* input */
> +	struct allocinfo_tag_data data;
> +};
> +
> +#define _ALLOCINFO_IOC_CONTENT_ID	0
> +#define _ALLOCINFO_IOC_GET_AT		1
> +#define _ALLOCINFO_IOC_GET_NEXT		2
> +
> +#define ALLOCINFO_IOC_BASE		0xA6
> +#define ALLOCINFO_IOC_CONTENT_ID	_IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_CONTENT_ID,	\
> +					     struct allocinfo_content_id)
> +#define ALLOCINFO_IOC_GET_AT		_IOWR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_AT,	\
> +					      struct allocinfo_get_at)
> +#define ALLOCINFO_IOC_GET_NEXT		_IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_NEXT,	\
> +					     struct allocinfo_tag_data)
> +
> +#endif /* _UAPI_ALLOC_TAG_H */
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index d9be1cf5187d..82e3b5f32dff 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -5,6 +5,7 @@
>   #include <linux/gfp.h>
>   #include <linux/kallsyms.h>
>   #include <linux/module.h>
> +#include <linux/mutex.h>
>   #include <linux/page_ext.h>
>   #include <linux/pgalloc_tag.h>
>   #include <linux/proc_fs.h>
> @@ -14,6 +15,7 @@
>   #include <linux/string_choices.h>
>   #include <linux/vmalloc.h>
>   #include <linux/kmemleak.h>
> +#include <uapi/linux/alloc_tag.h>
>   
>   #define ALLOCINFO_FILE_NAME		"allocinfo"
>   #define MODULE_ALLOC_TAG_VMAP_SIZE	(100000UL * sizeof(struct alloc_tag))
> @@ -47,6 +49,10 @@ struct allocinfo_private {
>   	struct codetag_iterator iter;
>   	struct codetag_iterator reported_iter;
>   	bool print_header;
> +	/* ioctl uses a separate iterator not to interfere with reads */
> +	struct codetag_iterator ioctl_iter;
> +	bool positioned; /* seq_open_private() sets to 0 */
> +	struct mutex ioctl_lock;
>   };
>   
>   static void *allocinfo_start(struct seq_file *m, loff_t *pos)
> @@ -130,6 +136,232 @@ static const struct seq_operations allocinfo_seq_op = {
>   	.show	= allocinfo_show,
>   };
>   
> +/*
> + * Initializes seq_file operations and allocates private state when opening
> + * the /proc/allocinfo procfs entry.
> + */
> +static int allocinfo_open(struct inode *inode, struct file *file)
> +{
> +	int ret;
> +
> +	ret = seq_open_private(file, &allocinfo_seq_op,
> +			       sizeof(struct allocinfo_private));
> +	if (!ret) {
> +		struct seq_file *m = file->private_data;
> +		struct allocinfo_private *priv = m->private;
> +
> +		mutex_init(&priv->ioctl_lock);
> +	}
> +	return ret;
> +}
> +
> +/*
> + * Cleans up the seq_file state and frees up the private state allocated in
> + * allocinfo_open() when closing the /proc/allocinfo file descriptor.
> + */
> +static int allocinfo_release(struct inode *inode, struct file *file)
> +{
> +	struct seq_file *m = file->private_data;
> +	struct allocinfo_private *priv = m->private;
> +
> +	mutex_destroy(&priv->ioctl_lock);
> +	return seq_release_private(inode, file);
> +}
> +
> +/*
> + * Returns a pointer to the suffix of a string so that its length fits within
> + * ALLOCINFO_STR_SIZE, preserving the trailing characters.
> + */
> +static const char *allocinfo_str(const char *str)
> +{
> +	size_t len = strlen(str);
> +
> +	/* Keep an extra space for the trailing NULL. */
> +	if (len >= ALLOCINFO_STR_SIZE)
> +		str += (len - ALLOCINFO_STR_SIZE) + 1;
> +	return str;
> +}
> +
> +/* Copy a string and trim from the beginning if it's too long */
> +static void allocinfo_copy_str(char *dest, const char *src)
> +{
> +	strscpy_pad(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE);
> +}
> +
> +/*
> + * Populates the UAPI allocinfo_tag_data structure with active runtime
> + * profiling counters extracted from the given kernel codetag.
> + */
> +static void allocinfo_to_params(struct codetag *ct,
> +				struct allocinfo_tag_data *data)
> +{
> +	struct alloc_tag *tag = ct_to_alloc_tag(ct);
> +	struct alloc_tag_counters counter = alloc_tag_read(tag);
> +
> +	if (ct->modname)
> +		allocinfo_copy_str(data->tag.modname, ct->modname);
> +	else
> +		data->tag.modname[0] = '\0';
> +	allocinfo_copy_str(data->tag.function, ct->function);
> +	allocinfo_copy_str(data->tag.filename, ct->filename);
> +	data->tag.lineno = ct->lineno;
> +	data->counter.bytes = counter.bytes;
> +	data->counter.calls = counter.calls;
> +	data->counter.accurate = !alloc_tag_is_inaccurate(tag);
> +}
> +
> +/*
> + * Retrieves the unique content ID representing the current allocation tag module
> + * layout, allowing userspace to detect if modules were loaded / unloaded.
> + */
> +static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
> +{
> +	struct allocinfo_content_id params;
> +
> +	codetag_lock_module_list(alloc_tag_cttype);
> +	params.id = codetag_get_content_id(alloc_tag_cttype);
> +	codetag_unlock_module_list(alloc_tag_cttype);
> +	if (copy_to_user(arg, &params, sizeof(params)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +/*
> + * Seeks the ioctl iterator to the specified 0-indexed tag position, reads its
> + * profiling data and returns it to userspace.
> + */
> +static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
> +{
> +	struct allocinfo_private *priv;
> +	struct codetag *ct;
> +	__u64 pos;
> +	struct allocinfo_get_at params = {0};
> +
> +	if (copy_from_user(&params, arg, sizeof(params)))
> +		return -EFAULT;
> +
> +	priv = m->private;
> +	pos = params.pos;
> +
> +	mutex_lock(&priv->ioctl_lock);
> +	codetag_lock_module_list(alloc_tag_cttype);
> +
> +	if (pos >= codetag_get_count(alloc_tag_cttype)) {
> +		codetag_unlock_module_list(alloc_tag_cttype);
> +		mutex_unlock(&priv->ioctl_lock);
> +		return -ENOENT;
> +	}
> +
> +	/* Find the codetag */
> +	priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
> +	ct = codetag_next_ct(&priv->ioctl_iter);
> +	while (ct && pos--)
> +		ct = codetag_next_ct(&priv->ioctl_iter);
> +	if (ct) {
> +		allocinfo_to_params(ct, &params.data);
> +		priv->positioned = true;
> +	}
> +
> +	codetag_unlock_module_list(alloc_tag_cttype);
> +	mutex_unlock(&priv->ioctl_lock);
> +
> +	if (!ct)
> +		return -ENOENT;
> +
> +	if (copy_to_user(arg, &params, sizeof(params)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +/*
> + * Advances the ioctl iterator to the next allocation tag in the sequence and
> + * returns its profiling data to userspace.
> + */
> +static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
> +{
> +	struct allocinfo_private *priv;
> +	struct codetag *ct;
> +	struct allocinfo_tag_data params;
> +	int ret = 0;
> +
> +	memset(&params, 0, sizeof(params));
> +	priv = m->private;
> +
> +	mutex_lock(&priv->ioctl_lock);
> +	codetag_lock_module_list(alloc_tag_cttype);
> +
> +	if (!priv->positioned) {
> +		priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
> +		priv->positioned = true;
> +	}
> +
> +	ct = codetag_next_ct(&priv->ioctl_iter);
> +	if (ct)
> +		allocinfo_to_params(ct, &params);
> +
> +	if (!ct) {
> +		priv->positioned = false;
> +		ret = -ENOENT;
> +	}
> +	codetag_unlock_module_list(alloc_tag_cttype);
> +	mutex_unlock(&priv->ioctl_lock);
> +
> +	if (ret == 0) {
> +		if (copy_to_user(arg, &params, sizeof(params)))
> +			return -EFAULT;
> +	}
> +	return ret;
> +}
> +
> +/*
> + * Entry point ioctl function for /proc/allocinfo routing requests to fetch the
> + * layout content ID, seek to a specific tag, or read sequential tags.
> + */
> +static long allocinfo_ioctl(struct file *file, unsigned int cmd,
> +			    unsigned long __arg)
> +{
> +	void __user *arg = (void __user *)__arg;
> +	int ret;
> +
> +	switch (cmd) {
> +	case ALLOCINFO_IOC_CONTENT_ID:
> +		ret = allocinfo_ioctl_get_content_id(file->private_data, arg);
> +		break;
> +	case ALLOCINFO_IOC_GET_AT:
> +		ret = allocinfo_ioctl_get_at(file->private_data, arg);
> +		break;
> +	case ALLOCINFO_IOC_GET_NEXT:
> +		ret = allocinfo_ioctl_get_next(file->private_data, arg);
> +		break;
> +	default:
> +		ret = -ENOIOCTLCMD;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +#ifdef CONFIG_COMPAT
> +static long allocinfo_compat_ioctl(struct file *file, unsigned int cmd,
> +				   unsigned long arg)
> +{
> +	return allocinfo_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
> +}
> +#endif
> +
> +static const struct proc_ops allocinfo_proc_ops = {
> +	.proc_open		= allocinfo_open,
> +	.proc_read_iter		= seq_read_iter,
> +	.proc_lseek		= seq_lseek,
> +	.proc_release		= allocinfo_release,
> +	.proc_ioctl		= allocinfo_ioctl,
> +#ifdef CONFIG_COMPAT
> +	.proc_compat_ioctl	= allocinfo_compat_ioctl,
> +#endif
> +};
> +
>   size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sleep)
>   {
>   	struct codetag_iterator iter;
> @@ -993,8 +1225,7 @@ static int __init alloc_tag_init(void)
>   		return 0;
>   	}
>   
> -	if (!proc_create_seq_private(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_seq_op,
> -				     sizeof(struct allocinfo_private), NULL)) {
> +	if (!proc_create(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_proc_ops)) {
>   		pr_err("Failed to create %s file\n", ALLOCINFO_FILE_NAME);
>   		shutdown_mem_profiling(false);
>   		return -ENOMEM;
> diff --git a/lib/codetag.c b/lib/codetag.c
> index 4001a7ea6675..a9cda4c962a3 100644
> --- a/lib/codetag.c
> +++ b/lib/codetag.c
> @@ -19,6 +19,8 @@ struct codetag_type {
>   	struct codetag_type_desc desc;
>   	/* generates unique sequence number for module load */
>   	unsigned long next_mod_seq;
> +	/* bumped on every module load and unload */
> +	unsigned long content_id;
>   };
>   
>   struct codetag_range {
> @@ -50,6 +52,20 @@ void codetag_unlock_module_list(struct codetag_type *cttype)
>   	up_read(&cttype->mod_lock);
>   }
>   
> +unsigned long codetag_get_content_id(struct codetag_type *cttype)
> +{
> +	lockdep_assert_held(&cttype->mod_lock);
> +
> +	return cttype->content_id;
> +}
> +
> +unsigned int codetag_get_count(struct codetag_type *cttype)
> +{
> +	lockdep_assert_held(&cttype->mod_lock);
> +
> +	return cttype->count;
> +}
> +
>   struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype)
>   {
>   	struct codetag_iterator iter = {
> @@ -204,6 +220,7 @@ static int codetag_module_init(struct codetag_type *cttype, struct module *mod)
>   
>   	down_write(&cttype->mod_lock);
>   	cmod->mod_seq = ++cttype->next_mod_seq;
> +	++cttype->content_id;
>   	mod_id = idr_alloc(&cttype->mod_idr, cmod, 0, 0, GFP_KERNEL);
>   	if (mod_id >= 0) {
>   		if (cttype->desc.module_load) {
> @@ -368,6 +385,7 @@ void codetag_unload_module(struct module *mod)
>   			cttype->count -= range_size(cttype, &cmod->range);
>   			idr_remove(&cttype->mod_idr, mod_id);
>   			kfree(cmod);
> +			++cttype->content_id;
>   		}
>   		up_write(&cttype->mod_lock);
>   		if (found && cttype->desc.free_section_mem)

^ permalink raw reply

* Re: [PATCH] docs: infiniband: correct name of option to enable the ib_uverbs module
From: Dongliang Mu @ 2026-06-16  1:18 UTC (permalink / raw)
  To: Ethan Nelson-Moore, Shuah Khan, Jonathan Corbet, linux-rdma,
	linux-doc
  Cc: Jason Gunthorpe, Leon Romanovsky, Alex Shi, Yanteng Si
In-Reply-To: <20260616002027.67925-1-enelsonmoore@gmail.com>


On 6/16/26 8:20 AM, Ethan Nelson-Moore wrote:
> The Infiniband documentation states that CONFIG_INFINIBAND_USER_VERBS
> should be used to enable the ib_uverbs module. However, this option was
> renamed to CONFIG_INFINIBAND_USER_ACCESS in commit 17781cd6186c
> ("[PATCH] IB: clean up user access config options"). Update the
> documentation to reflect this.

Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>

For patch that mixes English and translation update, should it directly 
be merged into Jon's kernel tree?

>
> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
> ---
>   Documentation/infiniband/user_verbs.rst                    | 2 +-
>   Documentation/translations/zh_CN/infiniband/user_verbs.rst | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/infiniband/user_verbs.rst b/Documentation/infiniband/user_verbs.rst
> index 8ddc4b1cfef2..96bcd1bd37ad 100644
> --- a/Documentation/infiniband/user_verbs.rst
> +++ b/Documentation/infiniband/user_verbs.rst
> @@ -2,7 +2,7 @@
>   Userspace verbs access
>   ======================
>   
> -  The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
> +  The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_ACCESS,
>     enables direct userspace access to IB hardware via "verbs," as
>     described in chapter 11 of the InfiniBand Architecture Specification.
>   
> diff --git a/Documentation/translations/zh_CN/infiniband/user_verbs.rst b/Documentation/translations/zh_CN/infiniband/user_verbs.rst
> index 970bc1a4e396..31534681654b 100644
> --- a/Documentation/translations/zh_CN/infiniband/user_verbs.rst
> +++ b/Documentation/translations/zh_CN/infiniband/user_verbs.rst
> @@ -17,7 +17,7 @@
>   用户空间verbs访问
>   =================
>   
> -  ib_uverbs模块,通过启用CONFIG_INFINIBAND_USER_VERBS构建,使用户空间
> +  ib_uverbs模块,通过启用CONFIG_INFINIBAND_USER_ACCESS构建,使用户空间
>     通过“verbs”直接访问IB硬件,如InfiniBand架构规范第11章所述。
>   
>     要使用verbs,需要libibverbs库,可从https://github.com/linux-rdma/rdma-core。


^ permalink raw reply

* Re: [PATCH v4 6/7] tracing/probes: Add this_cpu_read() and this_cpu_ptr() dereference method to fetcharg
From: Masami Hiramatsu @ 2026-06-16  1:15 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Steven Rostedt, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <178148609402.185520.8189233495763938815.stgit@devnote2>

On Mon, 15 Jun 2026 10:14:54 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> +		case FETCH_OP_DEREF_CPU:
> +			val = (unsigned long)this_cpu_ptr((void __percpu *)val);
> +			ret = probe_mem_read(&val, (void *)val, sizeof(val));
> +			break;
> +		case FETCH_OP_CPU_PTR:
> +			val = (unsigned long)this_cpu_ptr((void __percpu *)val);
> +			ret = 0;
> +			break;

Hmm, maybe I can just convert the FETCH_OP_DEREF_CPU to
FETCH_OP_CPU_PTR + FETCH_OP_DEREF to simply the code.

> +		default:
> +			lval = llval;
> +			goto out;
> +		}
>  		if (ret)
>  			return ret;
> +		llval = lval;
>  		code++;
>  	} while (1);
> +out:
>  
>  	s3 = code;
>  stage3:
> @@ -181,6 +195,10 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val,
>  	case FETCH_OP_ST_UMEM:
>  		probe_mem_read_user(dest, (void *)val + code->offset, code->size);
>  		break;
> +	case FETCH_OP_ST_CPUMEM:
> +		val = (unsigned long)this_cpu_ptr((void __percpu *)val);
> +		probe_mem_read(dest, (void *)val, code->size);
> +		break;

Then, I can just drop this change.

Thanks,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [swap tier discussion] Re: [PATCH v3 2/4] mm/zswap: Implement proactive writeback
From: YoungJun Park @ 2026-06-16  1:03 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Shakeel Butt, Hao Jia, Johannes Weiner, mhocko, tj, mkoutny,
	roman.gushchin, Nhat Pham, akpm, chengming.zhou, muchun.song,
	cgroups, linux-mm, linux-kernel, linux-doc, Hao Jia, chrisl,
	kasong, baoquan.he, joshua.hahnjy
In-Reply-To: <CAO9r8zOVqbJEaBqTHw=r2bYw7Lm1tO0TU9QuG+eH1rfqcTAJJQ@mail.gmail.com>

On Mon, Jun 15, 2026 at 12:55:09PM -0700, Yosry Ahmed wrote:
> > In that case, the internal logic could stay roughly the same rather
> > than counting via a page counter. Something like:
> >
> > 1. Change the interface shell: tier.*.max — allow only 0 ~ max.
> 
> What about a single interface as I suggested to remain consistent with
> memory tiering?

Hello Yosry!

I agree. As I was implementing the interface for seeing feasibility
, I reconsidered it. Since swap tiers can be added or removed at runtime, 
having static memory."tier_name".max files seems unnatural.

A single interface like `swap.tiers.max` would be better. We can use a
flat-keyed format (similar to io.weight. same as you suggested)

echo ["tier_name"] ["0 or max"] > swap.tiers.max

I am now leaning towards this is a better direction than what I initially
suggested (memory.swap.tiers and memory.swap.tiers.effective).

Considering other reviews and Shakeel's reply, I will update my swap tier
patch accordingly.

> > 2. Keep the internal logic as is: 0 disables the mask (child memcgs
> >    off too), max enables it (child memcgs on too).
> 
> I think a child should be able to disable a swap tier enabled by the
> parent, but not vice versa.

Yes, we are on the same page. I missed a part in my explanation. I meant
that the child's selected tiers should be a subset of the parent's (which
is how the current swap tier suggestion works). 
A child cannot enable a tier that the parent has disabled.

> > 3. memory.zswap.max integrates naturally (it's memory."tier_name".max).
> 
> Not really. memory.zswap.max is in terms of memory usage (compressed
> size), not swap usage (uncompressed size).

I see, memory.zswap.max needs to be maintained separately. I will look
closer into its semantics. I might have misunderstood this part!

> [..]
> > Let me clarify a part I wrote confusingly. Handling
> > memory.zswap.writeback via tiers is possible, but I don't think the
> > interface itself would be replaced even if memory.swap.tiers is adopted.
> >
> > Selecting only zswap in memory.swap.tiers would not just disable
> > writeback.it would also block regular swap entirely, which differs
> > slightly from the current semantic. (... "Per the cgroup v2 docs: a
> > zswap-only tier setting is subtly different from setting
> > memory.swap.max to 0, since it still allows pages to be written to the
> > zswap pool; this has no effect if zswap is disabled, and swapping is
> > allowed unless memory.swap.max is set to 0.")
> 
> I don't understand. How is disabling zswap writeback not equivalent to
> only enabling zswap as a tier?

Isn't there a case where zswap_store() fails and pages fall back to the
backing swap device?

- "zswap tier only": Only zswap is allowed. Fallback to other swap is
  blocked.
- "zswap writeback disabled": zswap is allowed, but if zswap_store()
  fails, pages can still fall back to other swap devices.

Because of this slight semantic difference, I thought they couldn't be
fully unified. If my understanding is correct, we could extend the zswap
tier to select the target swap device for writeback, but replacing the
writeback interface entirely might be difficult.

> Do you just mean the fact that disabling zswap writeback is a noop of
> zswap is disabled? It's a different interface so I think a small

Yes, I think so too.

> semantic difference is okay. In practice, I doubt that zswap is being
> disabled at runtime.

I thought disabling zswap at runtime might have some use cases, but we
can discuss this further when we talk about the patch extending the
zswap tier.

Best regards,
Youngjun

^ permalink raw reply

* [PATCH] docs: pt_BR: update netdevsim section in maintainer-netdev.rst
From: Amanda Corrêa @ 2026-06-16  0:52 UTC (permalink / raw)
  To: danielmaraboo, corbet; +Cc: skhan, linux-doc, linux-kernel, Amanda Corrêa

Update the Brazilian Portuguese translation of maintainer-netdev.rst
to align with the latest English version.

    Key changes include:
    - Updated the netdevsim section to reflect upstream changes
    - Added guidance on netdevsim-based API testing
    - Fixed minor spacing and formatting issues

Signed-off-by: Amanda Corrêa <amandacorreasilvax@gmail.com>
---
 .../pt_BR/process/maintainer-netdev.rst       | 23 +++++++++++++------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/Documentation/translations/pt_BR/process/maintainer-netdev.rst b/Documentation/translations/pt_BR/process/maintainer-netdev.rst
index 5de282804..e9bb998c7 100644
--- a/Documentation/translations/pt_BR/process/maintainer-netdev.rst
+++ b/Documentation/translations/pt_BR/process/maintainer-netdev.rst
@@ -22,11 +22,11 @@ netdev
 ------
 A **netdev** é a lista de discussão para todos os assuntos do Linux relacionados
 a rede. Isso inclui qualquer item encontrado em ``net/`` (ex: código principal
-como IPv6) e  em ``drivers/net`` (ex: drivers específicos de hardware) na árvore
+como IPv6) e em ``drivers/net`` (ex: drivers específicos de hardware) na árvore
 de diretórios do Linux.
 
 Note que alguns subsistemas (ex: drivers de rede sem fio/wireless), que possuem
-um  alto volume de tráfego, possuem suas próprias listas de discussão e árvores
+um alto volume de tráfego, possuem suas próprias listas de discussão e árvores
 específicas.
 
 Como muitas outras listas de discussão do Linux, a lista netdev é hospedada no
@@ -34,7 +34,7 @@ Como muitas outras listas de discussão do Linux, a lista netdev é hospedada no
 https://lore.kernel.org/netdev/.
 
 À exceção dos subsistemas mencionados anteriormente, todo o desenvolvimento de
-rede  do Linux (ex: RFCs, revisões, comentários, etc.) ocorre na **netdev**.
+rede do Linux (ex: RFCs, revisões, comentários, etc.) ocorre na **netdev**.
 
 Ciclo de Desenvolvimento
 ------------------------
@@ -506,8 +506,14 @@ netdevsim
 
 O ``netdevsim`` é um driver de teste que pode ser usado para exercitar APIs de
 configuração de driver sem a necessidade de hardware compatível. Mock-ups e
-testes baseados no ``netdevsim`` são fortemente encorajados ao adicionar novas
-APIs, mas o ``netdevsim`` em si **não** é considerado um caso de uso/usuário.
+testes baseados no ``netdevsim`` são encorajados ao adicionar novas APIs com
+lógica complexa na pilha. Os testes devem ser escritos de forma que possam ser
+executados tanto contra o ``netdevsim`` quanto contra um dispositivo real
+(veja ``tools/testing/selftests/drivers/net/README.rst``). Testes exclusivos
+para o ``netdevsim`` devem se concentrar em testar casos extremos e caminhos de
+falha no núcleo que são difíceis de exercitar com um driver real.
+
+``netdevsim`` em si **não** é considerado um caso de uso/usuário.
 Você também deve implementar as novas APIs em um driver real.
 
 Não damos garantias de que o ``netdevsim`` mudará no futuro de uma forma que
@@ -577,8 +583,11 @@ independentemente do nível de experiência. Para orientações gerais e dicas
 
 É seguro assumir que os mantenedores da netdev conhecem a comunidade e o nível
 de experiência dos revisores. Os revisores não devem se preocupar com o fato de
-seus comentários impedirem ou desviarem o fluxo de patches. Revisores menos
-experientes são fortemente incentivados a fazer uma revisão mais aprofundada das
+seus comentários impedirem ou desviarem o fluxo de patches. Uma tag Reviewed-by
+é entendida como "Eu revisei este código da melhor maneira possível" em vez de
+"Posso atestar que este código está correto".
+
+Revisores são fortemente incentivados a fazer uma revisão mais aprofundada das
 submissões e não focar exclusivamente em questões triviais ou subjetivas, como
 formatação de código, tags, etc.
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH] docs: infiniband: correct name of option to enable the ib_uverbs module
From: Ethan Nelson-Moore @ 2026-06-16  0:20 UTC (permalink / raw)
  To: Shuah Khan, Dongliang Mu, linux-rdma, linux-doc
  Cc: Ethan Nelson-Moore, Jason Gunthorpe, Leon Romanovsky,
	Jonathan Corbet, Alex Shi, Yanteng Si

The Infiniband documentation states that CONFIG_INFINIBAND_USER_VERBS
should be used to enable the ib_uverbs module. However, this option was
renamed to CONFIG_INFINIBAND_USER_ACCESS in commit 17781cd6186c
("[PATCH] IB: clean up user access config options"). Update the
documentation to reflect this.

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
---
 Documentation/infiniband/user_verbs.rst                    | 2 +-
 Documentation/translations/zh_CN/infiniband/user_verbs.rst | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/infiniband/user_verbs.rst b/Documentation/infiniband/user_verbs.rst
index 8ddc4b1cfef2..96bcd1bd37ad 100644
--- a/Documentation/infiniband/user_verbs.rst
+++ b/Documentation/infiniband/user_verbs.rst
@@ -2,7 +2,7 @@
 Userspace verbs access
 ======================
 
-  The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
+  The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_ACCESS,
   enables direct userspace access to IB hardware via "verbs," as
   described in chapter 11 of the InfiniBand Architecture Specification.
 
diff --git a/Documentation/translations/zh_CN/infiniband/user_verbs.rst b/Documentation/translations/zh_CN/infiniband/user_verbs.rst
index 970bc1a4e396..31534681654b 100644
--- a/Documentation/translations/zh_CN/infiniband/user_verbs.rst
+++ b/Documentation/translations/zh_CN/infiniband/user_verbs.rst
@@ -17,7 +17,7 @@
 用户空间verbs访问
 =================
 
-  ib_uverbs模块,通过启用CONFIG_INFINIBAND_USER_VERBS构建,使用户空间
+  ib_uverbs模块,通过启用CONFIG_INFINIBAND_USER_ACCESS构建,使用户空间
   通过“verbs”直接访问IB硬件,如InfiniBand架构规范第11章所述。
 
   要使用verbs,需要libibverbs库,可从https://github.com/linux-rdma/rdma-core。
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 6/6] kselftest: alloc_tag: extend the allocinfo ioctl kselftest
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

Add the following 2 scenarios to the allocinfo ioctl kselftest:
1. Validate size based filtering
2. Validate lineno based filtering

The first test uses "do_init_module" as the candidate function for the
test. This is because the associated site will only allocate memory when
a kernel module is loaded. The return value of get_content_id() changes
every time modules are loaded or unloaded. Hence, as long as
get_content_id() values at the start and the end of the test are the
same, the memory allocated by the do_init_module call site should also
remain the same. Consequently, the test can assume consistency between
the value returned by the ioctl and the procfs resulting in less
flakiness.

Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
---
 .../alloc_tag/allocinfo_ioctl_test.c          | 197 +++++++++++++++++-
 1 file changed, 196 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
index 62d5a488a04d..041fee1a3d74 100644
--- a/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
+++ b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
@@ -309,11 +309,194 @@ static int test_function_filter(void)
 	return run_filter_test(&filter);
 }
 
+static int test_size_filter(void)
+{
+	int fd;
+	struct allocinfo_tag_data_vec *tags = malloc(sizeof(*tags));
+	struct allocinfo_tag_data_vec *procfs_entries = malloc(sizeof(*procfs_entries));
+	struct allocinfo_filter filter;
+	int ret = KSFT_PASS;
+	__u64 target_size, i, pos;
+	bool found;
+	const char *target_function = "do_init_module";
+	struct allocinfo_content_id start_cont_id, end_cont_id;
+	int retry = 0;
+	const int max_retries = 10;
+
+	if (!tags || !procfs_entries) {
+		ksft_print_msg("Memory allocation failed.\n");
+		ret = KSFT_FAIL;
+		goto freemem;
+	}
+
+	fd = open(ALLOCINFO_PROC, O_RDONLY);
+	if (fd < 0) {
+		ksft_print_msg("Failed to open " ALLOCINFO_PROC ": %s\n", strerror(errno));
+		ret = KSFT_FAIL;
+		goto freemem;
+	}
+
+	do {
+		found = false;
+		pos = 0;
+
+		if (__allocinfo_get_content_id(fd, &start_cont_id)) {
+			ksft_print_msg("allocinfo_get_content_id failed\n");
+			ret = KSFT_FAIL;
+			goto exit;
+		}
+
+		memset(&filter, 0, sizeof(filter));
+		filter.mask |= ALLOCINFO_FILTER_MASK_FUNCTION;
+		strncpy(filter.fields.function, target_function, ALLOCINFO_STR_SIZE);
+
+		if (get_filtered_procfs_entries(procfs_entries, &filter)) {
+			ksft_print_msg("Error retrieving entries from " ALLOCINFO_PROC "\n");
+			ret = KSFT_FAIL;
+			goto exit;
+		}
+
+		if (procfs_entries->count == 0) {
+			ksft_print_msg("Function %s not found in procfs\n", target_function);
+			ret = KSFT_SKIP;
+			goto exit;
+		}
+
+		target_size = procfs_entries->tag[0].counter.bytes;
+
+		memset(&filter, 0, sizeof(filter));
+		filter.mask |= ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE;
+		filter.min_size = target_size;
+		filter.max_size = target_size;
+
+		while (1) {
+			struct allocinfo_get_at get_at_params;
+
+			memset(&get_at_params, 0, sizeof(get_at_params));
+			memcpy(&get_at_params.filter, &filter, sizeof(filter));
+			get_at_params.pos = pos;
+
+			if (__allocinfo_get_at(fd, &get_at_params))
+				break;
+
+			tags->count = 0;
+			memcpy(&tags->tag[tags->count++], &get_at_params.data,
+			       sizeof(get_at_params.data));
+
+			while (tags->count < VEC_MAX_ENTRIES &&
+			       __allocinfo_get_next(fd, &tags->tag[tags->count]) == 0)
+				tags->count++;
+
+			for (i = 0; i < tags->count; i++) {
+				if (strcmp(tags->tag[i].tag.function, target_function) == 0) {
+					found = true;
+					break;
+				}
+			}
+
+			if (found || tags->count < VEC_MAX_ENTRIES)
+				break;
+
+			pos += tags->count;
+		}
+
+		if (__allocinfo_get_content_id(fd, &end_cont_id)) {
+			ksft_print_msg("allocinfo_get_content_id failed\n");
+			ret = KSFT_FAIL;
+			goto exit;
+		}
+
+		if (start_cont_id.id == end_cont_id.id)
+			break;
+
+		ksft_print_msg("Module load detected during size verification, retrying...\n");
+	} while (retry++ < max_retries);
+
+	if (start_cont_id.id == end_cont_id.id && !found) {
+		ksft_print_msg("Entry with function %s not found in IOCTL results\n",
+			       target_function);
+		ret = KSFT_FAIL;
+	} else if (start_cont_id.id != end_cont_id.id) {
+		ksft_print_msg("Failed to match content_ids for procfs and IOCTL, skipping...\n");
+		ret = KSFT_SKIP;
+	}
+
+exit:
+	close(fd);
+freemem:
+	free(tags);
+	free(procfs_entries);
+	return ret;
+}
+
+static int test_lineno_filter(void)
+{
+	struct allocinfo_tag_data_vec *tags = malloc(sizeof(*tags));
+	struct allocinfo_tag_data_vec *procfs_entries = malloc(sizeof(*procfs_entries));
+	struct allocinfo_filter filter;
+	enum ioctl_ret ioctl_status;
+	int ret = KSFT_PASS;
+	__u64 target_lineno, i;
+
+	if (!tags || !procfs_entries) {
+		ksft_print_msg("Memory allocation failed.\n");
+		ret = KSFT_FAIL;
+		goto exit;
+	}
+
+	memset(&filter, 0, sizeof(filter));
+
+	if (get_filtered_procfs_entries(procfs_entries, &filter)) {
+		ksft_print_msg("Error retrieving entries from " ALLOCINFO_PROC "\n");
+		ret = KSFT_FAIL;
+		goto exit;
+	}
+	if (procfs_entries->count == 0) {
+		ksft_print_msg("Could not retrieve procfs entries\n");
+		ret = KSFT_SKIP;
+		goto exit;
+	}
+	/*
+	 * We depend on the result of procfs entries to create the ioctl_filter. Hence we
+	 * cannot recycle the run_filter_test function here.
+	 */
+	target_lineno = procfs_entries->tag[0].tag.lineno;
+
+	filter.mask |= ALLOCINFO_FILTER_MASK_LINENO;
+	filter.fields.lineno = target_lineno;
+
+	ioctl_status = get_filtered_ioctl_entries(tags, &filter, 0);
+	if (ioctl_status == IOCTL_INVALID_DATA) {
+		ksft_print_msg("Trouble retrieving valid IOCTL entries, skipping.\n");
+		ret = KSFT_SKIP;
+		goto exit;
+	}
+	if (ioctl_status == IOCTL_FAILURE) {
+		ksft_print_msg("Error retrieving IOCTL entries.\n");
+		ret = KSFT_FAIL;
+		goto exit;
+	}
+
+	for (i = 0; i < tags->count; i++) {
+		if (tags->tag[i].tag.lineno != target_lineno) {
+			ksft_print_msg("IOCTL entry %llu has incorrect lineno %llu.\n",
+				       i, tags->tag[i].tag.lineno);
+			ret = KSFT_FAIL;
+			goto exit;
+		}
+	}
+
+exit:
+	free(tags);
+	free(procfs_entries);
+	return ret;
+}
+
 int main(int argc, char *argv[])
 {
 	int ret;
 
-	ksft_set_plan(2);
+	ksft_set_plan(4);
 
 	ret = test_filename_filter();
 	if (ret == KSFT_SKIP)
@@ -327,5 +510,17 @@ int main(int argc, char *argv[])
 	else
 		ksft_test_result(ret == KSFT_PASS, "test_function_filter\n");
 
+	ret = test_size_filter();
+	if (ret == KSFT_SKIP)
+		ksft_test_result_skip("Skipping test_size_filter\n");
+	else
+		ksft_test_result(ret == KSFT_PASS, "test_size_filter\n");
+
+	ret = test_lineno_filter();
+	if (ret == KSFT_SKIP)
+		ksft_test_result_skip("Skipping test_lineno_filter\n");
+	else
+		ksft_test_result(ret == KSFT_PASS, "test_lineno_filter\n");
+
 	ksft_finished();
 }
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 5/6] kselftest: alloc_tag: add kselftest for ioctl interface
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

Introduce a kselftest to verify the new IOCTL-based interface for
/proc/allocinfo. The test covers:

1. Validation of the filename filter.
2. Validation of the function filter.

The first test validates the functionality of the filename filter. Using
"mm/memory.c" as the candidate filename filter, it retrieves filtered
entries from both procfs and ioctl and matches the first VEC_MAX_ENTRIES
entries.

The second test validates the functionality of the function filter.
It uses "dup_mm" as the candidate function as we do not expect this
function name to change frequently and hence won't be needing to modify
this test often.

Note that both the tests match line no, function name and file name
fields. Bytes allocated and calls are not matched as those values may
change in the time when the data is being read from procfs and ioctl and
hence can lead to false negatives.

Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
---
 MAINTAINERS                                   |   1 +
 tools/testing/selftests/alloc_tag/Makefile    |   9 +
 .../alloc_tag/allocinfo_ioctl_test.c          | 331 ++++++++++++++++++
 3 files changed, 341 insertions(+)
 create mode 100644 tools/testing/selftests/alloc_tag/Makefile
 create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 019cc4c285a3..6610dd42e484 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16715,6 +16715,7 @@ F:	include/linux/alloc_tag.h
 F:	include/linux/pgalloc_tag.h
 F:	include/uapi/linux/alloc_tag.h
 F:	lib/alloc_tag.c
+F:	tools/testing/selftests/alloc_tag/
 
 MEMORY CONTROLLER DRIVERS
 M:	Krzysztof Kozlowski <krzk@kernel.org>
diff --git a/tools/testing/selftests/alloc_tag/Makefile b/tools/testing/selftests/alloc_tag/Makefile
new file mode 100644
index 000000000000..f2b8fc022c3b
--- /dev/null
+++ b/tools/testing/selftests/alloc_tag/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+
+TEST_GEN_PROGS := allocinfo_ioctl_test
+
+CFLAGS += -Wall
+CFLAGS += -I../../../../usr/include
+
+include ../lib.mk
+
diff --git a/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
new file mode 100644
index 000000000000..62d5a488a04d
--- /dev/null
+++ b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* kselftest for allocinfo ioctl
+ * allocinfo ioctl retrives allocinfo data through ioctl
+ * Copyright (C) 2026 Google, Inc.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <linux/types.h>
+#include <linux/alloc_tag.h>
+#include "../kselftest.h"
+
+#define MAX_LINE_LEN		512
+#define ALLOCINFO_PROC		"/proc/allocinfo"
+
+enum ioctl_ret {
+	IOCTL_SUCCESS = 0,
+	IOCTL_FAILURE = 1,
+	IOCTL_INVALID_DATA = 2,
+};
+
+#define VEC_MAX_ENTRIES 32
+
+struct allocinfo_tag_data_vec {
+	struct allocinfo_tag_data tag[VEC_MAX_ENTRIES];
+	__u64 count;
+};
+
+static inline int __allocinfo_get_content_id(int dev_fd, struct allocinfo_content_id *params)
+{
+	return ioctl(dev_fd, ALLOCINFO_IOC_CONTENT_ID, params);
+}
+
+static inline int __allocinfo_get_at(int dev_fd, struct allocinfo_get_at *params)
+{
+	return ioctl(dev_fd, ALLOCINFO_IOC_GET_AT, params);
+}
+
+static inline int __allocinfo_get_next(int dev_fd, struct allocinfo_tag_data *params)
+{
+	return ioctl(dev_fd, ALLOCINFO_IOC_GET_NEXT, params);
+}
+
+static bool match_entry(const struct allocinfo_tag_data *procfs_entry,
+			const struct allocinfo_tag_data *tag_data,
+			bool match_bytes, bool match_calls, bool match_lineno,
+			bool match_function, bool match_filename)
+{
+	if (match_bytes && tag_data->counter.bytes != procfs_entry->counter.bytes) {
+		ksft_print_msg("size retrieved through ioctl does not match procfs\n");
+		return false;
+	}
+
+	if (match_calls && tag_data->counter.calls != procfs_entry->counter.calls) {
+		ksft_print_msg("call count retrieved through ioctl does not match procfs\n");
+		return false;
+	}
+
+	if (match_lineno && tag_data->tag.lineno != procfs_entry->tag.lineno) {
+		ksft_print_msg("lineno retrieved through ioctl does not match procfs\n");
+		return false;
+	}
+
+	if (match_function &&
+	    strncmp(tag_data->tag.function, procfs_entry->tag.function, ALLOCINFO_STR_SIZE)) {
+		ksft_print_msg("function retrieved through ioctl does not match procfs\n");
+		return false;
+	}
+
+	if (match_filename &&
+	    strncmp(tag_data->tag.filename, procfs_entry->tag.filename, ALLOCINFO_STR_SIZE)) {
+		ksft_print_msg("filename retrieved through ioctl does not match procfs\n");
+		return false;
+	}
+	return true;
+}
+
+static bool match_entries(const struct allocinfo_tag_data_vec *procfs_entries,
+			  const struct allocinfo_tag_data_vec *tags,
+			  bool match_bytes, bool match_calls, bool match_lineno,
+			  bool match_function, bool match_filename)
+{
+	__u64 i;
+
+	if (procfs_entries->count != tags->count) {
+		ksft_print_msg("Entry count mismatch. ioctl entries: %llu, proc entries: %llu\n",
+			       tags->count, procfs_entries->count);
+		return false;
+	}
+	for (i = 0; i < procfs_entries->count; i++) {
+		if (!match_entry(&procfs_entries->tag[i], &tags->tag[i],
+				 match_bytes, match_calls, match_lineno,
+				 match_function, match_filename)) {
+			ksft_print_msg("%lluth entry does not match.\n", i);
+			return false;
+		}
+	}
+	return true;
+}
+
+static const char *allocinfo_str(const char *str)
+{
+	size_t len = strlen(str);
+
+	if (len >= ALLOCINFO_STR_SIZE)
+		str += (len - ALLOCINFO_STR_SIZE) + 1;
+	return str;
+}
+
+static void allocinfo_copy_str(char *dest, const char *src)
+{
+	strncpy(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE - 1);
+	dest[ALLOCINFO_STR_SIZE - 1] = '\0';
+}
+
+static int get_filtered_procfs_entries(struct allocinfo_tag_data_vec *procfs_entries,
+				       const struct allocinfo_filter *filter)
+{
+	FILE *fp = fopen(ALLOCINFO_PROC, "r");
+	char line[MAX_LINE_LEN];
+	int matches;
+	struct allocinfo_tag_data procfs_entry;
+
+	if (!fp) {
+		ksft_print_msg("Failed to open " ALLOCINFO_PROC " for reading\n");
+		return 1;
+	}
+	memset(procfs_entries, 0, sizeof(*procfs_entries));
+	while (fgets(line, sizeof(line), fp) && procfs_entries->count < VEC_MAX_ENTRIES) {
+		char filename[MAX_LINE_LEN];
+		char function[MAX_LINE_LEN];
+
+		memset(&procfs_entry, 0, sizeof(procfs_entry));
+		matches = sscanf(line, "%llu %llu %[^:]:%llu func:%s",
+				 &procfs_entry.counter.bytes,
+				 &procfs_entry.counter.calls,
+				 filename,
+				 &procfs_entry.tag.lineno,
+				 function);
+
+		if (matches != 5)
+			continue;
+
+		allocinfo_copy_str(procfs_entry.tag.filename, filename);
+		allocinfo_copy_str(procfs_entry.tag.function, function);
+
+		if (filter->mask & ALLOCINFO_FILTER_MASK_FILENAME) {
+			if (strncmp(procfs_entry.tag.filename,
+				    filter->fields.filename, ALLOCINFO_STR_SIZE))
+				continue;
+		}
+		if (filter->mask & ALLOCINFO_FILTER_MASK_FUNCTION) {
+			if (strncmp(procfs_entry.tag.function,
+				    filter->fields.function, ALLOCINFO_STR_SIZE))
+				continue;
+		}
+		if (filter->mask & ALLOCINFO_FILTER_MASK_LINENO) {
+			if (procfs_entry.tag.lineno != filter->fields.lineno)
+				continue;
+		}
+		if (filter->mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) {
+			if (procfs_entry.counter.bytes < filter->min_size)
+				continue;
+		}
+		if (filter->mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) {
+			if (procfs_entry.counter.bytes > filter->max_size)
+				continue;
+		}
+
+		memcpy(&procfs_entries->tag[procfs_entries->count++], &procfs_entry,
+		       sizeof(procfs_entry));
+	}
+	fclose(fp);
+	return 0;
+}
+
+static enum ioctl_ret get_filtered_ioctl_entries(struct allocinfo_tag_data_vec *tags,
+						 const struct allocinfo_filter *filter,
+						 __u64 start_pos)
+{
+	int fd = open(ALLOCINFO_PROC, O_RDONLY);
+
+	if (fd < 0) {
+		ksft_print_msg("Failed to open " ALLOCINFO_PROC " for IOCTL\n");
+		return IOCTL_FAILURE;
+	}
+	struct allocinfo_content_id start_cont_id, end_cont_id;
+	struct allocinfo_get_at get_at_params;
+	const int max_retries = 10;
+	int retry_count = 0;
+	int status;
+
+	/*
+	 * __allocinfo_get_content_id may return different values if a kernel module was loaded
+	 * between the two calls. If that happens, the data gathered cannot be considered consistent
+	 * and hence needs to be fetched again to avoid flakiness.
+	 */
+	do {
+		if (__allocinfo_get_content_id(fd, &start_cont_id)) {
+			ksft_print_msg("allocinfo_get_content_id failed\n");
+			return IOCTL_FAILURE;
+		}
+
+		memset(tags, 0, sizeof(*tags));
+		memset(&get_at_params, 0, sizeof(get_at_params));
+		memcpy(&get_at_params.filter, filter, sizeof(*filter));
+		get_at_params.pos = start_pos;
+		if (__allocinfo_get_at(fd, &get_at_params)) {
+			ksft_print_msg("allocinfo_get_at failed\n");
+			return IOCTL_FAILURE;
+		}
+		memcpy(&tags->tag[tags->count++], &get_at_params.data, sizeof(get_at_params.data));
+
+		while (tags->count < VEC_MAX_ENTRIES &&
+		       __allocinfo_get_next(fd, &tags->tag[tags->count]) == 0)
+			tags->count++;
+
+		if (__allocinfo_get_content_id(fd, &end_cont_id)) {
+			ksft_print_msg("allocinfo_get_content_id failed\n");
+			return IOCTL_FAILURE;
+		}
+
+		if (start_cont_id.id == end_cont_id.id) {
+			status = IOCTL_SUCCESS;
+		} else {
+			ksft_print_msg("allocinfo_get_content_id mismatch, retrying...\n");
+			status = IOCTL_INVALID_DATA;
+		}
+	} while (status == IOCTL_INVALID_DATA && retry_count++ < max_retries);
+
+	close(fd);
+	return status;
+}
+
+static int run_filter_test(const struct allocinfo_filter *filter)
+{
+	struct allocinfo_tag_data_vec *tags = malloc(sizeof(*tags));
+	struct allocinfo_tag_data_vec *procfs_entries = malloc(sizeof(*procfs_entries));
+	int ioctl_status;
+	int ret = KSFT_PASS;
+
+	if (!tags || !procfs_entries) {
+		ksft_print_msg("Memory allocation failed.\n");
+		ret = KSFT_FAIL;
+		goto exit;
+	}
+
+	if (get_filtered_procfs_entries(procfs_entries, filter)) {
+		ksft_print_msg("Error retrieving entries from " ALLOCINFO_PROC "\n");
+		ret = KSFT_SKIP;
+		goto exit;
+	}
+
+	if (procfs_entries->count == 0) {
+		ksft_print_msg("No entries found in " ALLOCINFO_PROC ", skipping test\n");
+		ret = KSFT_SKIP;
+		goto exit;
+	}
+
+	ioctl_status = get_filtered_ioctl_entries(tags, filter, 0);
+	if (ioctl_status == IOCTL_INVALID_DATA) {
+		ksft_print_msg("Trouble retrieving valid IOCTL entries, skipping.\n");
+		ret = KSFT_SKIP;
+		goto exit;
+	}
+	if (ioctl_status == IOCTL_FAILURE) {
+		ksft_print_msg("Error retrieving IOCTL entries.\n");
+		ret = KSFT_FAIL;
+		goto exit;
+	}
+
+	if (!match_entries(procfs_entries, tags, false, false, true, true, true))
+		ret = KSFT_FAIL;
+
+exit:
+	free(tags);
+	free(procfs_entries);
+	return ret;
+}
+
+static int test_filename_filter(void)
+{
+	struct allocinfo_filter filter;
+	const char *target_filename = "mm/memory.c";
+
+	memset(&filter, 0, sizeof(filter));
+	filter.mask |= ALLOCINFO_FILTER_MASK_FILENAME;
+	strncpy(filter.fields.filename, target_filename, ALLOCINFO_STR_SIZE);
+
+	return run_filter_test(&filter);
+}
+
+static int test_function_filter(void)
+{
+	struct allocinfo_filter filter;
+	const char *target_function = "dup_mm";
+
+	memset(&filter, 0, sizeof(filter));
+	filter.mask |= ALLOCINFO_FILTER_MASK_FUNCTION;
+	strncpy(filter.fields.function, target_function, ALLOCINFO_STR_SIZE);
+
+	return run_filter_test(&filter);
+}
+
+int main(int argc, char *argv[])
+{
+	int ret;
+
+	ksft_set_plan(2);
+
+	ret = test_filename_filter();
+	if (ret == KSFT_SKIP)
+		ksft_test_result_skip("Skipping test_filename_filter\n");
+	else
+		ksft_test_result(ret == KSFT_PASS, "test_filename_filter\n");
+
+	ret = test_function_filter();
+	if (ret == KSFT_SKIP)
+		ksft_test_result_skip("Skipping test_function_filter\n");
+	else
+		ksft_test_result(ret == KSFT_PASS, "test_function_filter\n");
+
+	ksft_finished();
+}
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 4/6] alloc_tag: add accuracy based filtering to ioctl
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

Extend the allocinfo filtering mechanism to allow users to filter tags
based on their accuracy.

Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
Acked-by: Hao Ge <hao.ge@linux.dev>
---
 include/uapi/linux/alloc_tag.h | 4 ++++
 lib/alloc_tag.c                | 8 ++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
index 7f5acbb44c14..6ea39c4869fe 100644
--- a/include/uapi/linux/alloc_tag.h
+++ b/include/uapi/linux/alloc_tag.h
@@ -26,6 +26,8 @@ struct allocinfo_tag {
 	char function[ALLOCINFO_STR_SIZE];
 	char filename[ALLOCINFO_STR_SIZE];
 	__u64 lineno;
+	/* filter criteria only; see allocinfo_counter.accurate for actual accuracy */
+	__u64 inaccurate;
 };
 
 /* The alignment ensures 32-bit compatible interfaces are not broken */
@@ -45,6 +47,7 @@ enum {
 	ALLOCINFO_FILTER_FUNCTION,
 	ALLOCINFO_FILTER_FILENAME,
 	ALLOCINFO_FILTER_LINENO,
+	ALLOCINFO_FILTER_INACCURATE,
 	ALLOCINFO_FILTER_MIN_SIZE,
 	ALLOCINFO_FILTER_MAX_SIZE,
 	__ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
@@ -54,6 +57,7 @@ enum {
 #define ALLOCINFO_FILTER_MASK_FUNCTION		(1 << ALLOCINFO_FILTER_FUNCTION)
 #define ALLOCINFO_FILTER_MASK_FILENAME		(1 << ALLOCINFO_FILTER_FILENAME)
 #define ALLOCINFO_FILTER_MASK_LINENO		(1 << ALLOCINFO_FILTER_LINENO)
+#define ALLOCINFO_FILTER_MASK_INACCURATE	(1 << ALLOCINFO_FILTER_INACCURATE)
 #define ALLOCINFO_FILTER_MASK_MIN_SIZE		(1 << ALLOCINFO_FILTER_MIN_SIZE)
 #define ALLOCINFO_FILTER_MASK_MAX_SIZE		(1 << ALLOCINFO_FILTER_MAX_SIZE)
 
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index b3d21834b61e..4fb3653cb876 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -253,6 +253,8 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
 			   struct alloc_tag_counters *counters,
 			   bool *fetched_counters)
 {
+	bool inaccurate;
+
 	if (!filter || !filter->mask)
 		return true;
 
@@ -278,6 +280,12 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
 	    ct->lineno != filter->fields.lineno)
 		return false;
 
+	if (filter->mask & ALLOCINFO_FILTER_MASK_INACCURATE) {
+		inaccurate = !!(ct->flags & CODETAG_FLAG_INACCURATE);
+		if (inaccurate != !!(filter->fields.inaccurate))
+			return false;
+	}
+
 	if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
 		if (!*fetched_counters) {
 			*counters = allocinfo_prefetch_counters(ct);
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 3/6] alloc_tag: add size-based filtering to ioctl
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

Extend the allocinfo filtering mechanism to allow users to filter tags
based on the total number of bytes allocated [min_size, max_size]. The
size range is inclusive.

Filtering by size involves retrieving allocinfo per-CPU counters, which
is an expensive operation. Hence, the performance of size-based
filtering will be worse than other filters.

Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
Acked-by: Hao Ge <hao.ge@linux.dev>
---
 include/uapi/linux/alloc_tag.h |  8 ++++-
 lib/alloc_tag.c                | 63 ++++++++++++++++++++++++++++------
 2 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
index 3b11877955b9..7f5acbb44c14 100644
--- a/include/uapi/linux/alloc_tag.h
+++ b/include/uapi/linux/alloc_tag.h
@@ -45,13 +45,17 @@ enum {
 	ALLOCINFO_FILTER_FUNCTION,
 	ALLOCINFO_FILTER_FILENAME,
 	ALLOCINFO_FILTER_LINENO,
-	__ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_LINENO
+	ALLOCINFO_FILTER_MIN_SIZE,
+	ALLOCINFO_FILTER_MAX_SIZE,
+	__ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
 };
 
 #define ALLOCINFO_FILTER_MASK_MODNAME		(1 << ALLOCINFO_FILTER_MODNAME)
 #define ALLOCINFO_FILTER_MASK_FUNCTION		(1 << ALLOCINFO_FILTER_FUNCTION)
 #define ALLOCINFO_FILTER_MASK_FILENAME		(1 << ALLOCINFO_FILTER_FILENAME)
 #define ALLOCINFO_FILTER_MASK_LINENO		(1 << ALLOCINFO_FILTER_LINENO)
+#define ALLOCINFO_FILTER_MASK_MIN_SIZE		(1 << ALLOCINFO_FILTER_MIN_SIZE)
+#define ALLOCINFO_FILTER_MASK_MAX_SIZE		(1 << ALLOCINFO_FILTER_MAX_SIZE)
 
 #define ALLOCINFO_FILTER_MASKS \
 	((1 << (__ALLOCINFO_FILTER_LAST + 1)) - 1)
@@ -59,6 +63,8 @@ enum {
 struct allocinfo_filter {
 	__u64 mask; /* bitmask of the filter fields used */
 	struct allocinfo_tag fields;
+	__u64 min_size;
+	__u64 max_size;
 };
 
 struct allocinfo_get_at {
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 5feb61d9fb92..b3d21834b61e 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -195,15 +195,26 @@ static int allocinfo_cmp_str(const char *str, const char *template)
 	return strncmp(allocinfo_str(str), template, ALLOCINFO_STR_SIZE);
 }
 
+/* Fetch the per-CPU counters */
+static inline struct alloc_tag_counters allocinfo_prefetch_counters(struct codetag *ct)
+{
+	return alloc_tag_read(ct_to_alloc_tag(ct));
+}
+
 /*
  * Populates the UAPI allocinfo_tag_data structure with active runtime
  * profiling counters extracted from the given kernel codetag.
  */
 static void allocinfo_to_params(struct codetag *ct,
-				struct allocinfo_tag_data *data)
+				struct allocinfo_tag_data *data,
+				struct alloc_tag_counters *counters)
 {
-	struct alloc_tag *tag = ct_to_alloc_tag(ct);
-	struct alloc_tag_counters counter = alloc_tag_read(tag);
+	struct alloc_tag_counters local_counters;
+
+	if (!counters) {
+		local_counters = allocinfo_prefetch_counters(ct);
+		counters = &local_counters;
+	}
 
 	if (ct->modname)
 		allocinfo_copy_str(data->tag.modname, ct->modname);
@@ -212,9 +223,9 @@ static void allocinfo_to_params(struct codetag *ct,
 	allocinfo_copy_str(data->tag.function, ct->function);
 	allocinfo_copy_str(data->tag.filename, ct->filename);
 	data->tag.lineno = ct->lineno;
-	data->counter.bytes = counter.bytes;
-	data->counter.calls = counter.calls;
-	data->counter.accurate = !alloc_tag_is_inaccurate(tag);
+	data->counter.bytes = counters->bytes;
+	data->counter.calls = counters->calls;
+	data->counter.accurate = !alloc_tag_is_inaccurate(ct_to_alloc_tag(ct));
 }
 
 /*
@@ -238,7 +249,9 @@ static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
  * Verifies whether a given codetag satisfies the active filtering criteria by
  * matching its characteristics against the specified filter.
  */
-static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter)
+static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
+			   struct alloc_tag_counters *counters,
+			   bool *fetched_counters)
 {
 	if (!filter || !filter->mask)
 		return true;
@@ -265,6 +278,19 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter)
 	    ct->lineno != filter->fields.lineno)
 		return false;
 
+	if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
+		if (!*fetched_counters) {
+			*counters = allocinfo_prefetch_counters(ct);
+			*fetched_counters = true;
+		}
+		if ((filter->mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) &&
+		    counters->bytes < filter->min_size)
+			return false;
+		if ((filter->mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) &&
+		    counters->bytes > filter->max_size)
+			return false;
+	}
+
 	return true;
 }
 
@@ -278,6 +304,8 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
 	struct codetag *ct;
 	struct allocinfo_get_at params = {0};
 	__u64 skip_count;
+	struct alloc_tag_counters counters;
+	bool fetched_counters;
 
 	if (copy_from_user(&params, arg, sizeof(params)))
 		return -EFAULT;
@@ -285,6 +313,11 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
 	if (params.filter.mask & ~ALLOCINFO_FILTER_MASKS)
 		return -EINVAL;
 
+	if ((params.filter.mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) &&
+	    (params.filter.mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) &&
+	    params.filter.min_size > params.filter.max_size)
+		return -EINVAL;
+
 	priv = m->private;
 
 	mutex_lock(&priv->ioctl_lock);
@@ -308,7 +341,8 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
 	ct = codetag_next_ct(&priv->ioctl_iter);
 
 	while (ct) {
-		if (matches_filter(ct, &priv->filter)) {
+		fetched_counters = false;
+		if (matches_filter(ct, &priv->filter, &counters, &fetched_counters)) {
 			if (skip_count == 0)
 				break;
 			skip_count--;
@@ -317,7 +351,7 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
 	}
 
 	if (ct) {
-		allocinfo_to_params(ct, &params.data);
+		allocinfo_to_params(ct, &params.data, fetched_counters ? &counters : NULL);
 		priv->positioned = true;
 	}
 
@@ -343,6 +377,8 @@ static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
 	struct codetag *ct;
 	struct allocinfo_tag_data params;
 	int ret = 0;
+	struct alloc_tag_counters counters;
+	bool fetched_counters;
 
 	memset(&params, 0, sizeof(params));
 	priv = m->private;
@@ -356,10 +392,15 @@ static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
 	}
 
 	ct = codetag_next_ct(&priv->ioctl_iter);
-	while (ct && !matches_filter(ct, &priv->filter))
+	while (ct) {
+		fetched_counters = false;
+		if (matches_filter(ct, &priv->filter, &counters, &fetched_counters))
+			break;
 		ct = codetag_next_ct(&priv->ioctl_iter);
+	}
+
 	if (ct)
-		allocinfo_to_params(ct, &params);
+		allocinfo_to_params(ct, &params, fetched_counters ? &counters : NULL);
 
 	if (!ct) {
 		priv->positioned = false;
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 2/6] alloc_tag: add ioctl filters to /proc/allocinfo
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

Extend the capability of the IOCTL mechanism to filter allocations based
on tag's module name, function name, file name and line number.

Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
Acked-by: Hao Ge <hao.ge@linux.dev>
---
 include/uapi/linux/alloc_tag.h | 26 ++++++++++++-
 lib/alloc_tag.c                | 68 ++++++++++++++++++++++++++++++++--
 2 files changed, 89 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
index 0928e1a48d49..3b11877955b9 100644
--- a/include/uapi/linux/alloc_tag.h
+++ b/include/uapi/linux/alloc_tag.h
@@ -40,8 +40,32 @@ struct allocinfo_tag_data {
 	struct allocinfo_counter counter;
 };
 
+enum {
+	ALLOCINFO_FILTER_MODNAME,
+	ALLOCINFO_FILTER_FUNCTION,
+	ALLOCINFO_FILTER_FILENAME,
+	ALLOCINFO_FILTER_LINENO,
+	__ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_LINENO
+};
+
+#define ALLOCINFO_FILTER_MASK_MODNAME		(1 << ALLOCINFO_FILTER_MODNAME)
+#define ALLOCINFO_FILTER_MASK_FUNCTION		(1 << ALLOCINFO_FILTER_FUNCTION)
+#define ALLOCINFO_FILTER_MASK_FILENAME		(1 << ALLOCINFO_FILTER_FILENAME)
+#define ALLOCINFO_FILTER_MASK_LINENO		(1 << ALLOCINFO_FILTER_LINENO)
+
+#define ALLOCINFO_FILTER_MASKS \
+	((1 << (__ALLOCINFO_FILTER_LAST + 1)) - 1)
+
+struct allocinfo_filter {
+	__u64 mask; /* bitmask of the filter fields used */
+	struct allocinfo_tag fields;
+};
+
 struct allocinfo_get_at {
-	__u64 pos;	/* input */
+	/* inputs */
+	__u64 pos;
+	struct allocinfo_filter filter;
+	/* output */
 	struct allocinfo_tag_data data;
 };
 
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 82e3b5f32dff..5feb61d9fb92 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -49,6 +49,7 @@ struct allocinfo_private {
 	struct codetag_iterator iter;
 	struct codetag_iterator reported_iter;
 	bool print_header;
+	struct allocinfo_filter filter;
 	/* ioctl uses a separate iterator not to interfere with reads */
 	struct codetag_iterator ioctl_iter;
 	bool positioned; /* seq_open_private() sets to 0 */
@@ -188,6 +189,12 @@ static void allocinfo_copy_str(char *dest, const char *src)
 	strscpy_pad(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE);
 }
 
+/* Compare two strings and only consider the trimmed suffix if s1 is too long */
+static int allocinfo_cmp_str(const char *str, const char *template)
+{
+	return strncmp(allocinfo_str(str), template, ALLOCINFO_STR_SIZE);
+}
+
 /*
  * Populates the UAPI allocinfo_tag_data structure with active runtime
  * profiling counters extracted from the given kernel codetag.
@@ -227,6 +234,40 @@ static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
 	return 0;
 }
 
+/*
+ * Verifies whether a given codetag satisfies the active filtering criteria by
+ * matching its characteristics against the specified filter.
+ */
+static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter)
+{
+	if (!filter || !filter->mask)
+		return true;
+
+	if (filter->mask & ALLOCINFO_FILTER_MASK_MODNAME) {
+		/* user wants to filter by modname but ct->modname is NULL */
+		if (!ct->modname) {
+			/* validate if user was attempting to filter for built-in allocations */
+			if (filter->fields.modname[0] != '\0')
+				return false;
+		} else if (allocinfo_cmp_str(ct->modname, filter->fields.modname))
+			return false;
+	}
+
+	if ((filter->mask & ALLOCINFO_FILTER_MASK_FUNCTION) &&
+	    ct->function && allocinfo_cmp_str(ct->function, filter->fields.function))
+		return false;
+
+	if ((filter->mask & ALLOCINFO_FILTER_MASK_FILENAME) &&
+	    ct->filename && allocinfo_cmp_str(ct->filename, filter->fields.filename))
+		return false;
+
+	if ((filter->mask & ALLOCINFO_FILTER_MASK_LINENO) &&
+	    ct->lineno != filter->fields.lineno)
+		return false;
+
+	return true;
+}
+
 /*
  * Seeks the ioctl iterator to the specified 0-indexed tag position, reads its
  * profiling data and returns it to userspace.
@@ -235,29 +276,46 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
 {
 	struct allocinfo_private *priv;
 	struct codetag *ct;
-	__u64 pos;
 	struct allocinfo_get_at params = {0};
+	__u64 skip_count;
 
 	if (copy_from_user(&params, arg, sizeof(params)))
 		return -EFAULT;
 
+	if (params.filter.mask & ~ALLOCINFO_FILTER_MASKS)
+		return -EINVAL;
+
 	priv = m->private;
-	pos = params.pos;
 
 	mutex_lock(&priv->ioctl_lock);
 	codetag_lock_module_list(alloc_tag_cttype);
 
-	if (pos >= codetag_get_count(alloc_tag_cttype)) {
+	if (params.pos >= codetag_get_count(alloc_tag_cttype)) {
 		codetag_unlock_module_list(alloc_tag_cttype);
 		mutex_unlock(&priv->ioctl_lock);
 		return -ENOENT;
 	}
 
+	skip_count = params.pos;
+
+	if (params.filter.mask)
+		priv->filter = params.filter;
+	else
+		priv->filter.mask = 0;
+
 	/* Find the codetag */
 	priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
 	ct = codetag_next_ct(&priv->ioctl_iter);
-	while (ct && pos--)
+
+	while (ct) {
+		if (matches_filter(ct, &priv->filter)) {
+			if (skip_count == 0)
+				break;
+			skip_count--;
+		}
 		ct = codetag_next_ct(&priv->ioctl_iter);
+	}
+
 	if (ct) {
 		allocinfo_to_params(ct, &params.data);
 		priv->positioned = true;
@@ -298,6 +356,8 @@ static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
 	}
 
 	ct = codetag_next_ct(&priv->ioctl_iter);
+	while (ct && !matches_filter(ct, &priv->filter))
+		ct = codetag_next_ct(&priv->ioctl_iter);
 	if (ct)
 		allocinfo_to_params(ct, &params);
 
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 1/6] alloc_tag: add ioctl to /proc/allocinfo
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat
In-Reply-To: <cover.1781564384.git.abhishekbapat@google.com>

From: Suren Baghdasaryan <surenb@google.com>

Add the following ioctl commands for /proc/allocinfo file:

ALLOCINFO_IOC_CONTENT_ID - gets content identifier which can be used
to check whether the file content has changed specifically due to module
load/unload. Every time a module is loaded / unloaded, the returned
value will be different. By comparing the identifier value at the
beginning and at the end of the content retrieval operation, users can
validate retrieved information for consistency.

ALLOCINFO_IOC_GET_AT - gets the record at the specified position. This
is the position of a record in /proc/allocinfo.

ALLOCINFO_IOC_GET_NEXT - gets the record next to the last retrieved
one. If no records were previously retrieved, returns the first
record.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
---
 Documentation/mm/allocation-profiling.rst     |   5 +
 .../userspace-api/ioctl/ioctl-number.rst      |   2 +
 MAINTAINERS                                   |   1 +
 include/linux/codetag.h                       |   2 +
 include/uapi/linux/alloc_tag.h                |  60 +++++
 lib/alloc_tag.c                               | 235 +++++++++++++++++-
 lib/codetag.c                                 |  18 ++
 7 files changed, 321 insertions(+), 2 deletions(-)
 create mode 100644 include/uapi/linux/alloc_tag.h

diff --git a/Documentation/mm/allocation-profiling.rst b/Documentation/mm/allocation-profiling.rst
index 5389d241176a..c3a28467955f 100644
--- a/Documentation/mm/allocation-profiling.rst
+++ b/Documentation/mm/allocation-profiling.rst
@@ -46,6 +46,11 @@ sysctl:
 Runtime info:
   /proc/allocinfo
 
+  Profiling data can be retrieved either by reading `/proc/allocinfo` directly as
+  text or programmatically via `ioctl()` calls defined in `<uapi/linux/alloc_tag.h>`.
+  The ioctl interface supports structured binary data extraction as well as filtering
+  by module name, function, file, line number, accuracy, or allocation size limits.
+
 Example output::
 
   root@moria-kvm:~# sort -g /proc/allocinfo|tail|numfmt --to=iec
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 331223761fff..84f6808a8578 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -349,6 +349,8 @@ Code  Seq#    Include File                                             Comments
                                                                        <mailto:luzmaximilian@gmail.com>
 0xA5  20-2F  linux/surface_aggregator/dtx.h                            Microsoft Surface DTX driver
                                                                        <mailto:luzmaximilian@gmail.com>
+0xA6  00-0F  uapi/linux/alloc_tag.h                                    Memory allocation profiling
+                                                                       <mailto:surenb@google.com>
 0xAA  00-3F  linux/uapi/linux/userfaultfd.h
 0xAB  00-1F  linux/nbd.h
 0xAC  00-1F  linux/raw.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 65bd4328fe05..019cc4c285a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16713,6 +16713,7 @@ S:	Maintained
 F:	Documentation/mm/allocation-profiling.rst
 F:	include/linux/alloc_tag.h
 F:	include/linux/pgalloc_tag.h
+F:	include/uapi/linux/alloc_tag.h
 F:	lib/alloc_tag.c
 
 MEMORY CONTROLLER DRIVERS
diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index ddae7484ca45..a25a085c2df1 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -77,6 +77,8 @@ struct codetag_iterator {
 void codetag_lock_module_list(struct codetag_type *cttype);
 bool codetag_trylock_module_list(struct codetag_type *cttype);
 void codetag_unlock_module_list(struct codetag_type *cttype);
+unsigned long codetag_get_content_id(struct codetag_type *cttype);
+unsigned int codetag_get_count(struct codetag_type *cttype);
 struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
 struct codetag *codetag_next_ct(struct codetag_iterator *iter);
 
diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
new file mode 100644
index 000000000000..0928e1a48d49
--- /dev/null
+++ b/include/uapi/linux/alloc_tag.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * alloc_tag IOCTL API definition
+ *
+ * Copyright (C) 2026 Google, LLC.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _UAPI_ALLOC_TAG_H
+#define _UAPI_ALLOC_TAG_H
+
+#include <linux/types.h>
+
+#define ALLOCINFO_STR_SIZE	64
+
+struct allocinfo_content_id {
+	__u64 id;
+};
+
+struct allocinfo_tag {
+	/* Longer names are trimmed */
+	char modname[ALLOCINFO_STR_SIZE];
+	char function[ALLOCINFO_STR_SIZE];
+	char filename[ALLOCINFO_STR_SIZE];
+	__u64 lineno;
+};
+
+/* The alignment ensures 32-bit compatible interfaces are not broken */
+struct allocinfo_counter {
+	__u64 bytes;
+	__u64 calls;
+	__u8 accurate;
+} __attribute__((aligned(8)));
+
+struct allocinfo_tag_data {
+	struct allocinfo_tag tag;
+	struct allocinfo_counter counter;
+};
+
+struct allocinfo_get_at {
+	__u64 pos;	/* input */
+	struct allocinfo_tag_data data;
+};
+
+#define _ALLOCINFO_IOC_CONTENT_ID	0
+#define _ALLOCINFO_IOC_GET_AT		1
+#define _ALLOCINFO_IOC_GET_NEXT		2
+
+#define ALLOCINFO_IOC_BASE		0xA6
+#define ALLOCINFO_IOC_CONTENT_ID	_IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_CONTENT_ID,	\
+					     struct allocinfo_content_id)
+#define ALLOCINFO_IOC_GET_AT		_IOWR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_AT,	\
+					      struct allocinfo_get_at)
+#define ALLOCINFO_IOC_GET_NEXT		_IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_NEXT,	\
+					     struct allocinfo_tag_data)
+
+#endif /* _UAPI_ALLOC_TAG_H */
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index d9be1cf5187d..82e3b5f32dff 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -5,6 +5,7 @@
 #include <linux/gfp.h>
 #include <linux/kallsyms.h>
 #include <linux/module.h>
+#include <linux/mutex.h>
 #include <linux/page_ext.h>
 #include <linux/pgalloc_tag.h>
 #include <linux/proc_fs.h>
@@ -14,6 +15,7 @@
 #include <linux/string_choices.h>
 #include <linux/vmalloc.h>
 #include <linux/kmemleak.h>
+#include <uapi/linux/alloc_tag.h>
 
 #define ALLOCINFO_FILE_NAME		"allocinfo"
 #define MODULE_ALLOC_TAG_VMAP_SIZE	(100000UL * sizeof(struct alloc_tag))
@@ -47,6 +49,10 @@ struct allocinfo_private {
 	struct codetag_iterator iter;
 	struct codetag_iterator reported_iter;
 	bool print_header;
+	/* ioctl uses a separate iterator not to interfere with reads */
+	struct codetag_iterator ioctl_iter;
+	bool positioned; /* seq_open_private() sets to 0 */
+	struct mutex ioctl_lock;
 };
 
 static void *allocinfo_start(struct seq_file *m, loff_t *pos)
@@ -130,6 +136,232 @@ static const struct seq_operations allocinfo_seq_op = {
 	.show	= allocinfo_show,
 };
 
+/*
+ * Initializes seq_file operations and allocates private state when opening
+ * the /proc/allocinfo procfs entry.
+ */
+static int allocinfo_open(struct inode *inode, struct file *file)
+{
+	int ret;
+
+	ret = seq_open_private(file, &allocinfo_seq_op,
+			       sizeof(struct allocinfo_private));
+	if (!ret) {
+		struct seq_file *m = file->private_data;
+		struct allocinfo_private *priv = m->private;
+
+		mutex_init(&priv->ioctl_lock);
+	}
+	return ret;
+}
+
+/*
+ * Cleans up the seq_file state and frees up the private state allocated in
+ * allocinfo_open() when closing the /proc/allocinfo file descriptor.
+ */
+static int allocinfo_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *m = file->private_data;
+	struct allocinfo_private *priv = m->private;
+
+	mutex_destroy(&priv->ioctl_lock);
+	return seq_release_private(inode, file);
+}
+
+/*
+ * Returns a pointer to the suffix of a string so that its length fits within
+ * ALLOCINFO_STR_SIZE, preserving the trailing characters.
+ */
+static const char *allocinfo_str(const char *str)
+{
+	size_t len = strlen(str);
+
+	/* Keep an extra space for the trailing NULL. */
+	if (len >= ALLOCINFO_STR_SIZE)
+		str += (len - ALLOCINFO_STR_SIZE) + 1;
+	return str;
+}
+
+/* Copy a string and trim from the beginning if it's too long */
+static void allocinfo_copy_str(char *dest, const char *src)
+{
+	strscpy_pad(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE);
+}
+
+/*
+ * Populates the UAPI allocinfo_tag_data structure with active runtime
+ * profiling counters extracted from the given kernel codetag.
+ */
+static void allocinfo_to_params(struct codetag *ct,
+				struct allocinfo_tag_data *data)
+{
+	struct alloc_tag *tag = ct_to_alloc_tag(ct);
+	struct alloc_tag_counters counter = alloc_tag_read(tag);
+
+	if (ct->modname)
+		allocinfo_copy_str(data->tag.modname, ct->modname);
+	else
+		data->tag.modname[0] = '\0';
+	allocinfo_copy_str(data->tag.function, ct->function);
+	allocinfo_copy_str(data->tag.filename, ct->filename);
+	data->tag.lineno = ct->lineno;
+	data->counter.bytes = counter.bytes;
+	data->counter.calls = counter.calls;
+	data->counter.accurate = !alloc_tag_is_inaccurate(tag);
+}
+
+/*
+ * Retrieves the unique content ID representing the current allocation tag module
+ * layout, allowing userspace to detect if modules were loaded / unloaded.
+ */
+static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
+{
+	struct allocinfo_content_id params;
+
+	codetag_lock_module_list(alloc_tag_cttype);
+	params.id = codetag_get_content_id(alloc_tag_cttype);
+	codetag_unlock_module_list(alloc_tag_cttype);
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/*
+ * Seeks the ioctl iterator to the specified 0-indexed tag position, reads its
+ * profiling data and returns it to userspace.
+ */
+static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
+{
+	struct allocinfo_private *priv;
+	struct codetag *ct;
+	__u64 pos;
+	struct allocinfo_get_at params = {0};
+
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	priv = m->private;
+	pos = params.pos;
+
+	mutex_lock(&priv->ioctl_lock);
+	codetag_lock_module_list(alloc_tag_cttype);
+
+	if (pos >= codetag_get_count(alloc_tag_cttype)) {
+		codetag_unlock_module_list(alloc_tag_cttype);
+		mutex_unlock(&priv->ioctl_lock);
+		return -ENOENT;
+	}
+
+	/* Find the codetag */
+	priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
+	ct = codetag_next_ct(&priv->ioctl_iter);
+	while (ct && pos--)
+		ct = codetag_next_ct(&priv->ioctl_iter);
+	if (ct) {
+		allocinfo_to_params(ct, &params.data);
+		priv->positioned = true;
+	}
+
+	codetag_unlock_module_list(alloc_tag_cttype);
+	mutex_unlock(&priv->ioctl_lock);
+
+	if (!ct)
+		return -ENOENT;
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/*
+ * Advances the ioctl iterator to the next allocation tag in the sequence and
+ * returns its profiling data to userspace.
+ */
+static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
+{
+	struct allocinfo_private *priv;
+	struct codetag *ct;
+	struct allocinfo_tag_data params;
+	int ret = 0;
+
+	memset(&params, 0, sizeof(params));
+	priv = m->private;
+
+	mutex_lock(&priv->ioctl_lock);
+	codetag_lock_module_list(alloc_tag_cttype);
+
+	if (!priv->positioned) {
+		priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
+		priv->positioned = true;
+	}
+
+	ct = codetag_next_ct(&priv->ioctl_iter);
+	if (ct)
+		allocinfo_to_params(ct, &params);
+
+	if (!ct) {
+		priv->positioned = false;
+		ret = -ENOENT;
+	}
+	codetag_unlock_module_list(alloc_tag_cttype);
+	mutex_unlock(&priv->ioctl_lock);
+
+	if (ret == 0) {
+		if (copy_to_user(arg, &params, sizeof(params)))
+			return -EFAULT;
+	}
+	return ret;
+}
+
+/*
+ * Entry point ioctl function for /proc/allocinfo routing requests to fetch the
+ * layout content ID, seek to a specific tag, or read sequential tags.
+ */
+static long allocinfo_ioctl(struct file *file, unsigned int cmd,
+			    unsigned long __arg)
+{
+	void __user *arg = (void __user *)__arg;
+	int ret;
+
+	switch (cmd) {
+	case ALLOCINFO_IOC_CONTENT_ID:
+		ret = allocinfo_ioctl_get_content_id(file->private_data, arg);
+		break;
+	case ALLOCINFO_IOC_GET_AT:
+		ret = allocinfo_ioctl_get_at(file->private_data, arg);
+		break;
+	case ALLOCINFO_IOC_GET_NEXT:
+		ret = allocinfo_ioctl_get_next(file->private_data, arg);
+		break;
+	default:
+		ret = -ENOIOCTLCMD;
+		break;
+	}
+
+	return ret;
+}
+
+#ifdef CONFIG_COMPAT
+static long allocinfo_compat_ioctl(struct file *file, unsigned int cmd,
+				   unsigned long arg)
+{
+	return allocinfo_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
+}
+#endif
+
+static const struct proc_ops allocinfo_proc_ops = {
+	.proc_open		= allocinfo_open,
+	.proc_read_iter		= seq_read_iter,
+	.proc_lseek		= seq_lseek,
+	.proc_release		= allocinfo_release,
+	.proc_ioctl		= allocinfo_ioctl,
+#ifdef CONFIG_COMPAT
+	.proc_compat_ioctl	= allocinfo_compat_ioctl,
+#endif
+};
+
 size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sleep)
 {
 	struct codetag_iterator iter;
@@ -993,8 +1225,7 @@ static int __init alloc_tag_init(void)
 		return 0;
 	}
 
-	if (!proc_create_seq_private(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_seq_op,
-				     sizeof(struct allocinfo_private), NULL)) {
+	if (!proc_create(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_proc_ops)) {
 		pr_err("Failed to create %s file\n", ALLOCINFO_FILE_NAME);
 		shutdown_mem_profiling(false);
 		return -ENOMEM;
diff --git a/lib/codetag.c b/lib/codetag.c
index 4001a7ea6675..a9cda4c962a3 100644
--- a/lib/codetag.c
+++ b/lib/codetag.c
@@ -19,6 +19,8 @@ struct codetag_type {
 	struct codetag_type_desc desc;
 	/* generates unique sequence number for module load */
 	unsigned long next_mod_seq;
+	/* bumped on every module load and unload */
+	unsigned long content_id;
 };
 
 struct codetag_range {
@@ -50,6 +52,20 @@ void codetag_unlock_module_list(struct codetag_type *cttype)
 	up_read(&cttype->mod_lock);
 }
 
+unsigned long codetag_get_content_id(struct codetag_type *cttype)
+{
+	lockdep_assert_held(&cttype->mod_lock);
+
+	return cttype->content_id;
+}
+
+unsigned int codetag_get_count(struct codetag_type *cttype)
+{
+	lockdep_assert_held(&cttype->mod_lock);
+
+	return cttype->count;
+}
+
 struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype)
 {
 	struct codetag_iterator iter = {
@@ -204,6 +220,7 @@ static int codetag_module_init(struct codetag_type *cttype, struct module *mod)
 
 	down_write(&cttype->mod_lock);
 	cmod->mod_seq = ++cttype->next_mod_seq;
+	++cttype->content_id;
 	mod_id = idr_alloc(&cttype->mod_idr, cmod, 0, 0, GFP_KERNEL);
 	if (mod_id >= 0) {
 		if (cttype->desc.module_load) {
@@ -368,6 +385,7 @@ void codetag_unload_module(struct module *mod)
 			cttype->count -= range_size(cttype, &cmod->range);
 			idr_remove(&cttype->mod_idr, mod_id);
 			kfree(cmod);
+			++cttype->content_id;
 		}
 		up_write(&cttype->mod_lock);
 		if (found && cttype->desc.free_section_mem)
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH v5 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
From: Abhishek Bapat @ 2026-06-15 23:04 UTC (permalink / raw)
  To: Suren Baghdasaryan, Andrew Morton, Kent Overstreet, Hao Ge
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Abhishek Bapat

Currently, memory allocation profiling data is primarily exposed through
/proc/allocinfo. While useful for manual inspection, this text-based
interface poses challenges for production monitoring and large-scale
analysis:

1. Userspace must parse large amounts of text to extract specific
fields.
2. To find specific tags, userspace must read the entire dataset,
requiring many context switches and high data copying.
3. The kernel currently aggregates per-CPU counters for every allocation
size, even those the user intends to filter out immediately.

This series introduces a new IOCTL-based binary interface for allocinfo
that supports kernel-side filtering. By allowing the user to specify a
filter mask, we significantly reduce the work performed in-kernel and
the amount of data transferred to userspace. The IOCTL mechanism was
chosen for allocinfo to address the per-CPU counter aggregation
bottleneck. A traditional read() operation must report the total
allocation count and sizes for every code tag in the system. Doing so
requires iterating across all CPUs to sum their per-CPU counters for
thousands of tags, which introduces substantial runtime overhead.

The IOCTL interface allows userspace to push selective filtering
criteria directly into the kernel before the per-CPU counter
aggregation. The kernel aggregates per-CPU counters only for a small
subset of tags that match the filter. This results in significant
performance improvement.

Beyond fast filtered retrieval, the IOCTL foundation allows introducing
a context capture mechanism in the future to capture the context for
specific allocations.

Performance measurements were conducted on an Intel Xeon Platinum 8481C
(224 CPUs) with caches dropped before each run.

The IOCTL mechanism shows a ~20x performance improvement for
filtered queries. The kernel avoids the expensive per-CPU counter
aggregation (alloc_tag_read) for any tags that fail the initial string
or location filters.

Scenario 1: Specific File Filtering (arch/x86/events/rapl.c)
1. Traditional (cat /proc/allocinfo | grep): 22ms (sys)
2. IOCTL Interface: 1ms (sys)

Scenario 2: Compound Filtering (Filename + Size)
1. Traditional: (cat ... | grep | awk): 21ms (sys)
2. IOCTL Interface: 1ms (sys)

Scenario 3: Size-Based Filtering (min_size = 1MB)
1. Traditional: (cat ... | awk): 21ms (sys)
2. IOCTL Interface: 14ms (sys)

v5 changes:
- Patch 1/6: Added explicit mutex_destroy.
- Patch 5/6: Self-contained file descriptors to avoid wrap-around errors
  in retry loops.
- Patch 6/6: Fixed minor issues raised by sashiko in v4.

v4 changes:
- Patch 1/6: Fixed a copyright comment inside
  include/uapi/linux/alloc_tag.h
- Patch 3/6: Among other nits, fixed the inadvertent build failure
  introduced in v3.
- Patch 4/6: Included a comment stating that the accurate field in
  struct allocinfo_tag is only used for filtering.
- Patch 5/6: Modified test to trim prefix and keep suffix for entries
  with filenames exceeding the size limit.
- Patch 6/6: Modified test_size_filter such that if content_id changes
  between the moment when procfs and ioctl entries are read, both
entries are invalidated and re-fetched. Removed the tags->count == 0
check from test_lineno_filter as it's virtually unreachable.

v3 changes:
- Patch 1/6: Modified Documentation to indicate that map supports
  ioctl(). Modified struct allocinfo_count to use
__attribute__((aligned(8))) instead of manual padding. Removed
redundance type-casting. Added comments for static functions in
lib/alloc_tag.c. Introduced a new seq counter for content_id that gets
bumped every time module is loaded / unloaded. Introduced logic to
validate user specified position is not greater than number of
allocation tags and return early if it is. Changed strscpy to
strscpy_pad to not echo arbitrary user data back to the user.
- Patch 2/6: Handled the case where user wants to specifically filter
  for built-in modules. Included some comments for static functions.
- Patch 3/6: Modified logic to only fetch per-CPU counters for codetags
  that satisfy other filters. Included some comments for static
functions.

v2 changes:
- Patch 1/6: Introduced locking for m->private. Also included the new uapi
header file in MAINTAINERS list.
- Patch 2/6: Handled the case where ALLOCINFO_FILTER_MASK_MODNAME is
passed but ct->modname is NULL.
- Patch 3/6: Moved min_size and max_size outside of struct allocinfo_tag
into struct allocinfo_filter. Added validation that min_size <=
max_size. Prefetched alloc_tag_counters if size based filter masks are
provided to avoid assimilating per-cpu counters twice.
- Patch 5/6: Removed the hardcoded logic to skip the header, instead the
test will skip lines that don't match the format. Also included the
newly added alloc_tag selftests directory in MAINTAINERS list.

Abhishek Bapat (5):
  alloc_tag: add ioctl filters to /proc/allocinfo
  alloc_tag: add size-based filtering to ioctl
  alloc_tag: add accuracy based filtering to ioctl
  kselftest: alloc_tag: add kselftest for ioctl interface
  kselftest: alloc_tag: extend the allocinfo ioctl kselftest

Suren Baghdasaryan (1):
  alloc_tag: add ioctl to /proc/allocinfo

 Documentation/mm/allocation-profiling.rst     |   5 +
 .../userspace-api/ioctl/ioctl-number.rst      |   2 +
 MAINTAINERS                                   |   2 +
 include/linux/codetag.h                       |   2 +
 include/uapi/linux/alloc_tag.h                |  94 ++++
 lib/alloc_tag.c                               | 344 +++++++++++-
 lib/codetag.c                                 |  18 +
 tools/testing/selftests/alloc_tag/Makefile    |   9 +
 .../alloc_tag/allocinfo_ioctl_test.c          | 526 ++++++++++++++++++
 9 files changed, 1000 insertions(+), 2 deletions(-)
 create mode 100644 include/uapi/linux/alloc_tag.h
 create mode 100644 tools/testing/selftests/alloc_tag/Makefile
 create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c

-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply

* Re: [PATCH net-next v2 3/3] docs: net: fix minor issues with strparser docs
From: Sabrina Dubroca @ 2026-06-15 22:51 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
	linux-doc, john.fastabend, jiri, skhan
In-Reply-To: <20260613165846.2913092-4-kuba@kernel.org>

2026-06-13, 09:58:46 -0700, Jakub Kicinski wrote:
> Not sure if anyone would read this doc, but the API has evolved
> since it was written. Update to:
> - show the int return type for strp_init()
> - refer to strp_data_ready(), not the old strp_tcp_data_ready() name
> - direct users to strp_msg(skb) for strparser metadata instead of
>   treating skb->cb as struct strp_msg directly
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: corbet@lwn.net
> CC: skhan@linuxfoundation.org
> CC: linux-doc@vger.kernel.org
> ---
>  Documentation/networking/strparser.rst | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>


> @@ -123,9 +123,9 @@ Callbacks
>      should parse the sk_buff as containing the headers for the
>      next application layer message in the stream.

Last time I had a look at strp, I was thinking about removing half of
the callbacks, since they're never used. Not a big simplification, but
it would remove a few indirections when reading the code.

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH net-next v2 0/3] docs: net: more adjustments to docs
From: patchwork-bot+netdevbpf @ 2026-06-15 22:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
	linux-doc, john.fastabend, sd, jiri
In-Reply-To: <20260613165846.2913092-1-kuba@kernel.org>

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 13 Jun 2026 09:58:43 -0700 you wrote:
> A few small updates to the docs.
> This is trying to prepare docs for getting fed directly
> into AI reviews.
> 
> v2:
>  - fixes in the tls offload patch
>  - add the strparser patch in place of the already applied XDP md one
> v1: https://lore.kernel.org/20260609201224.1191391-1-kuba@kernel.org
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/3] docs: net: tls-offload: document tls_dev_del, tls_dev_resync, and rekey
    https://git.kernel.org/netdev/net-next/c/c8ee634048dd
  - [net-next,v2,2/3] docs: net: fix minor issues with devlink docs
    https://git.kernel.org/netdev/net-next/c/e504cf18ef47
  - [net-next,v2,3/3] docs: net: fix minor issues with strparser docs
    https://git.kernel.org/netdev/net-next/c/d9e4dd3c6f1f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v3] Documentation/process: Add Researcher Guidelines
From: Kees Cook @ 2026-06-15 22:48 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Dan Carpenter, Jonathan Corbet, Greg Kroah-Hartman,
	Stefano Zacchiroli, Steven Rostedt, Laura Abbott, Julia Lawall,
	Wenwen Wang, Gustavo A . R . Silva, Thorsten Leemhuis,
	linux-kernel, linux-doc, linux-hardening, Dawei Feng
In-Reply-To: <ajACprp9UJp2JSJM@black.igk.intel.com>

On Mon, Jun 15, 2026 at 03:48:22PM +0200, Andy Shevchenko wrote:
> On Thu, May 28, 2026 at 01:34:34PM +0300, Dan Carpenter wrote:
> > On Fri, Mar 04, 2022 at 10:14:18AM -0800, Kees Cook wrote:
> 
> ...
> 
> > > +  x86_64 and arm64 defconfig builds with CONFIG_FOO_BAR=y using GCC
> > > +  11.2 show no new warnings, and LeakMagic no longer warns about this
> > > +  code path. As we don't have a FooBar device to test with, no runtime
> > > +  testing was able to be performed.
> > 
> > People have started sending commit messages in this exact template and
> > normally I would ask them resend with the meta commentary from this
> > paragraph below the --- cut off line.
> > 
> > Do we really want this "Compile tested only" stuff in the permanent git
> > log?
> 
> +1 here, can we rather avoid flooding commit messages with the meta, that
> anyways is available in lore.kernel.org archives?

Hm, I have gotten a lot of push-back from maintainers (reasonablly)
wanting to know the specific level of testing patches get. In the case
of lacking hardware, this seems like useful information still.

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH v2] docs/mm: describe current criteria for enabling split page table lock for PTE tables
From: Ethan Nelson-Moore @ 2026-06-15 22:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
	Andrew Morton, David Hildenbrand, Jonathan Corbet
In-Reply-To: <ajB-ebLDYB-pX-4h@casper.infradead.org>

Hi, Matthew,

On Mon, Jun 15, 2026 at 3:36 PM Matthew Wilcox <willy@infradead.org> wrote:
> is enabled _at_ compile time?

Yes, I think that would be better. This section of the documentation
has a lot of grammar errors, though, and I didn't want to spend my
time fixing them at the moment.

Ethan

^ permalink raw reply

* Re: [PATCH v2] docs/mm: describe current criteria for enabling split page table lock for PTE tables
From: Matthew Wilcox @ 2026-06-15 22:36 UTC (permalink / raw)
  To: Ethan Nelson-Moore
  Cc: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
	Andrew Morton, David Hildenbrand, Jonathan Corbet
In-Reply-To: <20260615222829.17420-1-enelsonmoore@gmail.com>

On Mon, Jun 15, 2026 at 03:28:29PM -0700, Ethan Nelson-Moore wrote:
> -Split page table lock for PTE tables is enabled compile-time if
> -CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
> +Split page table lock for PTE tables is enabled compile-time (via

is enabled _at_ compile time?

> +CONFIG_SPLIT_PTE_PTLOCKS) if NR_CPUS is greater than or equal to 4 and an
> +MMU is being used. However, it is not supported on some architectures.
>  If split lock is disabled, all tables are guarded by mm->page_table_lock.
>  
>  Split page table lock for PMD tables is enabled, if it's enabled for PTE
> -- 
> 2.43.0
> 
> 

^ permalink raw reply

* [PATCH v2] docs/mm: describe current criteria for enabling split page table lock for PTE tables
From: Ethan Nelson-Moore @ 2026-06-15 22:28 UTC (permalink / raw)
  To: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc
  Cc: Ethan Nelson-Moore, Andrew Morton, David Hildenbrand,
	Jonathan Corbet

The mm documentation regarding split page table lock for PTE tables
refers to the CONFIG_SPLIT_PTLOCK_CPUS config option, which was
superseded by CONFIG_SPLIT_PTE_PTLOCKS in commit 394290cba966 ("mm:
turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig
options"). Update the documentation to refer to the current option and
document that it is not supported on some architectures.

Discovered while searching for CONFIG_* symbols referenced in the
kernel but not defined in any Kconfig file.

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
---
Changes in v2: Remove list of unsupported architectures for conciseness

 Documentation/mm/split_page_table_lock.rst | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/mm/split_page_table_lock.rst b/Documentation/mm/split_page_table_lock.rst
index cc3cd46abd1b..3a35988f3c16 100644
--- a/Documentation/mm/split_page_table_lock.rst
+++ b/Documentation/mm/split_page_table_lock.rst
@@ -37,8 +37,9 @@ There are helpers to lock/unlock a table and other accessor functions:
  - pmd_lockptr()
 	returns pointer to PMD table lock;
 
-Split page table lock for PTE tables is enabled compile-time if
-CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
+Split page table lock for PTE tables is enabled compile-time (via
+CONFIG_SPLIT_PTE_PTLOCKS) if NR_CPUS is greater than or equal to 4 and an
+MMU is being used. However, it is not supported on some architectures.
 If split lock is disabled, all tables are guarded by mm->page_table_lock.
 
 Split page table lock for PMD tables is enabled, if it's enabled for PTE
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH net-next v2 1/3] docs: net: tls-offload: document tls_dev_del, tls_dev_resync, and rekey
From: Sabrina Dubroca @ 2026-06-15 22:24 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
	linux-doc, john.fastabend, jiri, skhan
In-Reply-To: <20260613165846.2913092-2-kuba@kernel.org>

2026-06-13, 09:58:44 -0700, Jakub Kicinski wrote:
> Fill in some gaps in the TLS offload doc:
> 
> - describe the tls_dev_del and tls_dev_resync callbacks
> - add a mention of rekeying being out of scope for now
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> v2:
>  - add mentions of the callback in resync text

Thanks!

>  - Stack -> The stack
> v1: https://lore.kernel.org/20260609201224.1191391-1-kuba@kernel.org

Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH v6 01/12] PCI: liveupdate: Set up FLB handler for the PCI core
From: David Matlack @ 2026-06-15 22:22 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: kexec, linux-doc, linux-kernel, linux-mm, linux-pci,
	Adithya Jayachandran, Alexander Graf, Alex Williamson,
	Bjorn Helgaas, Chris Li, David Rientjes, Jacob Pan,
	Jason Gunthorpe, Jonathan Corbet, Josh Hilke, Leon Romanovsky,
	Lukas Wunner, Mike Rapoport, Parav Pandit, Pranjal Shrivastava,
	Pratyush Yadav, Saeed Mahameed, Samiullah Khawaja, Shuah Khan,
	Vipin Sharma, William Tu, Yi Liu
In-Reply-To: <178124130274.908199.14827357870284807134.b4-review@b4>

On 2026-06-12 05:15 AM, Pasha Tatashin wrote:
> On Fri, 22 May 2026 20:23:59 +0000, David Matlack <dmatlack@google.com> wrote:

> > + * PCI device preservation across Live Update is built on top of the Live Update
> > + * Orchestrator's (LUO) support for file preservation across kexec. Drivers
> 
> I prefer to just use acronyms FLB, and LUO, but have links to the actual 
> documentations about them.
> 
> So, something like this:
> 
>   * :ref:`FLB <flb>` Data
>   * =====================
>   *
>   * PCI device preservation across Live Update is built on top of the
>   * :ref:`LUO <luo>` support for file preservation across kexec. Drivers
> 
> And also add _luo and _flb to Documentation/core-api/liveupdate.rst
> 
> .. _luo:
> 
>  ========================
>  Live Update Orchestrator
>  ========================
> 
> .. _flb:

Will do.

I guess I will need to add another patch to add the link references to
liveupdate.rst?

> 
>  LUO File Lifecycle Bound Global Data
>  ====================================
> 
> > [ ... skip 17 lines ... ]
> > + *
> > + *  * ``pci_liveupdate_register_flb(driver_file_handler)``
> > + *  * ``pci_liveupdate_unregister_flb(driver_file_handler)``
> > + */
> > +
> > +#define pr_fmt(fmt) "PCI: liveupdate: " fmt
> 
> Nit, may be:

Did you have a suggestion here that got lost?

> 
> > +
> > +#include <linux/io.h>
> > +#include <linux/kexec_handover.h>
> > +#include <linux/kho/abi/pci.h>
> > +#include <linux/liveupdate.h>
> > +#include <linux/mutex.h>
> > +#include <linux/mm.h>
> 
> Please sort alphabetically.

Will do.

> 
> > [ ... skip 12 lines ... ]
> > +	 * future to increase the chances that there is enough room to preserve
> > +	 * devices that are not yet present on the system (e.g. VFs, hot-plugged
> > +	 * devices).
> > +	 */
> > +	for_each_pci_dev(dev)
> > +		max_nr_devices++;
> 
> I think, we want to use kho_block [1] (it is in liveupdate/next branch) 
> to allow number of supported devices to be dynamic.
> 
> To support this, we would redefine the ABI and tracking structures like 
> so:
> 
> /* include/linux/kho/abi/pci.h */
> struct pci_ser {
> 	u64 devices;      /* Phys address of the first block header of kho_block_set */
> 	u64 nr_devices;   /* Total count of active preserved devices */
> } __packed;
> 
> /* drivers/pci/liveupdate.c */
> struct pci_flb_outgoing {
> 	struct pci_ser *ser;            /* Points to the FDT/KHO-allocated ABI struct */
> 	struct kho_block_set block_set;  /* Controls the active blocks on the fly */
> };
> 
> In  __pci_liveupdate_preserve_device() , we would search for 
> and reuse any inactive  pci_dev_ser  slot first, and only call 
> kho_block_set_grow() to expand if no inactive slots are available.
> 
> In pci_liveupdate_unpreserve_device(), we would simply 
> mark the  pci_dev_ser as inactive.

Makes sense at a high level. I'll work on switching kho_block for v7 and
get back to you if I hit any issues.

> 
> >
> > diff --git a/include/linux/pci_liveupdate.h b/include/linux/pci_liveupdate.h
> > new file mode 100644
> > index 000000000000..8ec98beefcb4
> > --- /dev/null
> > +++ b/include/linux/pci_liveupdate.h
> > @@ -0,0 +1,30 @@
> > [ ... skip 24 lines ... ]
> > +static inline void pci_liveupdate_unregister_flb(struct liveupdate_file_handler *fh)
> > +{
> > +}
> > +#endif
> > +
> > +#endif /* LINUX_PCI_LIVEUPDATE_H */
> 
> [1] https://lore.kernel.org/all/20260603154402.468928-1-pasha.tatashin@soleen.com/
> 
> Preserving: In  __pci_liveupdate_preserve_device() , we would search for 
> Unpreserving: In  pci_liveupdate_unpreserve_device(), we would simply
> 
> Preserving: In  __pci_liveupdate_preserve_device() , we would search for 
> Unpreserving: In  pci_liveupdate_unpreserve_device(), we would simply 
> 
> -- 
> Pasha Tatashin <pasha.tatashin@soleen.com>

^ permalink raw reply

* Re: [PATCH v6 01/12] PCI: liveupdate: Set up FLB handler for the PCI core
From: David Matlack @ 2026-06-15 22:19 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: Mike Rapoport, kexec, linux-doc, linux-kernel, linux-mm,
	linux-pci, Adithya Jayachandran, Alexander Graf, Alex Williamson,
	Bjorn Helgaas, Chris Li, David Rientjes, Jacob Pan,
	Jason Gunthorpe, Jonathan Corbet, Josh Hilke, Leon Romanovsky,
	Lukas Wunner, Parav Pandit, Pranjal Shrivastava, Pratyush Yadav,
	Saeed Mahameed, Samiullah Khawaja, Shuah Khan, Vipin Sharma,
	William Tu, Yi Liu
In-Reply-To: <178126123047.908199.12042730612990730743.b4-reply@b4>

On 2026-06-12 10:47 AM, Pasha Tatashin wrote:
> On 2026-06-12 09:54:44+03:00, Mike Rapoport wrote:
> > On Fri, Jun 12, 2026 at 05:15:02AM +0000, Pasha Tatashin wrote:
> > 
> > > On Fri, 22 May 2026 20:23:59 +0000, David Matlack <dmatlack@google.com> wrote:
> > > 
> > > Please add Pratyush, Mike, and myself so we are notified directly of 
> > > incoming patches, the same as with other areas where the liveupdate/ 
> > > tree is specified.
> > 
> > Or we can add PCI liveupdate files to LIVEUPDATE entry.
> 
> That will not work, as we cannot serve as maintainers for 
> PCI/VFIO/IOMMU/KVM, etc. David Matlack will be the maintainer for the 
> PCI components, and we will accept patches once they have been approved 
> by him.
> 
> The simplification we could do is to create an email alias 
> for the live-update tree maintainers. This would allow us to use a 
> single entry instead of listing all three of us individually.

We could create a Live Update mailing list for all code that can be CCed
on all patches that must be merged through the Live Update tree. I would
also be interested in subscribing to that list.

But if you want to create a LUO maintainers-only list that would work
fine too.

For now, I'll plan to add you as Reviewers under PCI LIVE UPDATE in v7.

^ permalink raw reply

* Re: [PATCH v4 06/31] dt-bindings: firmware: arm,scmi: Add support for telemetry protocol
From: Rob Herring (Arm) @ 2026-06-15 22:14 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: linux-doc, Conor Dooley, puranjay, usama.arif, philip.radford,
	devicetree, souvik.chakravarty, linux-kernel, jic23, elif.topuz,
	lukasz.luba, sudeep.holla, leitao, vincent.guittot, james.quinlan,
	kernel-team, linux-arm-kernel, kas, arm-scmi, peng.fan,
	linux-fsdevel, michal.simek, brauner, etienne.carriere, d-gole,
	Krzysztof Kozlowski, f.fainelli
In-Reply-To: <20260612223802.1337232-7-cristian.marussi@arm.com>


On Fri, 12 Jun 2026 23:37:36 +0100, Cristian Marussi wrote:
> Add new SCMI v4.0 Telemetry protocol bindings definitions.
> 
> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> ---
> v3 --> v4
>  - changed protocol number to lowercase 1b
>  - fixed misplaced block for protocol 0x1b
> 
> Cc: Rob Herring <robh@kernel.org>
> Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
> Cc: Conor Dooley <conor+dt@kernel.org>
> Cc: devicetree@vger.kernel.org
> ---
>  Documentation/devicetree/bindings/firmware/arm,scmi.yaml | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

Acked-by: Rob Herring (Arm) <robh@kernel.org>


^ permalink raw reply

* [GIT PULL] Documentation for 7.2
From: Jonathan Corbet @ 2026-06-15 22:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shuah Khan, linux-doc, linux-kernel

The following changes since commit f2e65e4e5b4b4b9ecf43f03c3fdbe8c9a8a43a9e:

  docs: threat-model: don't limit root capabilities to CAP_SYS_ADMIN (2026-05-14 06:23:44 -0600)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux.git tags/docs-7.2

for you to fetch changes up to fa34b01aa0f59355206b0807f862cced06c2b7a1:

  docs: pt_BR: Translate 3.Early-stage.rst into Portuguese (2026-06-12 13:34:26 -0600)

----------------------------------------------------------------
Things have calmed down a bit on the docs front, with no earthshaking
changes this time around:

- Ongoing work on the Japanese and Portuguese translations.

- Better integration of the MAINTAINERS file into the rendered documents,
  including a search interface.

- A seemingly infinite supply of fixes for typos, minor grammatical issues,
  and related problems that LLMs find with abandon.

Expect a minor conflict with the kbuild tree on the Portuguese version
of changes.rst.
----------------------------------------------------------------
Akiyoshi Kurita (3):
      docs/ja_JP: translate more of submitting-patches.rst
      docs/ja_JP: translate more of submitting-patches.rst (no-mime)
      docs/ja_JP: translate submitting-patches.rst (interleaved-replies)

Amanda Corrêa (2):
      docs: pt_BR: update maintainer-handbooks
      docs: pt_BR: update "Purpose of Defconfigs" section in maintainer-soc.rst

Baolin Liu (1):
      Documentation: proc: fix section numbering in table of contents

Baruch Siach (1):
      docs: threat-model: add missing closing parenthesis

Brigham Campbell (1):
      docs: Fix minor grammatical error

Chao Gao (1):
      Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs

Charlie Jenkins (1):
      Documentation/binfmt-misc.rst: Specify aux vector for "O" flag description

Chen-Shi-Hong (1):
      docs: reporting-issues: replace "these advices" with "all of this advice"

Cheng-Han Wu (4):
      docs: admin-guide: fix typos in workload tracing guide
      docs: admin-guide: fix stress-ng command examples
      docs: admin-guide: clarify perf bench all behavior
      docs: admin-guide: add IGNORE_DIRS example for cscope

Clinton Phillips (1):
      docs: sphinx-static: fix typo "wich" -> "which"

Costa Shulyupin (9):
      docs: Remove stale ISDN parameters
      docs: locking: Fix stale dquot.c path
      docs: housekeeping: Fix struct member access in code example
      docs: sysctl/net: Remove ax25, netrom, rose entries
      docs: kernel-parameters: Remove sa1100ir IrDA parameter
      docs: sonypi: Fix stale header file path
      docs: kgdb: Fix stale source file paths
      docs: real-time: Fix duplicated sched(7) text
      docs: kernel-parameters: Fix stale sticore file paths

Daniel Pereira (6):
      docs/pt_BR: process: link maintainer-kvm-x86 in maintainer-handbooks
      docs: pt_BR: translate process/license-rules.rst
      docs: pt_BR: Translate process/kernel-docs.rst into Portuguese
      docs: pt_BR: update minimal software requirements in changes.rst
      docs: pt_BR: add translation for kernel development process guides
      docs: pt_BR: Translate 3.Early-stage.rst into Portuguese

Ethan Nelson-Moore (1):
      docs/{it_it,sp_SP,zh_CN,zh_TW}: update references to removed CONFIG_DEBUG_SLAB

Gabriele Monaco (1):
      Documentation/rv: Replace stale website link

Hyeonjin Kim (1):
      docs: fix typos in kernel documentation

Ilai Levin (1):
      docs: driver-api: eisa: add SPDX license identifier

Jim Cromie (2):
      docs/dyndbg: update examples \012 to \n
      docs/dyndbg: explain flags parse 1st

Jinjie Ruan (1):
      docs: Update nosmt support for arm64

Jonathan Corbet (3):
      Merge branch 'mauro' into docs-mw
      Merge branch 'docs-fixes' into docs-mw
      Merge branch 'mauro' into docs-mw

Krzysztof Kozlowski (1):
      docs: submitting-patches: Clarify that "reviewer" is a person

Manuel Ebner (6):
      Documentation: adopt new coding style of type-aware kmalloc-family
      Documentation: RCU: adopt new coding style of type-aware kmalloc-family
      Documentation: deprecated.rst: kmalloc-family: mark argument as optional
      Documentation: arch: fix brackets
      Documentation: process: fix brackets
      Documentation: bug-hunting.rst: fix grammar

Mauro Carvalho Chehab (25):
      docs: maintainers: add SPDX license to the file
      docs: maintainers_include: auto-generate maintainer profile TOC
      docs: auto-generate maintainer entry profile links
      docs: maintainers_include: use a better title for profiles
      docs: maintainers_include: add external profile URLs
      docs: maintainers_include: preserve names for files under process/
      docs: maintainers_include: Only show main entry for profiles
      docs: maintainers_include: improve its output
      docs: maintainers_include: fix support for O=dir
      docs: maintainers_include: parse MAINTAINERS just once
      docs: maintainers_include: keep hidden TOC sorted
      docs: maintainers_include: split state machine on multiple funcs
      docs: maintainers_include: cleanup the code
      docs: maintainers_include: clean most SPHINXDIRS=process warnings
      docs: maintainers_include: do some coding style cleanups
      docs: maintainers_include: store maintainers entries on a dict
      docs: maintainers_include: properly handle file patterns
      docs: maintainers_include: add a filtering javascript
      docs: maintainers_include: don't ignore invalid profile entries
      docs: maintainers_include: better handle directories
      docs: maintainers_include: better handle doc wildcards
      MAINTAINERS: make clearer about what's expected for "P" field
      MAINTAINERS: use a URL for pin-init maintainer's profile entry
      docs: maintainers_include: restore compatibility with Python 3.6
      docs: maintainers_include: keep the last entry at the end

Mayank Gite (1):
      Documentation: fix typo and formattting in security/credentials.rst

Miguel Martín Gil (1):
      docs: md: fix grammar in speed_limit description

Miles Krause (1):
      Documentation/scheduler: Fix duplicated word in sched-deadline

Myro Demma (1):
      docs: proc: fix minor grammar and formatting issues

Nicolas Pitre (2):
      Documentation: filesystems: cramfs: correct stale hard-link and endianness claims
      cramfs: drop obsolete Future Development notes and update tools URL

Ninad Naik (1):
      Documentation: fix spelling mistake "stucture" -> "structure"

Randy Dunlap (8):
      docs: xforms_lists: allow __maybe_unused in func parameters
      docs: watchdog: mlx-wdt: small fixes
      docs: watchdog: pcwd: fix typo and driver info.
      docs: watchdog-api: general cleaning
      docs: watchdog-kernel-api: general cleanups
      docs: watchdog-parameters: add missing watchdog_core parameters
      kdoc: xforms: move context attrs to function_xforms list
      iommu: Documentation: rearrange, update kernel-parameters

Sakurai Shun (2):
      docs: fix typo in leds-lp55xx.rst
      docs: fix typo in user_mode_linux_howto_v2.rst

Sebastian Andrzej Siewior (1):
      Documentation/kernel-parameters: Remove "Deprecated" from isolcpus=

Shuicheng Lin (1):
      scripts/kernel-doc: Detect mismatched inline member documentation tags

Tudor Ambarus (1):
      docs: kernel-doc: python: strip __counted_by_ptr macro

Uwe Kleine-König (1):
      Documentation: Fix syntax of kmalloc_objs example in coding style doc

Zhan Xusheng (1):
      docs: changes.rst: restore pahole 1.26 minimum (regressed by sort)

Zhang Xiaolei (1):
      docs: staging: fix various typos and grammar issues

 .../RCU/Design/Requirements/Requirements.rst       |   6 +-
 Documentation/RCU/listRCU.rst                      |   2 +-
 Documentation/RCU/whatisRCU.rst                    |   4 +-
 Documentation/admin-guide/binfmt-misc.rst          |   8 +-
 Documentation/admin-guide/bug-hunting.rst          |   4 +-
 Documentation/admin-guide/dynamic-debug-howto.rst  |  35 +-
 Documentation/admin-guide/kernel-parameters.txt    |  72 +--
 Documentation/admin-guide/laptops/sonypi.rst       |   2 +-
 Documentation/admin-guide/md.rst                   |   2 +-
 .../admin-guide/quickly-build-trimmed-linux.rst    |   2 +-
 Documentation/admin-guide/reporting-issues.rst     |   4 +-
 Documentation/admin-guide/sysctl/net.rst           |  11 +-
 Documentation/admin-guide/workload-tracing.rst     |  41 +-
 Documentation/arch/arc/arc.rst                     |   2 +-
 .../arch/arm/samsung/clksrc-change-registers.awk   |   2 +-
 Documentation/arch/arm/vlocks.rst                  |   4 +-
 .../arch/arm64/memory-tagging-extension.rst        |   2 +-
 Documentation/arch/powerpc/vas-api.rst             |   2 +-
 Documentation/arch/sparc/oradax/dax-hv-api.txt     |  18 +-
 Documentation/arch/sparc/oradax/oracle-dax.rst     |   2 +-
 Documentation/arch/x86/x86_64/fsgs.rst             |   4 +-
 Documentation/block/data-integrity.rst             |   2 +-
 Documentation/core-api/cpu_hotplug.rst             |   5 -
 Documentation/core-api/housekeeping.rst            |   2 +-
 Documentation/core-api/kref.rst                    |   4 +-
 Documentation/core-api/list.rst                    |   6 +-
 Documentation/core-api/real-time/theory.rst        |   2 +-
 Documentation/driver-api/eisa.rst                  |   2 +
 Documentation/driver-api/mailbox.rst               |   4 +-
 Documentation/driver-api/media/v4l2-fh.rst         |   2 +-
 Documentation/filesystems/cramfs.rst               |  22 +-
 Documentation/filesystems/locking.rst              |   2 +-
 Documentation/filesystems/proc.rst                 |  18 +-
 Documentation/gpu/drm-uapi.rst                     |   2 +-
 Documentation/kernel-hacking/locking.rst           |   4 +-
 Documentation/leds/leds-lp55xx.rst                 |   2 +-
 Documentation/locking/locktypes.rst                |   4 +-
 .../maintainer/maintainer-entry-profile.rst        |  26 +-
 Documentation/process/changes.rst                  |   7 +-
 Documentation/process/coding-style.rst             |   8 +-
 Documentation/process/debugging/kgdb.rst           |   4 +-
 Documentation/process/deprecated.rst               |  15 +-
 Documentation/process/maintainer-handbooks.rst     |  17 +-
 Documentation/process/maintainer-soc.rst           |   2 +-
 Documentation/process/maintainers.rst              |   2 +
 Documentation/process/submitting-patches.rst       |  12 +-
 Documentation/process/threat-model.rst             |   2 +-
 Documentation/scheduler/sched-deadline.rst         |   2 +-
 Documentation/security/credentials.rst             |   6 +-
 .../sound/kernel-api/writing-an-alsa-driver.rst    |  12 +-
 Documentation/sphinx-static/custom.css             |   2 +-
 Documentation/sphinx/maintainers_include.py        | 516 ++++++++++++++------
 Documentation/spi/spi-summary.rst                  |   4 +-
 Documentation/staging/crc32.rst                    |   2 +-
 Documentation/staging/lzo.rst                      |   2 +-
 Documentation/staging/remoteproc.rst               |   2 +-
 Documentation/staging/static-keys.rst              |   2 +-
 Documentation/trace/histogram-design.rst           |   2 +-
 Documentation/trace/rv/monitor_sched.rst           |   7 +-
 .../translations/it_IT/kernel-hacking/locking.rst  |   4 +-
 .../translations/it_IT/locking/locktypes.rst       |   4 +-
 .../translations/it_IT/process/coding-style.rst    |   2 +-
 .../it_IT/process/submit-checklist.rst             |   2 +-
 .../ja_JP/process/submitting-patches.rst           | 237 +++++++++-
 Documentation/translations/pt_BR/index.rst         |   3 +
 .../translations/pt_BR/process/2.Process.rst       | 520 +++++++++++++++++++++
 .../translations/pt_BR/process/3.Early-stage.rst   | 233 +++++++++
 .../translations/pt_BR/process/changes.rst         |  56 +--
 .../pt_BR/process/development-process.rst          |  22 +
 .../translations/pt_BR/process/kernel-docs.rst     | 373 +++++++++++++++
 .../translations/pt_BR/process/license-rules.rst   | 483 +++++++++++++++++++
 .../pt_BR/process/maintainer-handbooks.rst         |  18 +-
 .../translations/pt_BR/process/maintainer-soc.rst  |  12 +-
 .../translations/sp_SP/process/coding-style.rst    |   2 +-
 .../sp_SP/process/submit-checklist.rst             |   2 +-
 Documentation/translations/zh_CN/core-api/kref.rst |   4 +-
 .../translations/zh_CN/process/coding-style.rst    |   2 +-
 .../zh_CN/process/submit-checklist.rst             |   2 +-
 .../zh_CN/video4linux/v4l2-framework.txt           |   2 +-
 .../translations/zh_TW/process/coding-style.rst    |   2 +-
 .../zh_TW/process/submit-checklist.rst             |   2 +-
 .../virt/uml/user_mode_linux_howto_v2.rst          |   2 +-
 Documentation/watchdog/mlx-wdt.rst                 |   4 +-
 Documentation/watchdog/pcwd-watchdog.rst           |  19 +-
 Documentation/watchdog/watchdog-api.rst            |  38 +-
 Documentation/watchdog/watchdog-kernel-api.rst     |  20 +-
 Documentation/watchdog/watchdog-parameters.rst     |  11 +-
 MAINTAINERS                                        |   8 +-
 fs/cramfs/README                                   |  92 +---
 tools/lib/python/kdoc/kdoc_parser.py               |  54 ++-
 tools/lib/python/kdoc/xforms_lists.py              |  22 +-
 tools/unittests/test_cmatch.py                     |   1 +
 92 files changed, 2665 insertions(+), 559 deletions(-)
 create mode 100644 Documentation/translations/pt_BR/process/2.Process.rst
 create mode 100644 Documentation/translations/pt_BR/process/3.Early-stage.rst
 create mode 100644 Documentation/translations/pt_BR/process/development-process.rst
 create mode 100644 Documentation/translations/pt_BR/process/kernel-docs.rst
 create mode 100644 Documentation/translations/pt_BR/process/license-rules.rst

^ permalink raw reply

* Re: [PATCH v18 net-next 00/11] nbl driver for Nebulamatrix NICs
From: Jakub Kicinski @ 2026-06-15 21:54 UTC (permalink / raw)
  To: illusion.wang
  Cc: dimon.zhao, alvin.wang, sam.chen, netdev, andrew+netdev, corbet,
	horms, linux-doc, pabeni, vadim.fedorenko, lukas.bulwahn,
	edumazet, enelsonmoore, skhan, hkallweit1, open list
In-Reply-To: <20260611044916.2383-1-illusion.wang@nebula-matrix.com>

On Thu, 11 Jun 2026 12:48:59 +0800 illusion.wang wrote:
> This patch series represents the first phase. We plan to integrate it in
> two phases: the first phase covers mailbox and chip configuration,
> while the second phase involves net dev configuration.
> Together, they will provide basic PF-based Ethernet port transmission and
> reception capabilities.

7.1 has been tagged and the merge window for 7.2 has started.
It's a bit late to be adding drivers, so please repost this
after 7.2-rc1 has been tagged and net-next re-opened:
https://netdev.bots.linux.dev/net-next.html
-- 
pw-bot: cr

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox