* [PATCH v4 1/3] mm/page_owner: add print_mode filter
2026-04-30 16:32 [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Zhen Ni
@ 2026-04-30 16:32 ` Zhen Ni
2026-04-30 16:32 ` [PATCH v4 2/3] mm/page_owner: add NUMA node filter with nodelist support Zhen Ni
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Zhen Ni @ 2026-04-30 16:32 UTC (permalink / raw)
To: akpm, vbabka
Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
Zhen Ni
Add a print_mode filter to page_owner that allows users to choose between
printing full stack traces or only stack handles, significantly reducing
output size for debugging and analysis.
The filter provides a string-based interface under
/sys/kernel/debug/page_owner_filter/:
- Reading shows the current mode with [] brackets around active option
- Writing accepts "full_stack" or "stack_handle" strings
The default full_stack mode maintains backward compatibility with existing
usage, displaying complete stack traces for each page allocation.
The stack_handle mode dramatically reduces log size by showing only
the handle number instead of the full stack trace. The mapping from
handles to actual stack traces can be obtained via the
show_stacks_handles interface.
Example usage:
# echo stack_handle > /sys/kernel/debug/page_owner_filter/print_mode
# cat /sys/kernel/debug/page_owner_filter/print_mode
full_stack [stack_handle]
# cat /sys/kernel/debug/page_owner
Page allocated via order 0, migratetype Unmovable, gfp_mask 0x1100ca,
pid 1, tgid 1 (systemd), ts 123456789 ns
PFN 0x1000 type Unmovable Block 1 type Unmovable
Flags 0x3fffe800000084(referenced|lru|active|private|node=0|zone=1)
handle: 17432583
...
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
---
Changes in v4:
- Change from numeric (0/1) to string-based interface ("full_stack"/"stack_handle")
- Merge infrastructure patch into this patch
Changes in v3:
- No code changes
Changes in v2:
- Renamed from 'compact mode' to 'print_mode' for better clarity
- Use enum values (0=full_stack, 1=stack_handle) instead of boolean
- Update debugfs filename from 'compact' to 'print_mode'
---
mm/page_owner.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 91 insertions(+), 2 deletions(-)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 8178e0be557f..28766c854d02 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/debugfs.h>
+#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
@@ -54,6 +55,24 @@ struct stack_print_ctx {
u8 flags;
};
+enum page_owner_print_mode {
+ PAGE_OWNER_PRINT_FULL_STACK,
+ PAGE_OWNER_PRINT_STACK_HANDLE,
+};
+
+static const char * const page_owner_print_mode_strings[] = {
+ [PAGE_OWNER_PRINT_FULL_STACK] = "full_stack",
+ [PAGE_OWNER_PRINT_STACK_HANDLE] = "stack_handle",
+};
+
+struct page_owner_filter {
+ enum page_owner_print_mode print_mode;
+};
+
+static struct page_owner_filter owner_filter = {
+ .print_mode = PAGE_OWNER_PRINT_FULL_STACK,
+};
+
static bool page_owner_enabled __initdata;
DEFINE_STATIC_KEY_FALSE(page_owner_inited);
@@ -575,7 +594,11 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
migratetype_names[pageblock_mt],
&page->flags);
- ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
+ if (READ_ONCE(owner_filter.print_mode) == PAGE_OWNER_PRINT_STACK_HANDLE) {
+ ret += scnprintf(kbuf + ret, count - ret,
+ "handle: %d\n", handle);
+ } else
+ ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
if (ret >= count)
goto err;
@@ -970,10 +993,71 @@ static int page_owner_threshold_set(void *data, u64 val)
DEFINE_SIMPLE_ATTRIBUTE(page_owner_threshold_fops, &page_owner_threshold_get,
&page_owner_threshold_set, "%llu");
+static ssize_t print_mode_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ const char *output;
+ int mode;
+
+ mode = READ_ONCE(owner_filter.print_mode);
+
+ if (mode == PAGE_OWNER_PRINT_FULL_STACK)
+ output = "[full_stack] stack_handle\n";
+ else
+ output = "full_stack [stack_handle]\n";
+
+ return simple_read_from_buffer(buf, count, ppos, output, strlen(output));
+}
+
+static ssize_t print_mode_write(struct file *file,
+ const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ char *kbuf;
+ int mode;
+ int ret = count;
+
+ /*
+ * Limit input size. Maximum valid input is "stack_handle" (12 chars)
+ * plus newline and null terminator. Use 32 bytes as a reasonable limit.
+ */
+ if (count > 32)
+ return -EINVAL;
+
+ kbuf = kmalloc(count + 1, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ if (strncpy_from_user(kbuf, buf, count) < 0) {
+ ret = -EFAULT;
+ goto out_free;
+ }
+ kbuf[count] = '\0';
+
+ mode = sysfs_match_string(page_owner_print_mode_strings, kbuf);
+ if (mode < 0) {
+ ret = -EINVAL;
+ goto out_free;
+ }
+
+ WRITE_ONCE(owner_filter.print_mode, mode);
+
+out_free:
+ kfree(kbuf);
+ return ret;
+}
+
+static const struct file_operations page_owner_print_mode_fops = {
+ .owner = THIS_MODULE,
+ .read = print_mode_read,
+ .write = print_mode_write,
+ .llseek = default_llseek,
+};
+
static int __init pageowner_init(void)
{
- struct dentry *dir;
+ struct dentry *dir, *filter_dir;
if (!static_branch_unlikely(&page_owner_inited)) {
pr_info("page_owner is disabled\n");
@@ -981,6 +1065,11 @@ static int __init pageowner_init(void)
}
debugfs_create_file("page_owner", 0400, NULL, NULL, &page_owner_fops);
+
+ filter_dir = debugfs_create_dir("page_owner_filter", NULL);
+ debugfs_create_file("print_mode", 0600, filter_dir, NULL,
+ &page_owner_print_mode_fops);
+
dir = debugfs_create_dir("page_owner_stacks", NULL);
debugfs_create_file("show_stacks", 0400, dir,
(void *)(STACK_PRINT_FLAG_STACK |
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v4 2/3] mm/page_owner: add NUMA node filter with nodelist support
2026-04-30 16:32 [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Zhen Ni
2026-04-30 16:32 ` [PATCH v4 1/3] mm/page_owner: add print_mode filter Zhen Ni
@ 2026-04-30 16:32 ` Zhen Ni
2026-04-30 16:32 ` [PATCH v4 3/3] mm/page_owner: document page_owner filter features Zhen Ni
2026-04-30 18:22 ` [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Andrew Morton
3 siblings, 0 replies; 6+ messages in thread
From: Zhen Ni @ 2026-04-30 16:32 UTC (permalink / raw)
To: akpm, vbabka
Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
Zhen Ni
Add NUMA node filtering functionality to page_owner to allow filtering
pages by specific NUMA node(s). This is useful for NUMA-aware memory
allocation analysis and debugging.
The filter supports flexible nodelist input formats:
- Single node: echo "0" > nid
- Multiple nodes: echo "0,2,3" > nid
- Node range: echo "0-3" > nid
- Mixed format: echo "0,2-4,7" > nid
- Clear filter: echo > nid (empty string)
The implementation uses nodemask_t for efficient multi-node filtering
and nodelist_parse() for flexible input parsing. Empty input clears
the filter.
Note: Access to nid_mask uses plain load/store without locking because
nodemask_t is too large (128 bytes) for READ_ONCE/WRITE_ONCE. This is
safe for debug use: low-frequency changes and torn reads would only
cause temporary inconsistency in debug output.
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
---
Changes in v4:
- Remove "-1" support, use empty string to clear filter
- Use strncpy_from_user() instead of copy_from_user()
- Add concurrency safety documentation for nid_mask access
- Rename fops to page_owner_nid_filter_fops for consistency
Changes in v3:
- Remove READ_ONCE/WRITE_ONCE for nodemask_t (fixes compilation errors)
* nodemask_t is a large structure (128 bytes) that triggers compile-time asserts
* Direct assignment is safe for this use case
- Add comment explaining input length calculation formula
* 6 bytes = ",NNNNN" (comma + 5-digit node number)
- Simplify "-1" check using kstrtoint() instead of dual strcmp()
- Move nodemask_t mask read outside PFN iteration loop for performance
* Avoids 128-byte structure copy on each iteration
Changes in v2:
- Use nodemask_t instead of int to support multiple nodes
- Implement nodelist_parse() to support flexible input formats
* Single node: "0", "2"
* Multiple nodes: "0,2,3"
* Ranges: "0-3"
* Mixed: "0,2-4,7"
- Use %*pbl format for output (e.g., "0-2", "0,2-4,7")
- Use dynamic memory allocation (kmalloc) to handle variable-length input
- Follow cpuset's max_write_len pattern: (100 + 6 * MAX_NUMNODES)
---
mm/page_owner.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 87 insertions(+)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 28766c854d02..68dacf01c822 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -67,10 +67,18 @@ static const char * const page_owner_print_mode_strings[] = {
struct page_owner_filter {
enum page_owner_print_mode print_mode;
+ /*
+ * Access uses plain load/store without locking.
+ * nodemask_t is too large (128 bytes) for READ_ONCE/WRITE_ONCE.
+ * Safe for debug use: low-frequency changes, torn reads only cause
+ * temporary inconsistency in debug output.
+ */
+ nodemask_t nid_mask;
};
static struct page_owner_filter owner_filter = {
.print_mode = PAGE_OWNER_PRINT_FULL_STACK,
+ .nid_mask = NODE_MASK_NONE,
};
static bool page_owner_enabled __initdata;
@@ -687,6 +695,7 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
struct page_ext *page_ext;
struct page_owner *page_owner;
depot_stack_handle_t handle;
+ nodemask_t mask;
if (!static_branch_unlikely(&page_owner_inited))
return -EINVAL;
@@ -700,6 +709,8 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
while (!pfn_valid(pfn) && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0)
pfn++;
+ mask = owner_filter.nid_mask;
+
/* Find an allocated page */
for (; pfn < max_pfn; pfn++) {
/*
@@ -732,6 +743,14 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
if (unlikely(!page_ext))
continue;
+ /* NUMA node filter using bitmask */
+ if (!nodes_empty(mask)) {
+ int nid = page_to_nid(page);
+
+ if (!node_isset(nid, mask))
+ goto ext_put_continue;
+ }
+
/*
* Some pages could be missed by concurrent allocation or free,
* because we don't hold the zone lock.
@@ -1054,6 +1073,72 @@ static const struct file_operations page_owner_print_mode_fops = {
.llseek = default_llseek,
};
+static ssize_t nid_filter_write(struct file *file,
+ const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ char *kbuf;
+ nodemask_t mask;
+ int ret;
+
+ /*
+ * Limit input size to handle worst-case nodelist (all nodes).
+ * Worst case per node: ",NNNNN" (comma + 5-digit node number) = 6 bytes.
+ * Formula: 100 bytes overhead + 6 * MAX_NUMNODES
+ */
+ if (count > (100 + 6 * MAX_NUMNODES))
+ return -EINVAL;
+
+ kbuf = kmalloc(count + 1, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ if (strncpy_from_user(kbuf, buf, count) < 0) {
+ ret = -EFAULT;
+ goto out_free;
+ }
+ kbuf[count] = '\0';
+
+ /* Support nodelist format like "0", "0,2", "0-3", or empty to clear */
+ if (nodelist_parse(kbuf, mask)) {
+ ret = -EINVAL;
+ goto out_free;
+ }
+
+ owner_filter.nid_mask = mask;
+ ret = count;
+
+out_free:
+ kfree(kbuf);
+ return ret;
+}
+
+static int nid_filter_show(struct seq_file *m, void *v)
+{
+ nodemask_t mask = owner_filter.nid_mask;
+
+ if (nodes_empty(mask))
+ seq_puts(m, "\n");
+ else
+ seq_printf(m, "%*pbl\n", nodemask_pr_args(&mask));
+
+ return 0;
+}
+
+static int nid_filter_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, nid_filter_show, NULL);
+}
+
+static const struct file_operations page_owner_nid_filter_fops = {
+ .owner = THIS_MODULE,
+ .open = nid_filter_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .write = nid_filter_write,
+ .release = single_release,
+};
+
static int __init pageowner_init(void)
{
@@ -1069,6 +1154,8 @@ static int __init pageowner_init(void)
filter_dir = debugfs_create_dir("page_owner_filter", NULL);
debugfs_create_file("print_mode", 0600, filter_dir, NULL,
&page_owner_print_mode_fops);
+ debugfs_create_file("nid", 0600, filter_dir, NULL,
+ &page_owner_nid_filter_fops);
dir = debugfs_create_dir("page_owner_stacks", NULL);
debugfs_create_file("show_stacks", 0400, dir,
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v4 3/3] mm/page_owner: document page_owner filter features
2026-04-30 16:32 [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Zhen Ni
2026-04-30 16:32 ` [PATCH v4 1/3] mm/page_owner: add print_mode filter Zhen Ni
2026-04-30 16:32 ` [PATCH v4 2/3] mm/page_owner: add NUMA node filter with nodelist support Zhen Ni
@ 2026-04-30 16:32 ` Zhen Ni
2026-04-30 18:22 ` [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Andrew Morton
3 siblings, 0 replies; 6+ messages in thread
From: Zhen Ni @ 2026-04-30 16:32 UTC (permalink / raw)
To: akpm, vbabka
Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
Zhen Ni
Add documentation for the page_owner filter functionality, including:
- Print mode filter (full stack vs stack handle)
- NUMA node filter (single node, multiple nodes, ranges)
- Usage examples for both filters
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
---
Changes in v4:
- Update print_mode documentation to reflect string-based interface
* Change from "0/1" to "full_stack"/"stack_handle"
* Add bracket notation example: "[full_stack] stack_handle"
- Update NUMA filter documentation
* Remove "-1" example
* Add empty string as clear method
- Fix indentation: use tabs instead of spaces in code examples
Changes in v3:
- New patch to document filter features as requested by Andrew Morton
---
Documentation/mm/page_owner.rst | 61 ++++++++++++++++++++++++++++++++-
1 file changed, 60 insertions(+), 1 deletion(-)
diff --git a/Documentation/mm/page_owner.rst b/Documentation/mm/page_owner.rst
index 6b12f3b007ec..178bacfbb3fd 100644
--- a/Documentation/mm/page_owner.rst
+++ b/Documentation/mm/page_owner.rst
@@ -74,7 +74,17 @@ Usage
3) Do the job that you want to debug.
-4) Analyze information from page owner::
+4) (Optional) Use filters to focus on specific memory allocations::
+
+ cd /sys/kernel/debug/page_owner_filter
+
+ # Print only stack handles instead of full traces
+ echo stack_handle > print_mode
+
+ # Filter by NUMA nodes
+ echo "0,2-3" > nid
+
+5) Analyze information from page owner::
cat /sys/kernel/debug/page_owner_stacks/show_stacks > stacks.txt
cat stacks.txt
@@ -238,6 +248,55 @@ Usage
./page_owner_sort <input> <output> --tgid=1,2,3
./page_owner_sort <input> <output> --name name1,name2
+Page Owner Filters
+==================
+
+The page_owner feature provides filtering capabilities to focus on specific
+memory allocations (e.g., by NUMA node). Filters are controlled through debugfs
+files in ``/sys/kernel/debug/page_owner_filter/``.
+
+Print Mode Filter
+-----------------
+
+The ``print_mode`` file controls the level of detail in stack trace output.
+
+Available modes:
+
+- ``full_stack`` (default): Print full stack traces
+- ``stack_handle``: Print only stack handles
+
+Reading the file shows the current mode with brackets around the active option::
+
+ cat /sys/kernel/debug/page_owner_filter/print_mode
+ [full_stack] stack_handle
+
+The ``stack_handle`` mode significantly reduces output size. Instead of full
+stack traces, it prints only the handle number::
+
+ Page allocated via order 0, mask 0x42800(GFP_NOWAIT|__GFP_COMP),
+ pid 1, tgid 1 (systemd), ts 349667370 ns
+ PFN 0xa00a2 type Unmovable Block 1280 type Unmovable
+ Flags 0x33fffe0000004124(...)
+ handle: 17432583
+
+To retrieve the full stack trace for a handle, use::
+
+ cat /sys/kernel/debug/page_owner_stacks/show_stacks_handles
+
+NUMA Node Filter
+----------------
+
+The ``nid`` file filters pages by NUMA node. This is useful for NUMA-aware
+environments to analyze node-specific memory allocation.
+
+Supported input formats:
+
+- Single node: ``echo "2" > nid``
+- Multiple nodes: ``echo "0,2,3" > nid``
+- Node range: ``echo "0-3" > nid``
+- Mixed format: ``echo "0,2-4,7" > nid``
+- Clear filter: ``echo > nid`` (empty string)
+
STANDARD FORMAT SPECIFIERS
==========================
::
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering
2026-04-30 16:32 [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Zhen Ni
` (2 preceding siblings ...)
2026-04-30 16:32 ` [PATCH v4 3/3] mm/page_owner: document page_owner filter features Zhen Ni
@ 2026-04-30 18:22 ` Andrew Morton
2026-05-01 0:12 ` SeongJae Park
3 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2026-04-30 18:22 UTC (permalink / raw)
To: Zhen Ni
Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm,
linux-kernel
On Fri, 1 May 2026 00:32:44 +0800 Zhen Ni <zhen.ni@easystack.cn> wrote:
> This patch series introduces filtering capabilities to the page_owner
> feature to address storage and performance challenges in production
> environments.
AI review asks a couple of reasonable-sounding questions:
https://sashiko.dev/#/patchset/20260430163247.13628-1-zhen.ni@easystack.cn
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering
2026-04-30 18:22 ` [PATCH v4 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Andrew Morton
@ 2026-05-01 0:12 ` SeongJae Park
0 siblings, 0 replies; 6+ messages in thread
From: SeongJae Park @ 2026-05-01 0:12 UTC (permalink / raw)
To: Andrew Morton
Cc: SeongJae Park, Zhen Ni, vbabka, surenb, mhocko, jackmanb, hannes,
ziy, linux-mm, linux-kernel
On Thu, 30 Apr 2026 11:22:45 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 1 May 2026 00:32:44 +0800 Zhen Ni <zhen.ni@easystack.cn> wrote:
>
> > This patch series introduces filtering capabilities to the page_owner
> > feature to address storage and performance challenges in production
> > environments.
>
> AI review asks a couple of reasonable-sounding questions:
> https://sashiko.dev/#/patchset/20260430163247.13628-1-zhen.ni@easystack.cn
I like the idea of this series and therefore willing to help reviewing. I
therefore added a few comments to the previous version of this series. But
unfortunately not that much to open the web browser for revewing the Sashiko
review on my own. I might willing to do that on my onw, if I could read that
on this email list. But that's not the case and I'm a lazy and bad reviewer...
Even if Zhen replies with his opinion saying Sashiko's review found no real
issue, if it doesn't have reasonable amount of explanation with original
Sashiko review quotes, I might still feel like I may better to double check
Zhen's opinion, but again I might not feel like to open web browser to read
origianl Sashiko review.
So I will hold reviewing this series until I sure the Sashiko reviews found no
blocker, or I forget the fact that there were concerning Sashiko reviews to
this series. Just wanted to make clear why I don't keep reviewing this series,
FWIW.
Thanks,
SJ
^ permalink raw reply [flat|nested] 6+ messages in thread