Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "zhen.ni" <zhen.ni@easystack.cn>
To: Michal Hocko <mhocko@suse.com>
Cc: akpm@linux-foundation.org, vbabka@kernel.org, surenb@google.com,
	jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering
Date: Tue, 12 May 2026 16:16:36 +0800	[thread overview]
Message-ID: <d5bd3927-3f80-4ed7-a8d2-0b7d58139286@easystack.cn> (raw)
In-Reply-To: <agLWHkhlkBotiLYM@tiehlicka>



在 2026/5/12 15:26, Michal Hocko 写道:
> On Tue 12-05-26 11:11:47, zhen.ni wrote:
>>
>>
>> 在 2026/5/11 20:54, Michal Hocko 写道:
>>> On Mon 11-05-26 20:40:07, zhen.ni wrote:
>>>>
>>>>
>>>> 在 2026/5/11 20:23, Michal Hocko 写道:
>>>>> On Mon 11-05-26 11:30:14, Zhen Ni wrote:
>>>>>> Solution
>>>>>> ========
>>>>>>
>>>>>> This patch series introduces a flexible filter infrastructure with
>>>>>> two initial filters:
>>>>>>
>>>>>> 1. **Print Mode Filter**: Outputs only stack handles instead of
>>>>>>       full stack traces. The handle-to-stack mapping can be retrieved
>>>>>>       from the existing show_stacks_handles interface. This dramatically
>>>>>>       reduces output size while preserving all allocation metadata.
>>>>>>
>>>>>> 2. **NUMA Node Filter**: Allows filtering pages by specific NUMA node(s)
>>>>>>       using flexible nodelist format, enabling targeted analysis of memory
>>>>>>       issues in NUMA-aware deployments.
>>>>>
>>>>> How does this work when there are multiple consumers of the interface?
>>>>> E.g per numa tool to watch node lock page_owner information?
>>>>>
>>>> I understand your concern about concurrent access. Are you asking
>>>> about this scenario?
>>>>
>>>> Scenario: Multiple tools monitoring different NUMA nodes
>>>>     Tool 1: echo "0" > nid && cat page_owner > node0.log
>>>>     Tool 2: echo "1" > nid && cat page_owner > node1.log
>>>>
>>>> The current global filter implementation would have race conditions
>>>> in this case.
>>>
>>> That makes the interface rather broken in my eyes TBH. Is there any way
>>> to make the filter local to the fd?
>>
>> I agree that the global filter state creates race conditions for
>> concurrent consumers.
>>
>> Regarding per-fd filters, I've looked into this approach. The main
>> challenge is that per-fd filter state would require changing the current
>> simple usage model:
> 
>> Current usage:
>> echo "0" > /sys/kernel/debug/page_owner_filter/nid
>> cat /sys/kernel/debug/page_owner
> 
>> Per-fd implementation would require:
>> - Add ioctl interface and allocate filter state in file->private_data
>> - Change page_owner_fops to add .open/.unlocked_ioctl callbacks
>> - Provide user-space tool (e.g., ./page_owner_tool --node 0)
>> - New UAPI header with ioctl definitions
> 
> ioctl is one option. Have you considered to write the filter state to
> the page_owner fd to create a local state?
> 
>> This would replace the current "echo + cat" interface with a
>> tool-based approach.
> 
> Which doesn't sound all that terrible comparing to a non-deterministic
> behavior of this proposal
> 
>> Alternative: Simple mutex protection to serialize
>> concurrent filter modifications. Though this doesn't fully address
>> concurrent reads, it could mitigate the most obvious race conditions.
>>
>> I'm wondering if you have any thoughts on the trade-off here. Since
>> page_owner is mainly used for debugging (typically not in concurrent
>> scenarios), would a simpler approach like mutex protection or documenting
>> this limitation be sufficient?
> 
> The thing is that unless you own the whole machine you never know who
> might consider information from page_owner interesting to filter and
> read. So you might easily get garbage. Not completely terrible
> considering this is debugging interface but I believe we can do better
> than that.

Thank you for the feedback.

I've been thinking about the per-fd filtering approach you suggested:

## Implementation Plan

1. Add per-fd filtering to page_owner file
    - Add .open/.release/.write callbacks
    - Each file descriptor has its own filter state
    - Write filter commands: "nid=0", "mode=stack_handle"

2. Provide user-space tool
    - Simple CLI: ./page_owner_tool --nid=0
    - Handle fd management internally

## User Experience

Direct access (default: no filter):
   cat /sys/kernel/debug/page_owner

With filtering:
   ./page_owner_tool --nid=0
   ./page_owner_tool --nid=0,2-3
   ./page_owner_tool --nid=0 --mode=stack_handle

## Benefits

- Completely eliminates race condition
- Per-fd isolation for concurrent access
- Correct design for multi-consumer scenarios

Does this approach look good to you?

Please let me know if you have any suggestions or concerns.

Thanks,
Zhen


  reply	other threads:[~2026-05-12  8:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  3:30 [PATCH v6 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Zhen Ni
2026-05-11  3:30 ` [PATCH v6 1/3] mm/page_owner: add print_mode filter Zhen Ni
2026-05-11  8:29   ` Oscar Salvador
2026-05-11 11:54     ` zhen.ni
2026-05-11  3:30 ` [PATCH v6 2/3] mm/page_owner: add NUMA node filter with nodelist support Zhen Ni
2026-05-11  8:54   ` Oscar Salvador
2026-05-11 12:24     ` zhen.ni
2026-05-11  3:30 ` [PATCH v6 3/3] mm/page_owner: document page_owner filter features Zhen Ni
2026-05-11  8:33   ` Oscar Salvador
2026-05-11 12:23 ` [PATCH v6 0/3] mm/page_owner: add filter infrastructure for print_mode and NUMA filtering Michal Hocko
2026-05-11 12:40   ` zhen.ni
2026-05-11 12:54     ` Michal Hocko
2026-05-12  3:11       ` zhen.ni
2026-05-12  7:26         ` Michal Hocko
2026-05-12  8:16           ` zhen.ni [this message]
2026-05-12  8:54             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5bd3927-3f80-4ed7-a8d2-0b7d58139286@easystack.cn \
    --to=zhen.ni@easystack.cn \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox