From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-m49246.qiye.163.com (mail-m49246.qiye.163.com [45.254.49.246]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 790CE37AA96 for ; Sat, 9 May 2026 08:43:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.254.49.246 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778316200; cv=none; b=EtEow7sIitfMVCXA42GgGysHplMpCZysL3W+dq6hFCqR6sRGUyMLtVyKrnpIwT7t1kkylR2riq1nrEa9aWoeY6IRxpkEHBFPHGV/7alXGy/OtrGRJ5ODQUmT6FbO9ye6oHYj+sgzTM32CvPDbef/9cQeI5YkRDCm/P42UiiYjH8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778316200; c=relaxed/simple; bh=oesUJ0IID8iHGvAJGDsBZLsrgbc/bT2U0ZGxdSE+cig=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=LOeLjCZYE7dUa97xilhZZ0APblwd9V4qyVqUOqOC7TWFNwChc2YyWMV2IlkZ2ekEpe40fzg83qDzpINj5BJp98R+/rnCUKo1Wirm1o7+o9/k2Gz2vLwpiKGqUhKBlG7kLzrV6SmennPlHhnZ6hRDZ6SRLNI4aBM9ozwxD4oCSG8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=easystack.cn; spf=pass smtp.mailfrom=easystack.cn; arc=none smtp.client-ip=45.254.49.246 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=easystack.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=easystack.cn Received: from [192.168.0.59] (unknown [218.94.118.90]) by smtp.qiye.163.com (Hmail) with ESMTP id 19e7d9eab; Sat, 9 May 2026 15:27:22 +0800 (GMT+08:00) Message-ID: <9ebddab1-b7b4-462e-a920-f850cc5c55c5@easystack.cn> Date: Sat, 9 May 2026 15:27:22 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5 2/3] mm/page_owner: add NUMA node filter with nodelist support To: SeongJae Park Cc: akpm@linux-foundation.org, vbabka@kernel.org, surenb@google.com, mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20260509004417.84229-1-sj@kernel.org> From: "zhen.ni" In-Reply-To: <20260509004417.84229-1-sj@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-HM-Tid: 0a9e0ba23dfb0229kunm3dc358b4264912 X-HM-MType: 1 X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFJQjdXWRgWCB1ZQUpXWS1ZQUlXWQ8JGhUIEh9ZQVkZQxhLVk1IGBhIGEtLQx1KH1YVFA kWGhdVGRETFhoSFyQUDg9ZV1kYEgtZQVlJSkNVQk9VSkpDVUJLWVdZFhoPEhUdFFlBWU9LSFVCQk lOS1VKS0tVSkJLQlkG 在 2026/5/9 08:44, SeongJae Park 写道: > On Thu, 7 May 2026 14:46:42 +0800 Zhen Ni wrote: > >> Add NUMA node filtering functionality to page_owner to allow filtering >> pages by specific NUMA node(s). This is useful for NUMA-aware memory >> allocation analysis and debugging. >> >> The filter supports flexible nodelist input formats: >> - Single node: echo "0" > nid >> - Multiple nodes: echo "0,2,3" > nid >> - Node range: echo "0-3" > nid >> - Mixed format: echo "0,2-4,7" > nid >> - Clear filter: echo > nid (empty string) >> >> The implementation uses nodemask_t for efficient multi-node filtering >> and nodelist_parse() for flexible input parsing. Empty input clears >> the filter. >> >> Note: Access to nid_mask uses plain load/store without locking because >> nodemask_t is too large (128 bytes) for READ_ONCE/WRITE_ONCE. This is >> safe for debug use: low-frequency changes and torn reads would only >> cause temporary inconsistency in debug output. >> >> Signed-off-by: Zhen Ni >> --- >> >> Changes in v5: >> - Optimize nodes_empty() check in page iteration loop >> - Add __data_racy qualifier to nid_mask field > > Adding links to previous revisions [1] would be helpful. Will add lore links. > >> --- >> mm/page_owner.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 86 insertions(+) >> >> diff --git a/mm/page_owner.c b/mm/page_owner.c > [...] >> @@ -700,6 +707,9 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) >> while (!pfn_valid(pfn) && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) >> pfn++; >> >> + mask = owner_filter.nid_mask; >> + bool filter_by_nid = !nodes_empty(mask); >> + > > Shouldn't we separate variable declarations and statements inside a same block? > I will fix this in v6 by declaring all variables at the beginning of the block: nodemask_t mask; bool filter_by_nid; mask = owner_filter.nid_mask; filter_by_nid = !nodes_empty(mask); > [...] >> +static ssize_t nid_filter_write(struct file *file, >> + const char __user *buf, >> + size_t count, loff_t *ppos) >> +{ >> + char *kbuf; >> + nodemask_t mask; >> + int ret; >> + >> + /* >> + * Limit input size to handle worst-case nodelist (all nodes). >> + * Worst case per node: ",NNNNN" (comma + 5-digit node number) = 6 bytes. >> + * Formula: 100 bytes overhead + 6 * MAX_NUMNODES > > What is the 100 bytes overhead? The 100 bytes is intended as a safety margin, but it's not strictly necessary. Maybe I should simplify it to just 6 * MAX_NUMNODES? > >> + */ >> + if (count > (100 + 6 * MAX_NUMNODES)) >> + return -EINVAL; >> + >> + kbuf = kmalloc(count + 1, GFP_KERNEL); >> + if (!kbuf) >> + return -ENOMEM; > > Would it make sense to use kmalloc_objs()? I'll update the code to use kmalloc_objs(char, count + 1, GFP_KERNEL) Thanks for the review! > > [1] https://docs.kernel.org/process/submitting-patches.html#commentary > > > Thanks, > SJ > > [...] > > Best regards, Zhen