Linux Trace Kernel
 help / color / mirror / Atom feed
* [PATCH v2] tracing: Add NULL pointer check to trigger_data_free()
From: Guenter Roeck @ 2026-03-05 19:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Guenter Roeck, Miaoqian Lin

If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse()
jumps to the out_free error path. While kfree() safely handles a NULL
pointer, trigger_data_free() does not. This causes a NULL pointer
dereference in trigger_data_free() when evaluating
data->cmd_ops->set_filter.

Fix the problem by adding a NULL pointer check to trigger_data_free().

The problem was found by an experimental code review agent based on
gemini-3.1-pro while reviewing backports into v6.18.y.

Assisted-by: Gemini:gemini-3.1-pro
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: Add NULL pointer check to trigger_data_free() to make it more robust
    instead of changing the calling code.
    Note: Changed patch description to reflect new functionality.

 kernel/trace/trace_events_trigger.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index fecbd679d432..d5230b759a2d 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -50,6 +50,9 @@ static int trigger_kthread_fn(void *ignore)
 
 void trigger_data_free(struct event_trigger_data *data)
 {
+	if (!data)
+		return;
+
 	if (data->cmd_ops->set_filter)
 		data->cmd_ops->set_filter(NULL, data, NULL);
 
-- 
2.45.2


^ permalink raw reply related

* Re: [RFC PATCH 1/2] locking: add mutex_lock_nospin()
From: Waiman Long @ 2026-03-05 19:00 UTC (permalink / raw)
  To: David Laight
  Cc: Yafang Shao, Steven Rostedt, Peter Zijlstra, mingo, will, boqun,
	mhiramat, mark.rutland, mathieu.desnoyers, linux-kernel,
	linux-trace-kernel, bpf
In-Reply-To: <20260305093254.61facfb6@pumpkin>

On 3/5/26 4:32 AM, David Laight wrote:
> On Wed, 4 Mar 2026 23:30:40 -0500
> Waiman Long <longman@redhat.com> wrote:
>
>> On 3/4/26 10:08 PM, Yafang Shao wrote:
>>> On Thu, Mar 5, 2026 at 11:00 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>>>> On Thu, 5 Mar 2026 10:33:00 +0800
>>>> Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>   
>>>>> Other tools may also read available_filter_functions, requiring each
>>>>> one to be patched individually to avoid this flaw—a clearly
>>>>> impractical solution.
>>>> What exactly is the issue?
>>> It makes no sense to spin unnecessarily when it can be avoided. We
>>> continuously improve the kernel to do the right thing—and unnecessary
>>> spinning is certainly not the right thing.
>>>   
>>>> If a task does a while 1 in user space, it
>>>> wouldn't be much different.
>>> The while loop in user space performs actual work, whereas useless
>>> spinning does nothing but burn CPU cycles. My point is simple: if this
>>> unnecessary spinning isn't already considered an issue, it should
>>> be—it's something that clearly needs improvement.
>> The whole point of optimistic spinning is to reduce the lock acquisition
>> latency. If the waiter sleeps, the unlock operation will have to wake up
>> the waiter which can have a variable latency depending on how busy the
>> system is at the time. Yes, it is burning CPU cycles while spinning,
>> Most workloads will gain performance with this optimistic spinning
>> feature. You do have a point that for system monitoring tools that
>> observe the system behavior, they shouldn't burn that much CPU times
>> that affect performance of real workload that the tools are monitoring.
>>
>> BTW, you should expand the commit log of patch 1 to include the
>> rationale of why we should add this feature to mutex as the information
>> in the cover letter won't get included in the git log if this patch
>> series is merged. You should also elaborate in comment on under what
>> conditions should this this new mutex API be used.
> Isn't changing mutex_lock() the wrong place anyway?
> What you need is for the code holding the lock to indicate that
> it isn't worth waiters spinning because the lock will be held
> for a long time.

I have actually thought about having a flag somewhere in the mutex 
itself to indicate that optimistic spinning isn't needed. However the 
owner field is running out of usable flag bits. The other option is to 
add it to osq as it doesn't really need to use the full 32 bits for the 
tail. In this case, we can just initialize the mutex to say that we 
don't need optimistic spinning and no new mutex_lock() API will be needed.

Cheers,
Longman


^ permalink raw reply

* Re: [PATCH v4 4/5] mm: rename zone->lock to zone->_lock
From: Dmitry Ilvokhin @ 2026-03-05 18:59 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: SeongJae Park, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Rafael J. Wysocki, Pavel Machek, Len Brown, Brendan Jackman,
	Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
	linux-kernel, linux-mm, linux-trace-kernel, linux-pm
In-Reply-To: <aanIdk2i-Eo9M967@shell.ilvokhin.com>

On Thu, Mar 05, 2026 at 06:16:26PM +0000, Dmitry Ilvokhin wrote:
> On Thu, Mar 05, 2026 at 10:27:07AM +0100, Vlastimil Babka (SUSE) wrote:
> > On 3/4/26 16:13, SeongJae Park wrote:
> > > On Wed, 4 Mar 2026 13:01:45 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> > > 
> > >> On Tue, Mar 03, 2026 at 05:50:34PM -0800, SeongJae Park wrote:
> > >> > On Tue, 3 Mar 2026 14:25:55 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> > >> > 
> > >> > > On Mon, Mar 02, 2026 at 02:37:43PM -0800, Andrew Morton wrote:
> > >> > > > On Mon, 2 Mar 2026 15:10:03 +0100 "Vlastimil Babka (SUSE)" <vbabka@kernel.org> wrote:
> > >> > > > 
> > >> > > > > On 2/27/26 17:00, Dmitry Ilvokhin wrote:
> > >> > > > > > This intentionally breaks direct users of zone->lock at compile time so
> > >> > > > > > all call sites are converted to the zone lock wrappers. Without the
> > >> > > > > > rename, present and future out-of-tree code could continue using
> > >> > > > > > spin_lock(&zone->lock) and bypass the wrappers and tracing
> > >> > > > > > infrastructure.
> > >> > > > > > 
> > >> > > > > > No functional change intended.
> > >> > > > > > 
> > >> > > > > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > >> > > > > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > >> > > > > > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> > >> > > > > > Acked-by: SeongJae Park <sj@kernel.org>
> > >> > > > > 
> > >> > > > > I see some more instances of 'zone->lock' in comments in
> > >> > > > > include/linux/mmzone.h and under Documentation/ but otherwise LGTM.
> > >> > > > > 
> > >> > > > 
> > >> > > > I fixed (most of) that in the previous version but my fix was lost.
> > >> > > 
> > >> > > Thanks for the fixups, Andrew.
> > >> > > 
> > >> > > I still see a few 'zone->lock' references in Documentation remain on
> > >> > > mm-new. This patch cleans them up, as noted by Vlastimil.
> > >> > > 
> > >> > > I'm happy to adjust this patch if anything else needs attention.
> > >> > > 
> > >> > > From 9142d5a8b60038fa424a6033253960682e5a51f4 Mon Sep 17 00:00:00 2001
> > >> > > From: Dmitry Ilvokhin <d@ilvokhin.com>
> > >> > > Date: Tue, 3 Mar 2026 06:13:13 -0800
> > >> > > Subject: [PATCH] mm: fix remaining zone->lock references
> > >> > > 
> > >> > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > >> > > ---
> > >> > >  Documentation/mm/physical_memory.rst | 4 ++--
> > >> > >  Documentation/trace/events-kmem.rst  | 8 ++++----
> > >> > >  2 files changed, 6 insertions(+), 6 deletions(-)
> > >> > > 
> > >> > > diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
> > >> > > index b76183545e5b..e344f93515b6 100644
> > >> > > --- a/Documentation/mm/physical_memory.rst
> > >> > > +++ b/Documentation/mm/physical_memory.rst
> > >> > > @@ -500,11 +500,11 @@ General
> > >> > >  ``nr_isolate_pageblock``
> > >> > >    Number of isolated pageblocks. It is used to solve incorrect freepage counting
> > >> > >    problem due to racy retrieving migratetype of pageblock. Protected by
> > >> > > -  ``zone->lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> > >> > > +  ``zone_lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> > >> > 
> > >> > Dmitry's original patch [1] was doing 's/zone->lock/zone->_lock/', which aligns
> > >> > to my expectation.  But this patch is doing 's/zone->lock/zone_lock/'.  Same
> > >> > for the rest of this patch.
> > >> > 
> > >> > I was initially thinking this is just a mistake, but I also found Andrew is
> > >> > doing same change [2], so I'm bit confused.  Is this an intentional change?
> > >> > 
> > >> > [1] https://lore.kernel.org/d61500c5784c64e971f4d328c57639303c475f81.1772206930.git.d@ilvokhin.com
> > >> > [2] https://lore.kernel.org/20260302143743.220eed4feb36d7572fe726cc@linux-foundation.org
> > >> > 
> > >> 
> > >> Good catch, thanks for pointing this out, SJ.
> > >> 
> > >> Originally the mechanical rename was indeed zone->lock -> zone->_lock.
> > >> However, in Documentation I intentionally switched references to
> > >> zone_lock instead of zone->_lock. The reasoning is that _lock is now an
> > >> internal implementation detail, and direct access is discouraged. The
> > >> intended interface is via the zone_lock_*() / zone_unlock_*() wrappers,
> > >> so referencing zone_lock in documentation felt more appropriate than
> > >> mentioning the private struct field (zone->_lock).
> > > 
> > > Thank you for this nice and kind clarification, Dmitry!  I agree mentioning
> > > zone_[un]lock_*() helpers instead of the hidden member (zone->_lock) can be
> > > better.
> > > 
> > > But, I'm concerned if people like me might not aware the intention under
> > > 'zone_lock'.  If there is a well-known convention that allows people to know it
> > > is for 'zone_[un]lock_*()' helpers, making it more clear would be nice, in my
> > > humble opinion.  If there is such a convention but I'm just missing it, please
> > > ignore.  If I'm not, for eaxmaple,
> > > 
> > > "protected by ``zone->lock``" could be re-wrote to
> > > "protected by ``zone_[un]lock_*()`` locking helpers" or,
> > > "protected by zone lock helper functions (``zone_[un]lock_*()``)" ?
> > > 
> > >> 
> > >> That said, I agree this creates inconsistency with the mechanical
> > >> rename, and I'm happy to adjust either way: either consistently refer
> > >> to the wrapper API, or keep documentation aligned with zone->_lock.
> > >> 
> > >> I slightly prefer referring to the wrapper API, but don't have a strong
> > >> preference as long as we're consistent.
> > > 
> > > I also think both approaches are good.  But for the wrapper approach, I think
> > > giving more contexts rather than just ``zone_lock`` to readers would be nice.
> > 
> > Grep tells me that we also have comments mentioning simply "zone lock", btw.
> > And it's also a term used often in informal conversations. Maybe we could
> > just standardize on that in comments/documentations as it's easier to read.
> > Discovering that the field is called _lock and that wrappers should be used,
> > is hopefully not that difficult.
> 
> Thanks for the suggestion, Vlastimil. That sounds reasonable to me as
> well. I'll update the comments and documentation to consistently use
> "zone lock".

Following the suggestion from SJ and Vlastimil, I prepared fixup to
standardize documentation and comments on the term "zone lock".

The patch is based on top of the current mm-new.

Andrew, please let me know if you would prefer a respin of the series
instead.

From 267cda3e0e160f97b346009bc48819bfeed92e52 Mon Sep 17 00:00:00 2001
From: Dmitry Ilvokhin <d@ilvokhin.com>
Date: Thu, 5 Mar 2026 10:36:17 -0800
Subject: [PATCH] mm: documentation: standardize on "zone lock" terminology

During review of the zone lock tracing series it was suggested to
standardize documentation and comments on the term "zone lock"
instead of using zone_lock or referring to the internal field
zone->_lock.

Update references accordingly.

Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 Documentation/mm/physical_memory.rst |  4 ++--
 Documentation/trace/events-kmem.rst  |  8 ++++----
 mm/compaction.c                      |  2 +-
 mm/internal.h                        |  2 +-
 mm/page_alloc.c                      | 12 ++++++------
 mm/page_isolation.c                  |  4 ++--
 mm/page_owner.c                      |  2 +-
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
index e344f93515b6..2398d87ac156 100644
--- a/Documentation/mm/physical_memory.rst
+++ b/Documentation/mm/physical_memory.rst
@@ -500,11 +500,11 @@ General
 ``nr_isolate_pageblock``
   Number of isolated pageblocks. It is used to solve incorrect freepage counting
   problem due to racy retrieving migratetype of pageblock. Protected by
-  ``zone_lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
+  zone lock. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
 
 ``span_seqlock``
   The seqlock to protect ``zone_start_pfn`` and ``spanned_pages``. It is a
-  seqlock because it has to be read outside of ``zone_lock``, and it is done in
+  seqlock because it has to be read outside of zone lock, and it is done in
   the main allocator path. However, the seqlock is written quite infrequently.
   Defined only when ``CONFIG_MEMORY_HOTPLUG`` is enabled.
 
diff --git a/Documentation/trace/events-kmem.rst b/Documentation/trace/events-kmem.rst
index 3c20a972de27..42f08f3b136c 100644
--- a/Documentation/trace/events-kmem.rst
+++ b/Documentation/trace/events-kmem.rst
@@ -57,7 +57,7 @@ the per-CPU allocator (high performance) or the buddy allocator.
 
 If pages are allocated directly from the buddy allocator, the
 mm_page_alloc_zone_locked event is triggered. This event is important as high
-amounts of activity imply high activity on the zone_lock. Taking this lock
+amounts of activity imply high activity on the zone lock. Taking this lock
 impairs performance by disabling interrupts, dirtying cache lines between
 CPUs and serialising many CPUs.
 
@@ -79,11 +79,11 @@ contention on the lruvec->lru_lock.
   mm_page_pcpu_drain		page=%p pfn=%lu order=%d cpu=%d migratetype=%d
 
 In front of the page allocator is a per-cpu page allocator. It exists only
-for order-0 pages, reduces contention on the zone_lock and reduces the
+for order-0 pages, reduces contention on the zone lock and reduces the
 amount of writing on struct page.
 
 When a per-CPU list is empty or pages of the wrong type are allocated,
-the zone_lock will be taken once and the per-CPU list refilled. The event
+the zone lock will be taken once and the per-CPU list refilled. The event
 triggered is mm_page_alloc_zone_locked for each page allocated with the
 event indicating whether it is for a percpu_refill or not.
 
@@ -92,7 +92,7 @@ which triggers a mm_page_pcpu_drain event.
 
 The individual nature of the events is so that pages can be tracked
 between allocation and freeing. A number of drain or refill pages that occur
-consecutively imply the zone_lock being taken once. Large amounts of per-CPU
+consecutively imply the zone lock being taken once. Large amounts of per-CPU
 refills and drains could imply an imbalance between CPUs where too much work
 is being concentrated in one place. It could also indicate that the per-CPU
 lists should be a larger size. Finally, large amounts of refills on one CPU
diff --git a/mm/compaction.c b/mm/compaction.c
index 143ead2cb10a..32623894a632 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1419,7 +1419,7 @@ static bool suitable_migration_target(struct compact_control *cc,
 		int order = cc->order > 0 ? cc->order : pageblock_order;
 
 		/*
-		 * We are checking page_order without zone->_lock taken. But
+		 * We are checking page_order without zone lock taken. But
 		 * the only small danger is that we skip a potentially suitable
 		 * pageblock, so it's not worth to check order for valid range.
 		 */
diff --git a/mm/internal.h b/mm/internal.h
index f634ac469c87..95b583e7e4f7 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -727,7 +727,7 @@ static inline unsigned int buddy_order(struct page *page)
  * (d) a page and its buddy are in the same zone.
  *
  * For recording whether a page is in the buddy system, we set PageBuddy.
- * Setting, clearing, and testing PageBuddy is serialized by zone->_lock.
+ * Setting, clearing, and testing PageBuddy is serialized by zone lock.
  *
  * For recording page's order, we use page_private(page).
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4c95364b7063..75ee81445640 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2440,7 +2440,7 @@ enum rmqueue_mode {
 
 /*
  * Do the hard work of removing an element from the buddy allocator.
- * Call me with the zone->_lock already held.
+ * Call me with the zone lock already held.
  */
 static __always_inline struct page *
 __rmqueue(struct zone *zone, unsigned int order, int migratetype,
@@ -2468,7 +2468,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
 	 * fallbacks modes with increasing levels of fragmentation risk.
 	 *
 	 * The fallback logic is expensive and rmqueue_bulk() calls in
-	 * a loop with the zone->_lock held, meaning the freelists are
+	 * a loop with the zone lock held, meaning the freelists are
 	 * not subject to any outside changes. Remember in *mode where
 	 * we found pay dirt, to save us the search on the next call.
 	 */
@@ -7046,7 +7046,7 @@ int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
 	 * pages.  Because of this, we reserve the bigger range and
 	 * once this is done free the pages we are not interested in.
 	 *
-	 * We don't have to hold zone->_lock here because the pages are
+	 * We don't have to hold zone lock here because the pages are
 	 * isolated thus they won't get removed from buddy.
 	 */
 	outer_start = find_large_buddy(start);
@@ -7615,7 +7615,7 @@ void accept_page(struct page *page)
 		return;
 	}
 
-	/* Unlocks zone->_lock */
+	/* Unlocks zone lock */
 	__accept_page(zone, &flags, page);
 }
 
@@ -7632,7 +7632,7 @@ static bool try_to_accept_memory_one(struct zone *zone)
 		return false;
 	}
 
-	/* Unlocks zone->_lock */
+	/* Unlocks zone lock */
 	__accept_page(zone, &flags, page);
 
 	return true;
@@ -7773,7 +7773,7 @@ struct page *alloc_frozen_pages_nolock_noprof(gfp_t gfp_flags, int nid, unsigned
 
 	/*
 	 * Best effort allocation from percpu free list.
-	 * If it's empty attempt to spin_trylock zone->_lock.
+	 * If it's empty attempt to spin_trylock zone lock.
 	 */
 	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
 
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index cf731370e7a7..e8414e9a718a 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -212,7 +212,7 @@ static int set_migratetype_isolate(struct page *page, enum pb_isolate_mode mode,
 	zone_unlock_irqrestore(zone, flags);
 	if (mode == PB_ISOLATE_MODE_MEM_OFFLINE) {
 		/*
-		 * printk() with zone->_lock held will likely trigger a
+		 * printk() with zone lock held will likely trigger a
 		 * lockdep splat, so defer it here.
 		 */
 		dump_page(unmovable, "unmovable page");
@@ -553,7 +553,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn)
 /*
  * Test all pages in the range is free(means isolated) or not.
  * all pages in [start_pfn...end_pfn) must be in the same zone.
- * zone->_lock must be held before call this.
+ * zone lock must be held before call this.
  *
  * Returns the last tested pfn.
  */
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 54a4ba63b14f..109f2f28f5b1 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -799,7 +799,7 @@ static void init_pages_in_zone(struct zone *zone)
 				continue;
 
 			/*
-			 * To avoid having to grab zone->_lock, be a little
+			 * To avoid having to grab zone lock, be a little
 			 * careful when reading buddy page order. The only
 			 * danger is that we skip too much and potentially miss
 			 * some early allocated pages, which is better than
-- 
2.47.3


^ permalink raw reply related

* Re: [RFC PATCH 1/2] locking: add mutex_lock_nospin()
From: Waiman Long @ 2026-03-05 18:44 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Steven Rostedt, David Laight, Peter Zijlstra, mingo, will, boqun,
	mhiramat, mark.rutland, mathieu.desnoyers, linux-kernel,
	linux-trace-kernel, bpf
In-Reply-To: <c58807dc-74ea-469e-9c21-0300b80f2a82@redhat.com>

On 3/5/26 1:34 PM, Waiman Long wrote:
> On 3/5/26 12:40 AM, Yafang Shao wrote:
>> On Thu, Mar 5, 2026 at 12:30 PM Waiman Long <longman@redhat.com> wrote:
>>> On 3/4/26 10:08 PM, Yafang Shao wrote:
>>>> On Thu, Mar 5, 2026 at 11:00 AM Steven Rostedt 
>>>> <rostedt@goodmis.org> wrote:
>>>>> On Thu, 5 Mar 2026 10:33:00 +0800
>>>>> Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>>
>>>>>> Other tools may also read available_filter_functions, requiring each
>>>>>> one to be patched individually to avoid this flaw—a clearly
>>>>>> impractical solution.
>>>>> What exactly is the issue?
>>>> It makes no sense to spin unnecessarily when it can be avoided. We
>>>> continuously improve the kernel to do the right thing—and unnecessary
>>>> spinning is certainly not the right thing.
>>>>
>>>>> If a task does a while 1 in user space, it
>>>>> wouldn't be much different.
>>>> The while loop in user space performs actual work, whereas useless
>>>> spinning does nothing but burn CPU cycles. My point is simple: if this
>>>> unnecessary spinning isn't already considered an issue, it should
>>>> be—it's something that clearly needs improvement.
>>> The whole point of optimistic spinning is to reduce the lock 
>>> acquisition
>>> latency. If the waiter sleeps, the unlock operation will have to 
>>> wake up
>>> the waiter which can have a variable latency depending on how busy the
>>> system is at the time. Yes, it is burning CPU cycles while spinning,
>>> Most workloads will gain performance with this optimistic spinning
>>> feature. You do have a point that for system monitoring tools that
>>> observe the system behavior, they shouldn't burn that much CPU times
>>> that affect performance of real workload that the tools are monitoring.
>> Exactly. ftrace is intended for debugging and should not significantly
>> impact real workloads. Therefore, it's reasonable to make it sleep if
>> it cannot acquire the lock immediately, rather than spinning and
>> consuming CPU cycles.
>
> Your patch series use wordings that give a negative connotation about 
> optimistic spinning making it look like a bad thing. In fact, it is 
> just a request for a new mutex API for use cases where they can suffer 
> higher latency in order to minimize the system overhead they incur. So 
> don't bad-mouth optimistic spinning and emphasize the use cases you 
> want to support with the new API in your next version. 

BTW, for any new mutex API introduced, you should also provide an 
equivalent version in kernel/locking/rtmutex_api.c for PREEMPT_RT kernel.

Cheers,
Longman


^ permalink raw reply

* Re: [PATCH v12 00/30] Tracefs support for pKVM
From: Vincent Donnefort @ 2026-03-05 18:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marc Zyngier, mhiramat, mathieu.desnoyers, linux-trace-kernel,
	oliver.upton, joey.gouly, suzuki.poulose, yuzenghui, kvmarm,
	linux-arm-kernel, jstultz, qperret, will, aneesh.kumar,
	kernel-team, linux-kernel
In-Reply-To: <20260305111703.6346b243@gandalf.local.home>

O Thu, Mar 05, 2026 at 11:17:03AM -0500, Steven Rostedt wrote:
> On Thu, 19 Feb 2026 19:11:21 +0000
> Marc Zyngier <maz@kernel.org> wrote:
> 
> > > 
> > > Then the arm/kvm folks could start with that branch and add the arm/KVM
> > > portion of this series on top of it. This will prevent major merge
> > > conflicts in linux-next.
> > > 
> > > Does that sound OK?  
> > 
> > That works. Just send us a link to the branch after -rc2 and we'll get
> > that sorted.
> 
> Vincent,
> 
> There's a few minor conflicts with this series and v7.0-rc2, can you rebase
> and send again?

I'll send a rebased new version tomorrow.

> 
> Thanks,
> 
> -- Steve

^ permalink raw reply

* Re: [RFC PATCH 1/2] locking: add mutex_lock_nospin()
From: Waiman Long @ 2026-03-05 18:34 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Steven Rostedt, David Laight, Peter Zijlstra, mingo, will, boqun,
	mhiramat, mark.rutland, mathieu.desnoyers, linux-kernel,
	linux-trace-kernel, bpf
In-Reply-To: <CALOAHbAsiVNC7COL7rCJBdaC-8oZQvg+pjudR4QexQQaYLBmgA@mail.gmail.com>

On 3/5/26 12:40 AM, Yafang Shao wrote:
> On Thu, Mar 5, 2026 at 12:30 PM Waiman Long <longman@redhat.com> wrote:
>> On 3/4/26 10:08 PM, Yafang Shao wrote:
>>> On Thu, Mar 5, 2026 at 11:00 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>>>> On Thu, 5 Mar 2026 10:33:00 +0800
>>>> Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>
>>>>> Other tools may also read available_filter_functions, requiring each
>>>>> one to be patched individually to avoid this flaw—a clearly
>>>>> impractical solution.
>>>> What exactly is the issue?
>>> It makes no sense to spin unnecessarily when it can be avoided. We
>>> continuously improve the kernel to do the right thing—and unnecessary
>>> spinning is certainly not the right thing.
>>>
>>>> If a task does a while 1 in user space, it
>>>> wouldn't be much different.
>>> The while loop in user space performs actual work, whereas useless
>>> spinning does nothing but burn CPU cycles. My point is simple: if this
>>> unnecessary spinning isn't already considered an issue, it should
>>> be—it's something that clearly needs improvement.
>> The whole point of optimistic spinning is to reduce the lock acquisition
>> latency. If the waiter sleeps, the unlock operation will have to wake up
>> the waiter which can have a variable latency depending on how busy the
>> system is at the time. Yes, it is burning CPU cycles while spinning,
>> Most workloads will gain performance with this optimistic spinning
>> feature. You do have a point that for system monitoring tools that
>> observe the system behavior, they shouldn't burn that much CPU times
>> that affect performance of real workload that the tools are monitoring.
> Exactly. ftrace is intended for debugging and should not significantly
> impact real workloads. Therefore, it's reasonable to make it sleep if
> it cannot acquire the lock immediately, rather than spinning and
> consuming CPU cycles.

Your patch series use wordings that give a negative connotation about 
optimistic spinning making it look like a bad thing. In fact, it is just 
a request for a new mutex API for use cases where they can suffer higher 
latency in order to minimize the system overhead they incur. So don't 
bad-mouth optimistic spinning and emphasize the use cases you want to 
support with the new API in your next version.

My 2 cents.

Cheers,
Longman


^ permalink raw reply

* Re: [PATCH v4 4/5] mm: rename zone->lock to zone->_lock
From: Dmitry Ilvokhin @ 2026-03-05 18:16 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: SeongJae Park, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Rafael J. Wysocki, Pavel Machek, Len Brown, Brendan Jackman,
	Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
	linux-kernel, linux-mm, linux-trace-kernel, linux-pm
In-Reply-To: <ebd994ca-eb04-4dff-a0a8-47aef0934c2c@kernel.org>

On Thu, Mar 05, 2026 at 10:27:07AM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/4/26 16:13, SeongJae Park wrote:
> > On Wed, 4 Mar 2026 13:01:45 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> > 
> >> On Tue, Mar 03, 2026 at 05:50:34PM -0800, SeongJae Park wrote:
> >> > On Tue, 3 Mar 2026 14:25:55 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> >> > 
> >> > > On Mon, Mar 02, 2026 at 02:37:43PM -0800, Andrew Morton wrote:
> >> > > > On Mon, 2 Mar 2026 15:10:03 +0100 "Vlastimil Babka (SUSE)" <vbabka@kernel.org> wrote:
> >> > > > 
> >> > > > > On 2/27/26 17:00, Dmitry Ilvokhin wrote:
> >> > > > > > This intentionally breaks direct users of zone->lock at compile time so
> >> > > > > > all call sites are converted to the zone lock wrappers. Without the
> >> > > > > > rename, present and future out-of-tree code could continue using
> >> > > > > > spin_lock(&zone->lock) and bypass the wrappers and tracing
> >> > > > > > infrastructure.
> >> > > > > > 
> >> > > > > > No functional change intended.
> >> > > > > > 
> >> > > > > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> >> > > > > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > > > > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> >> > > > > > Acked-by: SeongJae Park <sj@kernel.org>
> >> > > > > 
> >> > > > > I see some more instances of 'zone->lock' in comments in
> >> > > > > include/linux/mmzone.h and under Documentation/ but otherwise LGTM.
> >> > > > > 
> >> > > > 
> >> > > > I fixed (most of) that in the previous version but my fix was lost.
> >> > > 
> >> > > Thanks for the fixups, Andrew.
> >> > > 
> >> > > I still see a few 'zone->lock' references in Documentation remain on
> >> > > mm-new. This patch cleans them up, as noted by Vlastimil.
> >> > > 
> >> > > I'm happy to adjust this patch if anything else needs attention.
> >> > > 
> >> > > From 9142d5a8b60038fa424a6033253960682e5a51f4 Mon Sep 17 00:00:00 2001
> >> > > From: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > Date: Tue, 3 Mar 2026 06:13:13 -0800
> >> > > Subject: [PATCH] mm: fix remaining zone->lock references
> >> > > 
> >> > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > ---
> >> > >  Documentation/mm/physical_memory.rst | 4 ++--
> >> > >  Documentation/trace/events-kmem.rst  | 8 ++++----
> >> > >  2 files changed, 6 insertions(+), 6 deletions(-)
> >> > > 
> >> > > diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
> >> > > index b76183545e5b..e344f93515b6 100644
> >> > > --- a/Documentation/mm/physical_memory.rst
> >> > > +++ b/Documentation/mm/physical_memory.rst
> >> > > @@ -500,11 +500,11 @@ General
> >> > >  ``nr_isolate_pageblock``
> >> > >    Number of isolated pageblocks. It is used to solve incorrect freepage counting
> >> > >    problem due to racy retrieving migratetype of pageblock. Protected by
> >> > > -  ``zone->lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> >> > > +  ``zone_lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> >> > 
> >> > Dmitry's original patch [1] was doing 's/zone->lock/zone->_lock/', which aligns
> >> > to my expectation.  But this patch is doing 's/zone->lock/zone_lock/'.  Same
> >> > for the rest of this patch.
> >> > 
> >> > I was initially thinking this is just a mistake, but I also found Andrew is
> >> > doing same change [2], so I'm bit confused.  Is this an intentional change?
> >> > 
> >> > [1] https://lore.kernel.org/d61500c5784c64e971f4d328c57639303c475f81.1772206930.git.d@ilvokhin.com
> >> > [2] https://lore.kernel.org/20260302143743.220eed4feb36d7572fe726cc@linux-foundation.org
> >> > 
> >> 
> >> Good catch, thanks for pointing this out, SJ.
> >> 
> >> Originally the mechanical rename was indeed zone->lock -> zone->_lock.
> >> However, in Documentation I intentionally switched references to
> >> zone_lock instead of zone->_lock. The reasoning is that _lock is now an
> >> internal implementation detail, and direct access is discouraged. The
> >> intended interface is via the zone_lock_*() / zone_unlock_*() wrappers,
> >> so referencing zone_lock in documentation felt more appropriate than
> >> mentioning the private struct field (zone->_lock).
> > 
> > Thank you for this nice and kind clarification, Dmitry!  I agree mentioning
> > zone_[un]lock_*() helpers instead of the hidden member (zone->_lock) can be
> > better.
> > 
> > But, I'm concerned if people like me might not aware the intention under
> > 'zone_lock'.  If there is a well-known convention that allows people to know it
> > is for 'zone_[un]lock_*()' helpers, making it more clear would be nice, in my
> > humble opinion.  If there is such a convention but I'm just missing it, please
> > ignore.  If I'm not, for eaxmaple,
> > 
> > "protected by ``zone->lock``" could be re-wrote to
> > "protected by ``zone_[un]lock_*()`` locking helpers" or,
> > "protected by zone lock helper functions (``zone_[un]lock_*()``)" ?
> > 
> >> 
> >> That said, I agree this creates inconsistency with the mechanical
> >> rename, and I'm happy to adjust either way: either consistently refer
> >> to the wrapper API, or keep documentation aligned with zone->_lock.
> >> 
> >> I slightly prefer referring to the wrapper API, but don't have a strong
> >> preference as long as we're consistent.
> > 
> > I also think both approaches are good.  But for the wrapper approach, I think
> > giving more contexts rather than just ``zone_lock`` to readers would be nice.
> 
> Grep tells me that we also have comments mentioning simply "zone lock", btw.
> And it's also a term used often in informal conversations. Maybe we could
> just standardize on that in comments/documentations as it's easier to read.
> Discovering that the field is called _lock and that wrappers should be used,
> is hopefully not that difficult.

Thanks for the suggestion, Vlastimil. That sounds reasonable to me as
well. I'll update the comments and documentation to consistently use
"zone lock".

^ permalink raw reply

* Re: [PATCH v2] tracefs: Use dentry name snapshots instead of heap allocation
From: Steven Rostedt @ 2026-03-05 18:12 UTC (permalink / raw)
  To: AnishMulay
  Cc: viro, mhiramat, mathieu.desnoyers, linux-trace-kernel,
	linux-kernel
In-Reply-To: <20260227211505.226643-1-anishm7030@gmail.com>

On Fri, 27 Feb 2026 16:15:05 -0500
AnishMulay <anishm7030@gmail.com> wrote:

> In fs/tracefs/inode.c, tracefs_syscall_mkdir() and tracefs_syscall_rmdir()
> previously used a local helper, get_dname(), which allocated a temporary
> buffer on the heap via kmalloc() to hold the dentry name. This introduced
> unnecessary overhead, an ENOMEM failure path, and required manual memory
> cleanup via kfree().
> 
> As suggested by Al Viro, replace this heap allocation with the VFS dentry
> name snapshot API. By stack-allocating a `struct name_snapshot` and using
> take_dentry_name_snapshot() and release_dentry_name_snapshot(), we safely
> capture the dentry name locklessly, eliminate the heap allocation entirely,
> and remove the now-obsolete error handling paths. The get_dname() helper
> is completely removed.
> 
> Testing:
> Booted a custom kernel natively in virtme-ng (ARM64). Triggered tracefs
> inode and dentry allocation by creating and removing a custom directory
> under a temporary tracefs mount. Verified that the instance is created
> successfully and that no memory errors or warnings are emitted in dmesg.
> 
> Signed-off-by: AnishMulay <anishm7030@gmail.com>
> ---

It's nice to add version change history below the "---" with a link to the
previous patch:

  Changes since v1: https://lore.kernel.org/linux-trace-kernel/20260227194453.213095-1-anishm7030@gmail.com/

  - Use the helper functions take/release_dentry_name_snapshot() instead of
    allocating the name. (Al Viro)

As when I pull in a patch, my scripts add a link to the patch itself, and
having that patch have a link to the previous version is always helpful.

(This email serves that purpose for this patch)

-- Steve

^ permalink raw reply

* Re: [PATCH RFC 3/3] locking: Wire up contended_release tracepoint
From: Steven Rostedt @ 2026-03-05 18:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Ilvokhin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Masami Hiramatsu, Mathieu Desnoyers, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, linux-mm, linux-kernel,
	linux-trace-kernel, kernel-team
In-Reply-To: <20260305174223.GA1442992@noisy.programming.kicks-ass.net>

On Thu, 5 Mar 2026 18:42:23 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> I still wish you would accept:
> 
> 	if (trace_foo_enabled() && foo)
> 		__do_trace_foo();
> 
> The compilers can't optimize the static branches and thus you'll get it
> twice for no reason.
> 
> I really wish they would just accept __pure, but alas.

Makes sense, and that could probably be done. It shouldn't be too hard to
do. If I find some time I could look at it, or perhaps someone lurking on
this thread could possibly give it a try! (I may even Cc some people that
want to learn this code).

-- Steve

^ permalink raw reply

* Re: [PATCH v5 2/3] ring-buffer: Handle RB_MISSED_* flags on commit field correctly
From: Steven Rostedt @ 2026-03-05 18:03 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel
In-Reply-To: <177211312362.419230.15156461178245984273.stgit@mhiramat.tok.corp.google.com>

On Thu, 26 Feb 2026 22:38:43 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Since the MSBs of rb_data_page::commit are used for storing
> RB_MISSED_EVENTS and RB_MISSED_STORED, we need to mask out those bits
> when it is used for finding the size of data pages.
> 
> Fixes: 5f3b6e839f3c ("ring-buffer: Validate boot range memory events")
> Fixes: 5b7be9c709e1 ("ring-buffer: Add test to validate the time stamp deltas")
> Cc: stable@vger.kernel.org

This is unneeded for the current way things work.

The missed events flags are added when a page is read, so the commits in
the write buffer should never have those flags set. If they did, the ring
buffer code itself would break.

But as patch 3 is adding a flag, you should likely merge this and patch 3
together, as the only way that flag would get set is if the validator set
it on a previous boot. And then this would be needed for subsequent boots
that did not reset the buffer.

Hmm, I don't think we even need to do that! Because if it is set, it would
simply warn again that a page is invalid, and I think we *want* that! As it
would preserve that pages were invalid and not be cleared with a simple
reboot.

-- Steve

^ permalink raw reply

* Re: [PATCH RFC 3/3] locking: Wire up contended_release tracepoint
From: Peter Zijlstra @ 2026-03-05 17:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dmitry Ilvokhin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Masami Hiramatsu, Mathieu Desnoyers, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, linux-mm, linux-kernel,
	linux-trace-kernel, kernel-team
In-Reply-To: <20260305105924.7069eb23@gandalf.local.home>

On Thu, Mar 05, 2026 at 10:59:24AM -0500, Steven Rostedt wrote:
> On Wed,  4 Mar 2026 16:56:17 +0000
> Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> 
> > @@ -204,6 +206,8 @@ static inline void rwbase_write_unlock(struct rwbase_rt *rwb)
> >  	unsigned long flags;
> >  
> >  	raw_spin_lock_irqsave(&rtm->wait_lock, flags);
> > +	if (rt_mutex_has_waiters(rtm))
> > +		trace_contended_release(rwb);
> 
> Hmm, if statements should never be used just for tracepoints without a
> static branch. The above should be:
> 
> 	if (trace_contended_release_enabled() && rt_mutex_has_waiters(rtm))
> 		trace_contended_release(rwb);
> 

I still wish you would accept:

	if (trace_foo_enabled() && foo)
		__do_trace_foo();

The compilers can't optimize the static branches and thus you'll get it
twice for no reason.

I really wish they would just accept __pure, but alas.

^ permalink raw reply

* Re: [PATCH v8 3/6] tracefs: Check file permission even if user has CAP_DAC_OVERRIDE
From: Steven Rostedt @ 2026-03-05 17:33 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel
In-Reply-To: <20260212151515.b384ac24de9b736d10387d21@kernel.org>

On Thu, 12 Feb 2026 15:15:15 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> > With this still not working this late in the game, it will need to wait
> > until the next merge window. I'll take the first two patches of this
> > series now though.  
> 
> OK. I will send the next version without the first 2 patches.

Hi Masami,

Did you send a new version of this yet?

-- Steve

^ permalink raw reply

* Re: [PATCH] tracing: Do not call trigger_data_free with NULL data pointer
From: Guenter Roeck @ 2026-03-05 16:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Miaoqian Lin
In-Reply-To: <20260305114336.0f2710eb@gandalf.local.home>

On Thu, Mar 05, 2026 at 11:43:36AM -0500, Steven Rostedt wrote:
> On Thu,  5 Mar 2026 08:36:33 -0800
> Guenter Roeck <linux@roeck-us.net> wrote:
> 
> > If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse()
> > jumps to the out_free error path. While kfree() safely handles a NULL
> > pointer, trigger_data_free() does not. This causes a NULL pointer
> > dereference in trigger_data_free() when evaluating
> > data->cmd_ops->set_filter.
> > 
> > Fix the problem by adding a new goto label and jumping to it if
> > trigger_data_alloc() returns NULL.
> > 
> > The problem was found by an experimental code review agent based on
> > gemini-3.1-pro while reviewing backports into v6.18.y.
> > 
> > Assisted-by: Gemini:gemini-3.1-pro
> > Cc: Miaoqian Lin <linmq006@gmail.com>
> > Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> > Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()")
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> >  kernel/trace/trace_events_hist.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> > index 73ea180cad55..a2abdfe19281 100644
> > --- a/kernel/trace/trace_events_hist.c
> > +++ b/kernel/trace/trace_events_hist.c
> > @@ -6874,7 +6874,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
> >  	trigger_data = trigger_data_alloc(cmd_ops, cmd, param, hist_data);
> >  	if (!trigger_data) {
> >  		ret = -ENOMEM;
> > -		goto out_free;
> > +		goto out_destroy;
> >  	}
> >  
> >  	ret = event_trigger_set_filter(cmd_ops, file, filter, trigger_data);
> > @@ -6942,7 +6942,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
> >  	remove_hist_vars(hist_data);
> >  
> >  	trigger_data_free(trigger_data);
> 
> I rather make trigger_data_free() more robust by starting it with:
> 
> 	if (!data)
> 		return;

Sure. No preference on my side. I'll send v2.

Thanks,
Guenter

^ permalink raw reply

* Re: [PATCH] tracing: Do not call trigger_data_free with NULL data pointer
From: Steven Rostedt @ 2026-03-05 16:43 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Miaoqian Lin
In-Reply-To: <20260305163633.2782210-1-linux@roeck-us.net>

On Thu,  5 Mar 2026 08:36:33 -0800
Guenter Roeck <linux@roeck-us.net> wrote:

> If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse()
> jumps to the out_free error path. While kfree() safely handles a NULL
> pointer, trigger_data_free() does not. This causes a NULL pointer
> dereference in trigger_data_free() when evaluating
> data->cmd_ops->set_filter.
> 
> Fix the problem by adding a new goto label and jumping to it if
> trigger_data_alloc() returns NULL.
> 
> The problem was found by an experimental code review agent based on
> gemini-3.1-pro while reviewing backports into v6.18.y.
> 
> Assisted-by: Gemini:gemini-3.1-pro
> Cc: Miaoqian Lin <linmq006@gmail.com>
> Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()")
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  kernel/trace/trace_events_hist.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index 73ea180cad55..a2abdfe19281 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -6874,7 +6874,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
>  	trigger_data = trigger_data_alloc(cmd_ops, cmd, param, hist_data);
>  	if (!trigger_data) {
>  		ret = -ENOMEM;
> -		goto out_free;
> +		goto out_destroy;
>  	}
>  
>  	ret = event_trigger_set_filter(cmd_ops, file, filter, trigger_data);
> @@ -6942,7 +6942,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
>  	remove_hist_vars(hist_data);
>  
>  	trigger_data_free(trigger_data);

I rather make trigger_data_free() more robust by starting it with:

	if (!data)
		return;

-- Steve

> -
> +out_destroy:
>  	destroy_hist_data(hist_data);
>  	goto out;
>  }


^ permalink raw reply

* [PATCH] tracing: Do not call trigger_data_free with NULL data pointer
From: Guenter Roeck @ 2026-03-05 16:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Guenter Roeck, Miaoqian Lin

If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse()
jumps to the out_free error path. While kfree() safely handles a NULL
pointer, trigger_data_free() does not. This causes a NULL pointer
dereference in trigger_data_free() when evaluating
data->cmd_ops->set_filter.

Fix the problem by adding a new goto label and jumping to it if
trigger_data_alloc() returns NULL.

The problem was found by an experimental code review agent based on
gemini-3.1-pro while reviewing backports into v6.18.y.

Assisted-by: Gemini:gemini-3.1-pro
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 kernel/trace/trace_events_hist.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 73ea180cad55..a2abdfe19281 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -6874,7 +6874,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
 	trigger_data = trigger_data_alloc(cmd_ops, cmd, param, hist_data);
 	if (!trigger_data) {
 		ret = -ENOMEM;
-		goto out_free;
+		goto out_destroy;
 	}
 
 	ret = event_trigger_set_filter(cmd_ops, file, filter, trigger_data);
@@ -6942,7 +6942,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
 	remove_hist_vars(hist_data);
 
 	trigger_data_free(trigger_data);
-
+out_destroy:
 	destroy_hist_data(hist_data);
 	goto out;
 }
-- 
2.45.2


^ permalink raw reply related

* Re: [PATCH v12 00/30] Tracefs support for pKVM
From: Steven Rostedt @ 2026-03-05 16:17 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: Marc Zyngier, mhiramat, mathieu.desnoyers, linux-trace-kernel,
	oliver.upton, joey.gouly, suzuki.poulose, yuzenghui, kvmarm,
	linux-arm-kernel, jstultz, qperret, will, aneesh.kumar,
	kernel-team, linux-kernel
In-Reply-To: <87cy20eeg6.wl-maz@kernel.org>

On Thu, 19 Feb 2026 19:11:21 +0000
Marc Zyngier <maz@kernel.org> wrote:

> > 
> > Then the arm/kvm folks could start with that branch and add the arm/KVM
> > portion of this series on top of it. This will prevent major merge
> > conflicts in linux-next.
> > 
> > Does that sound OK?  
> 
> That works. Just send us a link to the branch after -rc2 and we'll get
> that sorted.

Vincent,

There's a few minor conflicts with this series and v7.0-rc2, can you rebase
and send again?

Thanks,

-- Steve

^ permalink raw reply

* Re: [PATCH] tracing: Documentation: Update histogram-design.rst for fn() handling
From: Jonathan Corbet @ 2026-03-05 16:12 UTC (permalink / raw)
  To: Steven Rostedt, linux-doc
  Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mathieu Desnoyers,
	Tom Zanussi
In-Reply-To: <20260305110347.31d6bae5@gandalf.local.home>

Steven Rostedt <rostedt@goodmis.org> writes:

> Hi Jon,
>
> Can you take this through your tree?

Sure, will do.

Thanks,

jon

^ permalink raw reply

* Re: [PATCH RFC 2/3] locking/percpu-rwsem: Extract __percpu_up_read_slowpath()
From: Peter Zijlstra @ 2026-03-05 16:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dmitry Ilvokhin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Masami Hiramatsu, Mathieu Desnoyers, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, linux-mm, linux-kernel,
	linux-trace-kernel, kernel-team, Christoph Hellwig
In-Reply-To: <20260305104703.2a1e8151@gandalf.local.home>

On Thu, Mar 05, 2026 at 10:47:03AM -0500, Steven Rostedt wrote:
> On Wed, 4 Mar 2026 23:02:23 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
> > > index c8cb010d655e..89506895365c 100644
> > > --- a/include/linux/percpu-rwsem.h
> > > +++ b/include/linux/percpu-rwsem.h
> > > @@ -107,6 +107,8 @@ static inline bool percpu_down_read_trylock(struct percpu_rw_semaphore *sem)
> > >  	return ret;
> > >  }
> > >  
> > > +void __percpu_up_read_slowpath(struct percpu_rw_semaphore *sem);
> > > +  
> > 
> > extern for consistency with all the other declarations in this header.
> 
> I wonder if a cleanup patch should be added to remove the "extern" from the
> other functions, as that tends to be the way things are going (hch just
> recommended it elsewhere).

Well, I rather like the extern. But yeah, I know hch does not agree.

> > 
> > s/_slowpath//, the corresponding down function also doesn't have
> > _slowpath on.
> > 
> > >  static inline void percpu_up_read(struct percpu_rw_semaphore *sem)
> > >  {
> > >  	rwsem_release(&sem->dep_map, _RET_IP_);  
> 
> And since "slowpath" is more descriptive (and used in the rtmutex code),
> should that be added too?

It already has __ prefix, no point in making the name even longer for no
real benefit.

^ permalink raw reply

* Re: [PATCH] tracing: Documentation: Update histogram-design.rst for fn() handling
From: Steven Rostedt @ 2026-03-05 16:03 UTC (permalink / raw)
  To: linux-doc, Jonathan Corbet
  Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mathieu Desnoyers,
	Tom Zanussi
In-Reply-To: <20260126181742.03e8f0d5@gandalf.local.home>


Hi Jon,

Can you take this through your tree?

Thanks,

-- Steve


On Mon, 26 Jan 2026 18:17:42 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> The histogram documentation describes the old method of the histogram
> triggers using the fn() field of the histogram field structure to process
> the field. But due to Spectre mitigation, the function pointer to handle
> the fields at runtime caused a noticeable overhead. It was converted over
> to a fn_num and hist_fn_call() is now used to call the specific functions
> for the fields via a switch statement based on the field's fn_num value.
> 
> Update the documentation to reflect this change.
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  Documentation/trace/histogram-design.rst | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst
> index ae71b5bf97c6..e92f56ebd0b5 100644
> --- a/Documentation/trace/histogram-design.rst
> +++ b/Documentation/trace/histogram-design.rst
> @@ -69,7 +69,8 @@ So in this histogram, there's a separate bucket for each pid, and each
>  bucket contains a value for that bucket, counting the number of times
>  sched_waking was called for that pid.
>  
> -Each histogram is represented by a hist_data struct.
> +Each histogram is represented by a hist_data struct
> +(struct hist_trigger_data).
>  
>  To keep track of each key and value field in the histogram, hist_data
>  keeps an array of these fields named fields[].  The fields[] array is
> @@ -82,7 +83,7 @@ value or not, which the above histogram does not.
>  
>  Each struct hist_field contains a pointer to the ftrace_event_field
>  from the event's trace_event_file along with various bits related to
> -that such as the size, offset, type, and a hist_field_fn_t function,
> +that such as the size, offset, type, and a hist field function,
>  which is used to grab the field's data from the ftrace event buffer
>  (in most cases - some hist_fields such as hitcount don't directly map
>  to an event field in the trace buffer - in these cases the function
> @@ -241,28 +242,33 @@ it, event_hist_trigger() is called.  event_hist_trigger() first deals
>  with the key: for each subkey in the key (in the above example, there
>  is just one subkey corresponding to pid), the hist_field that
>  represents that subkey is retrieved from hist_data.fields[] and the
> -hist_field_fn_t fn() associated with that field, along with the
> +hist field function associated with that field, along with the
>  field's size and offset, is used to grab that subkey's data from the
>  current trace record.
>  
> +Note, the hist field function use to be a function pointer in the
> +hist_field stucture. Due to spectre mitigation, it was converted into
> +a fn_num and hist_fn_call() is used to call the associated hist field
> +function that corresponds to the fn_num of the hist_field structure.
> +
>  Once the complete key has been retrieved, it's used to look that key
>  up in the tracing_map.  If there's no tracing_map_elt associated with
>  that key, an empty one is claimed and inserted in the map for the new
>  key.  In either case, the tracing_map_elt associated with that key is
>  returned.
>  
> -Once a tracing_map_elt available, hist_trigger_elt_update() is called.
> +Once a tracing_map_elt is available, hist_trigger_elt_update() is called.
>  As the name implies, this updates the element, which basically means
>  updating the element's fields.  There's a tracing_map_field associated
>  with each key and value in the histogram, and each of these correspond
>  to the key and value hist_fields created when the histogram was
>  created.  hist_trigger_elt_update() goes through each value hist_field
> -and, as for the keys, uses the hist_field's fn() and size and offset
> +and, as for the keys, uses the hist_field's function and size and offset
>  to grab the field's value from the current trace record.  Once it has
>  that value, it simply adds that value to that field's
>  continually-updated tracing_map_field.sum member.  Some hist_field
> -fn()s, such as for the hitcount, don't actually grab anything from the
> -trace record (the hitcount fn() just increments the counter sum by 1),
> +functions, such as for the hitcount, don't actually grab anything from the
> +trace record (the hitcount function just increments the counter sum by 1),
>  but the idea is the same.
>  
>  Once all the values have been updated, hist_trigger_elt_update() is


^ permalink raw reply

* Re: [PATCH RFC 3/3] locking: Wire up contended_release tracepoint
From: Steven Rostedt @ 2026-03-05 15:59 UTC (permalink / raw)
  To: Dmitry Ilvokhin
  Cc: Dennis Zhou, Tejun Heo, Christoph Lameter, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, linux-mm, linux-kernel,
	linux-trace-kernel, kernel-team
In-Reply-To: <8298e098d3418cb446ef396f119edac58a3414e9.1772642407.git.d@ilvokhin.com>

On Wed,  4 Mar 2026 16:56:17 +0000
Dmitry Ilvokhin <d@ilvokhin.com> wrote:

> @@ -204,6 +206,8 @@ static inline void rwbase_write_unlock(struct rwbase_rt *rwb)
>  	unsigned long flags;
>  
>  	raw_spin_lock_irqsave(&rtm->wait_lock, flags);
> +	if (rt_mutex_has_waiters(rtm))
> +		trace_contended_release(rwb);

Hmm, if statements should never be used just for tracepoints without a
static branch. The above should be:

	if (trace_contended_release_enabled() && rt_mutex_has_waiters(rtm))
		trace_contended_release(rwb);

The above "trace_contened_release_enabled()" is a static_branch where it
turns the if statement into a nop when the tracepoint is not enabled, and a
jmp when it is.


>  	__rwbase_write_unlock(rwb, WRITER_BIAS, flags);
>  }
>  
> @@ -213,6 +217,8 @@ static inline void rwbase_write_downgrade(struct rwbase_rt *rwb)
>  	unsigned long flags;
>  
>  	raw_spin_lock_irqsave(&rtm->wait_lock, flags);
> +	if (rt_mutex_has_waiters(rtm))
> +		trace_contended_release(rwb);

Same here.

-- Steve

>  	/* Release it and account current as reader */
>  	__rwbase_write_unlock(rwb, WRITER_BIAS - 1, flags);
>  }
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 24df4d98f7d2..4e61dc0bb045 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -1360,6 +1360,7 @@ static inline void __up_read(struct rw_semaphore *sem)
>  	if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
>  		      RWSEM_FLAG_WAITERS)) {
>  		clear_nonspinnable(sem);
> +		trace_contended_release(sem);
>  		rwsem_wake(sem);
>  	}
>  	preempt_enable();
> @@ -1383,8 +1384,10 @@ static inline void __up_write(struct rw_semaphore *sem)
>  	preempt_disable();
>  	rwsem_clear_owner(sem);
>  	tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
> -	if (unlikely(tmp & RWSEM_FLAG_WAITERS))
> +	if (unlikely(tmp & RWSEM_FLAG_WAITERS)) {
> +		trace_contended_release(sem);
>  		rwsem_wake(sem);
> +	}
>  	preempt_enable();
>  }
>  
> @@ -1407,8 +1410,10 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
>  	tmp = atomic_long_fetch_add_release(
>  		-RWSEM_WRITER_LOCKED+RWSEM_READER_BIAS, &sem->count);
>  	rwsem_set_reader_owned(sem);
> -	if (tmp & RWSEM_FLAG_WAITERS)
> +	if (tmp & RWSEM_FLAG_WAITERS) {
> +		trace_contended_release(sem);
>  		rwsem_downgrade_wake(sem);
> +	}
>  	preempt_enable();
>  }
>  
> diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
> index 3ef032e22f7e..3cef5ba88f7e 100644
> --- a/kernel/locking/semaphore.c
> +++ b/kernel/locking/semaphore.c
> @@ -231,8 +231,10 @@ void __sched up(struct semaphore *sem)
>  	else
>  		__up(sem, &wake_q);
>  	raw_spin_unlock_irqrestore(&sem->lock, flags);
> -	if (!wake_q_empty(&wake_q))
> +	if (!wake_q_empty(&wake_q)) {
> +		trace_contended_release(sem);
>  		wake_up_q(&wake_q);
> +	}
>  }
>  EXPORT_SYMBOL(up);
>  


^ permalink raw reply

* Re: [PATCH RFC 2/3] locking/percpu-rwsem: Extract __percpu_up_read_slowpath()
From: Steven Rostedt @ 2026-03-05 15:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Ilvokhin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	Masami Hiramatsu, Mathieu Desnoyers, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, linux-mm, linux-kernel,
	linux-trace-kernel, kernel-team, Christoph Hellwig
In-Reply-To: <20260304220223.GS606826@noisy.programming.kicks-ass.net>

On Wed, 4 Mar 2026 23:02:23 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> > diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
> > index c8cb010d655e..89506895365c 100644
> > --- a/include/linux/percpu-rwsem.h
> > +++ b/include/linux/percpu-rwsem.h
> > @@ -107,6 +107,8 @@ static inline bool percpu_down_read_trylock(struct percpu_rw_semaphore *sem)
> >  	return ret;
> >  }
> >  
> > +void __percpu_up_read_slowpath(struct percpu_rw_semaphore *sem);
> > +  
> 
> extern for consistency with all the other declarations in this header.

I wonder if a cleanup patch should be added to remove the "extern" from the
other functions, as that tends to be the way things are going (hch just
recommended it elsewhere).

> 
> s/_slowpath//, the corresponding down function also doesn't have
> _slowpath on.
> 
> >  static inline void percpu_up_read(struct percpu_rw_semaphore *sem)
> >  {
> >  	rwsem_release(&sem->dep_map, _RET_IP_);  

And since "slowpath" is more descriptive (and used in the rtmutex code),
should that be added too?

-- Steve

^ permalink raw reply

* Re: [RFC PATCH] tracing: Revert "tracing: Remove pid in task_rename tracing output"
From: Steven Rostedt @ 2026-03-05 15:08 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: mhiramat, mathieu.desnoyers, elver, kees, lorenzo.stoakes,
	brauner, schuster.simon, david, linux-kernel, linux-trace-kernel,
	guohua.yan, ke.wang, xuewen.yan94
In-Reply-To: <20260305080809.3245-1-xuewen.yan@unisoc.com>

On Thu, 5 Mar 2026 16:08:09 +0800
Xuewen Yan <xuewen.yan@unisoc.com> wrote:

> This reverts commit e3f6a42272e028c46695acc83fc7d7c42f2750ad.
> 
> The commit says that the tracepoint only deals with the current task,
> however the following case is not current task:
> 
> comm_write
> set_task_comm
> __set_task_comm
> trace_task_rename

I'm fine with the patch, but the change log can use a bit of work:

  comm_write() {
	p = get_proc_task(inode);
	if (!p)
		return -ESRCH;

	if (same_thread_group(current, p)) {
		set_task_comm(p, buffer);

  where set_task_comm() calls __set_task_comm() which records the update of
  p and not current.

The above states exactly why current isn't the pid that should be recorded.

Other than that.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve


> 
> So revert the patch to show pid.
> 
> Fixes: e3f6a42272e0 ("tracing: Remove pid in task_rename tracing output")
> Reported-by: Guohua Yan <guohua.yan@unisoc.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  include/trace/events/task.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/events/task.h b/include/trace/events/task.h
> index 4f0759634306..b9a129eb54d9 100644
> --- a/include/trace/events/task.h
> +++ b/include/trace/events/task.h
> @@ -38,19 +38,22 @@ TRACE_EVENT(task_rename,
>  	TP_ARGS(task, comm),
>  
>  	TP_STRUCT__entry(
> +		__field(	pid_t,	pid)
>  		__array(	char, oldcomm,  TASK_COMM_LEN)
>  		__array(	char, newcomm,  TASK_COMM_LEN)
>  		__field(	short,	oom_score_adj)
>  	),
>  
>  	TP_fast_assign(
> +		__entry->pid = task->pid;
>  		memcpy(entry->oldcomm, task->comm, TASK_COMM_LEN);
>  		strscpy(entry->newcomm, comm, TASK_COMM_LEN);
>  		__entry->oom_score_adj = task->signal->oom_score_adj;
>  	),
>  
> -	TP_printk("oldcomm=%s newcomm=%s oom_score_adj=%hd",
> -		  __entry->oldcomm, __entry->newcomm, __entry->oom_score_adj)
> +	TP_printk("pid=%d oldcomm=%s newcomm=%s oom_score_adj=%hd",
> +		__entry->pid, __entry->oldcomm,
> +		__entry->newcomm, __entry->oom_score_adj)
>  );
>  
>  /**


^ permalink raw reply

* Re: [PATCH v4 4/5] mm: rename zone->lock to zone->_lock
From: SeongJae Park @ 2026-03-05 14:55 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: SeongJae Park, Dmitry Ilvokhin, Andrew Morton, David Hildenbrand,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Rafael J. Wysocki, Pavel Machek, Len Brown, Brendan Jackman,
	Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
	linux-kernel, linux-mm, linux-trace-kernel, linux-pm
In-Reply-To: <ebd994ca-eb04-4dff-a0a8-47aef0934c2c@kernel.org>

On Thu, 5 Mar 2026 10:27:07 +0100 "Vlastimil Babka (SUSE)" <vbabka@kernel.org> wrote:

> On 3/4/26 16:13, SeongJae Park wrote:
> > On Wed, 4 Mar 2026 13:01:45 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> > 
> >> On Tue, Mar 03, 2026 at 05:50:34PM -0800, SeongJae Park wrote:
> >> > On Tue, 3 Mar 2026 14:25:55 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> >> > 
> >> > > On Mon, Mar 02, 2026 at 02:37:43PM -0800, Andrew Morton wrote:
> >> > > > On Mon, 2 Mar 2026 15:10:03 +0100 "Vlastimil Babka (SUSE)" <vbabka@kernel.org> wrote:
> >> > > > 
> >> > > > > On 2/27/26 17:00, Dmitry Ilvokhin wrote:
> >> > > > > > This intentionally breaks direct users of zone->lock at compile time so
> >> > > > > > all call sites are converted to the zone lock wrappers. Without the
> >> > > > > > rename, present and future out-of-tree code could continue using
> >> > > > > > spin_lock(&zone->lock) and bypass the wrappers and tracing
> >> > > > > > infrastructure.
> >> > > > > > 
> >> > > > > > No functional change intended.
> >> > > > > > 
> >> > > > > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> >> > > > > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > > > > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> >> > > > > > Acked-by: SeongJae Park <sj@kernel.org>
> >> > > > > 
> >> > > > > I see some more instances of 'zone->lock' in comments in
> >> > > > > include/linux/mmzone.h and under Documentation/ but otherwise LGTM.
> >> > > > > 
> >> > > > 
> >> > > > I fixed (most of) that in the previous version but my fix was lost.
> >> > > 
> >> > > Thanks for the fixups, Andrew.
> >> > > 
> >> > > I still see a few 'zone->lock' references in Documentation remain on
> >> > > mm-new. This patch cleans them up, as noted by Vlastimil.
> >> > > 
> >> > > I'm happy to adjust this patch if anything else needs attention.
> >> > > 
> >> > > From 9142d5a8b60038fa424a6033253960682e5a51f4 Mon Sep 17 00:00:00 2001
> >> > > From: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > Date: Tue, 3 Mar 2026 06:13:13 -0800
> >> > > Subject: [PATCH] mm: fix remaining zone->lock references
> >> > > 
> >> > > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> >> > > ---
> >> > >  Documentation/mm/physical_memory.rst | 4 ++--
> >> > >  Documentation/trace/events-kmem.rst  | 8 ++++----
> >> > >  2 files changed, 6 insertions(+), 6 deletions(-)
> >> > > 
> >> > > diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
> >> > > index b76183545e5b..e344f93515b6 100644
> >> > > --- a/Documentation/mm/physical_memory.rst
> >> > > +++ b/Documentation/mm/physical_memory.rst
> >> > > @@ -500,11 +500,11 @@ General
> >> > >  ``nr_isolate_pageblock``
> >> > >    Number of isolated pageblocks. It is used to solve incorrect freepage counting
> >> > >    problem due to racy retrieving migratetype of pageblock. Protected by
> >> > > -  ``zone->lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> >> > > +  ``zone_lock``. Defined only when ``CONFIG_MEMORY_ISOLATION`` is enabled.
> >> > 
> >> > Dmitry's original patch [1] was doing 's/zone->lock/zone->_lock/', which aligns
> >> > to my expectation.  But this patch is doing 's/zone->lock/zone_lock/'.  Same
> >> > for the rest of this patch.
> >> > 
> >> > I was initially thinking this is just a mistake, but I also found Andrew is
> >> > doing same change [2], so I'm bit confused.  Is this an intentional change?
> >> > 
> >> > [1] https://lore.kernel.org/d61500c5784c64e971f4d328c57639303c475f81.1772206930.git.d@ilvokhin.com
> >> > [2] https://lore.kernel.org/20260302143743.220eed4feb36d7572fe726cc@linux-foundation.org
> >> > 
> >> 
> >> Good catch, thanks for pointing this out, SJ.
> >> 
> >> Originally the mechanical rename was indeed zone->lock -> zone->_lock.
> >> However, in Documentation I intentionally switched references to
> >> zone_lock instead of zone->_lock. The reasoning is that _lock is now an
> >> internal implementation detail, and direct access is discouraged. The
> >> intended interface is via the zone_lock_*() / zone_unlock_*() wrappers,
> >> so referencing zone_lock in documentation felt more appropriate than
> >> mentioning the private struct field (zone->_lock).
> > 
> > Thank you for this nice and kind clarification, Dmitry!  I agree mentioning
> > zone_[un]lock_*() helpers instead of the hidden member (zone->_lock) can be
> > better.
> > 
> > But, I'm concerned if people like me might not aware the intention under
> > 'zone_lock'.  If there is a well-known convention that allows people to know it
> > is for 'zone_[un]lock_*()' helpers, making it more clear would be nice, in my
> > humble opinion.  If there is such a convention but I'm just missing it, please
> > ignore.  If I'm not, for eaxmaple,
> > 
> > "protected by ``zone->lock``" could be re-wrote to
> > "protected by ``zone_[un]lock_*()`` locking helpers" or,
> > "protected by zone lock helper functions (``zone_[un]lock_*()``)" ?
> > 
> >> 
> >> That said, I agree this creates inconsistency with the mechanical
> >> rename, and I'm happy to adjust either way: either consistently refer
> >> to the wrapper API, or keep documentation aligned with zone->_lock.
> >> 
> >> I slightly prefer referring to the wrapper API, but don't have a strong
> >> preference as long as we're consistent.
> > 
> > I also think both approaches are good.  But for the wrapper approach, I think
> > giving more contexts rather than just ``zone_lock`` to readers would be nice.
> 
> Grep tells me that we also have comments mentioning simply "zone lock", btw.
> And it's also a term used often in informal conversations. Maybe we could
> just standardize on that in comments/documentations as it's easier to read.
> Discovering that the field is called _lock and that wrappers should be used,
> is hopefully not that difficult.

Sounds good, that also works for me.


Thanks,
SJ

^ permalink raw reply

* Re: [PATCH v3 12/12] treewide: change inode->i_ino from unsigned long to u64
From: Christoph Hellwig @ 2026-03-05 14:25 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Dan Williams, Eric Biggers,
	Theodore Y. Ts'o, Muchun Song, Oscar Salvador,
	David Hildenbrand, David Howells, Paulo Alcantara, Andreas Dilger,
	Jan Kara, Jaegeuk Kim, Chao Yu, Trond Myklebust, Anna Schumaker,
	Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Steve French, Ronnie Sahlberg, Shyam Prasad N, Bharath SM,
	Alexander Aring, Ryusuke Konishi, Viacheslav Dubeyko,
	Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	Christian Schoenebeck, David Sterba, Marc Dionne, Ian Kent,
	Luis de Bethencourt, Salah Triki, Tigran A. Aivazian,
	Ilya Dryomov, Alex Markuze, Jan Harkes, coda, Nicolas Pitre,
	Tyler Hicks, Amir Goldstein, Christoph Hellwig,
	John Paul Adrian Glaubitz, Yangtao Li, Mikulas Patocka,
	David Woodhouse, Richard Weinberger, Dave Kleikamp,
	Konstantin Komarov, Mark Fasheh, Joel Becker, Joseph Qi,
	Mike Marshall, Martin Brandenburg, Miklos Szeredi, Anders Larsen,
	Zhihao Cheng, Damien Le Moal, Naohiro Aota, Johannes Thumshirn,
	John Johansen, Paul Moore, James Morris, Serge E. Hallyn,
	Mimi Zohar, Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Fan Wu,
	Stephen Smalley, Ondrej Mosnacek, Casey Schaufler, Alex Deucher,
	Christian König, David Airlie, Simona Vetter, Sumit Semwal,
	Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, Oleg Nesterov,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Darrick J. Wong,
	Martin Schiller, Eric Paris, Joerg Reuter, Marcel Holtmann,
	Johan Hedberg, Luiz Augusto von Dentz, Oliver Hartkopp,
	Marc Kleine-Budde, David Ahern, Neal Cardwell, Steffen Klassert,
	Herbert Xu, Remi Denis-Courmont, Marcelo Ricardo Leitner,
	Xin Long, Magnus Karlsson, Maciej Fijalkowski, Stanislav Fomichev,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, linux-fsdevel, linux-kernel, linux-trace-kernel,
	nvdimm, fsverity, linux-mm, netfs, linux-ext4, linux-f2fs-devel,
	linux-nfs, linux-cifs, samba-technical, linux-nilfs, v9fs,
	linux-afs, autofs, ceph-devel, codalist, ecryptfs, linux-mtd,
	jfs-discussion, ntfs3, ocfs2-devel, devel, linux-unionfs,
	apparmor, linux-security-module, linux-integrity, selinux,
	amd-gfx, dri-devel, linux-media, linaro-mm-sig, netdev,
	linux-perf-users, linux-fscrypt, linux-xfs, linux-hams, linux-x25,
	audit, linux-bluetooth, linux-can, linux-sctp, bpf
In-Reply-To: <20260304-iino-u64-v3-12-2257ad83d372@kernel.org>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: [PATCH v3 01/12] vfs: widen inode hash/lookup functions to u64
From: Christoph Hellwig @ 2026-03-05 14:24 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Dan Williams, Eric Biggers,
	Theodore Y. Ts'o, Muchun Song, Oscar Salvador,
	David Hildenbrand, David Howells, Paulo Alcantara, Andreas Dilger,
	Jan Kara, Jaegeuk Kim, Chao Yu, Trond Myklebust, Anna Schumaker,
	Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Steve French, Ronnie Sahlberg, Shyam Prasad N, Bharath SM,
	Alexander Aring, Ryusuke Konishi, Viacheslav Dubeyko,
	Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	Christian Schoenebeck, David Sterba, Marc Dionne, Ian Kent,
	Luis de Bethencourt, Salah Triki, Tigran A. Aivazian,
	Ilya Dryomov, Alex Markuze, Jan Harkes, coda, Nicolas Pitre,
	Tyler Hicks, Amir Goldstein, Christoph Hellwig,
	John Paul Adrian Glaubitz, Yangtao Li, Mikulas Patocka,
	David Woodhouse, Richard Weinberger, Dave Kleikamp,
	Konstantin Komarov, Mark Fasheh, Joel Becker, Joseph Qi,
	Mike Marshall, Martin Brandenburg, Miklos Szeredi, Anders Larsen,
	Zhihao Cheng, Damien Le Moal, Naohiro Aota, Johannes Thumshirn,
	John Johansen, Paul Moore, James Morris, Serge E. Hallyn,
	Mimi Zohar, Roberto Sassu, Dmitry Kasatkin, Eric Snowberg, Fan Wu,
	Stephen Smalley, Ondrej Mosnacek, Casey Schaufler, Alex Deucher,
	Christian König, David Airlie, Simona Vetter, Sumit Semwal,
	Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, Oleg Nesterov,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Darrick J. Wong,
	Martin Schiller, Eric Paris, Joerg Reuter, Marcel Holtmann,
	Johan Hedberg, Luiz Augusto von Dentz, Oliver Hartkopp,
	Marc Kleine-Budde, David Ahern, Neal Cardwell, Steffen Klassert,
	Herbert Xu, Remi Denis-Courmont, Marcelo Ricardo Leitner,
	Xin Long, Magnus Karlsson, Maciej Fijalkowski, Stanislav Fomichev,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, linux-fsdevel, linux-kernel, linux-trace-kernel,
	nvdimm, fsverity, linux-mm, netfs, linux-ext4, linux-f2fs-devel,
	linux-nfs, linux-cifs, samba-technical, linux-nilfs, v9fs,
	linux-afs, autofs, ceph-devel, codalist, ecryptfs, linux-mtd,
	jfs-discussion, ntfs3, ocfs2-devel, devel, linux-unionfs,
	apparmor, linux-security-module, linux-integrity, selinux,
	amd-gfx, dri-devel, linux-media, linaro-mm-sig, netdev,
	linux-perf-users, linux-fscrypt, linux-xfs, linux-hams, linux-x25,
	audit, linux-bluetooth, linux-can, linux-sctp, bpf
In-Reply-To: <20260304-iino-u64-v3-1-2257ad83d372@kernel.org>

>  extern struct inode *ilookup5_nowait(struct super_block *sb,
> -		unsigned long hashval, int (*test)(struct inode *, void *),
> +		u64 hashval, int (*test)(struct inode *, void *),
>  		void *data, bool *isnew);
> -extern struct inode *ilookup5(struct super_block *sb, unsigned long hashval,
> +extern struct inode *ilookup5(struct super_block *sb, u64 hashval,
>  		int (*test)(struct inode *, void *), void *data);

...

Can you please drop all these pointless externs while you're at it?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox