Linux Trace Kernel
 help / color / mirror / Atom feed
* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
       [not found] <CY8PR11MB7134346A3E4BB28ECA28D6E989132@CY8PR11MB7134.namprd11.prod.outlook.com>
@ 2026-06-03 13:44 ` David Hildenbrand (Arm)
  2026-06-03 16:17   ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-03 13:44 UTC (permalink / raw)
  To: Zhuo, Qiuxu, rostedt@goodmis.org, mchehab+huawei@kernel.org,
	Luck, Tony, bp@alien8.de, akpm@linux-foundation.org,
	linmiaohe@huawei.com, xieyuanbin1@huawei.com
  Cc: Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org

On 6/3/26 15:11, Zhuo, Qiuxu wrote:
> Hi,
> 
>  
> 
> Laiyi reported that the userspace rasdaemon fails to enable memory_failure_event
> on kernels >= v6.19.
> 
>  
> 
> Kernel commit 31807483d395 ("mm/memory-failure: remove the selection of RAS"),
> merged in v6.19-rc1,
> 
> moved the memory_failure_event tracepoint from the "ras" subsystem to
> "memory_failure".
> 
> However, rasdaemon still tries to enable:
> 
>  
> 
>     ras:memory_failure_event
> 
>  
> 
> while on v6.19+ kernels, the tracepoint is:
> 
>  
> 
>     memory_failure:memory_failure_event
> 
>  
> 
> As a result, rasdaemon fails to start:
> 
>  
> 
>     …
> 
>    Can't write to set_event
> 
>    Huh! something got wrong. Aborting.
> 
>    …
> 
>  
> 
> Reproducer:
> 
>  
> 
>     rasdaemon --enable
> 
>  
> 
> Could you please let me know whether the preferred solution is to revert the
> kernel change,
> 
> or to update rasdaemon to support both tracepoint names for backward/forward
> compatibility?


Likely the latter. BPF [1] documents:

Q: Are tracepoints part of the stable ABI?
A: NO. Tracepoints are tied to internal implementation details hence they are
subject to change and can break with newer kernels. BPF programs need to change
accordingly when this happens.

The Kernel ABI document explicitly doesn't list them AFAIKS.

There were previous discussions on the stability of tracepints [2], I don't know
what changed in the meantime. CCing Steve

[1] https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html
[2] https://lwn.net/Articles/747256/
[3] https://www.kernel.org/doc/html/latest/admin-guide/abi.html

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 13:44 ` mm/memory-failure tracepoint change breaks userspace rasdaemon David Hildenbrand (Arm)
@ 2026-06-03 16:17   ` Steven Rostedt
  2026-06-03 16:19     ` Borislav Petkov
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2026-06-03 16:17 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Zhuo, Qiuxu, mchehab+huawei@kernel.org, Luck, Tony, bp@alien8.de,
	akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, 3 Jun 2026 15:44:54 +0200
"David Hildenbrand (Arm)" <david@kernel.org> wrote:

> Likely the latter. BPF [1] documents:
> 
> Q: Are tracepoints part of the stable ABI?
> A: NO. Tracepoints are tied to internal implementation details hence they are
> subject to change and can break with newer kernels. BPF programs need to change
> accordingly when this happens.
> 
> The Kernel ABI document explicitly doesn't list them AFAIKS.
> 
> There were previous discussions on the stability of tracepints [2], I don't know
> what changed in the meantime. CCing Steve
> 
> [1] https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html
> [2] https://lwn.net/Articles/747256/
> [3] https://www.kernel.org/doc/html/latest/admin-guide/abi.html

Tracepoints are not stable or BPF programs only. But other applications
they are[1].

Adding Linus as he's the Supreme Judge on the matter.

-- Steve

[1] https://lwn.net/Articles/442113/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 16:17   ` Steven Rostedt
@ 2026-06-03 16:19     ` Borislav Petkov
  2026-06-03 16:26       ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2026-06-03 16:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: David Hildenbrand (Arm), Zhuo, Qiuxu, mchehab+huawei@kernel.org,
	Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, Jun 03, 2026 at 12:17:07PM -0400, Steven Rostedt wrote:
> On Wed, 3 Jun 2026 15:44:54 +0200
> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
> 
> > Likely the latter. BPF [1] documents:
> > 
> > Q: Are tracepoints part of the stable ABI?
> > A: NO. Tracepoints are tied to internal implementation details hence they are
> > subject to change and can break with newer kernels. BPF programs need to change
> > accordingly when this happens.
> > 
> > The Kernel ABI document explicitly doesn't list them AFAIKS.
> > 
> > There were previous discussions on the stability of tracepints [2], I don't know
> > what changed in the meantime. CCing Steve
> > 
> > [1] https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html
> > [2] https://lwn.net/Articles/747256/
> > [3] https://www.kernel.org/doc/html/latest/admin-guide/abi.html
> 
> Tracepoints are not stable or BPF programs only. But other applications
> they are[1].
> 
> Adding Linus as he's the Supreme Judge on the matter.

I *think* tools or libtraceevent can't really anticipate the TP namespace
change so we might have to revert, I'm afraid...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 16:19     ` Borislav Petkov
@ 2026-06-03 16:26       ` David Hildenbrand (Arm)
  2026-06-03 17:00         ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-03 16:26 UTC (permalink / raw)
  To: Borislav Petkov, Steven Rostedt
  Cc: Zhuo, Qiuxu, mchehab+huawei@kernel.org, Luck, Tony,
	akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On 6/3/26 18:19, Borislav Petkov wrote:
> On Wed, Jun 03, 2026 at 12:17:07PM -0400, Steven Rostedt wrote:
>> On Wed, 3 Jun 2026 15:44:54 +0200
>> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
>>
>>> Likely the latter. BPF [1] documents:
>>>
>>> Q: Are tracepoints part of the stable ABI?
>>> A: NO. Tracepoints are tied to internal implementation details hence they are
>>> subject to change and can break with newer kernels. BPF programs need to change
>>> accordingly when this happens.
>>>
>>> The Kernel ABI document explicitly doesn't list them AFAIKS.
>>>
>>> There were previous discussions on the stability of tracepints [2], I don't know
>>> what changed in the meantime. CCing Steve
>>>
>>> [1] https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html
>>> [2] https://lwn.net/Articles/747256/
>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/abi.html
>>
>> Tracepoints are not stable or BPF programs only. But other applications
>> they are[1].
>>
>> Adding Linus as he's the Supreme Judge on the matter.
> 
> I *think* tools or libtraceevent can't really anticipate the TP namespace
> change so we might have to revert, I'm afraid...

Yeah, I was fearing that when I read in [2]:

	"It has become clear in the past that this promise extends to
	 tracepoints, most notably in 2011 when a tracepoint change broke
	 powertop and had to be reverted."

Which means that I now also fully understand

	"Some kernel maintainers prohibit or severely restrict the addition of
	 tracepoints to their subsystems out of fear that a similar thing could
	 happen to them. "

Whatever the result of this discussion will be, I'll try to document it.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 16:26       ` David Hildenbrand (Arm)
@ 2026-06-03 17:00         ` Steven Rostedt
  2026-06-03 19:13           ` David Hildenbrand (Arm)
  2026-06-03 19:54           ` Andrew Morton
  0 siblings, 2 replies; 9+ messages in thread
From: Steven Rostedt @ 2026-06-03 17:00 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Borislav Petkov, Zhuo, Qiuxu, mchehab+huawei@kernel.org,
	Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, 3 Jun 2026 18:26:24 +0200
"David Hildenbrand (Arm)" <david@kernel.org> wrote:

> Yeah, I was fearing that when I read in [2]:
> 
> 	"It has become clear in the past that this promise extends to
> 	 tracepoints, most notably in 2011 when a tracepoint change broke
> 	 powertop and had to be reverted."

Technically the issue is with trace events and not tracepoints. The
difference is that a trace event is created via the TRACE_EVENT() macro
which defines what is to be collected from the tracepoint and exposes that
information to tracefs which applications can easily see.

A tracepoint is simply the hook in the code that you can attach to. Trace
events create a callback from that hook to extract the data from the
tracepoint to fill in the fields.

> 
> Which means that I now also fully understand
> 
> 	"Some kernel maintainers prohibit or severely restrict the addition of
> 	 tracepoints to their subsystems out of fear that a similar thing could
> 	 happen to them. "
> 
> Whatever the result of this discussion will be, I'll try to document it.

You can still create a tracepoint without creating a trace event by using
the DECLARE_TRACE() macro. The scheduler subsystem uses that quite
extensively. That creates a tracepoint without exposing it to tracefs. The
runtime verifier uses these hooks to monitor the scheduler.

But you can still connect to these tracepoints from tracefs via a tprobe. A
tprobe hooks to tracepoints that you need the source code to find (just
like a fprobe hooks to any function). Thus applications *can't* rely on
them because there's nothing there to tell you it exists or not.

For example, for the given tracepoint:

 # cd /sys/kernel/tracing
 # echo 't:rfail memory_failure_event pfn=pfn type=type result=result' > dynamic_events
 # cat events/tracepoints/rfail/format 
name: rfail
ID: 1894
format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;

	field:unsigned long __probe_ip;	offset:8;	size:8;	signed:0;
	field:u64 pfn;	offset:16;	size:8;	signed:0;
	field:s32 type;	offset:24;	size:4;	signed:1;
	field:s32 result;	offset:28;	size:4;	signed:1;

print fmt: "(%lx) pfn=%Lu type=%d result=%d", REC->__probe_ip, REC->pfn, REC->type, REC->result

It requires that BTF exists and the above doesn't annotate the result as
nicely. But you can get data directly from tracepoints this way.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 17:00         ` Steven Rostedt
@ 2026-06-03 19:13           ` David Hildenbrand (Arm)
  2026-06-03 19:30             ` Steven Rostedt
  2026-06-03 19:31             ` Steven Rostedt
  2026-06-03 19:54           ` Andrew Morton
  1 sibling, 2 replies; 9+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-03 19:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Borislav Petkov, Zhuo, Qiuxu, mchehab+huawei@kernel.org,
	Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On 6/3/26 19:00, Steven Rostedt wrote:
> On Wed, 3 Jun 2026 18:26:24 +0200
> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
> 
>> Yeah, I was fearing that when I read in [2]:
>>
>> 	"It has become clear in the past that this promise extends to
>> 	 tracepoints, most notably in 2011 when a tracepoint change broke
>> 	 powertop and had to be reverted."
> 
> Technically the issue is with trace events and not tracepoints. The
> difference is that a trace event is created via the TRACE_EVENT() macro
> which defines what is to be collected from the tracepoint and exposes that
> information to tracefs which applications can easily see.
> 
> A tracepoint is simply the hook in the code that you can attach to. Trace
> events create a callback from that hook to extract the data from the
> tracepoint to fill in the fields.
> 
>>
>> Which means that I now also fully understand
>>
>> 	"Some kernel maintainers prohibit or severely restrict the addition of
>> 	 tracepoints to their subsystems out of fear that a similar thing could
>> 	 happen to them. "
>>
>> Whatever the result of this discussion will be, I'll try to document it.
> 
> You can still create a tracepoint without creating a trace event by using
> the DECLARE_TRACE() macro. The scheduler subsystem uses that quite
> extensively. That creates a tracepoint without exposing it to tracefs. The
> runtime verifier uses these hooks to monitor the scheduler.
> 
> But you can still connect to these tracepoints from tracefs via a tprobe. A
> tprobe hooks to tracepoints that you need the source code to find (just
> like a fprobe hooks to any function). Thus applications *can't* rely on
> them because there's nothing there to tell you it exists or not.

Thanks, that makes sense!

So, would it be fair to say that, in general, what's exposed through

	/sys/kernel/tracing/events/

is stable ABI?


Would the following be sufficient to avoid a full revert and the dependency on CONFIG_RAS?

diff --git a/include/trace/events/memory-failure.h b/include/trace/events/memory-failure.h
index aa57cc8f896b..c46b17602578 100644
--- a/include/trace/events/memory-failure.h
+++ b/include/trace/events/memory-failure.h
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #undef TRACE_SYSTEM
-#define TRACE_SYSTEM memory_failure
+/* Some user space relies on ras/memory_failure_event */
+#define TRACE_SYSTEM ras
 #define TRACE_INCLUDE_FILE memory-failure
 
 #if !defined(_TRACE_MEMORY_FAILURE_H) || defined(TRACE_HEADER_MULTI_READ)


-- 
Cheers,

David

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 19:13           ` David Hildenbrand (Arm)
@ 2026-06-03 19:30             ` Steven Rostedt
  2026-06-03 19:31             ` Steven Rostedt
  1 sibling, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2026-06-03 19:30 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Borislav Petkov, Zhuo, Qiuxu, mchehab+huawei@kernel.org,
	Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, 3 Jun 2026 21:13:30 +0200
"David Hildenbrand (Arm)" <david@kernel.org> wrote:

> Would the following be sufficient to avoid a full revert and the dependency on CONFIG_RAS?
> 
> diff --git a/include/trace/events/memory-failure.h b/include/trace/events/memory-failure.h
> index aa57cc8f896b..c46b17602578 100644
> --- a/include/trace/events/memory-failure.h
> +++ b/include/trace/events/memory-failure.h
> @@ -1,6 +1,7 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  #undef TRACE_SYSTEM
> -#define TRACE_SYSTEM memory_failure
> +/* Some user space relies on ras/memory_failure_event */
> +#define TRACE_SYSTEM ras

If that puts back the original path then yeah, all would be good.

-- Steve
  

>  #define TRACE_INCLUDE_FILE memory-failure
>  
>  #if !defined(_TRACE_MEMORY_FAILURE_H) || defined(TRACE_HEADER_MULTI_READ)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 19:13           ` David Hildenbrand (Arm)
  2026-06-03 19:30             ` Steven Rostedt
@ 2026-06-03 19:31             ` Steven Rostedt
  1 sibling, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2026-06-03 19:31 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Borislav Petkov, Zhuo, Qiuxu, mchehab+huawei@kernel.org,
	Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, 3 Jun 2026 21:13:30 +0200
"David Hildenbrand (Arm)" <david@kernel.org> wrote:

> Thanks, that makes sense!
> 
> So, would it be fair to say that, in general, what's exposed through
> 
> 	/sys/kernel/tracing/events/
> 
> is stable ABI?

It's only stable if something depends on it. It changes all the time.
It's only when someone complains about it that it becomes "stable"!

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
  2026-06-03 17:00         ` Steven Rostedt
  2026-06-03 19:13           ` David Hildenbrand (Arm)
@ 2026-06-03 19:54           ` Andrew Morton
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2026-06-03 19:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: David Hildenbrand (Arm), Borislav Petkov, Zhuo, Qiuxu,
	mchehab+huawei@kernel.org, Luck, Tony, linmiaohe@huawei.com,
	xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org, Linus Torvalds

On Wed, 3 Jun 2026 13:00:06 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 3 Jun 2026 18:26:24 +0200
> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
> 
> > Yeah, I was fearing that when I read in [2]:
> > 
> > 	"It has become clear in the past that this promise extends to
> > 	 tracepoints, most notably in 2011 when a tracepoint change broke
> > 	 powertop and had to be reverted."
> 
> Technically the issue is with trace events and not tracepoints. The
> difference is that a trace event is created via the TRACE_EVENT() macro
> which defines what is to be collected from the tracepoint and exposes that
> information to tracefs which applications can easily see.
> 
> A tracepoint is simply the hook in the code that you can attach to. Trace
> events create a callback from that hook to extract the data from the
> tracepoint to fill in the fields.

The problem here appears to be that "ras:memory_failure_event" became
"memory_failure:memory_failure_event".

Perhaps we can add infrastructure to permit aliasing "ras" onto
"memory_failure".  So if we make these namespace alterations we can
easily preserve back-compatibility?


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-06-03 19:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CY8PR11MB7134346A3E4BB28ECA28D6E989132@CY8PR11MB7134.namprd11.prod.outlook.com>
2026-06-03 13:44 ` mm/memory-failure tracepoint change breaks userspace rasdaemon David Hildenbrand (Arm)
2026-06-03 16:17   ` Steven Rostedt
2026-06-03 16:19     ` Borislav Petkov
2026-06-03 16:26       ` David Hildenbrand (Arm)
2026-06-03 17:00         ` Steven Rostedt
2026-06-03 19:13           ` David Hildenbrand (Arm)
2026-06-03 19:30             ` Steven Rostedt
2026-06-03 19:31             ` Steven Rostedt
2026-06-03 19:54           ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox