Linux Trace Kernel
 help / color / mirror / Atom feed
* Re: [PATCH v14 4/5] ring-buffer: Reset RB_MISSED_* flags on persistent ring buffer
From: Masami Hiramatsu @ 2026-03-31  1:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <20260330143613.42fe5640@gandalf.local.home>

On Mon, 30 Mar 2026 14:36:13 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 30 Mar 2026 21:50:20 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Reset RB_MISSED_* flags when the persistent ring buffer is
> > validated at boot. Since these flags are used only in reading
> > process, such process should be stopped when reboot and never
> > be restarted. Thus, these flags are meaningless in the next
> > boot. Moreover, it can confuse the read process after reboot.
> 
> Is it meaningless on a second boot?
> 
> Let's say you have a crash, and there's an invalid buffer. On the next boot
> it is flagged as invalid with the RB_MISSED flag. But then you reboot again
> before looking at the buffer. The next boot will clear this flag. Now
> looking at the persistent ring buffer will not show any missed events.
> 
> Ideally, it shouldn't matter how many reboots are made. If the persistent
> ring buffer hasn't started again, it should always show the same output.

Hmm, OK. I'll drop this.

Thanks!

> 
> -- Steve
> 
> 
> > 
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  Changes in v14:
> >    - Newly added.
> > ---
> >  kernel/trace/ring_buffer.c |    1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index e5178239f2f9..5049cf13021e 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -1903,6 +1903,7 @@ static int rb_validate_buffer(struct buffer_page *bpage, int cpu,
> >  		local_set(&bpage->page->commit, 0);
> >  	} else {
> >  		local_set(&bpage->entries, ret);
> > +		local_set(&bpage->page->commit, tail);
> >  	}
> >  
> >  	return ret;
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v14 2/5] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
From: Masami Hiramatsu @ 2026-03-31  1:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <20260330162822.54f6bd02@gandalf.local.home>

On Mon, 30 Mar 2026 16:28:22 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 30 Mar 2026 16:22:10 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Mon, 30 Mar 2026 21:50:04 +0900
> > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> > 
> > > @@ -2042,7 +2065,8 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
> > >  	local_set(&cpu_buffer->entries, entries);
> > >  	local_set(&cpu_buffer->entries_bytes, entry_bytes);
> > >  
> > > -	pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
> > > +	pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
> > > +		cpu_buffer->cpu, discarded);  
> > 
> > As pages should never be discarded unless something went wrong, let's only
> > print that if there were discarded pages.
> > 
> > 	if (discarded) {
> > 		pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
> > 			cpu_buffer->cpu, discarded);
> > 	} else {
> > 		pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
> > 	}
> 
> Or perhaps:
> 
> 	pr_info("Ring buffer meta [%d] is from previous boot!", cpu_buffer->cpu);
> 	if (discarded)
> 		pr_cont(" (%d pages discarded)", discarded);
> 	pr_cont("\n");

OK. I'll do this.

Thanks for the comment!

> 
> -- Steve
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* [PATCH] tracing: Remove duplicate latency_fsnotify() stub
From: Steven Rostedt @ 2026-03-31  0:58 UTC (permalink / raw)
  To: LKML, Linux Trace Kernel; +Cc: Masami Hiramatsu, Mathieu Desnoyers

From: Steven Rostedt <rostedt@goodmis.org>

When the SNAPSHOT is defined but FSNOTIFY is not the latency_fsnotify()
function is turned into a static inline stub. But this stub was defined in
both trace.h and trace_snapshot.c causing a error in build when
CONFIG_SNAPSHOT is defined but FSNOTIFY is not. The stub is not needed in
trace_snapshot.c as it will be defined in trace.h, remove it from the C
file.

Fixes: bade44fe5462 ("tracing: Move snapshot code out of trace.c and into trace_snapshot.c")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603310604.lGE9LDBK-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace.h          | 2 +-
 kernel/trace/trace_snapshot.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a3ea735a9ef6..a59d6acdf95d 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -845,13 +845,13 @@ void update_max_tr_single(struct trace_array *tr,
 #if defined(CONFIG_TRACER_MAX_TRACE) && defined(CONFIG_FSNOTIFY)
 # define LATENCY_FS_NOTIFY
 #endif
+#endif /* CONFIG_TRACER_SNAPSHOT */
 
 #ifdef LATENCY_FS_NOTIFY
 void latency_fsnotify(struct trace_array *tr);
 #else
 static inline void latency_fsnotify(struct trace_array *tr) { }
 #endif
-#endif /* CONFIG_TRACER_SNAPSHOT */
 
 #ifdef CONFIG_STACKTRACE
 void __trace_stack(struct trace_array *tr, unsigned int trace_ctx, int skip);
diff --git a/kernel/trace/trace_snapshot.c b/kernel/trace/trace_snapshot.c
index 8865b2ef2264..07b43c9863a2 100644
--- a/kernel/trace/trace_snapshot.c
+++ b/kernel/trace/trace_snapshot.c
@@ -391,9 +391,8 @@ void latency_fsnotify(struct trace_array *tr)
 	 */
 	irq_work_queue(&tr->fsnotify_irqwork);
 }
-#else
-static inline void latency_fsnotify(struct trace_array *tr) { }
 #endif /* LATENCY_FS_NOTIFY */
+
 static const struct file_operations tracing_max_lat_fops;
 
 void trace_create_maxlat_file(struct trace_array *tr,
-- 
2.51.0


^ permalink raw reply related

* Re: [PATCH v14 1/5] ring-buffer: Flush and stop persistent ring buffer on panic
From: Masami Hiramatsu @ 2026-03-31  0:29 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <20260330135447.423bc070@gandalf.local.home>

On Mon, 30 Mar 2026 13:54:47 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 30 Mar 2026 21:49:56 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > diff --git a/arch/arm64/include/asm/ring_buffer.h b/arch/arm64/include/asm/ring_buffer.h
> > new file mode 100644
> > index 000000000000..62316c406888
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/ring_buffer.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef _ASM_ARM64_RING_BUFFER_H
> > +#define _ASM_ARM64_RING_BUFFER_H
> > +
> > +#include <asm/cacheflush.h>
> > +
> > +/* Flush D-cache on persistent ring buffer */
> > +#define arch_ring_buffer_flush_range(start, end)	dcache_clean_pop(start, end)
> > +
> > +#endif /* _ASM_ARM64_RING_BUFFER_H */
> 
> You probably need to get an ack from the arm64 folks.

OK, I will add them in loop.

Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v2] bootconfig: Apply early options from embedded config
From: Masami Hiramatsu @ 2026-03-31  0:00 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
	linux-doc, oss, paulmck, rostedt, kernel-team
In-Reply-To: <acpzhCBEPh-tKVqg@gmail.com>

On Mon, 30 Mar 2026 06:15:17 -0700
Breno Leitao <leitao@debian.org> wrote:

> On Fri, Mar 27, 2026 at 10:37:44PM +0900, Masami Hiramatsu wrote:
> > On Fri, 27 Mar 2026 03:06:41 -0700
> > Breno Leitao <leitao@debian.org> wrote:
> 
> > > > To fix this, we need to change setup_arch() for each architecture so
> > > > that it calls this bootconfig_apply_early_params().
> > > 
> > > Could we instead integrate this into parse_early_param() itself? That
> > > approach would avoid the need to modify each architecture individually.
> > 
> > Ah, indeed. 
> 
> I investigated integrating bootconfig into parse_early_param() and hit a
> blocker: xbc_init() and xbc_make_cmdline() depend on memblock_alloc(), but on
> most architectures (x86, arm64, arm, s390, riscv) parse_early_param() is called
> from setup_arch() _before_ memblock is initialized.

Yeah, that's right.

> 
> So, bootconfig will not be available as early as parse_early_param(). 
> 
> An alternative is replace memblock allocations in lib/bootconfig.c with static
> __initdata buffers, similar to Petr's approach in 2023:
> 
> 	https://lore.kernel.org/all/20231121231342.193646-3-oss@malat.biz/
> 
> But, there was concerns about the allocation size:
> 
> 	Petr Malat <oss@malat.biz> wrote: 
> 	> To allow handling of early options, it's necessary to eliminate allocations
> 	> from embedded bootconfig handling
> 
> 	"Hm, my concern is that this can introduce some sort of overhead to parse the bootconfig."
> 

As far as we can correctly handle the early params and it is limited only
with the embedded bootconfig, I think it is OK to allocate it statically.

Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v14 2/5] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
From: Steven Rostedt @ 2026-03-30 20:28 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <20260330162210.5e37b0c3@gandalf.local.home>

On Mon, 30 Mar 2026 16:22:10 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 30 Mar 2026 21:50:04 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > @@ -2042,7 +2065,8 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
> >  	local_set(&cpu_buffer->entries, entries);
> >  	local_set(&cpu_buffer->entries_bytes, entry_bytes);
> >  
> > -	pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
> > +	pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
> > +		cpu_buffer->cpu, discarded);  
> 
> As pages should never be discarded unless something went wrong, let's only
> print that if there were discarded pages.
> 
> 	if (discarded) {
> 		pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
> 			cpu_buffer->cpu, discarded);
> 	} else {
> 		pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
> 	}

Or perhaps:

	pr_info("Ring buffer meta [%d] is from previous boot!", cpu_buffer->cpu);
	if (discarded)
		pr_cont(" (%d pages discarded)", discarded);
	pr_cont("\n");

-- Steve


^ permalink raw reply

* Re: [PATCH v14 5/5] ring-buffer: Add persistent ring buffer selftest
From: Steven Rostedt @ 2026-03-30 20:24 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <177487502763.3463592.7901517545360137050.stgit@mhiramat.tok.corp.google.com>

On Mon, 30 Mar 2026 21:50:27 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> @@ -2558,12 +2577,64 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
>  	kfree(cpu_buffer);
>  }
>  
> +#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
> +static void rb_test_inject_invalid_pages(struct trace_buffer *buffer)
> +{
> +	struct ring_buffer_per_cpu *cpu_buffer;
> +	struct ring_buffer_cpu_meta *meta;
> +	struct buffer_data_page *dpage;
> +	u32 entry_bytes = 0;
> +	unsigned long ptr;
> +	int subbuf_size;
> +	int invalid = 0;
> +	int cpu;
> +	int i;
> +
> +	if (!(buffer->flags & RB_FL_TESTING))
> +		return;
> +
> +	guard(preempt)();
> +	cpu = smp_processor_id();
> +
> +	cpu_buffer = buffer->buffers[cpu];
> +	meta = cpu_buffer->ring_meta;
> +	ptr = (unsigned long)rb_subbufs_from_meta(meta);
> +	subbuf_size = meta->subbuf_size;
> +
> +	for (i = 0; i < meta->nr_subbufs; i++) {
> +		int idx = meta->buffers[i];
> +
> +		dpage = (void *)(ptr + idx * subbuf_size);
> +		/* Skip unused pages */
> +		if (!local_read(&dpage->commit))
> +			continue;
> +
> +		/* Invalidate even pages. */
> +		if (!(i & 0x1)) {
> +			local_add(subbuf_size + 1, &dpage->commit);
> +			invalid++;
> +		} else {
> +			/* Count total commit bytes. */
> +			entry_bytes += local_read(&dpage->commit);
> +		}
> +	}
> +
> +	pr_info("Inject invalidated %d pages on CPU%d, total size: %ld\n",
> +		invalid, cpu, (long)entry_bytes);

This is only enabled when testing. Let's make that a pr_warn() as we really
do want to be able to see it. And it should warn that it is invalidating pages!
(warn as in pr_warn, it doesn't need a warn as in WARN()).

-- Steve


> +	meta->nr_invalid = invalid;
> +	meta->entry_bytes = entry_bytes;
> +}
> +#else /* !CONFIG_RING_BUFFER_PERSISTENT_INJECT */
> +#define rb_test_inject_invalid_pages(buffer)	do { } while (0)
> +#endif
> +
>  /* Stop recording on a persistent buffer and flush cache if needed. */
>  static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
>  {
>  	struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
>  
>  	ring_buffer_record_off(buffer);
> +	rb_test_inject_invalid_pages(buffer);
>  	arch_ring_buffer_flush_range(buffer->range_addr_start, buffer->range_addr_end);
>  	return NOTIFY_DONE;
>  }

^ permalink raw reply

* Re: [PATCH v14 2/5] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
From: Steven Rostedt @ 2026-03-30 20:22 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <177487500432.3463592.13753720277119177967.stgit@mhiramat.tok.corp.google.com>

On Mon, 30 Mar 2026 21:50:04 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> @@ -2042,7 +2065,8 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
>  	local_set(&cpu_buffer->entries, entries);
>  	local_set(&cpu_buffer->entries_bytes, entry_bytes);
>  
> -	pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
> +	pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
> +		cpu_buffer->cpu, discarded);

As pages should never be discarded unless something went wrong, let's only
print that if there were discarded pages.

	if (discarded) {
		pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
			cpu_buffer->cpu, discarded);
	} else {
		pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu);
	}

-- Steve




>  	return;
>  
>   invalid:

^ permalink raw reply

* Re: [PATCH v2] tracing/osnoise: Add option to align tlat threads
From: Crystal Wood @ 2026-03-30 19:43 UTC (permalink / raw)
  To: Tomas Glozar, Steven Rostedt, Masami Hiramatsu
  Cc: Mathieu Desnoyers, John Kacur, Luis Goncalves, Costa Shulyupin,
	Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <20260302131316.385987-1-tglozar@redhat.com>

On Mon, 2026-03-02 at 14:13 +0100, Tomas Glozar wrote:
> Add an option called TIMERLAT_ALIGN to osnoise/options, together with a
> corresponding setting osnoise/timerlat_align_us.
> 
> This option sets the alignment of wakeup times between different
> timerlat threads, similarly to cyclictest's -A/--aligned option. If
> TIMERLAT_ALIGN is set, the first thread that reaches the first cycle
> records its first wake-up time. Each following thread sets its first
> wake-up time to a fixed offset from the recorded time, and increments
> it by the same offset.
> 
> Example:
> 
> osnoise/timerlat_period is set to 1000, osnoise/timerlat_align_us is
> set to 20. There are four threads, on CPUs 1 to 4.
> 
> - CPU 4 enters first cycle first. The current time is 20000us, so
> the wake-up of the first cycle is set to 21000us. This time is recorded.
> - CPU 2 enter first cycle next. It reads the recorded time, increments
> it to 21020us, and uses this value as its own wake-up time for the first
> cycle.
> - CPU 3 enters first cycle next. It reads the recorded time, increments
> it to 21040 us, and uses the value as its own wake-up time.
> - CPU 1 proceeds analogically.
> 
> In each next cycle, the wake-up time (called "absolute period" in
> timerlat code) is incremented by the (relative) period of 1000us. Thus,
> the wake-ups in the following cycles (provided the times are reached and
> not in the past) will be as follows:
> 
> CPU 1		CPU 2		CPU 3	 	CPU 4
> 21080us		21020us		21040us		21000us
> 22080us		22020us		22040us		22000us
> ...		...		...		...
> 
> Even if any cycle is skipped due to e.g. the first cycle calculation
> happening later, the alignment stays in place.
> 
> Signed-off-by: Tomas Glozar <tglozar@redhat.com>

Reviewed-by: Crystal Wood <crwood@redhat.com>

-Crystal


^ permalink raw reply

* Re: [PATCH v14 4/5] ring-buffer: Reset RB_MISSED_* flags on persistent ring buffer
From: Steven Rostedt @ 2026-03-30 18:36 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <177487501981.3463592.2886576368556755178.stgit@mhiramat.tok.corp.google.com>

On Mon, 30 Mar 2026 21:50:20 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Reset RB_MISSED_* flags when the persistent ring buffer is
> validated at boot. Since these flags are used only in reading
> process, such process should be stopped when reboot and never
> be restarted. Thus, these flags are meaningless in the next
> boot. Moreover, it can confuse the read process after reboot.

Is it meaningless on a second boot?

Let's say you have a crash, and there's an invalid buffer. On the next boot
it is flagged as invalid with the RB_MISSED flag. But then you reboot again
before looking at the buffer. The next boot will clear this flag. Now
looking at the persistent ring buffer will not show any missed events.

Ideally, it shouldn't matter how many reboots are made. If the persistent
ring buffer hasn't started again, it should always show the same output.

-- Steve


> 
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  Changes in v14:
>    - Newly added.
> ---
>  kernel/trace/ring_buffer.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index e5178239f2f9..5049cf13021e 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1903,6 +1903,7 @@ static int rb_validate_buffer(struct buffer_page *bpage, int cpu,
>  		local_set(&bpage->page->commit, 0);
>  	} else {
>  		local_set(&bpage->entries, ret);
> +		local_set(&bpage->page->commit, tail);
>  	}
>  
>  	return ret;


^ permalink raw reply

* [PATCH v7 2/2] tracing: Preserve repeated trace_trigger boot parameters
From: Wesley Atwell @ 2026-03-30 18:11 UTC (permalink / raw)
  To: rostedt, mhiramat
  Cc: mark.rutland, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
	Wesley Atwell
In-Reply-To: <20260330181103.1851230-1-atwellwea@gmail.com>

trace_trigger= tokenizes bootup_trigger_buf in place and stores pointers
into that buffer for later trigger registration. Repeated trace_trigger=
parameters overwrite the buffer contents from earlier calls, leaving
only the last set of parsed event and trigger strings.

Keep each new trace_trigger= string at the end of bootup_trigger_buf and
parse only the appended range. That preserves the earlier event and
trigger strings while still letting repeated parameters queue additional
boot-time triggers.

This also lets Bootconfig array values work naturally when they expand
to repeated trace_trigger= entries.

Before this change, only the last trace_trigger= instance survived boot.

Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
---
Changes since v6: https://lore.kernel.org/all/20260329184254.1813273-1-atwellwea@gmail.com/
- split trace_trigger= handling into its own patch
- follow Steven Rostedt's suggested bootup_trigger_buf offset approach
- account for the terminating NUL when advancing boot_trigger_buf_len
---
 kernel/trace/trace_events.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 249d1cba72c0..dd26e838d4de 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3679,20 +3679,27 @@ static struct boot_triggers {
 } bootup_triggers[MAX_BOOT_TRIGGERS];
 
 static char bootup_trigger_buf[COMMAND_LINE_SIZE];
+static int boot_trigger_buf_len;
 static int nr_boot_triggers;
 
 static __init int setup_trace_triggers(char *str)
 {
 	char *trigger;
 	char *buf;
+	int len = boot_trigger_buf_len;
 	int i;
 
-	strscpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE);
+	if (len >= COMMAND_LINE_SIZE)
+		return 1;
+
+	strscpy(bootup_trigger_buf + len, str, COMMAND_LINE_SIZE - len);
 	trace_set_ring_buffer_expanded(NULL);
 	disable_tracing_selftest("running event triggers");
 
-	buf = bootup_trigger_buf;
-	for (i = 0; i < MAX_BOOT_TRIGGERS; i++) {
+	buf = bootup_trigger_buf + len;
+	boot_trigger_buf_len += strlen(buf) + 1;
+
+	for (i = nr_boot_triggers; i < MAX_BOOT_TRIGGERS; i++) {
 		trigger = strsep(&buf, ",");
 		if (!trigger)
 			break;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v7 1/2] tracing: Append repeated boot-time tracing parameters
From: Wesley Atwell @ 2026-03-30 18:11 UTC (permalink / raw)
  To: rostedt, mhiramat
  Cc: mark.rutland, mathieu.desnoyers, linux-kernel, linux-trace-kernel,
	Wesley Atwell

Some tracing boot parameters already accept delimited value lists, but
their __setup() handlers keep only the last instance seen at boot.
Make repeated instances append to the same boot-time buffer in the
format each parser already consumes.

Use a shared trace_append_boot_param() helper for the ftrace filters,
trace_options, and kprobe_event boot parameters.

This also lets Bootconfig array values work naturally when they expand
to repeated param=value entries.

Before this change, only the last instance from each repeated
parameter survived boot.

Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
---
Changes since v6: https://lore.kernel.org/all/20260329184254.1813273-1-atwellwea@gmail.com/
- split trace_trigger= handling into a separate patch
- add the requested blank line in trace_append_boot_param()
---
 kernel/trace/ftrace.c       | 12 ++++++++----
 kernel/trace/trace.c        | 31 ++++++++++++++++++++++++++++++-
 kernel/trace/trace.h        |  2 ++
 kernel/trace/trace_kprobe.c |  3 ++-
 4 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 413310912609..8bd3dd1d549c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6841,7 +6841,8 @@ bool ftrace_filter_param __initdata;
 static int __init set_ftrace_notrace(char *str)
 {
 	ftrace_filter_param = true;
-	strscpy(ftrace_notrace_buf, str, FTRACE_FILTER_SIZE);
+	trace_append_boot_param(ftrace_notrace_buf, str, ',',
+				FTRACE_FILTER_SIZE);
 	return 1;
 }
 __setup("ftrace_notrace=", set_ftrace_notrace);
@@ -6849,7 +6850,8 @@ __setup("ftrace_notrace=", set_ftrace_notrace);
 static int __init set_ftrace_filter(char *str)
 {
 	ftrace_filter_param = true;
-	strscpy(ftrace_filter_buf, str, FTRACE_FILTER_SIZE);
+	trace_append_boot_param(ftrace_filter_buf, str, ',',
+				FTRACE_FILTER_SIZE);
 	return 1;
 }
 __setup("ftrace_filter=", set_ftrace_filter);
@@ -6861,14 +6863,16 @@ static int ftrace_graph_set_hash(struct ftrace_hash *hash, char *buffer);
 
 static int __init set_graph_function(char *str)
 {
-	strscpy(ftrace_graph_buf, str, FTRACE_FILTER_SIZE);
+	trace_append_boot_param(ftrace_graph_buf, str, ',',
+				FTRACE_FILTER_SIZE);
 	return 1;
 }
 __setup("ftrace_graph_filter=", set_graph_function);
 
 static int __init set_graph_notrace_function(char *str)
 {
-	strscpy(ftrace_graph_notrace_buf, str, FTRACE_FILTER_SIZE);
+	trace_append_boot_param(ftrace_graph_notrace_buf, str, ',',
+				FTRACE_FILTER_SIZE);
 	return 1;
 }
 __setup("ftrace_graph_notrace=", set_graph_notrace_function);
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a626211ceb9a..652d9f4e7943 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -228,6 +228,34 @@ static int boot_instance_index;
 static char boot_snapshot_info[COMMAND_LINE_SIZE] __initdata;
 static int boot_snapshot_index;
 
+/*
+ * Repeated boot parameters, including Bootconfig array expansions, need
+ * to stay in the delimiter form that the existing parser consumes.
+ */
+void __init trace_append_boot_param(char *buf, const char *str, char sep,
+				    int size)
+{
+	int len, needed, str_len;
+
+	if (!*str)
+		return;
+
+	len = strlen(buf);
+	str_len = strlen(str);
+	needed = len + str_len + 1;
+
+	/* For continuation, account for the separator. */
+	if (len)
+		needed++;
+	if (needed > size)
+		return;
+
+	if (len)
+		buf[len++] = sep;
+
+	strscpy(buf + len, str, size - len);
+}
+
 static int __init set_cmdline_ftrace(char *str)
 {
 	strscpy(bootup_tracer_buf, str, MAX_TRACER_SIZE);
@@ -329,7 +357,8 @@ static char trace_boot_options_buf[MAX_TRACER_SIZE] __initdata;
 
 static int __init set_trace_boot_options(char *str)
 {
-	strscpy(trace_boot_options_buf, str, MAX_TRACER_SIZE);
+	trace_append_boot_param(trace_boot_options_buf, str, ',',
+				MAX_TRACER_SIZE);
 	return 1;
 }
 __setup("trace_options=", set_trace_boot_options);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index b8f3804586a0..1579cdec3f56 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -862,6 +862,8 @@ extern int DYN_FTRACE_TEST_NAME(void);
 #define DYN_FTRACE_TEST_NAME2 trace_selftest_dynamic_test_func2
 extern int DYN_FTRACE_TEST_NAME2(void);
 
+void __init trace_append_boot_param(char *buf, const char *str,
+				    char sep, int size);
 extern void trace_set_ring_buffer_expanded(struct trace_array *tr);
 extern bool tracing_selftest_disabled;
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index a5dbb72528e0..e9f1c55aea64 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -31,7 +31,8 @@ static char kprobe_boot_events_buf[COMMAND_LINE_SIZE] __initdata;
 
 static int __init set_kprobe_boot_events(char *str)
 {
-	strscpy(kprobe_boot_events_buf, str, COMMAND_LINE_SIZE);
+	trace_append_boot_param(kprobe_boot_events_buf, str, ';',
+				COMMAND_LINE_SIZE);
 	disable_tracing_selftest("running kprobe events");
 
 	return 1;

base-commit: e3c33bc767b5512dbfec643a02abf58ce608f3b2
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v14 1/5] ring-buffer: Flush and stop persistent ring buffer on panic
From: Steven Rostedt @ 2026-03-30 17:54 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, Ian Rogers
In-Reply-To: <177487499643.3463592.15413057950716995168.stgit@mhiramat.tok.corp.google.com>

On Mon, 30 Mar 2026 21:49:56 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> diff --git a/arch/arm64/include/asm/ring_buffer.h b/arch/arm64/include/asm/ring_buffer.h
> new file mode 100644
> index 000000000000..62316c406888
> --- /dev/null
> +++ b/arch/arm64/include/asm/ring_buffer.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef _ASM_ARM64_RING_BUFFER_H
> +#define _ASM_ARM64_RING_BUFFER_H
> +
> +#include <asm/cacheflush.h>
> +
> +/* Flush D-cache on persistent ring buffer */
> +#define arch_ring_buffer_flush_range(start, end)	dcache_clean_pop(start, end)
> +
> +#endif /* _ASM_ARM64_RING_BUFFER_H */

You probably need to get an ack from the arm64 folks.

-- Steve

^ permalink raw reply

* Re: [PATCH v6] tracing: Preserve repeated boot-time tracing parameters
From: Steven Rostedt @ 2026-03-30 16:42 UTC (permalink / raw)
  To: Wesley Atwell
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel
In-Reply-To: <20260330123743.5cd30e56@gandalf.local.home>

On Mon, 30 Mar 2026 12:37:43 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 30 Mar 2026 10:43:22 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 9928da636c9d..7754a8adb58a 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -3677,20 +3677,24 @@ static struct boot_triggers {
> >  } bootup_triggers[MAX_BOOT_TRIGGERS];
> >  
> >  static char bootup_trigger_buf[COMMAND_LINE_SIZE];
> > +static int boot_trigger_buf_len;
> >  static int nr_boot_triggers;
> >  
> >  static __init int setup_trace_triggers(char *str)
> >  {
> >  	char *trigger;
> >  	char *buf;
> > +	int len = boot_trigger_buf_len;
> >  	int i;
> >  
> > -	strscpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE);
> > +	strscpy(bootup_trigger_buf + len , str, COMMAND_LINE_SIZE - len);
> >  	trace_set_ring_buffer_expanded(NULL);
> >  	disable_tracing_selftest("running event triggers");
> >  
> > -	buf = bootup_trigger_buf;
> > -	for (i = 0; i < MAX_BOOT_TRIGGERS; i++) {
> > +	buf = bootup_trigger_buf + len;
> > +	boot_trigger_buf_len += strlen(buf);  
> 
> The above needs to skip the '\0' too:
> 
> 	boot_trigger_buf_len += strlen(buf) + 1;
> 

And since this option is different from the rest, lets make it a separate patch.

-- Steve

> 
> > +
> > +	for (i = nr_boot_triggers; i < MAX_BOOT_TRIGGERS; i++) {
> >  		trigger = strsep(&buf, ",");
> >  		if (!trigger)
> >  			break;  
> 


^ permalink raw reply

* Re: [PATCH v6] tracing: Preserve repeated boot-time tracing parameters
From: Steven Rostedt @ 2026-03-30 16:37 UTC (permalink / raw)
  To: Wesley Atwell
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel
In-Reply-To: <20260330104322.7403c660@gandalf.local.home>

On Mon, 30 Mar 2026 10:43:22 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 9928da636c9d..7754a8adb58a 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -3677,20 +3677,24 @@ static struct boot_triggers {
>  } bootup_triggers[MAX_BOOT_TRIGGERS];
>  
>  static char bootup_trigger_buf[COMMAND_LINE_SIZE];
> +static int boot_trigger_buf_len;
>  static int nr_boot_triggers;
>  
>  static __init int setup_trace_triggers(char *str)
>  {
>  	char *trigger;
>  	char *buf;
> +	int len = boot_trigger_buf_len;
>  	int i;
>  
> -	strscpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE);
> +	strscpy(bootup_trigger_buf + len , str, COMMAND_LINE_SIZE - len);
>  	trace_set_ring_buffer_expanded(NULL);
>  	disable_tracing_selftest("running event triggers");
>  
> -	buf = bootup_trigger_buf;
> -	for (i = 0; i < MAX_BOOT_TRIGGERS; i++) {
> +	buf = bootup_trigger_buf + len;
> +	boot_trigger_buf_len += strlen(buf);

The above needs to skip the '\0' too:

	boot_trigger_buf_len += strlen(buf) + 1;


> +
> +	for (i = nr_boot_triggers; i < MAX_BOOT_TRIGGERS; i++) {
>  		trigger = strsep(&buf, ",");
>  		if (!trigger)
>  			break;


^ permalink raw reply

* Re: [PATCH] tracing: Move snapshot code out of trace.c and into trace_snapshot.c
From: Steven Rostedt @ 2026-03-30 16:05 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: kernel test robot, LKML, Linux trace kernel, llvm, oe-kbuild-all,
	Masami Hiramatsu, Mathieu Desnoyers
In-Reply-To: <8580f943-4c37-4c66-937d-adee13b72201@app.fastmail.com>

On Mon, 30 Mar 2026 16:06:44 +0200
"Arnd Bergmann" <arnd@arndb.de> wrote:

> I saw the same thing and worked around it by removing the function.
> I then noticed that a bunch of code surrounding it is also unused
> and I removed that as well (see below). This version passes
> my randconfig build tests, but I suspect it is still wrong,
> since the code never had any callers and I don't understand
> why.

Note, this code is in include/linux/tracing_printk.h, and is for debugging
purposes (just like trace_printk() is). Hence, it shouldn't be removed.

The purpose is to call tracing_snapshot() when your code detects something
isn't right (but it doesn't crash), and this will take a snapshot of the
current trace that lead up to the anomaly.

If anything, I should add more to Documentation/trace/debugging.rst about it.

-- Steve

^ permalink raw reply

* Re: [PATCH] rtla: Fix build without libbpf header
From: Wander Lairson Costa @ 2026-03-30 16:01 UTC (permalink / raw)
  To: Tomas Glozar
  Cc: Steven Rostedt, John Kacur, Luis Goncalves, Crystal Wood,
	Costa Shulyupin, LKML, linux-trace-kernel
In-Reply-To: <20260330091207.16184-1-tglozar@redhat.com>

On Mon, Mar 30, 2026 at 11:12:07AM +0200, Tomas Glozar wrote:
> rtla supports building without libbpf. However, BPF actions
> patchset [1] adds an include of bpf/libbpf.h into timerlat_bpf.h,
> which breaks build on systems that don't have libbpf headers
> installed.
> 
> This is a leftover from a draft version of the patchset where
> timerlat_bpf_set_action() (which takes a struct bpf_program * argument)
> was defined in the header. timerlat_bpf.c already includes bpf/libbpf.h
> via timerlat.skel.h when libbpf is present.
> 
> Remove the redundant include to fix build on systems without libbpf
> headers.
> 
> [1] https://lore.kernel.org/linux-trace-kernel/20251126144205.331954-1-tglozar@redhat.com/T/
> 
> Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Closes: https://lore.kernel.org/linux-trace-kernel/20260329122202.65a8b575@robin/
> Fixes: 8cd0f08ac72e ("rtla/timerlat: Support tail call from BPF program")
> Signed-off-by: Tomas Glozar <tglozar@redhat.com>
> ---
>  tools/tracing/rtla/src/timerlat_bpf.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/tools/tracing/rtla/src/timerlat_bpf.h b/tools/tracing/rtla/src/timerlat_bpf.h
> index 169abeaf4363..f7c5675737fe 100644
> --- a/tools/tracing/rtla/src/timerlat_bpf.h
> +++ b/tools/tracing/rtla/src/timerlat_bpf.h
> @@ -12,7 +12,6 @@ enum summary_field {
>  };
>  
>  #ifndef __bpf__
> -#include <bpf/libbpf.h>
>  #ifdef HAVE_BPF_SKEL
>  int timerlat_bpf_init(struct timerlat_params *params);
>  int timerlat_bpf_attach(void);
> -- 
> 2.53.0
> 

Reviewed-by: Wander Lairson Costa <wander@redhat.com>


^ permalink raw reply

* Re: [PATCH v2] tracing/osnoise: Add option to align tlat threads
From: Wander Lairson Costa @ 2026-03-30 16:00 UTC (permalink / raw)
  To: Tomas Glozar
  Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, John Kacur,
	Luis Goncalves, Crystal Wood, Costa Shulyupin, LKML,
	linux-trace-kernel
In-Reply-To: <20260302131316.385987-1-tglozar@redhat.com>

On Mon, Mar 02, 2026 at 02:13:16PM +0100, Tomas Glozar wrote:
> Add an option called TIMERLAT_ALIGN to osnoise/options, together with a
> corresponding setting osnoise/timerlat_align_us.
> 
> This option sets the alignment of wakeup times between different
> timerlat threads, similarly to cyclictest's -A/--aligned option. If
> TIMERLAT_ALIGN is set, the first thread that reaches the first cycle
> records its first wake-up time. Each following thread sets its first
> wake-up time to a fixed offset from the recorded time, and increments
> it by the same offset.
> 
> Example:
> 
> osnoise/timerlat_period is set to 1000, osnoise/timerlat_align_us is
> set to 20. There are four threads, on CPUs 1 to 4.
> 
> - CPU 4 enters first cycle first. The current time is 20000us, so
> the wake-up of the first cycle is set to 21000us. This time is recorded.
> - CPU 2 enter first cycle next. It reads the recorded time, increments
> it to 21020us, and uses this value as its own wake-up time for the first
> cycle.
> - CPU 3 enters first cycle next. It reads the recorded time, increments
> it to 21040 us, and uses the value as its own wake-up time.
> - CPU 1 proceeds analogically.
> 
> In each next cycle, the wake-up time (called "absolute period" in
> timerlat code) is incremented by the (relative) period of 1000us. Thus,
> the wake-ups in the following cycles (provided the times are reached and
> not in the past) will be as follows:
> 
> CPU 1		CPU 2		CPU 3	 	CPU 4
> 21080us		21020us		21040us		21000us
> 22080us		22020us		22040us		22000us
> ...		...		...		...
> 

Reviewed-by: Wander Lairson Costa <wander@redhat.com>


^ permalink raw reply

* Re: [PATCH v2] bootconfig: Apply early options from embedded config
From: Breno Leitao @ 2026-03-30 15:04 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
	linux-doc, oss, paulmck, rostedt, kernel-team
In-Reply-To: <acpzhCBEPh-tKVqg@gmail.com>

On Mon, Mar 30, 2026 at 06:15:17AM -0700, Breno Leitao wrote:
> On Fri, Mar 27, 2026 at 10:37:44PM +0900, Masami Hiramatsu wrote:
> > On Fri, 27 Mar 2026 03:06:41 -0700
> > Breno Leitao <leitao@debian.org> wrote:
>
> > > > To fix this, we need to change setup_arch() for each architecture so
> > > > that it calls this bootconfig_apply_early_params().
> > >
> > > Could we instead integrate this into parse_early_param() itself? That
> > > approach would avoid the need to modify each architecture individually.
> >
> > Ah, indeed.
>
> I investigated integrating bootconfig into parse_early_param() and hit a
> blocker: xbc_init() and xbc_make_cmdline() depend on memblock_alloc(), but on
> most architectures (x86, arm64, arm, s390, riscv) parse_early_param() is called
> from setup_arch() _before_ memblock is initialized.

That said, I'd like to propose a simpler approach as a first step:

1) Keep calling bootconfig_apply_early_params() from setup_boot_config().
   This is the least intrusive approach and expands bootconfig support to
   additional early boot parameters.

2) Document that architecture-specific early parameters might be ignored.
   If a parameter is consumed early enough (during setup_arch()), it will
   not see the bootconfig value.

3) Ensure that early bootconfig parameters don't overwrite the boot command
   line. For example, if the boot command line has foo=bar and bootconfig
   later has foo=baz, the command line value should take precedence.
   This prevents early boot code (in setup_arch()) from seeing a parameter
   value that will be changed later.


If that is OK, that is what I have right now:

commit dd6e00e41c381e5fef9d22dda02b104aa8f83101
Author: Breno Leitao <leitao@debian.org>
Date:   Mon Mar 30 06:50:28 2026 -0700

    bootconfig: Apply early options from embedded config
    
    Bootconfig currently cannot apply early kernel parameters. For example,
    the "mitigations=" parameter must be passed through traditional boot
    methods because bootconfig parsing happens after these early parameters
    need to be processed.
    
    Add bootconfig_apply_early_params() which walks all kernel.* keys in the
    parsed XBC tree and calls do_early_param() for each one. It is called
    from setup_boot_config() immediately after a successful xbc_init() on
    the embedded data, which happens before parse_early_param() runs in
    start_kernel().
    
    This allows early options such as:
    
      kernel.mitigations = off
    
    to be placed in the embedded bootconfig and take effect, without
    requiring them on the kernel command line.
    
    If the same parameter appears on both the kernel command line and in
    the embedded bootconfig, the command-line value takes precedence:
    bootconfig_apply_early_params() checks boot_command_line and skips
    any parameter already present there.
    
    Known limitations are documented:
    - Early options in initrd bootconfig are still silently ignored, as the
      initrd is only available after the early param window has closed.
    - Arch-specific early params consumed during setup_arch() (e.g. mem=,
      earlycon, noapic) may not take effect from bootconfig.
    
    Signed-off-by: Breno Leitao <leitao@debian.org>

diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
index f712758472d5c..6ed852a0c66d8 100644
--- a/Documentation/admin-guide/bootconfig.rst
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -169,6 +169,15 @@ Boot Kernel With a Boot Config
 There are two options to boot the kernel with bootconfig: attaching the
 bootconfig to the initrd image or embedding it in the kernel itself.
 
+Early options (those registered with ``early_param()``) may only be
+specified in the embedded bootconfig, because the initrd is not yet
+available when early parameters are processed.
+
+Note that embedded bootconfig is parsed after ``setup_arch()``, so
+early options that are consumed during architecture initialization
+(e.g., ``mem=``, ``memmap=``, ``earlycon``, ``noapic``, ``nolapic``,
+``acpi=``, ``numa=``, ``iommu=``) may not take effect from bootconfig.
+
 Attaching a Boot Config to Initrd
 ---------------------------------
 
diff --git a/init/Kconfig b/init/Kconfig
index 7484cd703bc1a..34adcc1feb9b6 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1525,6 +1525,16 @@ config BOOT_CONFIG_EMBED
 	  image. But if the system doesn't support initrd, this option will
 	  help you by embedding a bootconfig file while building the kernel.
 
+	  Unlike bootconfig attached to initrd, the embedded bootconfig also
+	  supports early options (those registered with early_param()). Any
+	  kernel.* key in the embedded bootconfig is applied before
+	  parse_early_param() runs.  Early options in initrd bootconfig will
+	  not be applied.  Early options consumed during setup_arch() (e.g.
+	  mem=, memmap=, earlycon, noapic, acpi=, numa=, iommu=) may not
+	  take effect.  If the same early option
+	  appears in both bootconfig and the kernel command line, the
+	  command line value takes precedence.
+
 	  If unsure, say N.
 
 config BOOT_CONFIG_EMBED_FILE
diff --git a/init/main.c b/init/main.c
index 1cb395dd94e43..487fe86ab5c09 100644
--- a/init/main.c
+++ b/init/main.c
@@ -414,10 +414,112 @@ static int __init warn_bootconfig(char *str)
 	return 0;
 }
 
+/*
+ * do_early_param() is defined later in this file but called from
+ * bootconfig_apply_early_params() below, so we need a forward declaration.
+ */
+static int __init do_early_param(char *param, char *val,
+				 const char *unused, void *arg);
+
+/*
+ * Check if a parameter name appears on the kernel command line.
+ * Returns true if the parameter was explicitly passed by the bootloader.
+ */
+static bool __init cmdline_has_param(const char *param)
+{
+	const char *p = boot_command_line;
+	int len = strlen(param);
+
+	while ((p = strstr(p, param)) != NULL) {
+		/* Check it's a whole-word match: preceded by space/start */
+		if (p != boot_command_line && *(p - 1) != ' ') {
+			p += len;
+			continue;
+		}
+		/* Followed by =, space, or end of string */
+		if (p[len] == '=' || p[len] == ' ' || p[len] == '\0')
+			return true;
+		p += len;
+	}
+	return false;
+}
+
+/*
+ * bootconfig_apply_early_params - apply kernel.* keys from the embedded
+ * bootconfig as early_param() calls.
+ *
+ * early_param() handlers run before most of the kernel initialises.
+ * A bootconfig attached to initrd arrives too late because the initrd is
+ * not mapped when early params are processed.  The embedded bootconfig
+ * lives in the kernel image itself (.init.data), so it is always
+ * reachable.
+ *
+ * Called from setup_boot_config() which runs before parse_early_param()
+ * in start_kernel(), but after setup_arch().  Arch-specific early params
+ * parsed during setup_arch() will not see bootconfig values.
+ */
+static void __init bootconfig_apply_early_params(void)
+{
+	struct xbc_node *knode, *vnode, *root;
+	const char *val;
+	char *val_copy;
+
+	root = xbc_find_node("kernel");
+	if (!root)
+		return;
+
+	xbc_node_for_each_key_value(root, knode, val) {
+		if (xbc_node_compose_key_after(root, knode,
+					       xbc_namebuf,
+					       XBC_KEYLEN_MAX) < 0)
+			continue;
+
+		/* Command-line values take precedence over bootconfig */
+		if (cmdline_has_param(xbc_namebuf)) {
+			pr_info("bootconfig: skipping '%s', already on command line\n",
+				xbc_namebuf);
+			continue;
+		}
+
+		/* Boolean key with no value — pass NULL like parse_args() */
+		if (!xbc_node_get_child(knode)) {
+			do_early_param(xbc_namebuf, NULL, NULL, NULL);
+			continue;
+		}
+
+		/*
+		 * Iterate array values: "foo = bar, buz" becomes two
+		 * calls: do_early_param("foo", "bar") and
+		 * do_early_param("foo", "buz").
+		 */
+		vnode = xbc_node_get_child(knode);
+		xbc_array_for_each_value(vnode, val) {
+			/*
+			 * Some early_param handlers save the pointer to
+			 * val, so each value needs its own persistent
+			 * copy.  memblock is available here since we run
+			 * after setup_arch().  These allocations are
+			 * intentionally never freed because the handlers
+			 * may retain references indefinitely.
+			 */
+			val_copy = memblock_alloc(strlen(val) + 1,
+						  SMP_CACHE_BYTES);
+			if (!val_copy) {
+				pr_err("Failed to allocate bootconfig value for '%s'\n",
+				       xbc_namebuf);
+				continue;
+			}
+			strcpy(val_copy, val);
+			do_early_param(xbc_namebuf, val_copy, NULL, NULL);
+		}
+	}
+}
+
 static void __init setup_boot_config(void)
 {
 	static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
 	const char *msg, *data;
+	bool embedded = false;
 	int pos, ret;
 	size_t size;
 	char *err;
@@ -425,8 +527,11 @@ static void __init setup_boot_config(void)
 	/* Cut out the bootconfig data even if we have no bootconfig option */
 	data = get_boot_config_from_initrd(&size);
 	/* If there is no bootconfig in initrd, try embedded one. */
-	if (!data)
+	if (!data) {
 		data = xbc_get_embedded_bootconfig(&size);
+		/* tag we have embedded data */
+		embedded = !!data;
+	}
 
 	strscpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
 	err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
@@ -464,6 +569,8 @@ static void __init setup_boot_config(void)
 	} else {
 		xbc_get_info(&ret, NULL);
 		pr_info("Load bootconfig: %ld bytes %d nodes\n", (long)size, ret);
+		if (embedded)
+			bootconfig_apply_early_params();
 		/* keys starting with "kernel." are passed via cmdline */
 		extra_command_line = xbc_make_cmdline("kernel");
 		/* Also, "init." keys are init arguments */

^ permalink raw reply related

* Re: [PATCH v6] tracing: Preserve repeated boot-time tracing parameters
From: Steven Rostedt @ 2026-03-30 14:43 UTC (permalink / raw)
  To: Wesley Atwell
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel
In-Reply-To: <20260329184254.1813273-1-atwellwea@gmail.com>

On Sun, 29 Mar 2026 12:42:54 -0600
Wesley Atwell <atwellwea@gmail.com> wrote:

BTW, please do not reply to old versions of a patch with new versions. It
makes it much more difficult for maintainers to find what is the last patch.

New versions of a patch should *always* be a start of a new thread!

> Some tracing boot parameters already accept delimited value lists, but
> their __setup() handlers keep only the last instance seen at boot.
> Make repeated instances append to the same boot-time buffer in the
> format each parser already consumes.
> 
> Use a shared trace_append_boot_param() helper for the ftrace filters,
> trace_options, and kprobe_event boot parameters. trace_trigger=
> still tokenizes a temporary parse buffer in place, but now copies each
> parsed event/trigger pair into boot-time storage so repeated instances
> do not overwrite earlier ones.
> 
> This also lets Bootconfig array values work naturally when they expand
> to repeated param=value entries.
> 
> Before this change, only the last instance from each repeated
> parameter survived boot.
> 
> Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
> ---
> Changes since v5: https://lore.kernel.org/all/20260328201842.1782806-1-atwellwea@gmail.com/

This is also why I suggested using the above link. The link shows how to
find the old version of the patch, without relying on "In-Reply-To" header.

> - add the separator accounting comment in trace_append_boot_param()
> - keep the existing trace_trigger= temporary buffer and copy each
>   parsed event/trigger pair into boot-time storage instead of tracking
>   a running offset inside that buffer
> 
>  kernel/trace/ftrace.c       | 12 ++++++++----
>  kernel/trace/trace.c        | 30 +++++++++++++++++++++++++++++-
>  kernel/trace/trace.h        |  2 ++
>  kernel/trace/trace_events.c | 24 +++++++++++++++++++++---
>  kernel/trace/trace_kprobe.c |  3 ++-
>  5 files changed, 62 insertions(+), 9 deletions(-)
> 

> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -228,6 +228,33 @@ static int boot_instance_index;
>  static char boot_snapshot_info[COMMAND_LINE_SIZE] __initdata;
>  static int boot_snapshot_index;
>  
> +/*
> + * Repeated boot parameters, including Bootconfig array expansions, need
> + * to stay in the delimiter form that the existing parser consumes.
> + */
> +void __init trace_append_boot_param(char *buf, const char *str, char sep,
> +				    int size)
> +{
> +	int len, needed, str_len;
> +
> +	if (!*str)
> +		return;
> +
> +	len = strlen(buf);
> +	str_len = strlen(str);
> +	needed = len + str_len + 1;

Nit, but it would be nice to have a blank line here.

> +	/* For continuation, account for the separator. */
> +	if (len)
> +		needed++;
> +	if (needed > size)
> +		return;
> +
> +	if (len)
> +		buf[len++] = sep;
> +
> +	strscpy(buf + len, str, size - len);
> +}
> +
>  static int __init set_cmdline_ftrace(char *str)
>  {
>  	strscpy(bootup_tracer_buf, str, MAX_TRACER_SIZE);
> @@ -329,7 +356,8 @@ static char trace_boot_options_buf[MAX_TRACER_SIZE] __initdata;
>  
>  static int __init set_trace_boot_options(char *str)
>  {
> -	strscpy(trace_boot_options_buf, str, MAX_TRACER_SIZE);
> +	trace_append_boot_param(trace_boot_options_buf, str, ',',
> +				MAX_TRACER_SIZE);
>  	return 1;
>  }
>  __setup("trace_options=", set_trace_boot_options);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index b8f3804586a0..237a0417de1c 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -863,6 +863,8 @@ extern int DYN_FTRACE_TEST_NAME(void);
>  extern int DYN_FTRACE_TEST_NAME2(void);
>  
>  extern void trace_set_ring_buffer_expanded(struct trace_array *tr);
> +void __init trace_append_boot_param(char *buf, const char *str,
> +				    char sep, int size);
>  extern bool tracing_selftest_disabled;
>  
>  #ifdef CONFIG_FTRACE_STARTUP_TEST
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 249d1cba72c0..1c4a4a46169e 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -17,6 +17,7 @@
>  #include <linux/kthread.h>
>  #include <linux/tracefs.h>
>  #include <linux/uaccess.h>
> +#include <linux/memblock.h>
>  #include <linux/module.h>
>  #include <linux/ctype.h>
>  #include <linux/sort.h>
> @@ -3674,7 +3675,7 @@ trace_create_new_event(struct trace_event_call *call,
>  #define MAX_BOOT_TRIGGERS 32
>  
>  static struct boot_triggers {
> -	const char		*event;
> +	char			*event;
>  	char			*trigger;
>  } bootup_triggers[MAX_BOOT_TRIGGERS];
>  
> @@ -3683,6 +3684,7 @@ static int nr_boot_triggers;
>  
>  static __init int setup_trace_triggers(char *str)
>  {
> +	char *event;
>  	char *trigger;
>  	char *buf;
>  	int i;
> @@ -3692,14 +3694,30 @@ static __init int setup_trace_triggers(char *str)
>  	disable_tracing_selftest("running event triggers");
>  
>  	buf = bootup_trigger_buf;
> -	for (i = 0; i < MAX_BOOT_TRIGGERS; i++) {
> +	for (i = nr_boot_triggers; i < MAX_BOOT_TRIGGERS; i++) {

Let's not make this so complex.

This function isn't the same as the other functions. It doesn't need to add
separators to the temp buffer. It only needs to append it.


>  		trigger = strsep(&buf, ",");
>  		if (!trigger)
>  			break;
> -		bootup_triggers[i].event = strsep(&trigger, ".");
> +		event = strsep(&trigger, ".");
>  		bootup_triggers[i].trigger = trigger;
>  		if (!bootup_triggers[i].trigger)
>  			break;
> +
> +		/*
> +		 * Keep each parsed trigger outside the temporary setup
> +		 * buffer so repeated trace_trigger= entries do not
> +		 * overwrite earlier ones.
> +		 */
> +		bootup_triggers[i].event =
> +			memblock_alloc_or_panic(strlen(event) + 1,
> +						SMP_CACHE_BYTES);
> +		strscpy(bootup_triggers[i].event, event,
> +			strlen(event) + 1);
> +		bootup_triggers[i].trigger =
> +			memblock_alloc_or_panic(strlen(trigger) + 1,
> +						SMP_CACHE_BYTES);
> +		strscpy(bootup_triggers[i].trigger, trigger,
> +			strlen(trigger) + 1);
>  	}

I believe all you need for the boot triggers is this:

  (Not even compiled tested)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 9928da636c9d..7754a8adb58a 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3677,20 +3677,24 @@ static struct boot_triggers {
 } bootup_triggers[MAX_BOOT_TRIGGERS];
 
 static char bootup_trigger_buf[COMMAND_LINE_SIZE];
+static int boot_trigger_buf_len;
 static int nr_boot_triggers;
 
 static __init int setup_trace_triggers(char *str)
 {
 	char *trigger;
 	char *buf;
+	int len = boot_trigger_buf_len;
 	int i;
 
-	strscpy(bootup_trigger_buf, str, COMMAND_LINE_SIZE);
+	strscpy(bootup_trigger_buf + len , str, COMMAND_LINE_SIZE - len);
 	trace_set_ring_buffer_expanded(NULL);
 	disable_tracing_selftest("running event triggers");
 
-	buf = bootup_trigger_buf;
-	for (i = 0; i < MAX_BOOT_TRIGGERS; i++) {
+	buf = bootup_trigger_buf + len;
+	boot_trigger_buf_len += strlen(buf);
+
+	for (i = nr_boot_triggers; i < MAX_BOOT_TRIGGERS; i++) {
 		trigger = strsep(&buf, ",");
 		if (!trigger)
 			break;

^ permalink raw reply related

* Re: [PATCH v2 1/2] tracing/hist: rebuild full_name on each hist_field_name() call
From: Steven Rostedt @ 2026-03-30 14:22 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: mhiramat, mathieu.desnoyers, tom.zanussi, linux-kernel,
	linux-trace-kernel
In-Reply-To: <20260330024619.38459-1-pengpeng@iscas.ac.cn>

On Mon, 30 Mar 2026 10:46:19 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:


Please resend both patches as a separate thread series. Do not send new
versions of the patch as a reply to the old one. That just makes it much
harder for maintainers to keep track of patches, as they are hidden within
threads.

> hist_field_name() uses a static MAX_FILTER_STR_VAL buffer for fully
> qualified variable-reference names, but it currently appends into that
> buffer with strcat() without rebuilding it first. As a result, repeated
> calls append a new "system.event.field" name onto the previous one,
> which can eventually run past the end of full_name.
> 
> Build the name with snprintf() on each call and return NULL if the fully
> qualified name does not fit in MAX_FILTER_STR_VAL.
> 
> Fixes: 067fe038e70f ("tracing: Add variable reference handling to hist triggers")
> Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> ---
> v2:

Instead of saying "v2", use:

Changes since v1: https://lore.kernel.org/all/20260329030950.32503-1-pengpeng@iscas.ac.cn/

That keeps the history link of this patch compared to the previous version.

-- Steve


> - rebuild full_name on each call instead of falling back to field->name
> - return NULL on overflow as suggested
> - split out the snprintf() length check instead of using an inline if


^ permalink raw reply

* Re: [PATCH] tracing: Move snapshot code out of trace.c and into trace_snapshot.c
From: Arnd Bergmann @ 2026-03-30 14:06 UTC (permalink / raw)
  To: kernel test robot, Steven Rostedt, LKML, Linux trace kernel
  Cc: llvm, oe-kbuild-all, Masami Hiramatsu, Mathieu Desnoyers
In-Reply-To: <202603070230.Zz4BBLtb-lkp@intel.com>

On Fri, Mar 6, 2026, at 20:07, kernel test robot wrote:
>>> kernel/trace/trace.c:820:5: warning: no previous prototype for function 'tracing_alloc_snapshot' [-Wmissing-prototypes]
>      820 | int tracing_alloc_snapshot(void)
>          |     ^
>    kernel/trace/trace.c:820:1: note: declare 'static' if the function 
> is not intended to be used outside of this translation unit
>      820 | int tracing_alloc_snapshot(void)
>          | ^
>          | static 
>    1 warning generated.

I saw the same thing and worked around it by removing the function.
I then noticed that a bunch of code surrounding it is also unused
and I removed that as well (see below). This version passes
my randconfig build tests, but I suspect it is still wrong,
since the code never had any callers and I don't understand
why.

       Arnd


diff --git a/include/linux/trace_printk.h b/include/linux/trace_printk.h
index 2670ec7f4262..87466d8df147 100644
--- a/include/linux/trace_printk.h
+++ b/include/linux/trace_printk.h
@@ -38,8 +38,6 @@ enum ftrace_dump_mode {
 void tracing_on(void);
 void tracing_off(void);
 int tracing_is_on(void);
-void tracing_snapshot(void);
-void tracing_snapshot_alloc(void);
 
 extern void tracing_start(void);
 extern void tracing_stop(void);
@@ -184,8 +182,6 @@ static inline void trace_dump_stack(int skip) { }
 static inline void tracing_on(void) { }
 static inline void tracing_off(void) { }
 static inline int tracing_is_on(void) { return 0; }
-static inline void tracing_snapshot(void) { }
-static inline void tracing_snapshot_alloc(void) { }
 
 static inline __printf(1, 2)
 int trace_printk(const char *fmt, ...)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index ec2b926436a7..76fe2c758734 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -767,70 +767,6 @@ void tracing_on(void)
 }
 EXPORT_SYMBOL_GPL(tracing_on);
 
-#ifdef CONFIG_TRACER_SNAPSHOT
-/**
- * tracing_snapshot - take a snapshot of the current buffer.
- *
- * This causes a swap between the snapshot buffer and the current live
- * tracing buffer. You can use this to take snapshots of the live
- * trace when some condition is triggered, but continue to trace.
- *
- * Note, make sure to allocate the snapshot with either
- * a tracing_snapshot_alloc(), or by doing it manually
- * with: echo 1 > /sys/kernel/tracing/snapshot
- *
- * If the snapshot buffer is not allocated, it will stop tracing.
- * Basically making a permanent snapshot.
- */
-void tracing_snapshot(void)
-{
-	struct trace_array *tr = &global_trace;
-
-	tracing_snapshot_instance(tr);
-}
-EXPORT_SYMBOL_GPL(tracing_snapshot);
-
-/**
- * tracing_alloc_snapshot - allocate snapshot buffer.
- *
- * This only allocates the snapshot buffer if it isn't already
- * allocated - it doesn't also take a snapshot.
- *
- * This is meant to be used in cases where the snapshot buffer needs
- * to be set up for events that can't sleep but need to be able to
- * trigger a snapshot.
- */
-int tracing_alloc_snapshot(void)
-{
-	struct trace_array *tr = &global_trace;
-	int ret;
-
-	ret = tracing_alloc_snapshot_instance(tr);
-	WARN_ON(ret < 0);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(tracing_alloc_snapshot);
-#else
-void tracing_snapshot(void)
-{
-	WARN_ONCE(1, "Snapshot feature not enabled, but internal snapshot used");
-}
-EXPORT_SYMBOL_GPL(tracing_snapshot);
-int tracing_alloc_snapshot(void)
-{
-	WARN_ONCE(1, "Snapshot feature not enabled, but snapshot allocation used");
-	return -ENODEV;
-}
-EXPORT_SYMBOL_GPL(tracing_alloc_snapshot);
-void tracing_snapshot_alloc(void)
-{
-	/* Give warning */
-	tracing_snapshot();
-}
-EXPORT_SYMBOL_GPL(tracing_snapshot_alloc);
-#endif /* CONFIG_TRACER_SNAPSHOT */
-
 void tracer_tracing_off(struct trace_array *tr)
 {
 	if (tr->array_buffer.buffer)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index e4cf6703b301..6abd9e16ef21 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -2306,7 +2306,6 @@ static inline int register_snapshot_cmd(void) { return 0; }
 # endif
 #else /* !CONFIG_TRACER_SNAPSHOT */
 static inline int trace_allocate_snapshot(struct trace_array *tr, int size) { return 0; }
-static inline int tracing_alloc_snapshot(void) { return 0; }
 static inline void tracing_snapshot_instance(struct trace_array *tr) { }
 static inline int tracing_alloc_snapshot_instance(struct trace_array *tr)
 {
diff --git a/kernel/trace/trace_snapshot.c b/kernel/trace/trace_snapshot.c
index 8865b2ef2264..926f395e5af4 100644
--- a/kernel/trace/trace_snapshot.c
+++ b/kernel/trace/trace_snapshot.c
@@ -237,29 +237,6 @@ void tracing_disarm_snapshot(struct trace_array *tr)
 	spin_unlock(&tr->snapshot_trigger_lock);
 }
 
-/**
- * tracing_snapshot_alloc - allocate and take a snapshot of the current buffer.
- *
- * This is similar to tracing_snapshot(), but it will allocate the
- * snapshot buffer if it isn't already allocated. Use this only
- * where it is safe to sleep, as the allocation may sleep.
- *
- * This causes a swap between the snapshot buffer and the current live
- * tracing buffer. You can use this to take snapshots of the live
- * trace when some condition is triggered, but continue to trace.
- */
-void tracing_snapshot_alloc(void)
-{
-	int ret;
-
-	ret = tracing_alloc_snapshot();
-	if (ret < 0)
-		return;
-
-	tracing_snapshot();
-}
-EXPORT_SYMBOL_GPL(tracing_snapshot_alloc);
-
 /**
  * tracing_snapshot_cond_enable - enable conditional snapshot for an instance
  * @tr:		The tracing instance
@@ -391,8 +368,6 @@ void latency_fsnotify(struct trace_array *tr)
 	 */
 	irq_work_queue(&tr->fsnotify_irqwork);
 }
-#else
-static inline void latency_fsnotify(struct trace_array *tr) { }
 #endif /* LATENCY_FS_NOTIFY */
 static const struct file_operations tracing_max_lat_fops;
 

^ permalink raw reply related

* Re: [PATCH v2] bootconfig: Apply early options from embedded config
From: Breno Leitao @ 2026-03-30 13:15 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
	linux-doc, oss, paulmck, rostedt, kernel-team
In-Reply-To: <20260327223744.f246150adc1671f7605a4f0a@kernel.org>

On Fri, Mar 27, 2026 at 10:37:44PM +0900, Masami Hiramatsu wrote:
> On Fri, 27 Mar 2026 03:06:41 -0700
> Breno Leitao <leitao@debian.org> wrote:

> > > To fix this, we need to change setup_arch() for each architecture so
> > > that it calls this bootconfig_apply_early_params().
> > 
> > Could we instead integrate this into parse_early_param() itself? That
> > approach would avoid the need to modify each architecture individually.
> 
> Ah, indeed. 

I investigated integrating bootconfig into parse_early_param() and hit a
blocker: xbc_init() and xbc_make_cmdline() depend on memblock_alloc(), but on
most architectures (x86, arm64, arm, s390, riscv) parse_early_param() is called
from setup_arch() _before_ memblock is initialized.

So, bootconfig will not be available as early as parse_early_param(). 

An alternative is replace memblock allocations in lib/bootconfig.c with static
__initdata buffers, similar to Petr's approach in 2023:

	https://lore.kernel.org/all/20231121231342.193646-3-oss@malat.biz/

But, there was concerns about the allocation size:

	Petr Malat <oss@malat.biz> wrote: 
	> To allow handling of early options, it's necessary to eliminate allocations
	> from embedded bootconfig handling

	"Hm, my concern is that this can introduce some sort of overhead to parse the bootconfig."

^ permalink raw reply

* [PATCH v14 5/5] ring-buffer: Add persistent ring buffer selftest
From: Masami Hiramatsu (Google) @ 2026-03-30 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Ian Rogers
In-Reply-To: <177487498530.3463592.12715592581212799257.stgit@mhiramat.tok.corp.google.com>

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Add a self-destractive test for the persistent ring buffer. This
will invalidate some sub-buffer pages in the persistent ring buffer
when kernel gets panic, and check whether the number of detected
invalid pages and the total entry_bytes are the same as record
after reboot.

This can ensure the kernel correctly recover partially corrupted
persistent ring buffer when boot.

The test only runs on the persistent ring buffer whose name is
"ptracingtest". And user has to fill it up with events before
kernel panics.

To run the test, enable CONFIG_RING_BUFFER_PERSISTENT_INJECT
and you have to setup the kernel cmdline;

 reserve_mem=20M:2M:trace trace_instance=ptracingtest^traceoff@trace
 panic=1

And run following commands after the 1st boot;

 cd /sys/kernel/tracing/instances/ptracingtest
 echo 1 > tracing_on
 echo 1 > events/enable
 sleep 3
 echo c > /proc/sysrq-trigger

After panic message, the kernel will reboot and run the verification
on the persistent ring buffer, e.g.

 Ring buffer meta [2] invalid buffer page detected
 Ring buffer meta [2] is from previous boot! (318 pages discarded)
 Ring buffer testing [2] invalid pages: PASSED (318/318)
 Ring buffer testing [2] entry_bytes: PASSED (1300476/1300476)

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v14:
  - Rename config to CONFIG_RING_BUFFER_PERSISTENT_INJECT.
  - Clear meta->nr_invalid/entry_bytes after testing.
  - Add test commands in config comment.
 Changes in v10:
  - Add entry_bytes test.
  - Do not compile test code if CONFIG_RING_BUFFER_PERSISTENT_SELFTEST=n.
 Changes in v9:
  - Test also reader pages.
---
 include/linux/ring_buffer.h |    1 +
 kernel/trace/Kconfig        |   31 +++++++++++++++++++
 kernel/trace/ring_buffer.c  |   71 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace.c        |    4 ++
 4 files changed, 107 insertions(+)

diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 994f52b34344..0670742b2d60 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -238,6 +238,7 @@ int ring_buffer_subbuf_size_get(struct trace_buffer *buffer);
 
 enum ring_buffer_flags {
 	RB_FL_OVERWRITE		= 1 << 0,
+	RB_FL_TESTING		= 1 << 1,
 };
 
 #ifdef CONFIG_RING_BUFFER
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index e130da35808f..07305ed6d745 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -1202,6 +1202,37 @@ config RING_BUFFER_VALIDATE_TIME_DELTAS
 	  Only say Y if you understand what this does, and you
 	  still want it enabled. Otherwise say N
 
+config RING_BUFFER_PERSISTENT_INJECT
+	bool "Enable persistent ring buffer error injection test"
+	depends on RING_BUFFER
+	help
+	  Run a selftest on the persistent ring buffer which names
+	  "ptracingtest" (and its backup) when panic_on_reboot by
+	  invalidating ring buffer pages.
+	  To use this, boot kernel with "ptracingtest" persistent
+	  ring buffer, e.g.
+
+	   reserve_mem=20M:2M:trace trace_instance=ptracingtest@trace panic=1
+
+	  And after the 1st boot, run test command, like;
+
+	   cd /sys/kernel/tracing/instances/ptracingtest
+	   echo 1 > events/enable
+	   echo 1 > tracing_on
+	   sleep 3
+	   echo c > /proc/sysrq-trigger
+
+	  After panic message, the kernel reboots and show test results
+	  on the boot log.
+
+	  Note that user has to enable events on the persistent ring
+	  buffer manually to fill up ring buffers before rebooting.
+	  Since this invalidates the data on test target ring buffer,
+	  "ptracingtest" persistent ring buffer must not be used for
+	  actual tracing, but only for testing.
+
+	  If unsure, say N
+
 config MMIOTRACE_TEST
 	tristate "Test module for mmiotrace"
 	depends on MMIOTRACE && m
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5049cf13021e..7f8140c54fce 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -64,6 +64,10 @@ struct ring_buffer_cpu_meta {
 	unsigned long	commit_buffer;
 	__u32		subbuf_size;
 	__u32		nr_subbufs;
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
+	__u32		nr_invalid;
+	__u32		entry_bytes;
+#endif
 	int		buffers[];
 };
 
@@ -2078,6 +2082,21 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 
 	pr_info("Ring buffer meta [%d] is from previous boot! (%d pages discarded)\n",
 		cpu_buffer->cpu, discarded);
+
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
+	if (meta->nr_invalid)
+		pr_info("Ring buffer testing [%d] invalid pages: %s (%d/%d)\n",
+			cpu_buffer->cpu,
+			(discarded == meta->nr_invalid) ? "PASSED" : "FAILED",
+			discarded, meta->nr_invalid);
+	if (meta->entry_bytes)
+		pr_info("Ring buffer testing [%d] entry_bytes: %s (%ld/%ld)\n",
+			cpu_buffer->cpu,
+			(entry_bytes == meta->entry_bytes) ? "PASSED" : "FAILED",
+			(long)entry_bytes, (long)meta->entry_bytes);
+	meta->nr_invalid = 0;
+	meta->entry_bytes = 0;
+#endif
 	return;
 
  invalid:
@@ -2558,12 +2577,64 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
 	kfree(cpu_buffer);
 }
 
+#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
+static void rb_test_inject_invalid_pages(struct trace_buffer *buffer)
+{
+	struct ring_buffer_per_cpu *cpu_buffer;
+	struct ring_buffer_cpu_meta *meta;
+	struct buffer_data_page *dpage;
+	u32 entry_bytes = 0;
+	unsigned long ptr;
+	int subbuf_size;
+	int invalid = 0;
+	int cpu;
+	int i;
+
+	if (!(buffer->flags & RB_FL_TESTING))
+		return;
+
+	guard(preempt)();
+	cpu = smp_processor_id();
+
+	cpu_buffer = buffer->buffers[cpu];
+	meta = cpu_buffer->ring_meta;
+	ptr = (unsigned long)rb_subbufs_from_meta(meta);
+	subbuf_size = meta->subbuf_size;
+
+	for (i = 0; i < meta->nr_subbufs; i++) {
+		int idx = meta->buffers[i];
+
+		dpage = (void *)(ptr + idx * subbuf_size);
+		/* Skip unused pages */
+		if (!local_read(&dpage->commit))
+			continue;
+
+		/* Invalidate even pages. */
+		if (!(i & 0x1)) {
+			local_add(subbuf_size + 1, &dpage->commit);
+			invalid++;
+		} else {
+			/* Count total commit bytes. */
+			entry_bytes += local_read(&dpage->commit);
+		}
+	}
+
+	pr_info("Inject invalidated %d pages on CPU%d, total size: %ld\n",
+		invalid, cpu, (long)entry_bytes);
+	meta->nr_invalid = invalid;
+	meta->entry_bytes = entry_bytes;
+}
+#else /* !CONFIG_RING_BUFFER_PERSISTENT_INJECT */
+#define rb_test_inject_invalid_pages(buffer)	do { } while (0)
+#endif
+
 /* Stop recording on a persistent buffer and flush cache if needed. */
 static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
 {
 	struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
 
 	ring_buffer_record_off(buffer);
+	rb_test_inject_invalid_pages(buffer);
 	arch_ring_buffer_flush_range(buffer->range_addr_start, buffer->range_addr_end);
 	return NOTIFY_DONE;
 }
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4189ec9df6a5..108b0d16badf 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -9366,6 +9366,8 @@ static void setup_trace_scratch(struct trace_array *tr,
 	memset(tscratch, 0, size);
 }
 
+#define TRACE_TEST_PTRACING_NAME	"ptracingtest"
+
 static int
 allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, unsigned long size)
 {
@@ -9378,6 +9380,8 @@ allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, unsigned
 	buf->tr = tr;
 
 	if (tr->range_addr_start && tr->range_addr_size) {
+		if (!strcmp(tr->name, TRACE_TEST_PTRACING_NAME))
+			rb_flags |= RB_FL_TESTING;
 		/* Add scratch buffer to handle 128 modules */
 		buf->buffer = ring_buffer_alloc_range(size, rb_flags, 0,
 						      tr->range_addr_start,


^ permalink raw reply related

* [PATCH v14 4/5] ring-buffer: Reset RB_MISSED_* flags on persistent ring buffer
From: Masami Hiramatsu (Google) @ 2026-03-30 12:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, Ian Rogers
In-Reply-To: <177487498530.3463592.12715592581212799257.stgit@mhiramat.tok.corp.google.com>

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Reset RB_MISSED_* flags when the persistent ring buffer is
validated at boot. Since these flags are used only in reading
process, such process should be stopped when reboot and never
be restarted. Thus, these flags are meaningless in the next
boot. Moreover, it can confuse the read process after reboot.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v14:
   - Newly added.
---
 kernel/trace/ring_buffer.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index e5178239f2f9..5049cf13021e 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1903,6 +1903,7 @@ static int rb_validate_buffer(struct buffer_page *bpage, int cpu,
 		local_set(&bpage->page->commit, 0);
 	} else {
 		local_set(&bpage->entries, ret);
+		local_set(&bpage->page->commit, tail);
 	}
 
 	return ret;


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox