public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust
@ 2026-02-18 10:14 Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 1/4] ring-buffer: Fix to check event length before using Masami Hiramatsu (Google)
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-02-18 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

Hi,

Here is a series of improvement patches for making persistent
ring buffers robust to failures. This fixes some issues of
persistent ring buffer on real machines.
We naively assumed that event data stored in a persistent ring
buffer would be preserved across reboots. However, event data
is written on the hardware cache and is not flushed on reboot.
As a result, the data on the ring buffer will be partially
corrupted, which will be detected during startup validation
and all ring buffer data will be erased. (I have actually
observed this on an actual arm64 machine.)

To fix these issues, this series introduces following patches;

- [1/4] Fix to check event length before using it, because
  if event data is partially saved, the data length will be
  completely wrong and the rb_read_data_buffer() will access
  invalid address (which crashes kernel at boot).

- [2/4] Flush and stop persistent ring buffer on panic.
  For the kernel panic case, we can use callback to stop event
  recording and flush hardware cache for the persistent memory.
  This ensures that the ring buffer data is written to memory
  in the event of a panic.

- [3/4] Skip invalid sub-buffers when validating persistent
  ring buffer. Instead of invalidating whole CPU buffer,
  invalidate only corrupted sub buffer.

- [4/4] Record invalid buffer event on invalidated buffer for
  notifying users which sub-buffer was corrupted.

[3/4] and [4/4] could be combined, but I have separated them
for ease of review.

Thank you,

---

Masami Hiramatsu (Google) (4):
      ring-buffer: Fix to check event length before using
      ring-buffer: Flush and stop persistent ring buffer on panic
      ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
      ring-buffer: Record invalid buffer event


 kernel/trace/ring_buffer.c   |   84 ++++++++++++++++++++++++++++++++++++------
 kernel/trace/trace.h         |    1 +
 kernel/trace/trace_entries.h |   15 ++++++++
 3 files changed, 87 insertions(+), 13 deletions(-)

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/4] ring-buffer: Fix to check event length before using
  2026-02-18 10:14 [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust Masami Hiramatsu (Google)
@ 2026-02-18 10:14 ` Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic Masami Hiramatsu (Google)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-02-18 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Check the event length before adding it for accessing next index in
rb_read_data_buffer(). Since this function is used for validating
possibly broken ring buffers, the length of the event could be broken.
In that case, the new event (e + len) can point a wrong address.
To avoid invalid memory access at boot, check whether the length of
each event is in the possible range before using it.

Fixes: 5f3b6e839f3c ("ring-buffer: Validate boot range memory events")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/ring_buffer.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index d33103408955..f7fd4bdf6560 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1849,6 +1849,7 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
 	struct ring_buffer_event *event;
 	u64 ts, delta;
 	int events = 0;
+	int len;
 	int e;
 
 	*delta_ptr = 0;
@@ -1856,9 +1857,12 @@ static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu
 
 	ts = dpage->time_stamp;
 
-	for (e = 0; e < tail; e += rb_event_length(event)) {
+	for (e = 0; e < tail; e += len) {
 
 		event = (struct ring_buffer_event *)(dpage->data + e);
+		len = rb_event_length(event);
+		if (len <= 0 || len > tail - e)
+			return -1;
 
 		switch (event->type_len) {
 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic
  2026-02-18 10:14 [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 1/4] ring-buffer: Fix to check event length before using Masami Hiramatsu (Google)
@ 2026-02-18 10:14 ` Masami Hiramatsu (Google)
  2026-02-20 19:53   ` Steven Rostedt
  2026-02-18 10:14 ` [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 4/4] ring-buffer: Record invalid buffer event Masami Hiramatsu (Google)
  3 siblings, 1 reply; 9+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-02-18 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

On a real hardware, since panic and reboot the machine will not
flush hardware cache to the persistent ring buffer, the events
written right before the panic can be lost. Moreover, since
there will be an inconsistency between the commit counter (which
is written atomically via local_set()) and the data, validation
will fail and all data in the persistent ring buffer will be lost.

To avoid this issue, this will stop recording on the ring buffer
and flush cache at the reserved memory on panic.

Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/ring_buffer.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index f7fd4bdf6560..d2b69221a94c 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -6,6 +6,7 @@
  */
 #include <linux/sched/isolation.h>
 #include <linux/trace_recursion.h>
+#include <linux/panic_notifier.h>
 #include <linux/trace_events.h>
 #include <linux/ring_buffer.h>
 #include <linux/trace_clock.h>
@@ -589,6 +590,7 @@ struct trace_buffer {
 
 	unsigned long			range_addr_start;
 	unsigned long			range_addr_end;
+	struct notifier_block		flush_nb;
 
 	struct ring_buffer_meta		*meta;
 
@@ -2470,6 +2472,16 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
 	kfree(cpu_buffer);
 }
 
+static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
+{
+	struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
+
+	ring_buffer_record_disable(buffer);
+	flush_kernel_vmap_range((void *)buffer->range_addr_start,
+				buffer->range_addr_end - buffer->range_addr_start);
+	return NOTIFY_DONE;
+}
+
 static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
 					 int order, unsigned long start,
 					 unsigned long end,
@@ -2589,6 +2601,12 @@ static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
 
 	mutex_init(&buffer->mutex);
 
+	/* Persistent ring buffer needs to flush cache before reboot. */
+	if (start & end) {
+		buffer->flush_nb.notifier_call = rb_flush_buffer_cb;
+		atomic_notifier_chain_register(&panic_notifier_list, &buffer->flush_nb);
+	}
+
 	return_ptr(buffer);
 
  fail_free_buffers:
@@ -2676,6 +2694,9 @@ ring_buffer_free(struct trace_buffer *buffer)
 {
 	int cpu;
 
+	if (buffer->range_addr_start && buffer->range_addr_end)
+		atomic_notifier_chain_unregister(&panic_notifier_list, &buffer->flush_nb);
+
 	cpuhp_state_remove_instance(CPUHP_TRACE_RB_PREPARE, &buffer->node);
 
 	irq_work_sync(&buffer->irq_work.work);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
  2026-02-18 10:14 [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 1/4] ring-buffer: Fix to check event length before using Masami Hiramatsu (Google)
  2026-02-18 10:14 ` [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic Masami Hiramatsu (Google)
@ 2026-02-18 10:14 ` Masami Hiramatsu (Google)
  2026-02-20 19:56   ` Steven Rostedt
  2026-02-18 10:14 ` [PATCH v2 4/4] ring-buffer: Record invalid buffer event Masami Hiramatsu (Google)
  3 siblings, 1 reply; 9+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-02-18 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Skip invalid sub-buffers when validating the persistent ring buffer
instead of invalidate all ring buffers.

If the cache data in memory fails to be synchronized during a reboot,
the persistent ring buffer may become partially corrupted, but other
sub-buffers may still contain readable event data, allowing usersto
recover data from the corrupted ring buffer.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/ring_buffer.c |   22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index d2b69221a94c..0ae2a5ad8c3e 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2045,17 +2045,19 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 		if (ret < 0) {
 			pr_info("Ring buffer meta [%d] invalid buffer page\n",
 				cpu_buffer->cpu);
-			goto invalid;
-		}
-
-		/* If the buffer has content, update pages_touched */
-		if (ret)
-			local_inc(&cpu_buffer->pages_touched);
-
-		entries += ret;
-		entry_bytes += local_read(&head_page->page->commit);
-		local_set(&cpu_buffer->head_page->entries, ret);
+			/* Instead of invalidate whole ring buffer, just clear this subbuffer. */
+			local_set(&head_page->entries, 0);
+			local_set(&head_page->page->commit, 0);
+			/* TODO: commit an event to mark this is broken. */
+		} else {
+			/* If the buffer has content, update pages_touched */
+			if (ret)
+				local_inc(&cpu_buffer->pages_touched);
 
+			entries += ret;
+			entry_bytes += local_read(&head_page->page->commit);
+			local_set(&cpu_buffer->head_page->entries, ret);
+		}
 		if (head_page == cpu_buffer->commit_page)
 			break;
 	}


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/4] ring-buffer: Record invalid buffer event
  2026-02-18 10:14 [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust Masami Hiramatsu (Google)
                   ` (2 preceding siblings ...)
  2026-02-18 10:14 ` [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer Masami Hiramatsu (Google)
@ 2026-02-18 10:14 ` Masami Hiramatsu (Google)
  2026-02-20 19:59   ` Steven Rostedt
  3 siblings, 1 reply; 9+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-02-18 10:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Record an invalid buffer event on the invalidated sub buffer
so that user can notice how much data is skipped.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/ring_buffer.c   |   43 ++++++++++++++++++++++++++++++++++++------
 kernel/trace/trace.h         |    1 +
 kernel/trace/trace_entries.h |   15 +++++++++++++++
 3 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 0ae2a5ad8c3e..98df5a67de26 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1911,6 +1911,38 @@ static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu)
 	return rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
 }
 
+/* Inject invalid_buffer event */
+static void rb_record_invalid_buffer(struct buffer_page *buffer,
+				     long commit_bytes, long entries,
+				     int buffer_index)
+{
+	struct buffer_data_page *dpage = buffer->page;
+	struct invalid_subbuf_entry *entry;
+	struct ring_buffer_event *event;
+	long length;
+
+	length = DIV_ROUND_UP(sizeof(*entry), RB_ALIGNMENT);
+
+	/*
+	 * Instead of ring_buffer_lock_reserve(), directly allocate it on
+	 * the first entry of specific buffer_page.
+	 */
+	event = (struct ring_buffer_event *)&dpage->data[0];
+	event->type_len = length;
+	event->time_delta = 0;
+
+	trace_event_setup(event, TRACE_INVALID_BUF, 0);
+
+	entry = ring_buffer_event_data(event);
+	entry->lost_bytes = commit_bytes;
+	entry->lost_entries = entries;
+	entry->buffer_index = buffer_index;
+
+	/* This buffer_page has only one event. */
+	local_set(&buffer->entries, 1);
+	local_set(&buffer->page->commit, rb_event_data_length(event));
+}
+
 /* If the meta data has been validated, now validate the events */
 static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 {
@@ -2043,12 +2075,11 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
 
 		ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu);
 		if (ret < 0) {
-			pr_info("Ring buffer meta [%d] invalid buffer page\n",
-				cpu_buffer->cpu);
-			/* Instead of invalidate whole ring buffer, just clear this subbuffer. */
-			local_set(&head_page->entries, 0);
-			local_set(&head_page->page->commit, 0);
-			/* TODO: commit an event to mark this is broken. */
+			/* Discard invalid buffer and record it. */
+			rb_record_invalid_buffer(head_page,
+				local_read(&head_page->page->commit),
+				local_read(&head_page->entries),
+				rb_meta_subbuf_idx(meta, head_page->page));
 		} else {
 			/* If the buffer has content, update pages_touched */
 			if (ret)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 7894bf55743c..667834edb5b9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -57,6 +57,7 @@ enum trace_type {
 	TRACE_TIMERLAT,
 	TRACE_RAW_DATA,
 	TRACE_FUNC_REPEATS,
+	TRACE_INVALID_BUF,
 
 	__TRACE_LAST_TYPE,
 };
diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
index f6a8d29c0d76..df39fc245ab4 100644
--- a/kernel/trace/trace_entries.h
+++ b/kernel/trace/trace_entries.h
@@ -457,3 +457,18 @@ FTRACE_ENTRY(timerlat, timerlat_entry,
 		 __entry->context,
 		 __entry->timer_latency)
 );
+
+FTRACE_ENTRY(invalid_subbuf, invalid_subbuf_entry,
+	TRACE_INVALID_BUF,
+
+	F_STRUCT(
+		__field(	long,			lost_bytes	)
+		__field(	long,			lost_entries	)
+		__field(	int,			buffer_index	)
+	),
+
+	F_printk("lost_bytes:%ld\tlost_entries:%ld\tbuffer_index:%d\n",
+		 __entry->lost_bytes,
+		 __entry->lost_entries,
+		 __entry->buffer_index)
+);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic
  2026-02-18 10:14 ` [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic Masami Hiramatsu (Google)
@ 2026-02-20 19:53   ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2026-02-20 19:53 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel

On Wed, 18 Feb 2026 19:14:28 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> On a real hardware, since panic and reboot the machine will not
> flush hardware cache to the persistent ring buffer, the events
> written right before the panic can be lost. Moreover, since
> there will be an inconsistency between the commit counter (which
> is written atomically via local_set()) and the data, validation
> will fail and all data in the persistent ring buffer will be lost.

Here's a bit of a fix up on the text:

   On real hardware, panic and machine reboot may not flush hardware cache
   to memory. This means the persistent ring buffer, which relies on a
   coherent state of memory, may not have its events written to the buffer
   and they may be lost. Moreover, there may be inconsistency with the
   counters which are used for validation of the integrity of the
   persistent ring buffer which may cause all data to be discarded.


> 
> To avoid this issue, this will stop recording on the ring buffer
> and flush cache at the reserved memory on panic.

   To avoid this issue, stop recording of the ring buffer on panic and
   flush the cache of the ring buffer's memory.


-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
  2026-02-18 10:14 ` [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer Masami Hiramatsu (Google)
@ 2026-02-20 19:56   ` Steven Rostedt
  2026-02-23  7:39     ` Masami Hiramatsu
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2026-02-20 19:56 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel

On Wed, 18 Feb 2026 19:14:35 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Skip invalid sub-buffers when validating the persistent ring buffer
> instead of invalidate all ring buffers.

  instead of discarding the entire ring buffer.


> 
> If the cache data in memory fails to be synchronized during a reboot,
> the persistent ring buffer may become partially corrupted, but other
> sub-buffers may still contain readable event data, allowing usersto
> recover data from the corrupted ring buffer.

                  ... contain readable event data. Only discard the
                  subbuffers that are found to be corrupted.

> 
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/ring_buffer.c |   22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index d2b69221a94c..0ae2a5ad8c3e 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -2045,17 +2045,19 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
>  		if (ret < 0) {
>  			pr_info("Ring buffer meta [%d] invalid buffer page\n",
>  				cpu_buffer->cpu);
> -			goto invalid;
> -		}
> -
> -		/* If the buffer has content, update pages_touched */
> -		if (ret)
> -			local_inc(&cpu_buffer->pages_touched);
> -
> -		entries += ret;
> -		entry_bytes += local_read(&head_page->page->commit);
> -		local_set(&cpu_buffer->head_page->entries, ret);
> +			/* Instead of invalidate whole ring buffer, just clear this subbuffer. */
> +			local_set(&head_page->entries, 0);
> +			local_set(&head_page->page->commit, 0);
> +			/* TODO: commit an event to mark this is broken. */

Here's how to fix the TODO:

			local_set(&head_page->page->commit, RB_MISSED_EVENTS);

-- Steve


> +		} else {
> +			/* If the buffer has content, update pages_touched */
> +			if (ret)
> +				local_inc(&cpu_buffer->pages_touched);
>  
> +			entries += ret;
> +			entry_bytes += local_read(&head_page->page->commit);
> +			local_set(&cpu_buffer->head_page->entries, ret);
> +		}
>  		if (head_page == cpu_buffer->commit_page)
>  			break;
>  	}


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 4/4] ring-buffer: Record invalid buffer event
  2026-02-18 10:14 ` [PATCH v2 4/4] ring-buffer: Record invalid buffer event Masami Hiramatsu (Google)
@ 2026-02-20 19:59   ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2026-02-20 19:59 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel

On Wed, 18 Feb 2026 19:14:43 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Record an invalid buffer event on the invalidated sub buffer
> so that user can notice how much data is skipped.
> 
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

As I showed in patch 3, just mark it as having missed events. We could add
a pr_warn() that says the buffer was corrupted, but we don't need a
"invalid" event.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer
  2026-02-20 19:56   ` Steven Rostedt
@ 2026-02-23  7:39     ` Masami Hiramatsu
  0 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2026-02-23  7:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel

On Fri, 20 Feb 2026 14:56:56 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 18 Feb 2026 19:14:35 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Skip invalid sub-buffers when validating the persistent ring buffer
> > instead of invalidate all ring buffers.
> 
>   instead of discarding the entire ring buffer.
> 
> 
> > 
> > If the cache data in memory fails to be synchronized during a reboot,
> > the persistent ring buffer may become partially corrupted, but other
> > sub-buffers may still contain readable event data, allowing usersto
> > recover data from the corrupted ring buffer.
> 
>                   ... contain readable event data. Only discard the
>                   subbuffers that are found to be corrupted.
> 
> > 
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  kernel/trace/ring_buffer.c |   22 ++++++++++++----------
> >  1 file changed, 12 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index d2b69221a94c..0ae2a5ad8c3e 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -2045,17 +2045,19 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
> >  		if (ret < 0) {
> >  			pr_info("Ring buffer meta [%d] invalid buffer page\n",
> >  				cpu_buffer->cpu);
> > -			goto invalid;
> > -		}
> > -
> > -		/* If the buffer has content, update pages_touched */
> > -		if (ret)
> > -			local_inc(&cpu_buffer->pages_touched);
> > -
> > -		entries += ret;
> > -		entry_bytes += local_read(&head_page->page->commit);
> > -		local_set(&cpu_buffer->head_page->entries, ret);
> > +			/* Instead of invalidate whole ring buffer, just clear this subbuffer. */
> > +			local_set(&head_page->entries, 0);
> > +			local_set(&head_page->page->commit, 0);
> > +			/* TODO: commit an event to mark this is broken. */
> 
> Here's how to fix the TODO:
> 
> 			local_set(&head_page->page->commit, RB_MISSED_EVENTS);

Ah, that's a nice flag!

Thanks!

> 
> -- Steve
> 
> 
> > +		} else {
> > +			/* If the buffer has content, update pages_touched */
> > +			if (ret)
> > +				local_inc(&cpu_buffer->pages_touched);
> >  
> > +			entries += ret;
> > +			entry_bytes += local_read(&head_page->page->commit);
> > +			local_set(&cpu_buffer->head_page->entries, ret);
> > +		}
> >  		if (head_page == cpu_buffer->commit_page)
> >  			break;
> >  	}
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-23  7:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18 10:14 [PATCH v2 0/4] ring-buffer: Making persistent ring buffers robust Masami Hiramatsu (Google)
2026-02-18 10:14 ` [PATCH v2 1/4] ring-buffer: Fix to check event length before using Masami Hiramatsu (Google)
2026-02-18 10:14 ` [PATCH v2 2/4] ring-buffer: Flush and stop persistent ring buffer on panic Masami Hiramatsu (Google)
2026-02-20 19:53   ` Steven Rostedt
2026-02-18 10:14 ` [PATCH v2 3/4] ring-buffer: Skip invalid sub-buffers when validating persistent ring buffer Masami Hiramatsu (Google)
2026-02-20 19:56   ` Steven Rostedt
2026-02-23  7:39     ` Masami Hiramatsu
2026-02-18 10:14 ` [PATCH v2 4/4] ring-buffer: Record invalid buffer event Masami Hiramatsu (Google)
2026-02-20 19:59   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox