* [PATCH v5 0/4] tracing: Remove backup instance after read all
@ 2026-01-28 0:09 Masami Hiramatsu (Google)
2026-01-28 0:09 ` [PATCH v5 1/4] tracing: Reset last_boot_info if ring buffer is reset Masami Hiramatsu (Google)
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-01-28 0:09 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel
Hi,
Here is the v5 of the series to improve backup instances of
the persistent ring buffer. The previous version is here:
https://lore.kernel.org/all/176887135615.578403.6988045330349053692.stgit@mhiramat.tok.corp.google.com/
This is just rebased on top of the latest linux-trace/trace/for-next.
Since backup instances are a kind of snapshot of the persistent
ring buffer, it should be readonly. And if it is readonly
there is no reason to keep it after reading all data via trace_pipe
because the data has been consumed. But user should be able to remove
the readonly instance by rmdir or truncating `trace` file.
Thus, [2/4] makes backup instances readonly (not able to write any
events, cleanup trace, change buffer size). Also, [3/4] removes the
backup instance after consuming all data via trace_pipe.
With this improvements, even if we makes a backup instance (using
the same amount of memory of the persistent ring buffer), it will
be removed after reading the data automatically.
---
Masami Hiramatsu (Google) (4):
tracing: Reset last_boot_info if ring buffer is reset
tracing: Make the backup instance non-reusable
tracing: Remove the backup instance automatically after read
tracing/Documentation: Add a section about backup instance
Documentation/trace/debugging.rst | 19 ++++
kernel/trace/trace.c | 164 ++++++++++++++++++++++++++++++-------
kernel/trace/trace.h | 14 +++
kernel/trace/trace_boot.c | 5 +
kernel/trace/trace_events.c | 80 ++++++++++++------
5 files changed, 221 insertions(+), 61 deletions(-)
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH v5 1/4] tracing: Reset last_boot_info if ring buffer is reset 2026-01-28 0:09 [PATCH v5 0/4] tracing: Remove backup instance after read all Masami Hiramatsu (Google) @ 2026-01-28 0:09 ` Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 2/4] tracing: Make the backup instance non-reusable Masami Hiramatsu (Google) ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: Masami Hiramatsu (Google) @ 2026-01-28 0:09 UTC (permalink / raw) To: Steven Rostedt Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel, linux-trace-kernel From: Masami Hiramatsu (Google) <mhiramat@kernel.org> Commit 32dc0042528d ("tracing: Reset last-boot buffers when reading out all cpu buffers") resets the last_boot_info when user read out all data via trace_pipe* files. But it is not reset when user resets the buffer from other files. (e.g. write `trace` file) Reset it when the corresponding ring buffer is reset too. Fixes: 32dc0042528d ("tracing: Reset last-boot buffers when reading out all cpu buffers") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> --- kernel/trace/trace.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 396d59202438..5c3e4a554143 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4891,6 +4891,8 @@ static int tracing_single_release_tr(struct inode *inode, struct file *file) return single_release(inode, file); } +static bool update_last_data_if_empty(struct trace_array *tr); + static int tracing_open(struct inode *inode, struct file *file) { struct trace_array *tr = inode->i_private; @@ -4915,6 +4917,8 @@ static int tracing_open(struct inode *inode, struct file *file) tracing_reset_online_cpus(trace_buf); else tracing_reset_cpu(trace_buf, cpu); + + update_last_data_if_empty(tr); } if (file->f_mode & FMODE_READ) { @@ -5981,6 +5985,7 @@ tracing_set_trace_read(struct file *filp, char __user *ubuf, int tracer_init(struct tracer *t, struct trace_array *tr) { tracing_reset_online_cpus(&tr->array_buffer); + update_last_data_if_empty(tr); return t->init(tr); } @@ -7799,6 +7804,7 @@ int tracing_set_clock(struct trace_array *tr, const char *clockstr) ring_buffer_set_clock(tr->max_buffer.buffer, trace_clocks[i].func); tracing_reset_online_cpus(&tr->max_buffer); #endif + update_last_data_if_empty(tr); if (tr->scratch && !(tr->flags & TRACE_ARRAY_FL_LAST_BOOT)) { struct trace_scratch *tscratch = tr->scratch; @@ -8036,6 +8042,7 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt, tracing_reset_online_cpus(&tr->max_buffer); else tracing_reset_cpu(&tr->max_buffer, iter->cpu_file); + update_last_data_if_empty(tr); } break; } ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-28 0:09 [PATCH v5 0/4] tracing: Remove backup instance after read all Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 1/4] tracing: Reset last_boot_info if ring buffer is reset Masami Hiramatsu (Google) @ 2026-01-28 0:09 ` Masami Hiramatsu (Google) 2026-01-29 19:53 ` Steven Rostedt 2026-01-29 20:07 ` Steven Rostedt 2026-01-28 0:10 ` [PATCH v5 3/4] tracing: Remove the backup instance automatically after read Masami Hiramatsu (Google) 2026-01-28 0:10 ` [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance Masami Hiramatsu (Google) 3 siblings, 2 replies; 16+ messages in thread From: Masami Hiramatsu (Google) @ 2026-01-28 0:09 UTC (permalink / raw) To: Steven Rostedt Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel, linux-trace-kernel From: Masami Hiramatsu (Google) <mhiramat@kernel.org> Since there is no reason to reuse the backup instance, make it readonly (but erasable). Note that only backup instances are readonly, because other trace instances will be empty unless it is writable. Only backup instances have copy entries from the original. With this change, most of the trace control files are removed from the backup instance, including eventfs enable/filter etc. # find /sys/kernel/tracing/instances/backup/events/ | wc -l 4093 # find /sys/kernel/tracing/instances/boot_map/events/ | wc -l 9573 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> --- Changes in v5: - Rebased on the latest for-next (and hide show_event_filters/triggers if the instance is readonly. Changes in v4: - Make trace data erasable. (not reusable) Changes in v3: - Resuse the beginning part of event_entries for readonly files. - Remove readonly file_operations and checking readonly flag in each write operation. Changes in v2: - Use readonly file_operations to prohibit writing instead of checking flags in write() callbacks. - Remove writable files from eventfs. --- kernel/trace/trace.c | 93 ++++++++++++++++++++++++++++++------------- kernel/trace/trace.h | 8 +++- kernel/trace/trace_boot.c | 5 +- kernel/trace/trace_events.c | 80 ++++++++++++++++++++++++------------- 4 files changed, 126 insertions(+), 60 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 5c3e4a554143..d39f6509c12a 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5052,6 +5052,11 @@ static ssize_t tracing_write_stub(struct file *filp, const char __user *ubuf, size_t count, loff_t *ppos) { + struct trace_array *tr = file_inode(filp)->i_private; + + if (trace_array_is_readonly(tr)) + return -EPERM; + return count; } @@ -5152,6 +5157,9 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf, cpumask_var_t tracing_cpumask_new; int err; + if (trace_array_is_readonly(tr)) + return -EPERM; + if (count == 0 || count > KMALLOC_MAX_SIZE) return -EINVAL; @@ -6436,6 +6444,9 @@ tracing_set_trace_write(struct file *filp, const char __user *ubuf, size_t ret; int err; + if (trace_array_is_readonly(tr)) + return -EPERM; + ret = cnt; if (cnt > MAX_TRACER_SIZE) @@ -7070,6 +7081,9 @@ tracing_entries_write(struct file *filp, const char __user *ubuf, unsigned long val; int ret; + if (trace_array_is_readonly(tr)) + return -EPERM; + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); if (ret) return ret; @@ -7824,6 +7838,9 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf, const char *clockstr; int ret; + if (trace_array_is_readonly(tr)) + return -EPERM; + if (cnt >= sizeof(buf)) return -EINVAL; @@ -9388,12 +9405,16 @@ static void tracing_init_tracefs_percpu(struct trace_array *tr, long cpu) { struct dentry *d_percpu = tracing_dentry_percpu(tr, cpu); + umode_t writable_mode = TRACE_MODE_WRITE; struct dentry *d_cpu; char cpu_dir[30]; /* 30 characters should be more than enough */ if (!d_percpu) return; + if (trace_array_is_readonly(tr)) + writable_mode = TRACE_MODE_READ; + snprintf(cpu_dir, 30, "cpu%ld", cpu); d_cpu = tracefs_create_dir(cpu_dir, d_percpu); if (!d_cpu) { @@ -9616,7 +9637,6 @@ struct dentry *trace_create_file(const char *name, return ret; } - static struct dentry *trace_options_init_dentry(struct trace_array *tr) { struct dentry *d_tracer; @@ -9846,6 +9866,9 @@ rb_simple_write(struct file *filp, const char __user *ubuf, unsigned long val; int ret; + if (trace_array_is_readonly(tr)) + return -EPERM; + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); if (ret) return ret; @@ -9952,6 +9975,9 @@ buffer_subbuf_size_write(struct file *filp, const char __user *ubuf, int pages; int ret; + if (trace_array_is_readonly(tr)) + return -EPERM; + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); if (ret) return ret; @@ -10632,17 +10658,23 @@ static __init void create_trace_instances(struct dentry *d_tracer) static void init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) { + umode_t writable_mode = TRACE_MODE_WRITE; + bool readonly = trace_array_is_readonly(tr); int cpu; + if (readonly) + writable_mode = TRACE_MODE_READ; + trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, - tr, &show_traces_fops); + tr, &show_traces_fops); - trace_create_file("current_tracer", TRACE_MODE_WRITE, d_tracer, - tr, &set_tracer_fops); + trace_create_file("current_tracer", writable_mode, d_tracer, + tr, &set_tracer_fops); - trace_create_file("tracing_cpumask", TRACE_MODE_WRITE, d_tracer, + trace_create_file("tracing_cpumask", writable_mode, d_tracer, tr, &tracing_cpumask_fops); + /* Options are used for changing print-format even for readonly instance. */ trace_create_file("trace_options", TRACE_MODE_WRITE, d_tracer, tr, &tracing_iter_fops); @@ -10652,27 +10684,35 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) trace_create_file("trace_pipe", TRACE_MODE_READ, d_tracer, tr, &tracing_pipe_fops); - trace_create_file("buffer_size_kb", TRACE_MODE_WRITE, d_tracer, + trace_create_file("buffer_size_kb", writable_mode, d_tracer, tr, &tracing_entries_fops); trace_create_file("buffer_total_size_kb", TRACE_MODE_READ, d_tracer, tr, &tracing_total_entries_fops); - trace_create_file("free_buffer", 0200, d_tracer, - tr, &tracing_free_buffer_fops); + if (!readonly) { + trace_create_file("free_buffer", 0200, d_tracer, + tr, &tracing_free_buffer_fops); - trace_create_file("trace_marker", 0220, d_tracer, - tr, &tracing_mark_fops); + trace_create_file("trace_marker", 0220, d_tracer, + tr, &tracing_mark_fops); - tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); + tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); - trace_create_file("trace_marker_raw", 0220, d_tracer, - tr, &tracing_mark_raw_fops); + trace_create_file("trace_marker_raw", 0220, d_tracer, + tr, &tracing_mark_raw_fops); - trace_create_file("trace_clock", TRACE_MODE_WRITE, d_tracer, tr, + trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, + tr, &buffer_percent_fops); + + trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, + tr, &tracing_syscall_buf_fops); + } + + trace_create_file("trace_clock", writable_mode, d_tracer, tr, &trace_clock_fops); - trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, + trace_create_file("tracing_on", writable_mode, d_tracer, tr, &rb_simple_fops); trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, @@ -10680,41 +10720,38 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) tr->buffer_percent = 50; - trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, - tr, &buffer_percent_fops); - - trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, + trace_create_file("buffer_subbuf_size_kb", writable_mode, d_tracer, tr, &buffer_subbuf_size_fops); - trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, - tr, &tracing_syscall_buf_fops); - create_trace_options_dir(tr); #ifdef CONFIG_TRACER_MAX_TRACE - trace_create_maxlat_file(tr, d_tracer); + if (!readonly) + trace_create_maxlat_file(tr, d_tracer); #endif - if (ftrace_create_function_files(tr, d_tracer)) + if (!readonly && ftrace_create_function_files(tr, d_tracer)) MEM_FAIL(1, "Could not allocate function filter files"); if (tr->range_addr_start) { trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, tr, &last_boot_fops); #ifdef CONFIG_TRACER_SNAPSHOT - } else { + } else if (!readonly) { trace_create_file("snapshot", TRACE_MODE_WRITE, d_tracer, tr, &snapshot_fops); #endif } - trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, - tr, &tracing_err_log_fops); + if (!readonly) + trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, + tr, &tracing_err_log_fops); for_each_tracing_cpu(cpu) tracing_init_tracefs_percpu(tr, cpu); - ftrace_init_tracefs(tr, d_tracer); + if (!readonly) + ftrace_init_tracefs(tr, d_tracer); } #ifdef CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 69e7defba6c6..0adc644084bf 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -33,6 +33,7 @@ #define TRACE_MODE_WRITE 0640 #define TRACE_MODE_READ 0440 +#define TRACE_MODE_WRITE_MASK (TRACE_MODE_WRITE & ~TRACE_MODE_READ) enum trace_type { __TRACE_FIRST_TYPE = 0, @@ -483,6 +484,12 @@ extern bool trace_clock_in_ns(struct trace_array *tr); extern unsigned long trace_adjust_address(struct trace_array *tr, unsigned long addr); +static inline bool trace_array_is_readonly(struct trace_array *tr) +{ + /* backup instance is read only. */ + return tr->flags & TRACE_ARRAY_FL_VMALLOC; +} + /* * The global tracer (top) should be the first trace array added, * but we check the flag anyway. @@ -681,7 +688,6 @@ struct dentry *trace_create_file(const char *name, void *data, const struct file_operations *fops); - /** * tracer_tracing_is_on_cpu - show real state of ring buffer enabled on for a cpu * @tr : the trace array to know if ring buffer is enabled diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index dbe29b4c6a7a..2ca2541c8a58 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -61,7 +61,8 @@ trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node) v = memparse(p, NULL); if (v < PAGE_SIZE) pr_err("Buffer size is too small: %s\n", p); - if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) + if (trace_array_is_readonly(tr) || + tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) pr_err("Failed to resize trace buffer to %s\n", p); } @@ -597,7 +598,7 @@ trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node) p = xbc_node_find_value(node, "tracer", NULL); if (p && *p != '\0') { - if (tracing_set_tracer(tr, p) < 0) + if (trace_array_is_readonly(tr) || tracing_set_tracer(tr, p) < 0) pr_err("Failed to set given tracer: %s\n", p); } diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 4972e1a2b5f3..2e0db93be557 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -1380,6 +1380,9 @@ static int __ftrace_set_clr_event(struct trace_array *tr, const char *match, { int ret; + if (trace_array_is_readonly(tr)) + return -EPERM; + mutex_lock(&event_mutex); ret = __ftrace_set_clr_event_nolock(tr, match, sub, event, set, mod); mutex_unlock(&event_mutex); @@ -2952,8 +2955,8 @@ event_subsystem_dir(struct trace_array *tr, const char *name, } else __get_system(system); - /* ftrace only has directories no files */ - if (strcmp(name, "ftrace") == 0) + /* ftrace only has directories no files, readonly instance too. */ + if (strcmp(name, "ftrace") == 0 || trace_array_is_readonly(tr)) nr_entries = 0; else nr_entries = ARRAY_SIZE(system_entries); @@ -3118,28 +3121,30 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file) int ret; static struct eventfs_entry event_entries[] = { { - .name = "enable", + .name = "format", .callback = event_callback, - .release = event_release, }, +#ifdef CONFIG_PERF_EVENTS { - .name = "filter", + .name = "id", .callback = event_callback, }, +#endif +#define NR_RO_EVENT_ENTRIES (1 + IS_ENABLED(CONFIG_PERF_EVENTS)) +/* Readonly files must be above this line and counted by NR_RO_EVENT_ENTRIES. */ { - .name = "trigger", + .name = "enable", .callback = event_callback, + .release = event_release, }, { - .name = "format", + .name = "filter", .callback = event_callback, }, -#ifdef CONFIG_PERF_EVENTS { - .name = "id", + .name = "trigger", .callback = event_callback, }, -#endif #ifdef CONFIG_HIST_TRIGGERS { .name = "hist", @@ -3172,9 +3177,13 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file) if (!e_events) return -ENOMEM; - nr_entries = ARRAY_SIZE(event_entries); + if (trace_array_is_readonly(tr)) + nr_entries = NR_RO_EVENT_ENTRIES; + else + nr_entries = ARRAY_SIZE(event_entries); name = trace_event_name(call); + ei = eventfs_create_dir(name, e_events, event_entries, nr_entries, file); if (IS_ERR(ei)) { pr_warn("Could not create tracefs '%s' directory\n", name); @@ -4511,31 +4520,37 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr) int nr_entries; static struct eventfs_entry events_entries[] = { { - .name = "enable", + .name = "header_page", .callback = events_callback, }, { - .name = "header_page", + .name = "header_event", .callback = events_callback, }, +#define NR_RO_TOP_ENTRIES 2 +/* Readonly files must be above this line and counted by NR_RO_TOP_ENTRIES. */ { - .name = "header_event", + .name = "enable", .callback = events_callback, }, }; - entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, - tr, &ftrace_set_event_fops); - if (!entry) - return -ENOMEM; + if (!trace_array_is_readonly(tr)) { + entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, + tr, &ftrace_set_event_fops); + if (!entry) + return -ENOMEM; - trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, - &ftrace_show_event_filters_fops); + trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, + &ftrace_show_event_filters_fops); - trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, - &ftrace_show_event_triggers_fops); + trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, + &ftrace_show_event_triggers_fops); - nr_entries = ARRAY_SIZE(events_entries); + nr_entries = ARRAY_SIZE(events_entries); + } else { + nr_entries = NR_RO_TOP_ENTRIES; + } e_events = eventfs_create_events_dir("events", parent, events_entries, nr_entries, tr); @@ -4544,15 +4559,22 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr) return -ENOMEM; } - /* There are not as crucial, just warn if they are not created */ + if (!trace_array_is_readonly(tr)) { + + entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, + tr, &ftrace_set_event_fops); + if (!entry) + return -ENOMEM; - trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, - tr, &ftrace_set_event_pid_fops); + /* There are not as crucial, just warn if they are not created */ - trace_create_file("set_event_notrace_pid", - TRACE_MODE_WRITE, parent, tr, - &ftrace_set_event_notrace_pid_fops); + trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, + tr, &ftrace_set_event_pid_fops); + trace_create_file("set_event_notrace_pid", + TRACE_MODE_WRITE, parent, tr, + &ftrace_set_event_notrace_pid_fops); + } tr->event_dir = e_events; return 0; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-28 0:09 ` [PATCH v5 2/4] tracing: Make the backup instance non-reusable Masami Hiramatsu (Google) @ 2026-01-29 19:53 ` Steven Rostedt 2026-01-30 9:01 ` Masami Hiramatsu 2026-01-29 20:07 ` Steven Rostedt 1 sibling, 1 reply; 16+ messages in thread From: Steven Rostedt @ 2026-01-29 19:53 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Wed, 28 Jan 2026 09:09:56 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > @@ -9388,12 +9405,16 @@ static void > tracing_init_tracefs_percpu(struct trace_array *tr, long cpu) > { > struct dentry *d_percpu = tracing_dentry_percpu(tr, cpu); > + umode_t writable_mode = TRACE_MODE_WRITE; > struct dentry *d_cpu; > char cpu_dir[30]; /* 30 characters should be more than enough */ > > if (!d_percpu) > return; > > + if (trace_array_is_readonly(tr)) > + writable_mode = TRACE_MODE_READ; Hmm, writable_mode is set but not used? Looks like you missed an update. -- Steve > + > snprintf(cpu_dir, 30, "cpu%ld", cpu); > d_cpu = tracefs_create_dir(cpu_dir, d_percpu); > if (!d_cpu) { > @@ -9616,7 +9637,6 @@ struct dentry *trace_create_file(const char *name, > return ret; > } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-29 19:53 ` Steven Rostedt @ 2026-01-30 9:01 ` Masami Hiramatsu 0 siblings, 0 replies; 16+ messages in thread From: Masami Hiramatsu @ 2026-01-30 9:01 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Thu, 29 Jan 2026 14:53:33 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 28 Jan 2026 09:09:56 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > @@ -9388,12 +9405,16 @@ static void > > tracing_init_tracefs_percpu(struct trace_array *tr, long cpu) > > { > > struct dentry *d_percpu = tracing_dentry_percpu(tr, cpu); > > + umode_t writable_mode = TRACE_MODE_WRITE; > > struct dentry *d_cpu; > > char cpu_dir[30]; /* 30 characters should be more than enough */ > > > > if (!d_percpu) > > return; > > > > + if (trace_array_is_readonly(tr)) > > + writable_mode = TRACE_MODE_READ; > > Hmm, writable_mode is set but not used? Oops, good catch! I missed to update it. Thanks! > > Looks like you missed an update. > > -- Steve > > > + > > snprintf(cpu_dir, 30, "cpu%ld", cpu); > > d_cpu = tracefs_create_dir(cpu_dir, d_percpu); > > if (!d_cpu) { > > @@ -9616,7 +9637,6 @@ struct dentry *trace_create_file(const char *name, > > return ret; > > } -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-28 0:09 ` [PATCH v5 2/4] tracing: Make the backup instance non-reusable Masami Hiramatsu (Google) 2026-01-29 19:53 ` Steven Rostedt @ 2026-01-29 20:07 ` Steven Rostedt 2026-01-30 9:20 ` Masami Hiramatsu 1 sibling, 1 reply; 16+ messages in thread From: Steven Rostedt @ 2026-01-29 20:07 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Wed, 28 Jan 2026 09:09:56 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > @@ -10632,17 +10658,23 @@ static __init void create_trace_instances(struct dentry *d_tracer) > static void > init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > { > + umode_t writable_mode = TRACE_MODE_WRITE; > + bool readonly = trace_array_is_readonly(tr); > int cpu; > > + if (readonly) > + writable_mode = TRACE_MODE_READ; > + > trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, > - tr, &show_traces_fops); > + tr, &show_traces_fops); > > - trace_create_file("current_tracer", TRACE_MODE_WRITE, d_tracer, > - tr, &set_tracer_fops); > + trace_create_file("current_tracer", writable_mode, d_tracer, > + tr, &set_tracer_fops); > > - trace_create_file("tracing_cpumask", TRACE_MODE_WRITE, d_tracer, > + trace_create_file("tracing_cpumask", writable_mode, d_tracer, > tr, &tracing_cpumask_fops); > > + /* Options are used for changing print-format even for readonly instance. */ > trace_create_file("trace_options", TRACE_MODE_WRITE, d_tracer, > tr, &tracing_iter_fops); > > @@ -10652,27 +10684,35 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > trace_create_file("trace_pipe", TRACE_MODE_READ, d_tracer, > tr, &tracing_pipe_fops); > > - trace_create_file("buffer_size_kb", TRACE_MODE_WRITE, d_tracer, > + trace_create_file("buffer_size_kb", writable_mode, d_tracer, > tr, &tracing_entries_fops); > > trace_create_file("buffer_total_size_kb", TRACE_MODE_READ, d_tracer, > tr, &tracing_total_entries_fops); > > - trace_create_file("free_buffer", 0200, d_tracer, > - tr, &tracing_free_buffer_fops); > + if (!readonly) { > + trace_create_file("free_buffer", 0200, d_tracer, > + tr, &tracing_free_buffer_fops); Hmm, why remove the free_buffer. It just shrinks the buffer down to a minimum. Perhaps its useless, but I it doesn't write to the buffer. Sure it removes data but so does trace_pipe. > > - trace_create_file("trace_marker", 0220, d_tracer, > - tr, &tracing_mark_fops); > + trace_create_file("trace_marker", 0220, d_tracer, > + tr, &tracing_mark_fops); > > - tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); > + tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); > > - trace_create_file("trace_marker_raw", 0220, d_tracer, > - tr, &tracing_mark_raw_fops); > + trace_create_file("trace_marker_raw", 0220, d_tracer, > + tr, &tracing_mark_raw_fops); > > - trace_create_file("trace_clock", TRACE_MODE_WRITE, d_tracer, tr, > + trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, > + tr, &buffer_percent_fops); > + > + trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, > + tr, &tracing_syscall_buf_fops); > + } > + > + trace_create_file("trace_clock", writable_mode, d_tracer, tr, > &trace_clock_fops); > > - trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, > + trace_create_file("tracing_on", writable_mode, d_tracer, > tr, &rb_simple_fops); Hmm, should tracing_on exist in read only mode? > > trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, > @@ -10680,41 +10720,38 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > > tr->buffer_percent = 50; > > - trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, > - tr, &buffer_percent_fops); > - > - trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, > + trace_create_file("buffer_subbuf_size_kb", writable_mode, d_tracer, > tr, &buffer_subbuf_size_fops); > > - trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, > - tr, &tracing_syscall_buf_fops); > - > create_trace_options_dir(tr); > > #ifdef CONFIG_TRACER_MAX_TRACE > - trace_create_maxlat_file(tr, d_tracer); > + if (!readonly) > + trace_create_maxlat_file(tr, d_tracer); > #endif > > - if (ftrace_create_function_files(tr, d_tracer)) > + if (!readonly && ftrace_create_function_files(tr, d_tracer)) > MEM_FAIL(1, "Could not allocate function filter files"); > > if (tr->range_addr_start) { > trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, > tr, &last_boot_fops); > #ifdef CONFIG_TRACER_SNAPSHOT > - } else { > + } else if (!readonly) { > trace_create_file("snapshot", TRACE_MODE_WRITE, d_tracer, > tr, &snapshot_fops); > #endif > } > > - trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, > - tr, &tracing_err_log_fops); > + if (!readonly) > + trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, > + tr, &tracing_err_log_fops); Why not move this up into the "if (!readonly) {" block above? > > for_each_tracing_cpu(cpu) > tracing_init_tracefs_percpu(tr, cpu); > > - ftrace_init_tracefs(tr, d_tracer); > + if (!readonly) > + ftrace_init_tracefs(tr, d_tracer); Same here. Or just move the readonly block down to the end of the function. > } > > #ifdef CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h > index 69e7defba6c6..0adc644084bf 100644 > --- a/kernel/trace/trace.h > +++ b/kernel/trace/trace.h > @@ -33,6 +33,7 @@ > > #define TRACE_MODE_WRITE 0640 > #define TRACE_MODE_READ 0440 > +#define TRACE_MODE_WRITE_MASK (TRACE_MODE_WRITE & ~TRACE_MODE_READ) > > enum trace_type { > __TRACE_FIRST_TYPE = 0, > @@ -483,6 +484,12 @@ extern bool trace_clock_in_ns(struct trace_array *tr); > > extern unsigned long trace_adjust_address(struct trace_array *tr, unsigned long addr); > > +static inline bool trace_array_is_readonly(struct trace_array *tr) > +{ > + /* backup instance is read only. */ > + return tr->flags & TRACE_ARRAY_FL_VMALLOC; Hmm, I wonder if we should create a RDONLY flag for the trace_array? -- Steve > +} > + > /* > * The global tracer (top) should be the first trace array added, > * but we check the flag anyway. > @@ -681,7 +688,6 @@ struct dentry *trace_create_file(const char *name, > void *data, > const struct file_operations *fops); > > - > /** > * tracer_tracing_is_on_cpu - show real state of ring buffer enabled on for a cpu > * @tr : the trace array to know if ring buffer is enabled > diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c > index dbe29b4c6a7a..2ca2541c8a58 100644 > --- a/kernel/trace/trace_boot.c > +++ b/kernel/trace/trace_boot.c > @@ -61,7 +61,8 @@ trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node) > v = memparse(p, NULL); > if (v < PAGE_SIZE) > pr_err("Buffer size is too small: %s\n", p); > - if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) > + if (trace_array_is_readonly(tr) || > + tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) > pr_err("Failed to resize trace buffer to %s\n", p); > } > > @@ -597,7 +598,7 @@ trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node) > > p = xbc_node_find_value(node, "tracer", NULL); > if (p && *p != '\0') { > - if (tracing_set_tracer(tr, p) < 0) > + if (trace_array_is_readonly(tr) || tracing_set_tracer(tr, p) < 0) > pr_err("Failed to set given tracer: %s\n", p); > } > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-29 20:07 ` Steven Rostedt @ 2026-01-30 9:20 ` Masami Hiramatsu 2026-01-30 15:21 ` Steven Rostedt 0 siblings, 1 reply; 16+ messages in thread From: Masami Hiramatsu @ 2026-01-30 9:20 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Thu, 29 Jan 2026 15:07:45 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 28 Jan 2026 09:09:56 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > @@ -10632,17 +10658,23 @@ static __init void create_trace_instances(struct dentry *d_tracer) > > static void > > init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > > { > > + umode_t writable_mode = TRACE_MODE_WRITE; > > + bool readonly = trace_array_is_readonly(tr); > > int cpu; > > > > + if (readonly) > > + writable_mode = TRACE_MODE_READ; > > + > > trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, > > - tr, &show_traces_fops); > > + tr, &show_traces_fops); > > > > - trace_create_file("current_tracer", TRACE_MODE_WRITE, d_tracer, > > - tr, &set_tracer_fops); > > + trace_create_file("current_tracer", writable_mode, d_tracer, > > + tr, &set_tracer_fops); > > > > - trace_create_file("tracing_cpumask", TRACE_MODE_WRITE, d_tracer, > > + trace_create_file("tracing_cpumask", writable_mode, d_tracer, > > tr, &tracing_cpumask_fops); > > > > + /* Options are used for changing print-format even for readonly instance. */ > > trace_create_file("trace_options", TRACE_MODE_WRITE, d_tracer, > > tr, &tracing_iter_fops); > > > > @@ -10652,27 +10684,35 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > > trace_create_file("trace_pipe", TRACE_MODE_READ, d_tracer, > > tr, &tracing_pipe_fops); > > > > - trace_create_file("buffer_size_kb", TRACE_MODE_WRITE, d_tracer, > > + trace_create_file("buffer_size_kb", writable_mode, d_tracer, > > tr, &tracing_entries_fops); > > > > trace_create_file("buffer_total_size_kb", TRACE_MODE_READ, d_tracer, > > tr, &tracing_total_entries_fops); > > > > - trace_create_file("free_buffer", 0200, d_tracer, > > - tr, &tracing_free_buffer_fops); > > + if (!readonly) { > > + trace_create_file("free_buffer", 0200, d_tracer, > > + tr, &tracing_free_buffer_fops); > > Hmm, why remove the free_buffer. It just shrinks the buffer down to a > minimum. Perhaps its useless, but I it doesn't write to the buffer. Sure it > removes data but so does trace_pipe. I can keep it but free_buffer and free instance is a bit different on the persistent ring buffer and its backup. For both cases, since the scratch area needs to be kept, it does not free the reserved memory. (but the minimum ring buffer is dynamically allocated.) IMHO, the buffer resize interfaces are not useful for persistent ring buffer. > > > > > - trace_create_file("trace_marker", 0220, d_tracer, > > - tr, &tracing_mark_fops); > > + trace_create_file("trace_marker", 0220, d_tracer, > > + tr, &tracing_mark_fops); > > > > - tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); > > + tr->trace_marker_file = __find_event_file(tr, "ftrace", "print"); > > > > - trace_create_file("trace_marker_raw", 0220, d_tracer, > > - tr, &tracing_mark_raw_fops); > > + trace_create_file("trace_marker_raw", 0220, d_tracer, > > + tr, &tracing_mark_raw_fops); > > > > - trace_create_file("trace_clock", TRACE_MODE_WRITE, d_tracer, tr, > > + trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, > > + tr, &buffer_percent_fops); > > + > > + trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, > > + tr, &tracing_syscall_buf_fops); > > + } > > + > > + trace_create_file("trace_clock", writable_mode, d_tracer, tr, > > &trace_clock_fops); > > > > - trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, > > + trace_create_file("tracing_on", writable_mode, d_tracer, > > tr, &rb_simple_fops); > > Hmm, should tracing_on exist in read only mode? Ah, indeed. I'll remove this too. > > > > trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, > > @@ -10680,41 +10720,38 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > > > > tr->buffer_percent = 50; > > > > - trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, > > - tr, &buffer_percent_fops); > > - > > - trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, > > + trace_create_file("buffer_subbuf_size_kb", writable_mode, d_tracer, > > tr, &buffer_subbuf_size_fops); > > > > - trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, > > - tr, &tracing_syscall_buf_fops); > > - > > create_trace_options_dir(tr); > > > > #ifdef CONFIG_TRACER_MAX_TRACE > > - trace_create_maxlat_file(tr, d_tracer); > > + if (!readonly) > > + trace_create_maxlat_file(tr, d_tracer); > > #endif > > > > - if (ftrace_create_function_files(tr, d_tracer)) > > + if (!readonly && ftrace_create_function_files(tr, d_tracer)) > > MEM_FAIL(1, "Could not allocate function filter files"); > > > > if (tr->range_addr_start) { > > trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, > > tr, &last_boot_fops); > > #ifdef CONFIG_TRACER_SNAPSHOT > > - } else { > > + } else if (!readonly) { > > trace_create_file("snapshot", TRACE_MODE_WRITE, d_tracer, > > tr, &snapshot_fops); > > #endif > > } > > > > - trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, > > - tr, &tracing_err_log_fops); > > + if (!readonly) > > + trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, > > + tr, &tracing_err_log_fops); > > Why not move this up into the "if (!readonly) {" block above? OK, I'll move it. > > > > > for_each_tracing_cpu(cpu) > > tracing_init_tracefs_percpu(tr, cpu); > > > > - ftrace_init_tracefs(tr, d_tracer); > > + if (!readonly) > > + ftrace_init_tracefs(tr, d_tracer); > > Same here. Or just move the readonly block down to the end of the function. OK. > > > } > > > > #ifdef CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h > > index 69e7defba6c6..0adc644084bf 100644 > > --- a/kernel/trace/trace.h > > +++ b/kernel/trace/trace.h > > @@ -33,6 +33,7 @@ > > > > #define TRACE_MODE_WRITE 0640 > > #define TRACE_MODE_READ 0440 > > +#define TRACE_MODE_WRITE_MASK (TRACE_MODE_WRITE & ~TRACE_MODE_READ) > > > > enum trace_type { > > __TRACE_FIRST_TYPE = 0, > > @@ -483,6 +484,12 @@ extern bool trace_clock_in_ns(struct trace_array *tr); > > > > extern unsigned long trace_adjust_address(struct trace_array *tr, unsigned long addr); > > > > +static inline bool trace_array_is_readonly(struct trace_array *tr) > > +{ > > + /* backup instance is read only. */ > > + return tr->flags & TRACE_ARRAY_FL_VMALLOC; > > Hmm, I wonder if we should create a RDONLY flag for the trace_array? OK, let me add the RDONLY flag. (maybe it will be reused afterwards) Thanks, > > -- Steve > > > +} > > + > > /* > > * The global tracer (top) should be the first trace array added, > > * but we check the flag anyway. > > @@ -681,7 +688,6 @@ struct dentry *trace_create_file(const char *name, > > void *data, > > const struct file_operations *fops); > > > > - > > /** > > * tracer_tracing_is_on_cpu - show real state of ring buffer enabled on for a cpu > > * @tr : the trace array to know if ring buffer is enabled > > diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c > > index dbe29b4c6a7a..2ca2541c8a58 100644 > > --- a/kernel/trace/trace_boot.c > > +++ b/kernel/trace/trace_boot.c > > @@ -61,7 +61,8 @@ trace_boot_set_instance_options(struct trace_array *tr, struct xbc_node *node) > > v = memparse(p, NULL); > > if (v < PAGE_SIZE) > > pr_err("Buffer size is too small: %s\n", p); > > - if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) > > + if (trace_array_is_readonly(tr) || > > + tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) > > pr_err("Failed to resize trace buffer to %s\n", p); > > } > > > > @@ -597,7 +598,7 @@ trace_boot_enable_tracer(struct trace_array *tr, struct xbc_node *node) > > > > p = xbc_node_find_value(node, "tracer", NULL); > > if (p && *p != '\0') { > > - if (tracing_set_tracer(tr, p) < 0) > > + if (trace_array_is_readonly(tr) || tracing_set_tracer(tr, p) < 0) > > pr_err("Failed to set given tracer: %s\n", p); > > } > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/4] tracing: Make the backup instance non-reusable 2026-01-30 9:20 ` Masami Hiramatsu @ 2026-01-30 15:21 ` Steven Rostedt 0 siblings, 0 replies; 16+ messages in thread From: Steven Rostedt @ 2026-01-30 15:21 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Fri, 30 Jan 2026 18:20:58 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > - trace_create_file("free_buffer", 0200, d_tracer, > > > - tr, &tracing_free_buffer_fops); > > > + if (!readonly) { > > > + trace_create_file("free_buffer", 0200, d_tracer, > > > + tr, &tracing_free_buffer_fops); > > > > Hmm, why remove the free_buffer. It just shrinks the buffer down to a > > minimum. Perhaps its useless, but I it doesn't write to the buffer. Sure it > > removes data but so does trace_pipe. > > I can keep it but free_buffer and free instance is a bit different on > the persistent ring buffer and its backup. For both cases, since the > scratch area needs to be kept, it does not free the reserved memory. > (but the minimum ring buffer is dynamically allocated.) > IMHO, the buffer resize interfaces are not useful for persistent ring > buffer. Ah yeah. OK, lets not add free_buffer to read only instances. -- Steve ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v5 3/4] tracing: Remove the backup instance automatically after read 2026-01-28 0:09 [PATCH v5 0/4] tracing: Remove backup instance after read all Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 1/4] tracing: Reset last_boot_info if ring buffer is reset Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 2/4] tracing: Make the backup instance non-reusable Masami Hiramatsu (Google) @ 2026-01-28 0:10 ` Masami Hiramatsu (Google) 2026-01-29 20:13 ` Steven Rostedt 2026-01-28 0:10 ` [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance Masami Hiramatsu (Google) 3 siblings, 1 reply; 16+ messages in thread From: Masami Hiramatsu (Google) @ 2026-01-28 0:10 UTC (permalink / raw) To: Steven Rostedt Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel, linux-trace-kernel From: Masami Hiramatsu (Google) <mhiramat@kernel.org> Since the backup instance is readonly, after reading all data via pipe, no data is left on the instance. Thus it can be removed safely after closing all files. This also removes it if user resets the ring buffer manually via 'trace' file. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> --- Changes in v4: - Update description. --- kernel/trace/trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++- kernel/trace/trace.h | 6 +++++ 2 files changed, 69 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d39f6509c12a..7d615a74f915 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -590,6 +590,55 @@ void trace_set_ring_buffer_expanded(struct trace_array *tr) tr->ring_buffer_expanded = true; } +static int __remove_instance(struct trace_array *tr); + +static void trace_array_autoremove(struct work_struct *work) +{ + struct trace_array *tr = container_of(work, struct trace_array, autoremove_work); + + guard(mutex)(&event_mutex); + guard(mutex)(&trace_types_lock); + + /* + * This can be fail if someone gets @tr before starting this + * function, but in that case, this will be kicked again when + * putting it. So we don't care the result. + */ + __remove_instance(tr); +} + +static struct workqueue_struct *autoremove_wq; + +static void trace_array_init_autoremove(struct trace_array *tr) +{ + INIT_WORK(&tr->autoremove_work, trace_array_autoremove); +} + +static void trace_array_kick_autoremove(struct trace_array *tr) +{ + if (!work_pending(&tr->autoremove_work) && autoremove_wq) + queue_work(autoremove_wq, &tr->autoremove_work); +} + +static void trace_array_cancel_autoremove(struct trace_array *tr) +{ + if (work_pending(&tr->autoremove_work)) + cancel_work(&tr->autoremove_work); +} + +__init static int trace_array_init_autoremove_wq(void) +{ + autoremove_wq = alloc_workqueue("tr_autoremove_wq", + WQ_UNBOUND | WQ_HIGHPRI, 0); + if (!autoremove_wq) { + pr_err("Unable to allocate tr_autoremove_wq\n"); + return -ENOMEM; + } + return 0; +} + +late_initcall_sync(trace_array_init_autoremove_wq); + LIST_HEAD(ftrace_trace_arrays); int trace_array_get(struct trace_array *this_tr) @@ -598,7 +647,7 @@ int trace_array_get(struct trace_array *this_tr) guard(mutex)(&trace_types_lock); list_for_each_entry(tr, &ftrace_trace_arrays, list) { - if (tr == this_tr) { + if (tr == this_tr && !tr->free_on_close) { tr->ref++; return 0; } @@ -611,6 +660,12 @@ static void __trace_array_put(struct trace_array *this_tr) { WARN_ON(!this_tr->ref); this_tr->ref--; + /* + * When free_on_close is set, prepare removing the array + * when the last reference is released. + */ + if (this_tr->ref == 1 && this_tr->free_on_close) + trace_array_kick_autoremove(this_tr); } /** @@ -6237,6 +6292,10 @@ static void update_last_data(struct trace_array *tr) /* Only if the buffer has previous boot data clear and update it. */ tr->flags &= ~TRACE_ARRAY_FL_LAST_BOOT; + /* If this is a backup instance, mark it for autoremove. */ + if (tr->flags & TRACE_ARRAY_FL_VMALLOC) + tr->free_on_close = true; + /* Reset the module list and reload them */ if (tr->scratch) { struct trace_scratch *tscratch = tr->scratch; @@ -10418,6 +10477,8 @@ trace_array_create_systems(const char *name, const char *systems, if (ftrace_allocate_ftrace_ops(tr) < 0) goto out_free_tr; + trace_array_init_autoremove(tr); + ftrace_init_trace_array(tr); init_trace_flags_index(tr); @@ -10566,6 +10627,7 @@ static int __remove_instance(struct trace_array *tr) if (update_marker_trace(tr, 0)) synchronize_rcu(); + trace_array_cancel_autoremove(tr); tracing_set_nop(tr); clear_ftrace_function_probes(tr); event_trace_del_tracer(tr); diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 0adc644084bf..d9d45dd3c0f3 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -447,6 +447,12 @@ struct trace_array { * we do not waste memory on systems that are not using tracing. */ bool ring_buffer_expanded; + /* + * If the ring buffer is a read only backup instance, it will be + * removed after dumping all data via pipe, because no readable data. + */ + bool free_on_close; + struct work_struct autoremove_work; }; enum { ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v5 3/4] tracing: Remove the backup instance automatically after read 2026-01-28 0:10 ` [PATCH v5 3/4] tracing: Remove the backup instance automatically after read Masami Hiramatsu (Google) @ 2026-01-29 20:13 ` Steven Rostedt 2026-01-30 9:24 ` Masami Hiramatsu 0 siblings, 1 reply; 16+ messages in thread From: Steven Rostedt @ 2026-01-29 20:13 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Wed, 28 Jan 2026 09:10:04 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > Since the backup instance is readonly, after reading all data > via pipe, no data is left on the instance. Thus it can be > removed safely after closing all files. > This also removes it if user resets the ring buffer manually > via 'trace' file. > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > Changes in v4: > - Update description. > --- > kernel/trace/trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++- > kernel/trace/trace.h | 6 +++++ > 2 files changed, 69 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index d39f6509c12a..7d615a74f915 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -590,6 +590,55 @@ void trace_set_ring_buffer_expanded(struct trace_array *tr) > tr->ring_buffer_expanded = true; > } > > +static int __remove_instance(struct trace_array *tr); > + > +static void trace_array_autoremove(struct work_struct *work) > +{ > + struct trace_array *tr = container_of(work, struct trace_array, autoremove_work); > + > + guard(mutex)(&event_mutex); > + guard(mutex)(&trace_types_lock); > + > + /* > + * This can be fail if someone gets @tr before starting this > + * function, but in that case, this will be kicked again when > + * putting it. So we don't care the result. "So we don't care about the result." > + */ > + __remove_instance(tr); > +} > + > +static struct workqueue_struct *autoremove_wq; > + > +static void trace_array_init_autoremove(struct trace_array *tr) > +{ > + INIT_WORK(&tr->autoremove_work, trace_array_autoremove); > +} > + > +static void trace_array_kick_autoremove(struct trace_array *tr) > +{ > + if (!work_pending(&tr->autoremove_work) && autoremove_wq) > + queue_work(autoremove_wq, &tr->autoremove_work); > +} > + > +static void trace_array_cancel_autoremove(struct trace_array *tr) > +{ > + if (work_pending(&tr->autoremove_work)) > + cancel_work(&tr->autoremove_work); > +} > + > +__init static int trace_array_init_autoremove_wq(void) > +{ This isn't needed if there's no backup trace_array right? Instead of creating a work queue when its not needed, just exit out if there's no backup trace_array. Oh, and the above functions should always test autoremove_wq for NULL. > + autoremove_wq = alloc_workqueue("tr_autoremove_wq", > + WQ_UNBOUND | WQ_HIGHPRI, 0); > + if (!autoremove_wq) { > + pr_err("Unable to allocate tr_autoremove_wq\n"); > + return -ENOMEM; > + } > + return 0; > +} > + > +late_initcall_sync(trace_array_init_autoremove_wq); > + > LIST_HEAD(ftrace_trace_arrays); > > int trace_array_get(struct trace_array *this_tr) > @@ -598,7 +647,7 @@ int trace_array_get(struct trace_array *this_tr) > > guard(mutex)(&trace_types_lock); > list_for_each_entry(tr, &ftrace_trace_arrays, list) { > - if (tr == this_tr) { > + if (tr == this_tr && !tr->free_on_close) { > tr->ref++; > return 0; > } Break the above into: if (tr == this_tr) { if (tr->free_on_close) break; tr->ref++; return 0; } Why continue the loop if we found the trace_array but it's in the process of closing? -- Steve ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 3/4] tracing: Remove the backup instance automatically after read 2026-01-29 20:13 ` Steven Rostedt @ 2026-01-30 9:24 ` Masami Hiramatsu 0 siblings, 0 replies; 16+ messages in thread From: Masami Hiramatsu @ 2026-01-30 9:24 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Thu, 29 Jan 2026 15:13:52 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 28 Jan 2026 09:10:04 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > Since the backup instance is readonly, after reading all data > > via pipe, no data is left on the instance. Thus it can be > > removed safely after closing all files. > > This also removes it if user resets the ring buffer manually > > via 'trace' file. > > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > Changes in v4: > > - Update description. > > --- > > kernel/trace/trace.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++- > > kernel/trace/trace.h | 6 +++++ > > 2 files changed, 69 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > > index d39f6509c12a..7d615a74f915 100644 > > --- a/kernel/trace/trace.c > > +++ b/kernel/trace/trace.c > > @@ -590,6 +590,55 @@ void trace_set_ring_buffer_expanded(struct trace_array *tr) > > tr->ring_buffer_expanded = true; > > } > > > > +static int __remove_instance(struct trace_array *tr); > > + > > +static void trace_array_autoremove(struct work_struct *work) > > +{ > > + struct trace_array *tr = container_of(work, struct trace_array, autoremove_work); > > + > > + guard(mutex)(&event_mutex); > > + guard(mutex)(&trace_types_lock); > > + > > + /* > > + * This can be fail if someone gets @tr before starting this > > + * function, but in that case, this will be kicked again when > > + * putting it. So we don't care the result. > > "So we don't care about the result." OK. > > > + */ > > + __remove_instance(tr); > > +} > > + > > +static struct workqueue_struct *autoremove_wq; > > + > > +static void trace_array_init_autoremove(struct trace_array *tr) > > +{ > > + INIT_WORK(&tr->autoremove_work, trace_array_autoremove); > > +} > > + > > +static void trace_array_kick_autoremove(struct trace_array *tr) > > +{ > > + if (!work_pending(&tr->autoremove_work) && autoremove_wq) > > + queue_work(autoremove_wq, &tr->autoremove_work); > > +} > > + > > +static void trace_array_cancel_autoremove(struct trace_array *tr) > > +{ > > + if (work_pending(&tr->autoremove_work)) > > + cancel_work(&tr->autoremove_work); > > +} > > + > > +__init static int trace_array_init_autoremove_wq(void) > > +{ > > This isn't needed if there's no backup trace_array right? > > Instead of creating a work queue when its not needed, just exit out if > there's no backup trace_array. > > Oh, and the above functions should always test autoremove_wq for NULL. Ah, right. OK, let me fix it. > > > + autoremove_wq = alloc_workqueue("tr_autoremove_wq", > > + WQ_UNBOUND | WQ_HIGHPRI, 0); > > + if (!autoremove_wq) { > > + pr_err("Unable to allocate tr_autoremove_wq\n"); > > + return -ENOMEM; > > + } > > + return 0; > > +} > > + > > +late_initcall_sync(trace_array_init_autoremove_wq); > > + > > LIST_HEAD(ftrace_trace_arrays); > > > > int trace_array_get(struct trace_array *this_tr) > > @@ -598,7 +647,7 @@ int trace_array_get(struct trace_array *this_tr) > > > > guard(mutex)(&trace_types_lock); > > list_for_each_entry(tr, &ftrace_trace_arrays, list) { > > - if (tr == this_tr) { > > + if (tr == this_tr && !tr->free_on_close) { > > tr->ref++; > > return 0; > > } > > Break the above into: > > if (tr == this_tr) { > if (tr->free_on_close) > break; > tr->ref++; > return 0; > } > > Why continue the loop if we found the trace_array but it's in the process > of closing? Ah, indeed. OK, I'll fix it. Thank you! > > -- Steve > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance 2026-01-28 0:09 [PATCH v5 0/4] tracing: Remove backup instance after read all Masami Hiramatsu (Google) ` (2 preceding siblings ...) 2026-01-28 0:10 ` [PATCH v5 3/4] tracing: Remove the backup instance automatically after read Masami Hiramatsu (Google) @ 2026-01-28 0:10 ` Masami Hiramatsu (Google) 2026-01-29 20:19 ` Steven Rostedt 3 siblings, 1 reply; 16+ messages in thread From: Masami Hiramatsu (Google) @ 2026-01-28 0:10 UTC (permalink / raw) To: Steven Rostedt Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel, linux-trace-kernel From: Masami Hiramatsu (Google) <mhiramat@kernel.org> Add a section about backup instance to the debugging.rst. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> --- Documentation/trace/debugging.rst | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/debugging.rst index 4d88c346fc38..73e2154bcf98 100644 --- a/Documentation/trace/debugging.rst +++ b/Documentation/trace/debugging.rst @@ -159,3 +159,22 @@ If setting it from the kernel command line, it is recommended to also disable tracing with the "traceoff" flag, and enable tracing after boot up. Otherwise the trace from the most recent boot will be mixed with the trace from the previous boot, and may make it confusing to read. + +Using a backup instance for keeping previous boot data +------------------------------------------------------ + +It is also possible to record trace data at system boot time by specifying +events with the persistent ring buffer, but in this case the data before the +reboot will be lost before it can be read. This problem can be solved by a +backup instance. From the kernel command line:: + + reserve_mem=12M:4096:trace trace_instance=boot_map@trace,sched,irq trace_instance=backup=boot_map + +When the boot up, the previous data in the`boot_map` is copied to "backup" +instance, and the "sched:*" and "irq:*" events for current boot are traced on +"boot_map". Thus user can read the previous boot data from the "backup" instance +without stopping trace. + +Note that this "backup" instance is readonly, and will be removed automatically +if you clear the trace data or read out all trace data from "trace_pipe" or +"trace_pipe_raw" files. \ No newline at end of file ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance 2026-01-28 0:10 ` [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance Masami Hiramatsu (Google) @ 2026-01-29 20:19 ` Steven Rostedt 2026-01-30 13:24 ` Masami Hiramatsu 0 siblings, 1 reply; 16+ messages in thread From: Steven Rostedt @ 2026-01-29 20:19 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Wed, 28 Jan 2026 09:10:11 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > Add a section about backup instance to the debugging.rst. > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > Documentation/trace/debugging.rst | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/debugging.rst > index 4d88c346fc38..73e2154bcf98 100644 > --- a/Documentation/trace/debugging.rst > +++ b/Documentation/trace/debugging.rst > @@ -159,3 +159,22 @@ If setting it from the kernel command line, it is recommended to also > disable tracing with the "traceoff" flag, and enable tracing after boot up. > Otherwise the trace from the most recent boot will be mixed with the trace > from the previous boot, and may make it confusing to read. > + > +Using a backup instance for keeping previous boot data > +------------------------------------------------------ > + > +It is also possible to record trace data at system boot time by specifying > +events with the persistent ring buffer, but in this case the data before the > +reboot will be lost before it can be read. This problem can be solved by a > +backup instance. From the kernel command line:: > + > + reserve_mem=12M:4096:trace trace_instance=boot_map@trace,sched,irq trace_instance=backup=boot_map > + > +When the boot up, the previous data in the`boot_map` is copied to "backup" s/When the boot up/On boot up/ copied to the "backup" > +instance, and the "sched:*" and "irq:*" events for current boot are traced on for the current boot are traced in > +"boot_map". Thus user can read the previous boot data from the "backup" instance Thus the user > +without stopping trace. stopping the trace. > + > +Note that this "backup" instance is readonly, and will be removed automatically > +if you clear the trace data or read out all trace data from "trace_pipe" or > +"trace_pipe_raw" files. > \ No newline at end of file Definite articles are a pain ;-) -- Steve ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance 2026-01-29 20:19 ` Steven Rostedt @ 2026-01-30 13:24 ` Masami Hiramatsu 2026-01-30 14:23 ` Steven Rostedt 0 siblings, 1 reply; 16+ messages in thread From: Masami Hiramatsu @ 2026-01-30 13:24 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Thu, 29 Jan 2026 15:19:00 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 28 Jan 2026 09:10:11 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > Add a section about backup instance to the debugging.rst. > > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > Documentation/trace/debugging.rst | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/debugging.rst > > index 4d88c346fc38..73e2154bcf98 100644 > > --- a/Documentation/trace/debugging.rst > > +++ b/Documentation/trace/debugging.rst > > @@ -159,3 +159,22 @@ If setting it from the kernel command line, it is recommended to also > > disable tracing with the "traceoff" flag, and enable tracing after boot up. > > Otherwise the trace from the most recent boot will be mixed with the trace > > from the previous boot, and may make it confusing to read. > > + > > +Using a backup instance for keeping previous boot data > > +------------------------------------------------------ > > + > > +It is also possible to record trace data at system boot time by specifying > > +events with the persistent ring buffer, but in this case the data before the > > +reboot will be lost before it can be read. This problem can be solved by a > > +backup instance. From the kernel command line:: > > + > > + reserve_mem=12M:4096:trace trace_instance=boot_map@trace,sched,irq trace_instance=backup=boot_map > > + > > +When the boot up, the previous data in the`boot_map` is copied to "backup" > > s/When the boot up/On boot up/ copied to the "backup" Ah, thanks, > > > +instance, and the "sched:*" and "irq:*" events for current boot are traced on > > for the current boot are traced in > > > > +"boot_map". Thus user can read the previous boot data from the "backup" instance ^ the "boot_map". > > Thus the user > > > +without stopping trace. > > stopping the trace. OK. > > > + > > +Note that this "backup" instance is readonly, and will be removed automatically > > +if you clear the trace data or read out all trace data from "trace_pipe" or > > +"trace_pipe_raw" files. > > \ No newline at end of file > > Definite articles are a pain ;-) Thanks! > > -- Steve -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance 2026-01-30 13:24 ` Masami Hiramatsu @ 2026-01-30 14:23 ` Steven Rostedt 2026-01-30 14:27 ` Steven Rostedt 0 siblings, 1 reply; 16+ messages in thread From: Steven Rostedt @ 2026-01-30 14:23 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Fri, 30 Jan 2026 22:24:49 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > + > > > +When the boot up, the previous data in the`boot_map` is copied to "backup" > > > > s/When the boot up/On boot up/ copied to the "backup" > > Ah, thanks, > > > > > > +instance, and the "sched:*" and "irq:*" events for current boot are traced on > > > > for the current boot are traced in > > > > > > > +"boot_map". Thus user can read the previous boot data from the "backup" instance > > ^ the "boot_map". Nope ;-) In this case, leaving out the "the" is appropriate. Now you would need article if it was: the "boot_map" instance But because you left out "instance" you don't need the article, as "boot_map" is a name of an object and doesn't need the article when used by itself. You wouldn't use an article with: The code was written by Steven. But you would if you said: The code was written by the Steven person. ;-) > > > > Definite articles are a pain ;-) As I said. Definite articles in English are a pain ;) -- Steve ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance 2026-01-30 14:23 ` Steven Rostedt @ 2026-01-30 14:27 ` Steven Rostedt 0 siblings, 0 replies; 16+ messages in thread From: Steven Rostedt @ 2026-01-30 14:27 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel On Fri, 30 Jan 2026 09:23:29 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > Nope ;-) In this case, leaving out the "the" is appropriate. Now you would > need article if it was: the "boot_map" instance "need the article" God, I can't even get it right :-p -- Steve ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-01-30 15:21 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-28 0:09 [PATCH v5 0/4] tracing: Remove backup instance after read all Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 1/4] tracing: Reset last_boot_info if ring buffer is reset Masami Hiramatsu (Google) 2026-01-28 0:09 ` [PATCH v5 2/4] tracing: Make the backup instance non-reusable Masami Hiramatsu (Google) 2026-01-29 19:53 ` Steven Rostedt 2026-01-30 9:01 ` Masami Hiramatsu 2026-01-29 20:07 ` Steven Rostedt 2026-01-30 9:20 ` Masami Hiramatsu 2026-01-30 15:21 ` Steven Rostedt 2026-01-28 0:10 ` [PATCH v5 3/4] tracing: Remove the backup instance automatically after read Masami Hiramatsu (Google) 2026-01-29 20:13 ` Steven Rostedt 2026-01-30 9:24 ` Masami Hiramatsu 2026-01-28 0:10 ` [PATCH v5 4/4] tracing/Documentation: Add a section about backup instance Masami Hiramatsu (Google) 2026-01-29 20:19 ` Steven Rostedt 2026-01-30 13:24 ` Masami Hiramatsu 2026-01-30 14:23 ` Steven Rostedt 2026-01-30 14:27 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox