* Re: [GIT PULL] RTLA changes for v7.1
From: Steven Rostedt @ 2026-04-02 16:26 UTC (permalink / raw)
To: Tomas Glozar
Cc: Costa Shulyupin, Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <CAP4=nvRZ2iEVxSgwHwumgonrhdvkYiG9KsCh0S2kwBwnemMi6A@mail.gmail.com>
On Thu, 2 Apr 2026 17:08:36 +0200
Tomas Glozar <tglozar@redhat.com> wrote:
> After merging the fix for 7.0 [1], there's now a context difference
> caused by commit ea06305ff9920 (tools/rtla: Remove unneeded nr_cpus
> arguments) on merging rtla-v7.1 onto the current master. The context
> difference merges cleanly via three-way merge:
>
> $ git merge rtla-v7.1
> Auto-merging tools/tracing/rtla/src/timerlat_bpf.h
> Merge made by the 'ort' strategy.
> ...
>
> Do you prefer me to rebase this PR on top of 7.0-rc6 once it's tagged
> or to leave the pull request as is and perhaps add a note to your PR
> to Linus the merge difference is expected?
This is fine as is. Linus is used to this. He's even OK with minor merge
conflicts. The only time you really need to tell Linus about something is
if the merge conflicts or a merge causes something to break but merges
cleanly (like removing an extra #endif)
-- Steve
^ permalink raw reply
* Re: [GIT PULL] RTLA changes for v7.1
From: Steven Rostedt @ 2026-04-02 16:28 UTC (permalink / raw)
To: Tomas Glozar
Cc: Costa Shulyupin, Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <20260402122634.4d750f2c@gandalf.local.home>
On Thu, 2 Apr 2026 12:26:34 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> This is fine as is. Linus is used to this. He's even OK with minor merge
> conflicts. The only time you really need to tell Linus about something is
> if the merge conflicts or a merge causes something to break but merges
> cleanly (like removing an extra #endif)
In fact, Linus actually looks down at rebasing. You should only rebase if
there's something really nasty.
The code is already in linux-next. Which means it should not be rebased at
all.
-- Steve
^ permalink raw reply
* Re: [PATCH bpf v3 1/2] bpf: Reject sleepable kprobe_multi programs at attach time
From: patchwork-bot+netdevbpf @ 2026-04-02 16:50 UTC (permalink / raw)
To: Varun R Mallya
Cc: bpf, ast, daniel, memxor, yonghong.song, jolsa, rostedt, mhiramat,
linux-kernel, linux-trace-kernel
In-Reply-To: <20260401191126.440683-1-varunrmallya@gmail.com>
Hello:
This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Thu, 2 Apr 2026 00:41:25 +0530 you wrote:
> kprobe.multi programs run in atomic/RCU context and cannot sleep.
> However, bpf_kprobe_multi_link_attach() did not validate whether the
> program being attached had the sleepable flag set, allowing sleepable
> helpers such as bpf_copy_from_user() to be invoked from a non-sleepable
> context.
>
> This causes a "sleeping function called from invalid context" splat:
>
> [...]
Here is the summary with links:
- [bpf,v3,1/2] bpf: Reject sleepable kprobe_multi programs at attach time
https://git.kernel.org/bpf/bpf/c/eb7024bfcc5f
- [bpf,v3,2/2] selftests/bpf: Add test to ensure kprobe_multi is not sleepable
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: NULL pointer dereference when booting ppc64_guest_defconfig in QEMU on -next
From: Andrew Morton @ 2026-04-02 19:26 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Ritesh Harjani (IBM), Harry Yoo (Oracle), linuxppc-dev, Harry Yoo,
Nathan Chancellor, Thomas Weißschuh, Michal Clapinski,
Thomas Gleixner, Steven Rostedt, Masami Hiramatsu, linux-mm,
linux-trace-kernel, linux-kernel, Srikar Dronamraju,
Madhavan Srinivasan
In-Reply-To: <6a5f2139-0ff8-40b3-88f9-98d2ea020d6f@efficios.com>
On Thu, 2 Apr 2026 11:30:53 -0400 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> On 2026-03-20 22:21, Andrew Morton wrote:
> > On Sat, 21 Mar 2026 06:42:41 +0530 Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> >
> >> Looks like this is causing regressions in linux-next with warnings
> >> similar to what Harry also pointed out. Do we have any solution for
> >> this, or are we planning to hold on to this patch[1] and maybe even
> >> remove it temporarily from linux-next, until this is fixed?
> >
> > Yes, I'll disable this patchset.
>
> Hi Andrew,
>
> I have prepared fixes for this issue. On which branch should I rebase
> them ? Do you still have the HPCC series in your branch or should I
> send it anew ?
Cool thanks.
It's best to do a full resend after -rc1 please, presumably against
mainline. Show reviewers the latest version, refresh memories, etc.
^ permalink raw reply
* [PATCH] kernel/trace: fixed static warnings
From: abhijithsriram95 @ 2026-04-02 19:54 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
Cc: Abhijith Sriram
From: Abhijith Sriram <abhijithsriram95@gmail.com>
The change in the function argument description
was due to the static code checker script reading
the word filter back to back
Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
---
kernel/trace/trace_events_trigger.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 655db2e82513..477d8dee3362 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
}
EXPORT_SYMBOL_GPL(event_triggers_post_call);
-#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
+#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
{
@@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
static int event_trigger_regex_open(struct inode *inode, struct file *file)
{
int ret;
+ struct seq_file *m = NULL;
ret = security_locked_down(LOCKDOWN_TRACEFS);
if (ret)
@@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_READ) {
ret = seq_open(file, &event_triggers_seq_ops);
if (!ret) {
- struct seq_file *m = file->private_data;
+ *m = file->private_data;
m->private = file;
}
}
@@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
+ char *buf __free(kfree) = NULL;
struct trace_event_file *event_file;
ssize_t ret;
- char *buf __free(kfree) = NULL;
if (!cnt)
return 0;
@@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
list_for_each_entry(file, &tr->events, list) {
struct event_trigger_data *data, *n;
+
list_for_each_entry_safe(data, n, &file->triggers, list) {
trace_event_trigger_enable_disable(file, 0);
list_del_rcu(&data->list);
@@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
* cmd - the trigger command name
* glob - the trigger command name optionally prefaced with '!'
* param_and_filter - text following cmd and ':'
- * param - text following cmd and ':' and stripped of filter
+ * param - text following cmd and ':' and filter removed
* filter - the optional filter text following (and including) 'if'
*
* To illustrate the use of these components, here are some concrete
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v2] bootconfig: Apply early options from embedded config
From: Masami Hiramatsu @ 2026-04-03 2:45 UTC (permalink / raw)
To: Breno Leitao
Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
linux-doc, oss, paulmck, rostedt, kernel-team, Kiryl Shutsemau
In-Reply-To: <ac0wz_eW5Zgi4t45@gmail.com>
On Wed, 1 Apr 2026 08:01:48 -0700
Breno Leitao <leitao@debian.org> wrote:
> On Wed, Apr 01, 2026 at 10:48:53PM +0900, Masami Hiramatsu wrote:
>
> > > The challenge extends beyond that. There are numerous early_parameter()
> > > definitions scattered throughout the kernel that may or may not be
> > > utilized by setup_arch().
> > >
> > > For example, consider `early_param("mitigations", ..)` in
> > > ./kernel/cpu.c. This modifies the cpu_mitigations global variable, which
> > > is referenced in various locations across different architectures.
> > >
> > > It's worth noting that we have over 300 early_parameter() instances in
> > > the kernel.
> > >
> > > Given this, analyzing all these early parameters and examining each one
> > > individually represents a substantial amount of work.
> >
> > Yes, that may require a substantial amount of work. But to improve
> > the kernel framework around the parameter handling, eventually we
> > need to examine each early parameter.
>
> I'm still uncertain about this approach. The goal is to identify and
> categorize the early parameters that are parsed prior to bootconfig
> initialization.
Yes, if we support early parameters in bootconfig, we need to clarify
which parameters are inherently unsupportable, and document it.
Currently it is easy to say that it does not support the parameter
defined with "early_param()". Similary, maybe we should introduce
"arch_param()" or something like it (or support all of them).
>
> Moreover, this work could become obsolete if bootconfig's initialization
> point shifts earlier or later in the boot sequence, necessitating
> another comprehensive analysis.
If we can init it before calling setup_arch(), yes, we don't need to
check it. So that is another option. Do you think it is feasible to
support all of them? (Of course, theologically we can do, but the
question is the use case and requirements.)
> Conversely, if we successfully move bootconfig initialization earlier
> by breaking the dependency of memblock (assuming this is feasible), the
> vast majority of early parameters would execute after bootconfig is
> configured, eliminating the need for this extensive categorization work.
OK, I agreed.
>
> Please, feel free to tell what approach might be better for the project.
>
> > > Are there alternative approaches? At this point, I'm leaning toward
> > > breaking bootconfig's dependency on memblock, allowing us to invoke it
> > > before setup_arch(). Is this the only practical solution available?!
> >
> > Basically, the memblock dependency comes from allocating copy of data.
> > Only for the embedded bootconfig, we can just pass copy memory block
> > to the xbc_init(). Something like;
> >
> > xbc_init() {
> > xbc_data = memblock_alloc();
> > memcpy(xbc_data, data);
> > __xbc_init(xbc_data);
> > }
> >
> > embedded_xbc_init() {
> > __xbc_init(embedded_bootconfig_data);
> > }
> >
> > Afterwards, we can pass mixture of embedded bootcofnigt and initrd
> > bootconfig data to parser again.
> >
> > (But in this case, we must be careful not to override the early
> > parameters that we have already applied.)
>
> Do you have any additional recommendations if I proceed with this
> approach?
OK,
First of all, even if we enable early parameter support in bootconfig,
this is only possible if bootconfig is embedded. In that case, we can
pass memory that has been pre-allocated at compile time to bootconfig
as a working area. However, this will consume a lot of memory, so it
needs to be selectable in Kconfig.
If you're going to embed this, as Kiryl pointed out[1], it might be better
to pass pre-normalized (or compiled) data and avoid using a parser.
Compilation itself is relatively easy if you utilize the tools/bootconfig.
(However, in this case, there doesn't seem to be much point in using
bootconfig in the first place because we also can use embed kernel
cmdline.)
[1] https://lore.kernel.org/all/acueCFv4neO7zQGI@thinkstation/
Can you clarify the main reason of requesting this feature and
examples?
Thank you,
>
> Thank you for your detailed responses and insights.
> --breno
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* [PATCH v2] kernel/trace: fixed static warnings
From: abhijithsriram95 @ 2026-04-03 7:11 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
Cc: Abhijith Sriram
From: Abhijith Sriram <abhijithsriram95@gmail.com>
The change in the function argument description
was due to the static code checker script reading
the word filter back to back
Changes in v2:
- corrected *m = file->private_data to m = file->private_data
Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
---
kernel/trace/trace_events_trigger.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 655db2e82513..e632a8b77153 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
}
EXPORT_SYMBOL_GPL(event_triggers_post_call);
-#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
+#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
{
@@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
static int event_trigger_regex_open(struct inode *inode, struct file *file)
{
int ret;
+ struct seq_file *m = NULL;
ret = security_locked_down(LOCKDOWN_TRACEFS);
if (ret)
@@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_READ) {
ret = seq_open(file, &event_triggers_seq_ops);
if (!ret) {
- struct seq_file *m = file->private_data;
+ m = file->private_data;
m->private = file;
}
}
@@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
+ char *buf __free(kfree) = NULL;
struct trace_event_file *event_file;
ssize_t ret;
- char *buf __free(kfree) = NULL;
if (!cnt)
return 0;
@@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
list_for_each_entry(file, &tr->events, list) {
struct event_trigger_data *data, *n;
+
list_for_each_entry_safe(data, n, &file->triggers, list) {
trace_event_trigger_enable_disable(file, 0);
list_del_rcu(&data->list);
@@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
* cmd - the trigger command name
* glob - the trigger command name optionally prefaced with '!'
* param_and_filter - text following cmd and ':'
- * param - text following cmd and ':' and stripped of filter
+ * param - text following cmd and ':' and filter removed
* filter - the optional filter text following (and including) 'if'
*
* To illustrate the use of these components, here are some concrete
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] kernel/trace: fixed static warnings
From: Abhijith Sriram @ 2026-04-03 7:29 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
In-Reply-To: <20260402195405.21316-1-abhijithsriram95@gmail.com>
On Thu, Apr 2, 2026 at 9:55 PM <abhijithsriram95@gmail.com> wrote:
>
> From: Abhijith Sriram <abhijithsriram95@gmail.com>
>
> The change in the function argument description
> was due to the static code checker script reading
> the word filter back to back
>
> Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
>
I have corrected and made a new patch, please have a look here:
https://lore.kernel.org/linux-trace-kernel/20260403071108.23422-2-abhijithsriram95@gmail.com/
--
Regards
Abhijith Sriram
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Menglong Dong @ 2026-04-03 10:25 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <CALOAHbDnNba_w_nWH3-S9GAXw0+VKuLTh1gy5hy9Yqgeo4C0iA@mail.gmail.com>
On 2026/4/2 21:20 Yafang Shao <laoar.shao@gmail.com> write:
> On Thu, Apr 2, 2026 at 8:48 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> >
> > On 2026/4/2 17:26, Yafang Shao wrote:
> > > Introduce the ability for kprobes to override the return values of
> > > functions that have been livepatched. This functionality is guarded by the
> > > CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
> >
> > Hi, Yafang. This is a interesting idea.
> >
[...]
>
> +/* noclone to avoid bond_get_slave_hook.constprop.0 */
> +__attribute__((__noclone__, __noinline__))
> +int bond_get_slave_hook(struct sk_buff *skb, u32 hash, unsigned int count)
> +{
> + return -1;
> +}
Hi, yafang.
I see what you mean now. So you want to allow BPF program override
the return of all the kernel functions in a KLP module.
I think the security problem is a big issue. Image that we have a KLP
in our environment. Any users can crash the kernel by hook a BPF
program on it with the calling of bpf_override_write().
What's more, this is a little weird for me. If we allow to use bpf_override_return()
for the kernel functions in a KLP, why not we allow it in a common kernel
module, as KLP is a kind of kernel module. Then, why not we allow to
use it for all the kernel functions?
Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
your example? Then we can override its return directly. This is a more
reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
anyone can modify the return of this function safely.
WDYT?
BTW, this is a BPF modification, so maybe we can use "bpf: xxx" for the title
of this patch. Then, the BPF maintainers can notice this patch ;)
Thanks!
Menglong Dong
>
> static struct slave *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
> struct sk_buff *skb,
> struct bond_up_slave *slaves)
> {
> struct slave *slave;
> unsigned int count;
> + int slave_idx;
> u32 hash;
>
> hash = bond_xmit_hash(bond, skb);
> @@ -5188,6 +5198,13 @@ static struct slave
> *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
> if (unlikely(!count))
> return NULL;
>
> + /* Try BPF hook first - returns slave index directly */
> + slave_idx = bond_get_slave_hook(skb, hash, count);
> + /* If BPF hook returned valid slave index, use it */
> + if (slave_idx >= 0 && slave_idx < count) {
> + slave = slaves->arr[slave_idx];
> + return slave;
> + }
> slave = slaves->arr[hash % count];
> return slave;
> }
>
> - The BPF program
>
> SEC("kprobe/bond_get_slave_hook")
> int BPF_KPROBE(slave_selector, struct sk_buff *skb, u32 hash, u32 count)
> {
> unsigned short net_hdr_off;
> unsigned char *head;
> struct iphdr iph;
> int *slave_idx;
> __u32 daddr;
>
> __u16 proto = BPF_CORE_READ(skb, protocol);
> if (proto != bpf_htons(0x0800))
> return 0;
>
> head = BPF_CORE_READ(skb, head);
> net_hdr_off = BPF_CORE_READ(skb, network_header);
>
> if (bpf_probe_read_kernel(&iph, sizeof(iph), head + net_hdr_off) != 0)
> return 0;
>
> daddr = iph.daddr;
> slave_idx = bpf_map_lookup_elem(&ip_slave_map, &daddr);
> if (slave_idx) {
> int idx = *slave_idx;
>
> if (idx >= 0 && idx < (int)count)
> bpf_override_return(ctx, idx);
> }
> return 0;
> }
>
> >
> > BTW, if we allow the usage of bpf_override_return() on the KLP patched
> > function, we should allow the usage of BPF_MODIFY_RETURN on this
> > case too, right?
>
> It's a possibility, but I haven't tested that specifically yet.
>
> --
> Regards
> Yafang
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Steven Rostedt @ 2026-04-03 11:30 UTC (permalink / raw)
To: Menglong Dong
Cc: Yafang Shao, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa,
ast, daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <3036842.e9J7NaK4W3@7940hx>
On Fri, 03 Apr 2026 18:25:59 +0800
Menglong Dong <menglong.dong@linux.dev> wrote:
> I think the security problem is a big issue. Image that we have a KLP
> in our environment. Any users can crash the kernel by hook a BPF
> program on it with the calling of bpf_override_write().
Right, livepatching may allow for rapid experimentation but that is not its
purpose. It is for fixing production systems without having to reboot.
Using BPF to change the return of a function is a huge security issue.
>
> What's more, this is a little weird for me. If we allow to use bpf_override_return()
> for the kernel functions in a KLP, why not we allow it in a common kernel
> module, as KLP is a kind of kernel module. Then, why not we allow to
> use it for all the kernel functions?
Right.
>
> Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> your example? Then we can override its return directly. This is a more
> reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> anyone can modify the return of this function safely.
If this were to go in, I say it would require both a kernel config, with
a big warning about this being a security hole, and a kernel command line
option to enable it, so that people don't accidentally have it enabled in
their config.
The command line should be something like:
allow_bpf_to_rootkit_functions
-- Steve
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-03 13:26 UTC (permalink / raw)
To: Menglong Dong
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <3036842.e9J7NaK4W3@7940hx>
On Fri, Apr 3, 2026 at 6:26 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2026/4/2 21:20 Yafang Shao <laoar.shao@gmail.com> write:
> > On Thu, Apr 2, 2026 at 8:48 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> > >
> > > On 2026/4/2 17:26, Yafang Shao wrote:
> > > > Introduce the ability for kprobes to override the return values of
> > > > functions that have been livepatched. This functionality is guarded by the
> > > > CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
> > >
> > > Hi, Yafang. This is a interesting idea.
> > >
> [...]
> >
> > +/* noclone to avoid bond_get_slave_hook.constprop.0 */
> > +__attribute__((__noclone__, __noinline__))
> > +int bond_get_slave_hook(struct sk_buff *skb, u32 hash, unsigned int count)
> > +{
> > + return -1;
> > +}
>
> Hi, yafang.
>
> I see what you mean now. So you want to allow BPF program override
> the return of all the kernel functions in a KLP module.
>
> I think the security problem is a big issue. Image that we have a KLP
> in our environment. Any users can crash the kernel by hook a BPF
> program on it with the calling of bpf_override_write().
This feature is guarded by the CONFIG_KPROBE_OVERRIDE_KLP_FUNC
configuration option, which is disabled by default. Consequently, the
user must explicitly enable this option to utilize the feature.
>
> What's more, this is a little weird for me. If we allow to use bpf_override_return()
> for the kernel functions in a KLP, why not we allow it in a common kernel
> module, as KLP is a kind of kernel module. Then, why not we allow to
> use it for all the kernel functions?
By leveraging KLP, we can rapidly deploy new features without
interrupting production workloads. Accordingly, this feature is
specifically targeted at KLP-patched functions to maintain that
seamless delivery model.
>
> Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> your example? Then we can override its return directly. This is a more
> reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> anyone can modify the return of this function safely.
It is unfortunate that ALLOW_ERROR_INJECTION() is incompatible with
KLP-patched functions, as this limits our ability to perform fault
injection on livepatched code
>
> WDYT?
>
> BTW, this is a BPF modification, so maybe we can use "bpf: xxx" for the title
> of this patch. Then, the BPF maintainers can notice this patch ;)
I agree with the suggestion. I will update the subject prefix to
"trace, bpf:" in the next version.
--
Regards
Yafang
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-03 13:30 UTC (permalink / raw)
To: Steven Rostedt
Cc: Menglong Dong, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa,
ast, daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260403073055.031275d9@gandalf.local.home>
On Fri, Apr 3, 2026 at 7:29 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 03 Apr 2026 18:25:59 +0800
> Menglong Dong <menglong.dong@linux.dev> wrote:
>
> > I think the security problem is a big issue. Image that we have a KLP
> > in our environment. Any users can crash the kernel by hook a BPF
> > program on it with the calling of bpf_override_write().
>
> Right, livepatching may allow for rapid experimentation but that is not its
> purpose. It is for fixing production systems without having to reboot.
> Using BPF to change the return of a function is a huge security issue.
>
> >
> > What's more, this is a little weird for me. If we allow to use bpf_override_return()
> > for the kernel functions in a KLP, why not we allow it in a common kernel
> > module, as KLP is a kind of kernel module. Then, why not we allow to
> > use it for all the kernel functions?
>
> Right.
>
> >
> > Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> > your example? Then we can override its return directly. This is a more
> > reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> > anyone can modify the return of this function safely.
>
> If this were to go in, I say it would require both a kernel config, with
> a big warning about this being a security hole, and a kernel command line
> option to enable it, so that people don't accidentally have it enabled in
> their config.
>
> The command line should be something like:
>
> allow_bpf_to_rootkit_functions
The feature is currently gated by CONFIG_KPROBE_OVERRIDE_KLP_FUNC. In
the next revision, I will rename this to
CONFIG_ALLOW_BPF_TO_ROOTKIT_FUNCS and introduce a corresponding kernel
command-line parameter, allow_bpf_to_rootkit_functions, to control
it.
--
Regards
Yafang
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Alexei Starovoitov @ 2026-04-03 14:26 UTC (permalink / raw)
To: Yafang Shao
Cc: Steven Rostedt, Menglong Dong, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Masami Hiramatsu,
Mathieu Desnoyers, KP Singh, Matt Bobrowski, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard, Kumar Kartikeya Dwivedi, Yonghong Song,
live-patching, LKML, linux-trace-kernel, bpf
In-Reply-To: <CALOAHbBBf_vWcwZp9kdXhpFOq_oG87X-7Nj2yurZ6LgBpDHwwQ@mail.gmail.com>
On Fri, Apr 3, 2026 at 6:31 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> >
> > If this were to go in, I say it would require both a kernel config, with
> > a big warning about this being a security hole, and a kernel command line
> > option to enable it, so that people don't accidentally have it enabled in
> > their config.
> >
> > The command line should be something like:
> >
> > allow_bpf_to_rootkit_functions
>
> The feature is currently gated by CONFIG_KPROBE_OVERRIDE_KLP_FUNC. In
> the next revision, I will rename this to
> CONFIG_ALLOW_BPF_TO_ROOTKIT_FUNCS and introduce a corresponding kernel
> command-line parameter, allow_bpf_to_rootkit_functions, to control
> it.
No. Even with extra config this is not ok.
^ permalink raw reply
* Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2
From: Ackerley Tng @ 2026-04-03 14:50 UTC (permalink / raw)
To: Michael Roth
Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
ira.weiny, jmattson, jthoughton, oupton, pankaj.gupta, qperret,
rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
aneesh.kumar, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
Baoquan He, Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu,
Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest, linux-mm
In-Reply-To: <CAEvNRgGm9icDK8sK5ZfqHEOEqSbvjwtihE4p9d3vpBq-NfVjmw@mail.gmail.com>
Ackerley Tng <ackerleytng@google.com> writes:
>
> [...snip...]
>
> guest_memfd's populate will first check that the memory is shared, then
> also set the memory to private after the populate.
>
> [...snip...]
>
Looking at this again, the above basically means that the entire
conversion process needs to be performed within populate.
In addition to setting the attributes in guest_memfd as private, for
consistency, populate will also have to do all the associated
operations, especially unmapping from the host, checking refcounts,
and the list of work in conversion will only increase in future with
direct map removal/restoration and huge page merging.
The complexity of conversion also means possible errors (EAGAIN for
elevated refcounts and ENOMEM when we're out of memory), and error
information like the offset where the elevated refcount was.
It doesn't look like there's room for this information to be plumbed out
through the platform-specific ioctls, and even if we make space, it
seems odd to have conversion-related error information returned through
the platform-specific call.
I agree with the goal of not having KVM touch private memory contents as
tracked by guest_memfd, so I'd like to propose that we distinguish:
1. private as tracked by KVM (guest_memfd/vm_memory_attributes)
2. private as tracked by trusted entity
Currently, in TDX's populate flow, KVM doesn't do any copying, it only
instructs TDX to do the copying.
"TDH.MEM.PAGE.ADD Operands Information Definition" in the TDX module
ABI spec says the source is accessed as shared, the destination is
accessed as private, and in the case that source and destination are
the same, the "In-Place Add" section says the source will be converted
to private.
In SNP's populate flow, SNP-specific code in KVM does the copying from
shared memory into the destination, then makes the destination address
private in the RMP table before telling the firmware to do the
in-place encryption.
I think we should think of the populate ioctls as being
platform-specific ioctls that, when called, accept
+ destination address: private (as tracked by guest_memfd)
+ source address: shared (as tracked by guest_memfd) or NULL
KVM doesn't touch private memory contents, even in this case, because
it's really a platform-specific ioctl that handles loading, and the
platform does expect the destination is private for both TDX and SNP
at the firmware boundary.
Since SNP (platform-specific) only allows in-place launch update, and
KVM had to provide an interface that allows a different source address
for support before in-place conversion, then SNP has to continue
supporting the to-be-deprecated path by doing the copying within
platform-specific code.
For consistency, guest_memfd can continue to check that it tracks the
destination address as private, and sev_gmem_populate will then hide
the copying code away just to support the legacy case.
The flow before in-place conversion is
1. Load memory (shared or non-guest_memfd memory)
2. KVM_SEV_SNP_LAUNCH_UPDATE or KVM_TDX_INIT_MEM_REGION (destination:
gfn for separate private memory, source: shared memory)
The proposed flow for in-place conversion is
1. INIT_SHARED or convert to shared
2. Load memory while it is shared
3. Convert to private (PRESERVE, or new flag?)
4. KVM_SEV_SNP_LAUNCH_UPDATE or KVM_TDX_INIT_MEM_REGION (destination:
gfn for converted private memory, source: NULL)
TLDR:
+ Think of populate ioctls not as KVM touching memory, but platform
handling population.
+ KVM code (kvm_gmem_populate) still doesn't touch memory contents
+ post_populate is platform-specific code that handles loading into
private destination memory just to support legacy non-in-place
conversion.
+ Don't complicate populate ioctls by doing conversion just to support
legacy use-cases where platform-specific code has to do copying on
the host.
>>>
>>> [...snip...]
>>>
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-03 16:00 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Steven Rostedt, Menglong Dong, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Masami Hiramatsu,
Mathieu Desnoyers, KP Singh, Matt Bobrowski, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard, Kumar Kartikeya Dwivedi, Yonghong Song,
live-patching, LKML, linux-trace-kernel, bpf
In-Reply-To: <CAADnVQJobMwH8Q0LtCrwbD2TkzFaLfRaw9gyVnhDpMWGmLFVvg@mail.gmail.com>
On Fri, Apr 3, 2026 at 10:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Apr 3, 2026 at 6:31 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > >
> > > If this were to go in, I say it would require both a kernel config, with
> > > a big warning about this being a security hole, and a kernel command line
> > > option to enable it, so that people don't accidentally have it enabled in
> > > their config.
> > >
> > > The command line should be something like:
> > >
> > > allow_bpf_to_rootkit_functions
> >
> > The feature is currently gated by CONFIG_KPROBE_OVERRIDE_KLP_FUNC. In
> > the next revision, I will rename this to
> > CONFIG_ALLOW_BPF_TO_ROOTKIT_FUNCS and introduce a corresponding kernel
> > command-line parameter, allow_bpf_to_rootkit_functions, to control
> > it.
>
> No. Even with extra config this is not ok.
I will send patch #3 and #4 as a standalone patchset to upstream the
hybrid livepatch first and then figure out how to move
allow_bpf_to_rootkit_functions forward.
--
Regards
Yafang
^ permalink raw reply
* Re: [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions
From: Song Liu @ 2026-04-03 16:06 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, jolsa, ast, daniel,
andrii, martin.lau, eddyz87, memxor, yonghong.song, live-patching,
linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260402092607.96430-1-laoar.shao@gmail.com>
Hi Yafang,
On Thu, Apr 2, 2026 at 2:26 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Livepatching allows for rapid experimentation with new kernel features
> without interrupting production workloads. However, static livepatches lack
> the flexibility required to tune features based on task-specific attributes,
> such as cgroup membership, which is critical in multi-tenant k8s
> environments. Furthermore, hardcoding logic into a livepatch prevents
> dynamic adjustments based on the runtime environment.
>
> To address this, we propose a hybrid approach using BPF. Our production use
> case involves:
>
> 1. Deploying a Livepatch function to serve as a stable BPF hook.
>
> 2. Utilizing bpf_override_return() to dynamically modify the return value
> of that hook based on the current task's context.
Could you please provide a specific use case that can benefit from this?
AFAICT, livepatch is more flexible but risky (may cause crash); while
BPF is safe, but less flexible. The combination you are proposing seems
to get the worse of the two sides. Maybe it can indeed get the benefit of
both sides in some cases, but I cannot think of such examples.
Thanks,
Song
^ permalink raw reply
* Re: [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch
From: Song Liu @ 2026-04-03 16:19 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, jolsa, ast, daniel,
andrii, martin.lau, eddyz87, memxor, yonghong.song, live-patching,
linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260402092607.96430-4-laoar.shao@gmail.com>
On Thu, Apr 2, 2026 at 2:26 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Add a new replaceable attribute to allow the coexistence of both
> atomic-replace and non-atomic-replace livepatches. If replaceable is set to
> 0, the livepatch will not be replaced by a subsequent atomic-replace
> operation.
>
> This is a preparatory patch for following changes.
IIRC, the use case for this change is when multiple users load various
livepatch modules on the same system. I still don't believe this is the
right way to manage livepatches. That said, I won't really NACK this
if other folks think this is a useful option.
In case we really want a feature like this, shall we add a replaceable
flag to each function (klp_func)? This will give us fine granularity
control. For example, user A has a non-replaceable livepatch on
func_a; later user B wants to patch another version of func_a. Then
some logic can detect such a conflict and reject the load of user B's
livepatch. Does this make sense?
Thanks,
Song
^ permalink raw reply
* Re: [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch
From: Dylan Hatch @ 2026-04-03 20:55 UTC (permalink / raw)
To: Song Liu
Cc: Yafang Shao, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
rostedt, mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski,
jolsa, ast, daniel, andrii, martin.lau, eddyz87, memxor,
yonghong.song, live-patching, linux-kernel, linux-trace-kernel,
bpf
In-Reply-To: <CAPhsuW51Hh7XfN6xXm_uMAoDXBBQoNQ5ynqom+wVNdqCt81f2A@mail.gmail.com>
On Fri, Apr 3, 2026 at 9:19 AM Song Liu <song@kernel.org> wrote:
>
> On Thu, Apr 2, 2026 at 2:26 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > Add a new replaceable attribute to allow the coexistence of both
> > atomic-replace and non-atomic-replace livepatches. If replaceable is set to
> > 0, the livepatch will not be replaced by a subsequent atomic-replace
> > operation.
> >
> > This is a preparatory patch for following changes.
>
> IIRC, the use case for this change is when multiple users load various
> livepatch modules on the same system. I still don't believe this is the
> right way to manage livepatches. That said, I won't really NACK this
> if other folks think this is a useful option.
In our production fleet, we apply exactly one cumulative livepatch
module, and we use per-kernel build "livepatch release" branches to
track the contents of these cumulative livepatches. This model has
worked relatively well for us, but there are some painpoints.
We are often under pressure to selectively deploy a livepatch fix to
certain subpopulations of production. If the subpopulation is running
the same build of everything else, this would require us to introduce
another branching factor to the "livepatch release" branches --
something we do not support due to the added toil and complexity.
However, if we had the ability to build "off-band" livepatch modules
that were marked as non-replaceable, we could support these selective
patches without the additional branching factor. I will have to
circulate the idea internally, but to me this seems like a very useful
option to have in certain cases.
>
> In case we really want a feature like this, shall we add a replaceable
> flag to each function (klp_func)? This will give us fine granularity
> control. For example, user A has a non-replaceable livepatch on
> func_a; later user B wants to patch another version of func_a. Then
> some logic can detect such a conflict and reject the load of user B's
> livepatch. Does this make sense?
I agree with this. The ability to reject livepatches that try to
change functions that have already been patched seems like an
important safeguard.
Thanks,
Dylan
^ permalink raw reply
* Re: [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch
From: Song Liu @ 2026-04-03 21:35 UTC (permalink / raw)
To: Dylan Hatch
Cc: Yafang Shao, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
rostedt, mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski,
jolsa, ast, daniel, andrii, martin.lau, eddyz87, memxor,
yonghong.song, live-patching, linux-kernel, linux-trace-kernel,
bpf
In-Reply-To: <CADBMgpy3e25EH5xbKMN5GeOK47jE6uzviknbt35V49_Y7zFj8A@mail.gmail.com>
On Fri, Apr 3, 2026 at 1:55 PM Dylan Hatch <dylanbhatch@google.com> wrote:
[...]
> > IIRC, the use case for this change is when multiple users load various
> > livepatch modules on the same system. I still don't believe this is the
> > right way to manage livepatches. That said, I won't really NACK this
> > if other folks think this is a useful option.
>
> In our production fleet, we apply exactly one cumulative livepatch
> module, and we use per-kernel build "livepatch release" branches to
> track the contents of these cumulative livepatches. This model has
> worked relatively well for us, but there are some painpoints.
>
> We are often under pressure to selectively deploy a livepatch fix to
> certain subpopulations of production. If the subpopulation is running
> the same build of everything else, this would require us to introduce
> another branching factor to the "livepatch release" branches --
> something we do not support due to the added toil and complexity.
>
> However, if we had the ability to build "off-band" livepatch modules
> that were marked as non-replaceable, we could support these selective
> patches without the additional branching factor. I will have to
> circulate the idea internally, but to me this seems like a very useful
> option to have in certain cases.
IIUC, the plan is:
- The regular livepatches are cumulative, have the replace flag; and
are replaceable.
- The occasional "off-band" livepatches do not have the replace flag,
and are not replaceable.
With this setup, for systems with off-band livepatches loaded, we can
still release a cumulative livepatch to replace the previous cumulative
livepatch. Is this the expected use case?
I think there are a few issues with this:
1. The "off-band" livepatches cannot be replaced atomically. To upgrade
"off-band' livepatches, we will have to unload the old version and load
the new version later.
2. Any conflict with the off-band livepatches and regular livepatches will
be difficult to manage. IOW, we kind removed the benefit of cumulative
livepatches. For example, what shall we do if we really need two fixes
to the same kernel functions: one from the original branch, the other
from the off-band branch?
Thanks,
Song
^ permalink raw reply
* Re: [PATCH] kernel/trace: fixed static warnings
From: Masami Hiramatsu @ 2026-04-04 0:18 UTC (permalink / raw)
To: abhijithsriram95
Cc: Steven Rostedt, Mathieu Desnoyers, open list:TRACING,
open list:TRACING
In-Reply-To: <20260402195405.21316-1-abhijithsriram95@gmail.com>
On Thu, 2 Apr 2026 21:54:04 +0200
abhijithsriram95@gmail.com wrote:
> From: Abhijith Sriram <abhijithsriram95@gmail.com>
>
> The change in the function argument description
> was due to the static code checker script reading
> the word filter back to back
>
> Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
> ---
> kernel/trace/trace_events_trigger.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
> index 655db2e82513..477d8dee3362 100644
> --- a/kernel/trace/trace_events_trigger.c
> +++ b/kernel/trace/trace_events_trigger.c
> @@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
> }
> EXPORT_SYMBOL_GPL(event_triggers_post_call);
>
> -#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
> +#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
This is OK.
>
> static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
> {
> @@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
> static int event_trigger_regex_open(struct inode *inode, struct file *file)
> {
> int ret;
> + struct seq_file *m = NULL;
>
> ret = security_locked_down(LOCKDOWN_TRACEFS);
> if (ret)
> @@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
> if (file->f_mode & FMODE_READ) {
> ret = seq_open(file, &event_triggers_seq_ops);
> if (!ret) {
> - struct seq_file *m = file->private_data;
> + *m = file->private_data;
Why is this change required?
> m->private = file;
> }
> }
> @@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
> const char __user *ubuf,
> size_t cnt, loff_t *ppos)
> {
> + char *buf __free(kfree) = NULL;
> struct trace_event_file *event_file;
> ssize_t ret;
> - char *buf __free(kfree) = NULL;
What is this change?
Thanks,
>
> if (!cnt)
> return 0;
> @@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
>
> list_for_each_entry(file, &tr->events, list) {
> struct event_trigger_data *data, *n;
> +
> list_for_each_entry_safe(data, n, &file->triggers, list) {
> trace_event_trigger_enable_disable(file, 0);
> list_del_rcu(&data->list);
> @@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
> * cmd - the trigger command name
> * glob - the trigger command name optionally prefaced with '!'
> * param_and_filter - text following cmd and ':'
> - * param - text following cmd and ':' and stripped of filter
> + * param - text following cmd and ':' and filter removed
> * filter - the optional filter text following (and including) 'if'
> *
> * To illustrate the use of these components, here are some concrete
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v6 4/4] selftests/ftrace: Add accept cases for fprobe list syntax
From: Masami Hiramatsu @ 2026-04-04 0:25 UTC (permalink / raw)
To: Ryan Chung
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <CAB1jyqw_6wepbDaKi7087GDUcJ9t1jQO6qP9pa0DWCjti7ABZg@mail.gmail.com>
On Thu, 2 Apr 2026 11:45:42 -0400
Ryan Chung <seokwoo.chung130@gmail.com> wrote:
> Hi Masami,
>
> Thank you for your feedback. Unfortunately, I am not in the position
> to continue working on this patch series for the foreseeable future.
> If you or anyone else on the list would like to pick it up and carry
> it forward, you are welcome to do so. I appreciate your time and
> effort on this.
I see, that's unfortunate, but I understand. I'll continue to fix
and post updates for this patch series.
I appreciate you starting this series.
Thank you.
>
> Best regards,
> Seokwoo Chung
>
> On Tue, 24 Mar 2026 at 00:12, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 5 Feb 2026 08:58:42 -0500
> > "Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
> >
> > > Add fprobe_list.tc to test the comma-separated symbol list syntax
> > > with :entry/:exit suffixes. Three scenarios are covered:
> > >
> > > 1. List with default (entry) behavior and ! exclusion
> > > 2. List with explicit :entry suffix
> > > 3. List with :exit suffix for return probes
> >
> >
> > Could you also add wildcard pattern test?
> >
> > >
> > > Each test verifies that the correct functions appear in
> > > enabled_functions and that excluded (!) symbols are absent.
> > >
> > > Note: The existing tests add_remove_fprobe.tc, fprobe_syntax_errors.tc,
> > > and add_remove_fprobe_repeat.tc check their "requires" line against the
> > > tracefs README for the old "%return" syntax pattern. Since the README
> > > now documents ":entry|:exit" instead, these tests report UNSUPPORTED.
> > > Their "requires" lines need updating in a follow-up patch.
> >
> > This means you'll break the selftest. please fix those test first.
> > (This fix must be done before "tracing/fprobe: Support comma-separated
> > symbols and :entry/:exit" so that we can safely bisect it.)
> >
> > Thank you,
> >
> >
> > >
> > > Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> > > ---
> > > .../ftrace/test.d/dynevent/fprobe_list.tc | 92 +++++++++++++++++++
> > > 1 file changed, 92 insertions(+)
> > > create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> > >
> > > diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> > > new file mode 100644
> > > index 000000000000..45e57c6f487d
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> > > @@ -0,0 +1,92 @@
> > > +#!/bin/sh
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# description: Fprobe event list syntax and :entry/:exit suffixes
> > > +# requires: dynamic_events "f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]":README
> > > +
> > > +# Setup symbols to test. These are common kernel functions.
> > > +PLACE=vfs_read
> > > +PLACE2=vfs_write
> > > +PLACE3=vfs_open
> > > +
> > > +echo 0 > events/enable
> > > +echo > dynamic_events
> > > +
> > > +# Get baseline count of enabled functions (should be 0 if clean, but be safe)
> > > +if [ -f enabled_functions ]; then
> > > + ocnt=`cat enabled_functions | wc -l`
> > > +else
> > > + ocnt=0
> > > +fi
> > > +
> > > +# Test 1: List default (entry) with exclusion
> > > +# Target: Trace vfs_read and vfs_open, but EXCLUDE vfs_write
> > > +echo "f:test/list_entry $PLACE,!$PLACE2,$PLACE3" >> dynamic_events
> > > +grep -q "test/list_entry" dynamic_events
> > > +test -d events/test/list_entry
> > > +
> > > +echo 1 > events/test/list_entry/enable
> > > +
> > > +grep -q "$PLACE" enabled_functions
> > > +grep -q "$PLACE3" enabled_functions
> > > +! grep -q "$PLACE2" enabled_functions
> > > +
> > > +# Check count (Baseline + 2 new functions)
> > > +cnt=`cat enabled_functions | wc -l`
> > > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > > + exit_fail
> > > +fi
> > > +
> > > +# Cleanup Test 1
> > > +echo 0 > events/test/list_entry/enable
> > > +echo "-:test/list_entry" >> dynamic_events
> > > +! grep -q "test/list_entry" dynamic_events
> > > +
> > > +# Count should return to baseline
> > > +cnt=`cat enabled_functions | wc -l`
> > > +if [ $cnt -ne $ocnt ]; then
> > > + exit_fail
> > > +fi
> > > +
> > > +# Test 2: List with explicit :entry suffix
> > > +# (Should behave exactly like Test 1)
> > > +echo "f:test/list_entry_exp $PLACE,!$PLACE2,$PLACE3:entry" >> dynamic_events
> > > +grep -q "test/list_entry_exp" dynamic_events
> > > +test -d events/test/list_entry_exp
> > > +
> > > +echo 1 > events/test/list_entry_exp/enable
> > > +
> > > +grep -q "$PLACE" enabled_functions
> > > +grep -q "$PLACE3" enabled_functions
> > > +! grep -q "$PLACE2" enabled_functions
> > > +
> > > +cnt=`cat enabled_functions | wc -l`
> > > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > > + exit_fail
> > > +fi
> > > +
> > > +# Cleanup Test 2
> > > +echo 0 > events/test/list_entry_exp/enable
> > > +echo "-:test/list_entry_exp" >> dynamic_events
> > > +
> > > +# Test 3: List with :exit suffix
> > > +echo "f:test/list_exit $PLACE,!$PLACE2,$PLACE3:exit" >> dynamic_events
> > > +grep -q "test/list_exit" dynamic_events
> > > +test -d events/test/list_exit
> > > +
> > > +echo 1 > events/test/list_exit/enable
> > > +
> > > +# Even for return probes, enabled_functions lists the attached symbols
> > > +grep -q "$PLACE" enabled_functions
> > > +grep -q "$PLACE3" enabled_functions
> > > +! grep -q "$PLACE2" enabled_functions
> > > +
> > > +cnt=`cat enabled_functions | wc -l`
> > > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > > + exit_fail
> > > +fi
> > > +
> > > +# Cleanup Test 3
> > > +echo 0 > events/test/list_exit/enable
> > > +echo "-:test/list_exit" >> dynamic_events
> > > +
> > > +clear_trace
> > > --
> > > 2.43.0
> > >
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v15 4/5] ring-buffer: Add persistent ring buffer invalid-page inject test
From: Masami Hiramatsu @ 2026-04-04 0:32 UTC (permalink / raw)
To: Steven Rostedt
Cc: Catalin Marinas, Will Deacon, Mathieu Desnoyers, linux-kernel,
linux-trace-kernel, Ian Rogers, linux-arm-kernel
In-Reply-To: <20260401184055.2f932674@gandalf.local.home>
On Wed, 1 Apr 2026 18:40:55 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I replied with mostly grammar fixes and some rewrites of text.
Thanks for reviewing! Let me update it.
>
> On Tue, 31 Mar 2026 17:36:30 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
>
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Add a self-destractive test for the persistent ring buffer.
>
> "self-destractive"? Do you mean "self-destructive"?
>
> Probably better to call it a "self-corrupting test".
>
> >
> > This will inject erroneous value to some sub-buffer pages (where
>
> inject an erroneous
>
> > the index is even or multiples of 5) in the persistent ring buffer
> > when kernel gets panic, and check whether the number of detected
>
> when the kernel panics, and checks whether
>
> > invalid pages and the total entry_bytes are the same as recorded
>
> same as the recorded
>
> > values after reboot.
> >
> > This can ensure the kernel correctly recover partially corrupted
>
> This ensures that the kernel can correctly recover a partially corrupted
>
> > persistent ring buffer when boot.
>
> after a reboot or panic.
>
> >
> > The test only runs on the persistent ring buffer whose name is
> > "ptracingtest". And user has to fill it up with events before
>
> The user has to fill it with events before a kernel panic.
>
> > kernel panics.
> >
> > To run the test, enable CONFIG_RING_BUFFER_PERSISTENT_INJECT
> > and you have to setup the kernel cmdline;
>
> and add the following kernel cmdline:
>
> Note, it's more proper to use a colon (:) than a semi-colon (;) when
> referencing an example on the next line.
OK,
>
> >
> > reserve_mem=20M:2M:trace trace_instance=ptracingtest^traceoff@trace
> > panic=1
> >
> > And run following commands after the 1st boot;
>
> Run the following commands after the 1st boot:
>
> >
> > cd /sys/kernel/tracing/instances/ptracingtest
> > echo 1 > tracing_on
> > echo 1 > events/enable
> > sleep 3
> > echo c > /proc/sysrq-trigger
> >
> > After panic message, the kernel will reboot and run the verification
> > on the persistent ring buffer, e.g.
> >
> > Ring buffer meta [2] invalid buffer page detected
> > Ring buffer meta [2] is from previous boot! (318 pages discarded)
> > Ring buffer testing [2] invalid pages: PASSED (318/318)
> > Ring buffer testing [2] entry_bytes: PASSED (1300476/1300476)
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> > Changes in v15:
> > - Use pr_warn() for test result.
> > - Inject errors on the page index is multiples of 5 so that
> > this can reproduce contiguous empty pages.
> > Changes in v14:
> > - Rename config to CONFIG_RING_BUFFER_PERSISTENT_INJECT.
> > - Clear meta->nr_invalid/entry_bytes after testing.
> > - Add test commands in config comment.
> > Changes in v10:
> > - Add entry_bytes test.
> > - Do not compile test code if CONFIG_RING_BUFFER_PERSISTENT_SELFTEST=n.
> > Changes in v9:
> > - Test also reader pages.
> > ---
> > include/linux/ring_buffer.h | 1 +
> > kernel/trace/Kconfig | 31 ++++++++++++++++++
> > kernel/trace/ring_buffer.c | 74 +++++++++++++++++++++++++++++++++++++++++++
> > kernel/trace/trace.c | 4 ++
> > 4 files changed, 110 insertions(+)
> >
> > diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> > index 994f52b34344..0670742b2d60 100644
> > --- a/include/linux/ring_buffer.h
> > +++ b/include/linux/ring_buffer.h
> > @@ -238,6 +238,7 @@ int ring_buffer_subbuf_size_get(struct trace_buffer *buffer);
> >
> > enum ring_buffer_flags {
> > RB_FL_OVERWRITE = 1 << 0,
> > + RB_FL_TESTING = 1 << 1,
> > };
> >
> > #ifdef CONFIG_RING_BUFFER
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index e130da35808f..07305ed6d745 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -1202,6 +1202,37 @@ config RING_BUFFER_VALIDATE_TIME_DELTAS
> > Only say Y if you understand what this does, and you
> > still want it enabled. Otherwise say N
> >
> > +config RING_BUFFER_PERSISTENT_INJECT
> > + bool "Enable persistent ring buffer error injection test"
> > + depends on RING_BUFFER
> > + help
> > + Run a selftest on the persistent ring buffer which names
> > + "ptracingtest" (and its backup) when panic_on_reboot by
>
> Does this do anything with the backup?
This meant that if you make a backup of ptracingtest
(instance=backup=ptracingtest) the check runs on backup instance too.
But it may be redundant. I'll drop it.
>
> > + invalidating ring buffer pages.
>
> This option will have the kernel check if the persistent ring
> buffer is named "ptracingtest", and if so, it will corrupt some
> of its pages on a kernel panic. This is used to test if the
> persistent ring buffer can recover from some of its sub-buffers
> being corrupted.
>
> [space]
>
> > + To use this, boot kernel with "ptracingtest" persistent
>
> , boot a kernel with a "ptracingtest" persistent
>
> > + ring buffer, e.g.
> > +
> > + reserve_mem=20M:2M:trace trace_instance=ptracingtest@trace panic=1
> > +
> > + And after the 1st boot, run test command, like;
>
> , run the following commands:
>
> > +
> > + cd /sys/kernel/tracing/instances/ptracingtest
> > + echo 1 > events/enable
> > + echo 1 > tracing_on
> > + sleep 3
> > + echo c > /proc/sysrq-trigger
> > +
> > + After panic message, the kernel reboots and show test results
> > + on the boot log.
>
> After the panic message, the kernel will reboot and will show the
> test results in the console output.
>
> > +
> > + Note that user has to enable events on the persistent ring
> > + buffer manually to fill up ring buffers before rebooting.
>
> Note that events for the ring buffer needs to be enabled prior to
> crashing the kernel so that the ring buffer has content that the
> test will corrupt.
>
> > + Since this invalidates the data on test target ring buffer,
> > + "ptracingtest" persistent ring buffer must not be used for
> > + actual tracing, but only for testing.
>
> As the test will corrupt events in the "ptracingtest" persistent
> ring buffer, it should not be used for any other purpose other
> than his test.
>
>
> > +
> > + If unsure, say N
> > +
> > config MMIOTRACE_TEST
> > tristate "Test module for mmiotrace"
> > depends on MMIOTRACE && m
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index 5ff632ca3858..fb098b0b4505 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -64,6 +64,10 @@ struct ring_buffer_cpu_meta {
> > unsigned long commit_buffer;
> > __u32 subbuf_size;
> > __u32 nr_subbufs;
> > +#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
> > + __u32 nr_invalid;
> > + __u32 entry_bytes;
> > +#endif
> > int buffers[];
> > };
> >
> > @@ -2079,6 +2083,21 @@ static void rb_meta_validate_events(struct
> > ring_buffer_per_cpu *cpu_buffer) if (discarded)
> > pr_cont(" (%d pages discarded)", discarded);
> > pr_cont("\n");
> > +
> > +#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
> > + if (meta->nr_invalid)
> > + pr_warn("Ring buffer testing [%d] invalid pages: %s (%d/%d)\n",
> > + cpu_buffer->cpu,
> > + (discarded == meta->nr_invalid) ? "PASSED" : "FAILED",
> > + discarded, meta->nr_invalid);
> > + if (meta->entry_bytes)
> > + pr_warn("Ring buffer testing [%d] entry_bytes: %s (%ld/%ld)\n",
> > + cpu_buffer->cpu,
> > + (entry_bytes == meta->entry_bytes) ? "PASSED" : "FAILED",
> > + (long)entry_bytes, (long)meta->entry_bytes);
> > + meta->nr_invalid = 0;
> > + meta->entry_bytes = 0;
> > +#endif
> > return;
> >
> > invalid:
> > @@ -2559,12 +2578,67 @@ static void rb_free_cpu_buffer(struct
> > ring_buffer_per_cpu *cpu_buffer) kfree(cpu_buffer);
> > }
> >
> > +#ifdef CONFIG_RING_BUFFER_PERSISTENT_INJECT
> > +static void rb_test_inject_invalid_pages(struct trace_buffer *buffer)
> > +{
> > + struct ring_buffer_per_cpu *cpu_buffer;
> > + struct ring_buffer_cpu_meta *meta;
> > + struct buffer_data_page *dpage;
> > + u32 entry_bytes = 0;
> > + unsigned long ptr;
> > + int subbuf_size;
> > + int invalid = 0;
> > + int cpu;
> > + int i;
> > +
> > + if (!(buffer->flags & RB_FL_TESTING))
> > + return;
> > +
> > + guard(preempt)();
> > + cpu = smp_processor_id();
> > +
> > + cpu_buffer = buffer->buffers[cpu];
> > + meta = cpu_buffer->ring_meta;
> > + ptr = (unsigned long)rb_subbufs_from_meta(meta);
> > + subbuf_size = meta->subbuf_size;
> > +
> > + for (i = 0; i < meta->nr_subbufs; i++) {
> > + int idx = meta->buffers[i];
> > +
> > + dpage = (void *)(ptr + idx * subbuf_size);
> > + /* Skip unused pages */
> > + if (!local_read(&dpage->commit))
> > + continue;
> > +
> > + /*
> > + * Invalidate even pages or multiples of 5. This will lead 3
>
> This will cause 3
OK, thanks for your comments. I'll update it according your
review.
Thank you,
>
> -- Steve
>
> > + * contiguous invalidated(empty) pages.
> > + */
> > + if (!(i & 0x1) || !(i % 5)) {
> > + local_add(subbuf_size + 1, &dpage->commit);
> > + invalid++;
> > + } else {
> > + /* Count total commit bytes. */
> > + entry_bytes += local_read(&dpage->commit);
> > + }
> > + }
> > +
> > + pr_info("Inject invalidated %d pages on CPU%d, total size: %ld\n",
> > + invalid, cpu, (long)entry_bytes);
> > + meta->nr_invalid = invalid;
> > + meta->entry_bytes = entry_bytes;
> > +}
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v2] tracing/probe: reject empty immediate strings
From: Masami Hiramatsu @ 2026-04-04 0:33 UTC (permalink / raw)
To: Pengpeng Hou
Cc: rostedt, mhiramat, mathieu.desnoyers, linux-trace-kernel,
linux-kernel
In-Reply-To: <20260401160315.88518-1-pengpeng@iscas.ac.cn>
On Thu, 2 Apr 2026 00:03:15 +0800
Pengpeng Hou <pengpeng@iscas.ac.cn> wrote:
> parse_probe_arg() accepts quoted immediate strings and passes the body
> after the opening quote to __parse_imm_string(). That helper currently
> computes strlen(str) and immediately dereferences str[len - 1], which
> underflows when the body is empty.
>
> Reject empty immediate strings before checking for the closing quote.
>
Oops, good catch!
Let me pick it.
Thank you!
> Fixes: a42e3c4de964 ("tracing/probe: Add immediate string parameter support")
> Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> ---
> Changes since v1:
> - resend as a standalone patch instead of part of an accidental
> cross-subsystem 1/4 series
>
> kernel/trace/trace_probe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index e0a5dc86c07e..e1c73065dae5 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1068,7 +1068,7 @@ static int __parse_imm_string(char *str, char **pbuf, int offs)
> {
> size_t len = strlen(str);
>
> - if (str[len - 1] != '"') {
> + if (!len || str[len - 1] != '"') {
> trace_probe_log_err(offs + len, IMMSTR_NO_CLOSE);
> return -EINVAL;
> }
> --
> 2.50.1 (Apple Git-155)
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH] kernel/trace: fixed static warnings
From: Abhijith Sriram @ 2026-04-04 6:03 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Steven Rostedt, Mathieu Desnoyers, open list:TRACING,
open list:TRACING
In-Reply-To: <20260404091806.4e1cf73d2775d128f1dc5277@kernel.org>
On Sat, Apr 4, 2026 at 2:18 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 2 Apr 2026 21:54:04 +0200
> abhijithsriram95@gmail.com wrote:
>
> > From: Abhijith Sriram <abhijithsriram95@gmail.com>
> >
> > The change in the function argument description
> > was due to the static code checker script reading
> > the word filter back to back
> >
> > Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
> > ---
> > kernel/trace/trace_events_trigger.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
> > index 655db2e82513..477d8dee3362 100644
> > --- a/kernel/trace/trace_events_trigger.c
> > +++ b/kernel/trace/trace_events_trigger.c
> > @@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
> > }
> > EXPORT_SYMBOL_GPL(event_triggers_post_call);
> >
> > -#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
> > +#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
>
> This is OK.
>
> >
> > static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
> > {
> > @@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
> > static int event_trigger_regex_open(struct inode *inode, struct file *file)
> > {
> > int ret;
> > + struct seq_file *m = NULL;
> >
> > ret = security_locked_down(LOCKDOWN_TRACEFS);
> > if (ret)
> > @@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
> > if (file->f_mode & FMODE_READ) {
> > ret = seq_open(file, &event_triggers_seq_ops);
> > if (!ret) {
> > - struct seq_file *m = file->private_data;
> > + *m = file->private_data;
>
> Why is this change required?
The original warning says "missing blank line after declaration". I
thought it was cleaner to have the
declaration in the beginning of the function. I made a mistake here
which I fixed in the version 2 of
the patch, please have a look here:
https://lore.kernel.org/linux-trace-kernel/20260403071108.23422-2-abhijithsriram95@gmail.com/T/#u
>
> > m->private = file;
> > }
> > }
> > @@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
> > const char __user *ubuf,
> > size_t cnt, loff_t *ppos)
> > {
> > + char *buf __free(kfree) = NULL;
> > struct trace_event_file *event_file;
> > ssize_t ret;
> > - char *buf __free(kfree) = NULL;
>
> What is this change?
The same missing blank lines after declaration was triggered here,
even though there is a blank line after the char *buf.
If I do give an empty line then there is another error "Trailing white
space". So I reordered it and the warning disappeared.
This change I am not super sure since it is usually recommended that
variables of larger size are declared first
for padding purposes. What do you think?
>
> Thanks,
>
> >
> > if (!cnt)
> > return 0;
> > @@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
> >
> > list_for_each_entry(file, &tr->events, list) {
> > struct event_trigger_data *data, *n;
> > +
> > list_for_each_entry_safe(data, n, &file->triggers, list) {
> > trace_event_trigger_enable_disable(file, 0);
> > list_del_rcu(&data->list);
> > @@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
> > * cmd - the trigger command name
> > * glob - the trigger command name optionally prefaced with '!'
> > * param_and_filter - text following cmd and ':'
> > - * param - text following cmd and ':' and stripped of filter
> > + * param - text following cmd and ':' and filter removed
> > * filter - the optional filter text following (and including) 'if'
> > *
> > * To illustrate the use of these components, here are some concrete
> > --
> > 2.43.0
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
--
Regards
Abhijith Sriram
^ permalink raw reply
* [PATCH] tracefs: fix default permissions not being applied on initial mount
From: David Carlier @ 2026-04-04 13:47 UTC (permalink / raw)
To: rostedt, mhiramat
Cc: mathieu.desnoyers, kaleshsingh, linux-trace-kernel, linux-kernel,
stable, David Carlier
Commit e4d32142d1de ("tracing: Fix tracefs mount options") moved the
option application from tracefs_fill_super() to tracefs_reconfigure()
called from tracefs_get_tree(). This fixed mount options being ignored
on user-space mounts when the superblock already exists, but introduced
a regression for the initial kernel-internal mount.
On the first mount (via simple_pin_fs during init), sget_fc() transfers
fc->s_fs_info to sb->s_fs_info and sets fc->s_fs_info to NULL. When
tracefs_get_tree() then calls tracefs_reconfigure(), it sees a NULL
fc->s_fs_info and returns early without applying any options. The root
inode keeps mode 0755 from simple_fill_super() instead of the intended
TRACEFS_DEFAULT_MODE (0700).
Furthermore, even on subsequent user-space mounts without an explicit
mode= option, tracefs_apply_options(sb, true) gates the mode behind
fsi->opts & BIT(Opt_mode), which is unset for the defaults. So the
mode is never corrected unless the user explicitly passes mode=0700.
Restore the tracefs_apply_options(sb, false) call in tracefs_fill_super()
to apply default permissions on initial superblock creation, matching
what debugfs does in debugfs_fill_super().
Fixes: e4d32142d1de ("tracing: Fix tracefs mount options")
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
---
fs/tracefs/inode.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index edfdf139cb02..36058ba5ee48 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -491,6 +491,7 @@ static int tracefs_fill_super(struct super_block *sb, struct fs_context *fc)
return err;
sb->s_op = &tracefs_super_operations;
+ tracefs_apply_options(sb, false);
set_default_d_op(sb, &tracefs_dentry_operations);
return 0;
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox