* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Alexei Starovoitov @ 2026-04-03 14:26 UTC (permalink / raw)
To: Yafang Shao
Cc: Steven Rostedt, Menglong Dong, Josh Poimboeuf, Jiri Kosina,
Miroslav Benes, Petr Mladek, Joe Lawrence, Masami Hiramatsu,
Mathieu Desnoyers, KP Singh, Matt Bobrowski, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard, Kumar Kartikeya Dwivedi, Yonghong Song,
live-patching, LKML, linux-trace-kernel, bpf
In-Reply-To: <CALOAHbBBf_vWcwZp9kdXhpFOq_oG87X-7Nj2yurZ6LgBpDHwwQ@mail.gmail.com>
On Fri, Apr 3, 2026 at 6:31 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> >
> > If this were to go in, I say it would require both a kernel config, with
> > a big warning about this being a security hole, and a kernel command line
> > option to enable it, so that people don't accidentally have it enabled in
> > their config.
> >
> > The command line should be something like:
> >
> > allow_bpf_to_rootkit_functions
>
> The feature is currently gated by CONFIG_KPROBE_OVERRIDE_KLP_FUNC. In
> the next revision, I will rename this to
> CONFIG_ALLOW_BPF_TO_ROOTKIT_FUNCS and introduce a corresponding kernel
> command-line parameter, allow_bpf_to_rootkit_functions, to control
> it.
No. Even with extra config this is not ok.
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-03 13:30 UTC (permalink / raw)
To: Steven Rostedt
Cc: Menglong Dong, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa,
ast, daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260403073055.031275d9@gandalf.local.home>
On Fri, Apr 3, 2026 at 7:29 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 03 Apr 2026 18:25:59 +0800
> Menglong Dong <menglong.dong@linux.dev> wrote:
>
> > I think the security problem is a big issue. Image that we have a KLP
> > in our environment. Any users can crash the kernel by hook a BPF
> > program on it with the calling of bpf_override_write().
>
> Right, livepatching may allow for rapid experimentation but that is not its
> purpose. It is for fixing production systems without having to reboot.
> Using BPF to change the return of a function is a huge security issue.
>
> >
> > What's more, this is a little weird for me. If we allow to use bpf_override_return()
> > for the kernel functions in a KLP, why not we allow it in a common kernel
> > module, as KLP is a kind of kernel module. Then, why not we allow to
> > use it for all the kernel functions?
>
> Right.
>
> >
> > Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> > your example? Then we can override its return directly. This is a more
> > reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> > anyone can modify the return of this function safely.
>
> If this were to go in, I say it would require both a kernel config, with
> a big warning about this being a security hole, and a kernel command line
> option to enable it, so that people don't accidentally have it enabled in
> their config.
>
> The command line should be something like:
>
> allow_bpf_to_rootkit_functions
The feature is currently gated by CONFIG_KPROBE_OVERRIDE_KLP_FUNC. In
the next revision, I will rename this to
CONFIG_ALLOW_BPF_TO_ROOTKIT_FUNCS and introduce a corresponding kernel
command-line parameter, allow_bpf_to_rootkit_functions, to control
it.
--
Regards
Yafang
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-03 13:26 UTC (permalink / raw)
To: Menglong Dong
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <3036842.e9J7NaK4W3@7940hx>
On Fri, Apr 3, 2026 at 6:26 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2026/4/2 21:20 Yafang Shao <laoar.shao@gmail.com> write:
> > On Thu, Apr 2, 2026 at 8:48 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> > >
> > > On 2026/4/2 17:26, Yafang Shao wrote:
> > > > Introduce the ability for kprobes to override the return values of
> > > > functions that have been livepatched. This functionality is guarded by the
> > > > CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
> > >
> > > Hi, Yafang. This is a interesting idea.
> > >
> [...]
> >
> > +/* noclone to avoid bond_get_slave_hook.constprop.0 */
> > +__attribute__((__noclone__, __noinline__))
> > +int bond_get_slave_hook(struct sk_buff *skb, u32 hash, unsigned int count)
> > +{
> > + return -1;
> > +}
>
> Hi, yafang.
>
> I see what you mean now. So you want to allow BPF program override
> the return of all the kernel functions in a KLP module.
>
> I think the security problem is a big issue. Image that we have a KLP
> in our environment. Any users can crash the kernel by hook a BPF
> program on it with the calling of bpf_override_write().
This feature is guarded by the CONFIG_KPROBE_OVERRIDE_KLP_FUNC
configuration option, which is disabled by default. Consequently, the
user must explicitly enable this option to utilize the feature.
>
> What's more, this is a little weird for me. If we allow to use bpf_override_return()
> for the kernel functions in a KLP, why not we allow it in a common kernel
> module, as KLP is a kind of kernel module. Then, why not we allow to
> use it for all the kernel functions?
By leveraging KLP, we can rapidly deploy new features without
interrupting production workloads. Accordingly, this feature is
specifically targeted at KLP-patched functions to maintain that
seamless delivery model.
>
> Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> your example? Then we can override its return directly. This is a more
> reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> anyone can modify the return of this function safely.
It is unfortunate that ALLOW_ERROR_INJECTION() is incompatible with
KLP-patched functions, as this limits our ability to perform fault
injection on livepatched code
>
> WDYT?
>
> BTW, this is a BPF modification, so maybe we can use "bpf: xxx" for the title
> of this patch. Then, the BPF maintainers can notice this patch ;)
I agree with the suggestion. I will update the subject prefix to
"trace, bpf:" in the next version.
--
Regards
Yafang
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Steven Rostedt @ 2026-04-03 11:30 UTC (permalink / raw)
To: Menglong Dong
Cc: Yafang Shao, jpoimboe, jikos, mbenes, pmladek, joe.lawrence,
mhiramat, mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa,
ast, daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <3036842.e9J7NaK4W3@7940hx>
On Fri, 03 Apr 2026 18:25:59 +0800
Menglong Dong <menglong.dong@linux.dev> wrote:
> I think the security problem is a big issue. Image that we have a KLP
> in our environment. Any users can crash the kernel by hook a BPF
> program on it with the calling of bpf_override_write().
Right, livepatching may allow for rapid experimentation but that is not its
purpose. It is for fixing production systems without having to reboot.
Using BPF to change the return of a function is a huge security issue.
>
> What's more, this is a little weird for me. If we allow to use bpf_override_return()
> for the kernel functions in a KLP, why not we allow it in a common kernel
> module, as KLP is a kind of kernel module. Then, why not we allow to
> use it for all the kernel functions?
Right.
>
> Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
> your example? Then we can override its return directly. This is a more
> reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
> anyone can modify the return of this function safely.
If this were to go in, I say it would require both a kernel config, with
a big warning about this being a security hole, and a kernel command line
option to enable it, so that people don't accidentally have it enabled in
their config.
The command line should be something like:
allow_bpf_to_rootkit_functions
-- Steve
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Menglong Dong @ 2026-04-03 10:25 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <CALOAHbDnNba_w_nWH3-S9GAXw0+VKuLTh1gy5hy9Yqgeo4C0iA@mail.gmail.com>
On 2026/4/2 21:20 Yafang Shao <laoar.shao@gmail.com> write:
> On Thu, Apr 2, 2026 at 8:48 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> >
> > On 2026/4/2 17:26, Yafang Shao wrote:
> > > Introduce the ability for kprobes to override the return values of
> > > functions that have been livepatched. This functionality is guarded by the
> > > CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
> >
> > Hi, Yafang. This is a interesting idea.
> >
[...]
>
> +/* noclone to avoid bond_get_slave_hook.constprop.0 */
> +__attribute__((__noclone__, __noinline__))
> +int bond_get_slave_hook(struct sk_buff *skb, u32 hash, unsigned int count)
> +{
> + return -1;
> +}
Hi, yafang.
I see what you mean now. So you want to allow BPF program override
the return of all the kernel functions in a KLP module.
I think the security problem is a big issue. Image that we have a KLP
in our environment. Any users can crash the kernel by hook a BPF
program on it with the calling of bpf_override_write().
What's more, this is a little weird for me. If we allow to use bpf_override_return()
for the kernel functions in a KLP, why not we allow it in a common kernel
module, as KLP is a kind of kernel module. Then, why not we allow to
use it for all the kernel functions?
Can we mark the "bond_get_slave_hook" with ALLOW_ERROR_INJECTION() in
your example? Then we can override its return directly. This is a more
reasonable for me. With ALLOW_ERROR_INJECTION(), we are telling people that
anyone can modify the return of this function safely.
WDYT?
BTW, this is a BPF modification, so maybe we can use "bpf: xxx" for the title
of this patch. Then, the BPF maintainers can notice this patch ;)
Thanks!
Menglong Dong
>
> static struct slave *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
> struct sk_buff *skb,
> struct bond_up_slave *slaves)
> {
> struct slave *slave;
> unsigned int count;
> + int slave_idx;
> u32 hash;
>
> hash = bond_xmit_hash(bond, skb);
> @@ -5188,6 +5198,13 @@ static struct slave
> *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
> if (unlikely(!count))
> return NULL;
>
> + /* Try BPF hook first - returns slave index directly */
> + slave_idx = bond_get_slave_hook(skb, hash, count);
> + /* If BPF hook returned valid slave index, use it */
> + if (slave_idx >= 0 && slave_idx < count) {
> + slave = slaves->arr[slave_idx];
> + return slave;
> + }
> slave = slaves->arr[hash % count];
> return slave;
> }
>
> - The BPF program
>
> SEC("kprobe/bond_get_slave_hook")
> int BPF_KPROBE(slave_selector, struct sk_buff *skb, u32 hash, u32 count)
> {
> unsigned short net_hdr_off;
> unsigned char *head;
> struct iphdr iph;
> int *slave_idx;
> __u32 daddr;
>
> __u16 proto = BPF_CORE_READ(skb, protocol);
> if (proto != bpf_htons(0x0800))
> return 0;
>
> head = BPF_CORE_READ(skb, head);
> net_hdr_off = BPF_CORE_READ(skb, network_header);
>
> if (bpf_probe_read_kernel(&iph, sizeof(iph), head + net_hdr_off) != 0)
> return 0;
>
> daddr = iph.daddr;
> slave_idx = bpf_map_lookup_elem(&ip_slave_map, &daddr);
> if (slave_idx) {
> int idx = *slave_idx;
>
> if (idx >= 0 && idx < (int)count)
> bpf_override_return(ctx, idx);
> }
> return 0;
> }
>
> >
> > BTW, if we allow the usage of bpf_override_return() on the KLP patched
> > function, we should allow the usage of BPF_MODIFY_RETURN on this
> > case too, right?
>
> It's a possibility, but I haven't tested that specifically yet.
>
> --
> Regards
> Yafang
^ permalink raw reply
* Re: [PATCH] kernel/trace: fixed static warnings
From: Abhijith Sriram @ 2026-04-03 7:29 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
In-Reply-To: <20260402195405.21316-1-abhijithsriram95@gmail.com>
On Thu, Apr 2, 2026 at 9:55 PM <abhijithsriram95@gmail.com> wrote:
>
> From: Abhijith Sriram <abhijithsriram95@gmail.com>
>
> The change in the function argument description
> was due to the static code checker script reading
> the word filter back to back
>
> Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
>
I have corrected and made a new patch, please have a look here:
https://lore.kernel.org/linux-trace-kernel/20260403071108.23422-2-abhijithsriram95@gmail.com/
--
Regards
Abhijith Sriram
^ permalink raw reply
* [PATCH v2] kernel/trace: fixed static warnings
From: abhijithsriram95 @ 2026-04-03 7:11 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
Cc: Abhijith Sriram
From: Abhijith Sriram <abhijithsriram95@gmail.com>
The change in the function argument description
was due to the static code checker script reading
the word filter back to back
Changes in v2:
- corrected *m = file->private_data to m = file->private_data
Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
---
kernel/trace/trace_events_trigger.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 655db2e82513..e632a8b77153 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
}
EXPORT_SYMBOL_GPL(event_triggers_post_call);
-#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
+#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
{
@@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
static int event_trigger_regex_open(struct inode *inode, struct file *file)
{
int ret;
+ struct seq_file *m = NULL;
ret = security_locked_down(LOCKDOWN_TRACEFS);
if (ret)
@@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_READ) {
ret = seq_open(file, &event_triggers_seq_ops);
if (!ret) {
- struct seq_file *m = file->private_data;
+ m = file->private_data;
m->private = file;
}
}
@@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
+ char *buf __free(kfree) = NULL;
struct trace_event_file *event_file;
ssize_t ret;
- char *buf __free(kfree) = NULL;
if (!cnt)
return 0;
@@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
list_for_each_entry(file, &tr->events, list) {
struct event_trigger_data *data, *n;
+
list_for_each_entry_safe(data, n, &file->triggers, list) {
trace_event_trigger_enable_disable(file, 0);
list_del_rcu(&data->list);
@@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
* cmd - the trigger command name
* glob - the trigger command name optionally prefaced with '!'
* param_and_filter - text following cmd and ':'
- * param - text following cmd and ':' and stripped of filter
+ * param - text following cmd and ':' and filter removed
* filter - the optional filter text following (and including) 'if'
*
* To illustrate the use of these components, here are some concrete
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v2] bootconfig: Apply early options from embedded config
From: Masami Hiramatsu @ 2026-04-03 2:45 UTC (permalink / raw)
To: Breno Leitao
Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
linux-doc, oss, paulmck, rostedt, kernel-team, Kiryl Shutsemau
In-Reply-To: <ac0wz_eW5Zgi4t45@gmail.com>
On Wed, 1 Apr 2026 08:01:48 -0700
Breno Leitao <leitao@debian.org> wrote:
> On Wed, Apr 01, 2026 at 10:48:53PM +0900, Masami Hiramatsu wrote:
>
> > > The challenge extends beyond that. There are numerous early_parameter()
> > > definitions scattered throughout the kernel that may or may not be
> > > utilized by setup_arch().
> > >
> > > For example, consider `early_param("mitigations", ..)` in
> > > ./kernel/cpu.c. This modifies the cpu_mitigations global variable, which
> > > is referenced in various locations across different architectures.
> > >
> > > It's worth noting that we have over 300 early_parameter() instances in
> > > the kernel.
> > >
> > > Given this, analyzing all these early parameters and examining each one
> > > individually represents a substantial amount of work.
> >
> > Yes, that may require a substantial amount of work. But to improve
> > the kernel framework around the parameter handling, eventually we
> > need to examine each early parameter.
>
> I'm still uncertain about this approach. The goal is to identify and
> categorize the early parameters that are parsed prior to bootconfig
> initialization.
Yes, if we support early parameters in bootconfig, we need to clarify
which parameters are inherently unsupportable, and document it.
Currently it is easy to say that it does not support the parameter
defined with "early_param()". Similary, maybe we should introduce
"arch_param()" or something like it (or support all of them).
>
> Moreover, this work could become obsolete if bootconfig's initialization
> point shifts earlier or later in the boot sequence, necessitating
> another comprehensive analysis.
If we can init it before calling setup_arch(), yes, we don't need to
check it. So that is another option. Do you think it is feasible to
support all of them? (Of course, theologically we can do, but the
question is the use case and requirements.)
> Conversely, if we successfully move bootconfig initialization earlier
> by breaking the dependency of memblock (assuming this is feasible), the
> vast majority of early parameters would execute after bootconfig is
> configured, eliminating the need for this extensive categorization work.
OK, I agreed.
>
> Please, feel free to tell what approach might be better for the project.
>
> > > Are there alternative approaches? At this point, I'm leaning toward
> > > breaking bootconfig's dependency on memblock, allowing us to invoke it
> > > before setup_arch(). Is this the only practical solution available?!
> >
> > Basically, the memblock dependency comes from allocating copy of data.
> > Only for the embedded bootconfig, we can just pass copy memory block
> > to the xbc_init(). Something like;
> >
> > xbc_init() {
> > xbc_data = memblock_alloc();
> > memcpy(xbc_data, data);
> > __xbc_init(xbc_data);
> > }
> >
> > embedded_xbc_init() {
> > __xbc_init(embedded_bootconfig_data);
> > }
> >
> > Afterwards, we can pass mixture of embedded bootcofnigt and initrd
> > bootconfig data to parser again.
> >
> > (But in this case, we must be careful not to override the early
> > parameters that we have already applied.)
>
> Do you have any additional recommendations if I proceed with this
> approach?
OK,
First of all, even if we enable early parameter support in bootconfig,
this is only possible if bootconfig is embedded. In that case, we can
pass memory that has been pre-allocated at compile time to bootconfig
as a working area. However, this will consume a lot of memory, so it
needs to be selectable in Kconfig.
If you're going to embed this, as Kiryl pointed out[1], it might be better
to pass pre-normalized (or compiled) data and avoid using a parser.
Compilation itself is relatively easy if you utilize the tools/bootconfig.
(However, in this case, there doesn't seem to be much point in using
bootconfig in the first place because we also can use embed kernel
cmdline.)
[1] https://lore.kernel.org/all/acueCFv4neO7zQGI@thinkstation/
Can you clarify the main reason of requesting this feature and
examples?
Thank you,
>
> Thank you for your detailed responses and insights.
> --breno
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* [PATCH] kernel/trace: fixed static warnings
From: abhijithsriram95 @ 2026-04-02 19:54 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
open list:TRACING, open list:TRACING
Cc: Abhijith Sriram
From: Abhijith Sriram <abhijithsriram95@gmail.com>
The change in the function argument description
was due to the static code checker script reading
the word filter back to back
Signed-off-by: Abhijith Sriram <abhijithsriram95@gmail.com>
---
kernel/trace/trace_events_trigger.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 655db2e82513..477d8dee3362 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -246,7 +246,7 @@ event_triggers_post_call(struct trace_event_file *file,
}
EXPORT_SYMBOL_GPL(event_triggers_post_call);
-#define SHOW_AVAILABLE_TRIGGERS (void *)(1UL)
+#define SHOW_AVAILABLE_TRIGGERS ((void *)(1UL))
static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
{
@@ -325,6 +325,7 @@ static const struct seq_operations event_triggers_seq_ops = {
static int event_trigger_regex_open(struct inode *inode, struct file *file)
{
int ret;
+ struct seq_file *m = NULL;
ret = security_locked_down(LOCKDOWN_TRACEFS);
if (ret)
@@ -351,7 +352,7 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_READ) {
ret = seq_open(file, &event_triggers_seq_ops);
if (!ret) {
- struct seq_file *m = file->private_data;
+ *m = file->private_data;
m->private = file;
}
}
@@ -388,9 +389,9 @@ static ssize_t event_trigger_regex_write(struct file *file,
const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
+ char *buf __free(kfree) = NULL;
struct trace_event_file *event_file;
ssize_t ret;
- char *buf __free(kfree) = NULL;
if (!cnt)
return 0;
@@ -633,6 +634,7 @@ clear_event_triggers(struct trace_array *tr)
list_for_each_entry(file, &tr->events, list) {
struct event_trigger_data *data, *n;
+
list_for_each_entry_safe(data, n, &file->triggers, list) {
trace_event_trigger_enable_disable(file, 0);
list_del_rcu(&data->list);
@@ -785,7 +787,7 @@ static void unregister_trigger(char *glob,
* cmd - the trigger command name
* glob - the trigger command name optionally prefaced with '!'
* param_and_filter - text following cmd and ':'
- * param - text following cmd and ':' and stripped of filter
+ * param - text following cmd and ':' and filter removed
* filter - the optional filter text following (and including) 'if'
*
* To illustrate the use of these components, here are some concrete
--
2.43.0
^ permalink raw reply related
* Re: NULL pointer dereference when booting ppc64_guest_defconfig in QEMU on -next
From: Andrew Morton @ 2026-04-02 19:26 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Ritesh Harjani (IBM), Harry Yoo (Oracle), linuxppc-dev, Harry Yoo,
Nathan Chancellor, Thomas Weißschuh, Michal Clapinski,
Thomas Gleixner, Steven Rostedt, Masami Hiramatsu, linux-mm,
linux-trace-kernel, linux-kernel, Srikar Dronamraju,
Madhavan Srinivasan
In-Reply-To: <6a5f2139-0ff8-40b3-88f9-98d2ea020d6f@efficios.com>
On Thu, 2 Apr 2026 11:30:53 -0400 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> On 2026-03-20 22:21, Andrew Morton wrote:
> > On Sat, 21 Mar 2026 06:42:41 +0530 Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> >
> >> Looks like this is causing regressions in linux-next with warnings
> >> similar to what Harry also pointed out. Do we have any solution for
> >> this, or are we planning to hold on to this patch[1] and maybe even
> >> remove it temporarily from linux-next, until this is fixed?
> >
> > Yes, I'll disable this patchset.
>
> Hi Andrew,
>
> I have prepared fixes for this issue. On which branch should I rebase
> them ? Do you still have the HPCC series in your branch or should I
> send it anew ?
Cool thanks.
It's best to do a full resend after -rc1 please, presumably against
mainline. Show reviewers the latest version, refresh memories, etc.
^ permalink raw reply
* Re: [PATCH bpf v3 1/2] bpf: Reject sleepable kprobe_multi programs at attach time
From: patchwork-bot+netdevbpf @ 2026-04-02 16:50 UTC (permalink / raw)
To: Varun R Mallya
Cc: bpf, ast, daniel, memxor, yonghong.song, jolsa, rostedt, mhiramat,
linux-kernel, linux-trace-kernel
In-Reply-To: <20260401191126.440683-1-varunrmallya@gmail.com>
Hello:
This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Thu, 2 Apr 2026 00:41:25 +0530 you wrote:
> kprobe.multi programs run in atomic/RCU context and cannot sleep.
> However, bpf_kprobe_multi_link_attach() did not validate whether the
> program being attached had the sleepable flag set, allowing sleepable
> helpers such as bpf_copy_from_user() to be invoked from a non-sleepable
> context.
>
> This causes a "sleeping function called from invalid context" splat:
>
> [...]
Here is the summary with links:
- [bpf,v3,1/2] bpf: Reject sleepable kprobe_multi programs at attach time
https://git.kernel.org/bpf/bpf/c/eb7024bfcc5f
- [bpf,v3,2/2] selftests/bpf: Add test to ensure kprobe_multi is not sleepable
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [GIT PULL] RTLA changes for v7.1
From: Steven Rostedt @ 2026-04-02 16:28 UTC (permalink / raw)
To: Tomas Glozar
Cc: Costa Shulyupin, Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <20260402122634.4d750f2c@gandalf.local.home>
On Thu, 2 Apr 2026 12:26:34 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> This is fine as is. Linus is used to this. He's even OK with minor merge
> conflicts. The only time you really need to tell Linus about something is
> if the merge conflicts or a merge causes something to break but merges
> cleanly (like removing an extra #endif)
In fact, Linus actually looks down at rebasing. You should only rebase if
there's something really nasty.
The code is already in linux-next. Which means it should not be rebased at
all.
-- Steve
^ permalink raw reply
* Re: [GIT PULL] RTLA changes for v7.1
From: Steven Rostedt @ 2026-04-02 16:26 UTC (permalink / raw)
To: Tomas Glozar
Cc: Costa Shulyupin, Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <CAP4=nvRZ2iEVxSgwHwumgonrhdvkYiG9KsCh0S2kwBwnemMi6A@mail.gmail.com>
On Thu, 2 Apr 2026 17:08:36 +0200
Tomas Glozar <tglozar@redhat.com> wrote:
> After merging the fix for 7.0 [1], there's now a context difference
> caused by commit ea06305ff9920 (tools/rtla: Remove unneeded nr_cpus
> arguments) on merging rtla-v7.1 onto the current master. The context
> difference merges cleanly via three-way merge:
>
> $ git merge rtla-v7.1
> Auto-merging tools/tracing/rtla/src/timerlat_bpf.h
> Merge made by the 'ort' strategy.
> ...
>
> Do you prefer me to rebase this PR on top of 7.0-rc6 once it's tagged
> or to leave the pull request as is and perhaps add a note to your PR
> to Linus the merge difference is expected?
This is fine as is. Linus is used to this. He's even OK with minor merge
conflicts. The only time you really need to tell Linus about something is
if the merge conflicts or a merge causes something to break but merges
cleanly (like removing an extra #endif)
-- Steve
^ permalink raw reply
* Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2
From: Ackerley Tng @ 2026-04-02 16:20 UTC (permalink / raw)
To: Michael Roth
Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
ira.weiny, jmattson, jroedel, jthoughton, oupton, pankaj.gupta,
qperret, rick.p.edgecombe, rientjes, shivankg, steven.price,
tabba, willy, wyihan, yan.y.zhao, forkloop, pratyush,
suzuki.poulose, aneesh.kumar, Paolo Bonzini, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
Kemeng Shi, Nhat Pham, Baoquan He, Barry Song, Axel Rasmussen,
Yuanchu Xie, Wei Xu, Jason Gunthorpe, Vlastimil Babka, kvm,
linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
linux-mm
In-Reply-To: <CAEvNRgFkusZeKxGctUpTTbYjdi7nZL1ZZar-gT7XRUOCZ2xtpw@mail.gmail.com>
Ackerley Tng <ackerleytng@google.com> writes:
>
> [...snip...]
>
>> In the case of SNP, there is a
>> documentation/parameter check in snp_launch_update() that needs to be
>> relaxed in order for userspace to be able to pass in a NULL 'src'
>> parameter (since, for in-place conversion, it would be initialized in place
>> as shared memory prior to the call, since by the time kvm_gmem_poulate()
>> it will have been set to private and therefore cannot be faulted in via
>> GUP (and if it could, we'd be unecessarily copying the src back on top
>> of itself since src/dst are the same).
>
>
> [...snip...]
>
>
> Btw, if snp_launch_update() is going to accept a NULL src parameter and
> launch-update the src in-place:
>
> + Will userspace have to set that memory to private before calling launch
> update?
> + If yes, then would we need some other mode of conversion that is
> not ZERO and not quite PRESERVE (since PRESERVE is defined as that
> the guest will see what the host wrote post-encryption, but it
> sounds like launch update is doing the encryption)
> + Or should launch update be called when that memory is shared? Will
> launch update then also set that memory to private in guest_memfd?
>
Update after today's guest_memfd biweekly:
guest_memfd's populate will first check that the memory is shared, then
also set the memory to private after the populate.
KVM must not make assumptions about any memory that is private, so it
should actually only be operating on memory that is shared. This is
aligned with pre-in-place-conversion, since before this series, there
was no way to populate from private memory anyway.
>>
>> [...snip...]
>>
^ permalink raw reply
* Re: [PATCH v6 4/4] selftests/ftrace: Add accept cases for fprobe list syntax
From: Ryan Chung @ 2026-04-02 15:45 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: rostedt, corbet, shuah, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <20260324131204.735c60133288e94718f20d31@kernel.org>
Hi Masami,
Thank you for your feedback. Unfortunately, I am not in the position
to continue working on this patch series for the foreseeable future.
If you or anyone else on the list would like to pick it up and carry
it forward, you are welcome to do so. I appreciate your time and
effort on this.
Best regards,
Seokwoo Chung
On Tue, 24 Mar 2026 at 00:12, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 5 Feb 2026 08:58:42 -0500
> "Seokwoo Chung (Ryan)" <seokwoo.chung130@gmail.com> wrote:
>
> > Add fprobe_list.tc to test the comma-separated symbol list syntax
> > with :entry/:exit suffixes. Three scenarios are covered:
> >
> > 1. List with default (entry) behavior and ! exclusion
> > 2. List with explicit :entry suffix
> > 3. List with :exit suffix for return probes
>
>
> Could you also add wildcard pattern test?
>
> >
> > Each test verifies that the correct functions appear in
> > enabled_functions and that excluded (!) symbols are absent.
> >
> > Note: The existing tests add_remove_fprobe.tc, fprobe_syntax_errors.tc,
> > and add_remove_fprobe_repeat.tc check their "requires" line against the
> > tracefs README for the old "%return" syntax pattern. Since the README
> > now documents ":entry|:exit" instead, these tests report UNSUPPORTED.
> > Their "requires" lines need updating in a follow-up patch.
>
> This means you'll break the selftest. please fix those test first.
> (This fix must be done before "tracing/fprobe: Support comma-separated
> symbols and :entry/:exit" so that we can safely bisect it.)
>
> Thank you,
>
>
> >
> > Signed-off-by: Seokwoo Chung (Ryan) <seokwoo.chung130@gmail.com>
> > ---
> > .../ftrace/test.d/dynevent/fprobe_list.tc | 92 +++++++++++++++++++
> > 1 file changed, 92 insertions(+)
> > create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> >
> > diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> > new file mode 100644
> > index 000000000000..45e57c6f487d
> > --- /dev/null
> > +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
> > @@ -0,0 +1,92 @@
> > +#!/bin/sh
> > +# SPDX-License-Identifier: GPL-2.0
> > +# description: Fprobe event list syntax and :entry/:exit suffixes
> > +# requires: dynamic_events "f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]":README
> > +
> > +# Setup symbols to test. These are common kernel functions.
> > +PLACE=vfs_read
> > +PLACE2=vfs_write
> > +PLACE3=vfs_open
> > +
> > +echo 0 > events/enable
> > +echo > dynamic_events
> > +
> > +# Get baseline count of enabled functions (should be 0 if clean, but be safe)
> > +if [ -f enabled_functions ]; then
> > + ocnt=`cat enabled_functions | wc -l`
> > +else
> > + ocnt=0
> > +fi
> > +
> > +# Test 1: List default (entry) with exclusion
> > +# Target: Trace vfs_read and vfs_open, but EXCLUDE vfs_write
> > +echo "f:test/list_entry $PLACE,!$PLACE2,$PLACE3" >> dynamic_events
> > +grep -q "test/list_entry" dynamic_events
> > +test -d events/test/list_entry
> > +
> > +echo 1 > events/test/list_entry/enable
> > +
> > +grep -q "$PLACE" enabled_functions
> > +grep -q "$PLACE3" enabled_functions
> > +! grep -q "$PLACE2" enabled_functions
> > +
> > +# Check count (Baseline + 2 new functions)
> > +cnt=`cat enabled_functions | wc -l`
> > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > + exit_fail
> > +fi
> > +
> > +# Cleanup Test 1
> > +echo 0 > events/test/list_entry/enable
> > +echo "-:test/list_entry" >> dynamic_events
> > +! grep -q "test/list_entry" dynamic_events
> > +
> > +# Count should return to baseline
> > +cnt=`cat enabled_functions | wc -l`
> > +if [ $cnt -ne $ocnt ]; then
> > + exit_fail
> > +fi
> > +
> > +# Test 2: List with explicit :entry suffix
> > +# (Should behave exactly like Test 1)
> > +echo "f:test/list_entry_exp $PLACE,!$PLACE2,$PLACE3:entry" >> dynamic_events
> > +grep -q "test/list_entry_exp" dynamic_events
> > +test -d events/test/list_entry_exp
> > +
> > +echo 1 > events/test/list_entry_exp/enable
> > +
> > +grep -q "$PLACE" enabled_functions
> > +grep -q "$PLACE3" enabled_functions
> > +! grep -q "$PLACE2" enabled_functions
> > +
> > +cnt=`cat enabled_functions | wc -l`
> > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > + exit_fail
> > +fi
> > +
> > +# Cleanup Test 2
> > +echo 0 > events/test/list_entry_exp/enable
> > +echo "-:test/list_entry_exp" >> dynamic_events
> > +
> > +# Test 3: List with :exit suffix
> > +echo "f:test/list_exit $PLACE,!$PLACE2,$PLACE3:exit" >> dynamic_events
> > +grep -q "test/list_exit" dynamic_events
> > +test -d events/test/list_exit
> > +
> > +echo 1 > events/test/list_exit/enable
> > +
> > +# Even for return probes, enabled_functions lists the attached symbols
> > +grep -q "$PLACE" enabled_functions
> > +grep -q "$PLACE3" enabled_functions
> > +! grep -q "$PLACE2" enabled_functions
> > +
> > +cnt=`cat enabled_functions | wc -l`
> > +if [ $cnt -ne $((ocnt + 2)) ]; then
> > + exit_fail
> > +fi
> > +
> > +# Cleanup Test 3
> > +echo 0 > events/test/list_exit/enable
> > +echo "-:test/list_exit" >> dynamic_events
> > +
> > +clear_trace
> > --
> > 2.43.0
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: NULL pointer dereference when booting ppc64_guest_defconfig in QEMU on -next
From: Mathieu Desnoyers @ 2026-04-02 15:30 UTC (permalink / raw)
To: Andrew Morton, Ritesh Harjani (IBM)
Cc: Harry Yoo (Oracle), linuxppc-dev, Harry Yoo, Nathan Chancellor,
Thomas Weißschuh, Michal Clapinski, Thomas Gleixner,
Steven Rostedt, Masami Hiramatsu, linux-mm, linux-trace-kernel,
linux-kernel, Srikar Dronamraju, Madhavan Srinivasan
In-Reply-To: <20260320192153.759d6fec57f04fb653a0dac7@linux-foundation.org>
On 2026-03-20 22:21, Andrew Morton wrote:
> On Sat, 21 Mar 2026 06:42:41 +0530 Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
>
>> Looks like this is causing regressions in linux-next with warnings
>> similar to what Harry also pointed out. Do we have any solution for
>> this, or are we planning to hold on to this patch[1] and maybe even
>> remove it temporarily from linux-next, until this is fixed?
>
> Yes, I'll disable this patchset.
Hi Andrew,
I have prepared fixes for this issue. On which branch should I rebase
them ? Do you still have the HPCC series in your branch or should I
send it anew ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
^ permalink raw reply
* Re: [GIT PULL] RTLA changes for v7.1
From: Tomas Glozar @ 2026-04-02 15:08 UTC (permalink / raw)
To: Steven Rostedt
Cc: Costa Shulyupin, Wander Lairson Costa, LKML, linux-trace-kernel
In-Reply-To: <20260329122202.65a8b575@robin>
ne 29. 3. 2026 v 18:22 odesílatel Steven Rostedt <rostedt@goodmis.org> napsal:
>
> That should probably be fixed on top of v7.0-rcX so that it is not
> broken in 7.0.
>
> -- Steve
>
After merging the fix for 7.0 [1], there's now a context difference
caused by commit ea06305ff9920 (tools/rtla: Remove unneeded nr_cpus
arguments) on merging rtla-v7.1 onto the current master. The context
difference merges cleanly via three-way merge:
$ git merge rtla-v7.1
Auto-merging tools/tracing/rtla/src/timerlat_bpf.h
Merge made by the 'ort' strategy.
...
Do you prefer me to rebase this PR on top of 7.0-rc6 once it's tagged
or to leave the pull request as is and perhaps add a note to your PR
to Linus the merge difference is expected?
[1] https://lore.kernel.org/all/177490453553.1933951.12021005257041359513.pr-tracker-bot@kernel.org/
Tomas
^ permalink raw reply
* Re: [PATCH v9 2/3] tracing: Remove the backup instance automatically after read
From: Steven Rostedt @ 2026-04-02 14:52 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel
In-Reply-To: <20260402221943.e0ba663a6a223f7f857adaf1@kernel.org>
On Thu, 2 Apr 2026 22:19:43 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> >
> > rmdir() doesn't use __trace_array_get(), it uses trace_array_find() which
> > we shouldn't need to modify.
> >
> Oops, OK it must be updated too.
No it doesn't. Use trace_array_destroy() (as mentioned below) and all will
be fine.
-- Steve
> > > >
> > > > What would prevent this is this is to use trace_array_destroy() that checks
> > > > this and also adds the proper locking:
> > > >
> > > > static void trace_array_autoremove(struct work_struct *work)
> > > > {
> > > > struct trace_array *tr = container_of(work, struct trace_array, autoremove_work);
> > > >
> > > > trace_array_destroy(tr);
> > > > }
> > >
> > > OK, let's use it.
> >
> > Yes, by using trace_array_destroy(), it will fix this.
> >
^ permalink raw reply
* Re: [PATCH v2 0/2] Fix trace remotes read with an offline CPU
From: Marc Zyngier @ 2026-04-02 13:37 UTC (permalink / raw)
To: rostedt, mhiramat, mathieu.desnoyers, linux-trace-kernel,
Vincent Donnefort
Cc: kernel-team, linux-kernel
In-Reply-To: <20260401045100.3394299-1-vdonnefort@google.com>
On Wed, 01 Apr 2026 05:50:58 +0100, Vincent Donnefort wrote:
> This small series is fixing non-consuming read of a trace remote when the
> trace_buffer is created after a CPU is offline.
>
> It also extends hotplug testing coverage to include this test case.
>
> I have based this series on top of kvmarm/next which contains the hypervisor
> tracing patches.
>
> [...]
Applied to next, thanks!
[1/2] tracing: Non-consuming read for trace remotes with an offline CPU
commit: ce47b798ed1e44a6ae2c2966cdf7cba6b428083e
[2/2] tracing: selftests: Extend hotplug testing for trace remotes
commit: ec07906bdc52848bd7dc93d1d44e642dcdc7a15a
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Yafang Shao @ 2026-04-02 13:20 UTC (permalink / raw)
To: Menglong Dong
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <2261072.irdbgypaU6@7950hx>
On Thu, Apr 2, 2026 at 8:48 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2026/4/2 17:26, Yafang Shao wrote:
> > Introduce the ability for kprobes to override the return values of
> > functions that have been livepatched. This functionality is guarded by the
> > CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
>
> Hi, Yafang. This is a interesting idea.
>
> For now, the bpf_override_return() can only be used on the kernel
> functions that allow error injection to prevent the BPF program from
> crash the kernel. If we use it on the kernel functions that patched
> by the KLP, we can crash the kernel easily by return a invalid value
> with bpf_override_return(), right? (Of course, we can crash the kernel
> easily with KLP too ;)
Right.
Livepatch already grants the power to modify the kernel at will;
allowing BPF to override a patched function simply adds a layer of
runtime programmability to an existing modification.
>
> I haven't figure out the use case yet. Can KLP be used together with
> the BPF program that use bpf_override_return()?
The two mechanisms do not target the same entry point: whileKLP
modifies the original kernel function, bpf_override_return() is
applied to the newly patched function provided by the KLP module.
> The KLP will modify
> the RIP on the stack, and the bpf_override_return() will modify it too.
> AFAIK, there can't be two ftrace_ops that both have the
> FTRACE_OPS_FL_IPMODIFY flag. Did I miss something?
Correct, but as noted, they target different functions
>
> It will be helpful for me to understand the use case if a selftests is
> offered :)
Here is a recent use case from our production environment.
- The livepatch
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e378bbe5705f..047e937bfa6d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -5175,12 +5175,22 @@ int bond_update_slave_arr(struct bonding
*bond, struct slave *skipslave)
return ret;
}
+/* noclone to avoid bond_get_slave_hook.constprop.0 */
+__attribute__((__noclone__, __noinline__))
+int bond_get_slave_hook(struct sk_buff *skb, u32 hash, unsigned int count)
+{
+ return -1;
+}
static struct slave *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
struct sk_buff *skb,
struct bond_up_slave *slaves)
{
struct slave *slave;
unsigned int count;
+ int slave_idx;
u32 hash;
hash = bond_xmit_hash(bond, skb);
@@ -5188,6 +5198,13 @@ static struct slave
*bond_xmit_3ad_xor_slave_get(struct bonding *bond,
if (unlikely(!count))
return NULL;
+ /* Try BPF hook first - returns slave index directly */
+ slave_idx = bond_get_slave_hook(skb, hash, count);
+ /* If BPF hook returned valid slave index, use it */
+ if (slave_idx >= 0 && slave_idx < count) {
+ slave = slaves->arr[slave_idx];
+ return slave;
+ }
slave = slaves->arr[hash % count];
return slave;
}
- The BPF program
SEC("kprobe/bond_get_slave_hook")
int BPF_KPROBE(slave_selector, struct sk_buff *skb, u32 hash, u32 count)
{
unsigned short net_hdr_off;
unsigned char *head;
struct iphdr iph;
int *slave_idx;
__u32 daddr;
__u16 proto = BPF_CORE_READ(skb, protocol);
if (proto != bpf_htons(0x0800))
return 0;
head = BPF_CORE_READ(skb, head);
net_hdr_off = BPF_CORE_READ(skb, network_header);
if (bpf_probe_read_kernel(&iph, sizeof(iph), head + net_hdr_off) != 0)
return 0;
daddr = iph.daddr;
slave_idx = bpf_map_lookup_elem(&ip_slave_map, &daddr);
if (slave_idx) {
int idx = *slave_idx;
if (idx >= 0 && idx < (int)count)
bpf_override_return(ctx, idx);
}
return 0;
}
>
> BTW, if we allow the usage of bpf_override_return() on the KLP patched
> function, we should allow the usage of BPF_MODIFY_RETURN on this
> case too, right?
It's a possibility, but I haven't tested that specifically yet.
--
Regards
Yafang
^ permalink raw reply related
* Re: [PATCH v9 2/3] tracing: Remove the backup instance automatically after read
From: Masami Hiramatsu @ 2026-04-02 13:19 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel
In-Reply-To: <20260401104001.5461c5f0@gandalf.local.home>
On Wed, 1 Apr 2026 10:40:01 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 1 Apr 2026 12:19:57 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > >
> > > CPU 0 CPU 1
> > > ----- -----
> > > open(trace_pipe);
> > > read(..);
> > > close(trace_pipe);
> > > kick the work queue to delete it....
> > > rmdir();
> > > [instance deleted]
> >
> > I thought this requires trace_types_lock, and after kicked the queue,
> > can rmdir() gets the tr? (__trace_array_get() return error if
> > tr->free_on_close is set)
>
> rmdir() doesn't use __trace_array_get(), it uses trace_array_find() which
> we shouldn't need to modify.
>
> static int instance_rmdir(const char *name)
> {
> struct trace_array *tr;
>
> guard(mutex)(&event_mutex);
> guard(mutex)(&trace_types_lock);
>
> tr = trace_array_find(name);
> if (!tr)
> return -ENODEV;
>
> return __remove_instance(tr);
> }
Oops, OK it must be updated too.
Thanks,
>
> >
> > >
> > > __remove_instance();
> > >
> > > [ now the tr is freed, and the remove will crash!]
> > >
> > >
> > > What would prevent this is this is to use trace_array_destroy() that checks
> > > this and also adds the proper locking:
> > >
> > > static void trace_array_autoremove(struct work_struct *work)
> > > {
> > > struct trace_array *tr = container_of(work, struct trace_array, autoremove_work);
> > >
> > > trace_array_destroy(tr);
> > > }
> >
> > OK, let's use it.
>
> Yes, by using trace_array_destroy(), it will fix this.
>
> Thanks,
>
> -- Steve
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [RFC PATCH 1/4] trace: Simplify kprobe overridable function check
From: Masami Hiramatsu @ 2026-04-02 13:13 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260402092607.96430-2-laoar.shao@gmail.com>
On Thu, 2 Apr 2026 17:26:04 +0800
Yafang Shao <laoar.shao@gmail.com> wrote:
> Simplify the logic for checking overridable kprobe functions by removing
> redundant code.
>
> No functional change.
NACK.
trace_kprobe must be hidden inside the trace_kprobe.c. It is not
designed to be exposed.
Thank you,
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> kernel/trace/bpf_trace.c | 13 ++++++---
> kernel/trace/trace_kprobe.c | 40 +++++----------------------
> kernel/trace/trace_probe.h | 54 ++++++++++++++++++++++++++-----------
> 3 files changed, 54 insertions(+), 53 deletions(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 0b040a417442..c901ace836cb 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1929,10 +1929,15 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> * Kprobe override only works if they are on the function entry,
> * and only if they are on the opt-in list.
> */
> - if (prog->kprobe_override &&
> - (!trace_kprobe_on_func_entry(event->tp_event) ||
> - !trace_kprobe_error_injectable(event->tp_event)))
> - return -EINVAL;
> + if (prog->kprobe_override) {
> + struct trace_kprobe *tp = trace_kprobe_primary_from_call(event->tp_event);
> +
> + if (!tp)
> + return -EINVAL;
> + if (!trace_kprobe_on_func_entry(tp) ||
> + !trace_kprobe_error_injectable(tp))
> + return -EINVAL;
> + }
>
> mutex_lock(&bpf_event_mutex);
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index a5dbb72528e0..768702674a5c 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -53,17 +53,6 @@ static struct dyn_event_operations trace_kprobe_ops = {
> .match = trace_kprobe_match,
> };
>
> -/*
> - * Kprobe event core functions
> - */
> -struct trace_kprobe {
> - struct dyn_event devent;
> - struct kretprobe rp; /* Use rp.kp for kprobe use */
> - unsigned long __percpu *nhit;
> - const char *symbol; /* symbol name */
> - struct trace_probe tp;
> -};
> -
> static bool is_trace_kprobe(struct dyn_event *ev)
> {
> return ev->ops == &trace_kprobe_ops;
> @@ -212,33 +201,16 @@ unsigned long trace_kprobe_address(struct trace_kprobe *tk)
> return addr;
> }
>
> -static nokprobe_inline struct trace_kprobe *
> -trace_kprobe_primary_from_call(struct trace_event_call *call)
> -{
> - struct trace_probe *tp;
> -
> - tp = trace_probe_primary_from_call(call);
> - if (WARN_ON_ONCE(!tp))
> - return NULL;
> -
> - return container_of(tp, struct trace_kprobe, tp);
> -}
> -
> -bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> +bool trace_kprobe_on_func_entry(struct trace_kprobe *tp)
> {
> - struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
> -
> - return tk ? (kprobe_on_func_entry(tk->rp.kp.addr,
> - tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
> - tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0) : false;
> + return !kprobe_on_func_entry(tp->rp.kp.addr,
> + tp->rp.kp.addr ? NULL : tp->rp.kp.symbol_name,
> + tp->rp.kp.addr ? 0 : tp->rp.kp.offset);
> }
>
> -bool trace_kprobe_error_injectable(struct trace_event_call *call)
> +bool trace_kprobe_error_injectable(struct trace_kprobe *tp)
> {
> - struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
> -
> - return tk ? within_error_injection_list(trace_kprobe_address(tk)) :
> - false;
> + return within_error_injection_list(trace_kprobe_address(tp));
> }
>
> static int register_kprobe_event(struct trace_kprobe *tk);
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 9fc56c937130..958eb78a9068 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -30,6 +30,7 @@
>
> #include "trace.h"
> #include "trace_output.h"
> +#include "trace_dynevent.h"
>
> #define MAX_TRACE_ARGS 128
> #define MAX_ARGSTR_LEN 63
> @@ -210,21 +211,6 @@ DECLARE_BASIC_PRINT_TYPE_FUNC(symbol);
> #define ASSIGN_FETCH_TYPE_END {}
> #define MAX_ARRAY_LEN 64
>
> -#ifdef CONFIG_KPROBE_EVENTS
> -bool trace_kprobe_on_func_entry(struct trace_event_call *call);
> -bool trace_kprobe_error_injectable(struct trace_event_call *call);
> -#else
> -static inline bool trace_kprobe_on_func_entry(struct trace_event_call *call)
> -{
> - return false;
> -}
> -
> -static inline bool trace_kprobe_error_injectable(struct trace_event_call *call)
> -{
> - return false;
> -}
> -#endif /* CONFIG_KPROBE_EVENTS */
> -
> struct probe_arg {
> struct fetch_insn *code;
> bool dynamic;/* Dynamic array (string) is used */
> @@ -271,6 +257,32 @@ struct event_file_link {
> struct list_head list;
> };
>
> +/*
> + * Kprobe event core functions
> + */
> +struct trace_kprobe {
> + struct dyn_event devent;
> + struct kretprobe rp; /* Use rp.kp for kprobe use */
> + unsigned long __percpu *nhit;
> + const char *symbol; /* symbol name */
> + struct trace_probe tp;
> +};
> +
> +#ifdef CONFIG_KPROBE_EVENTS
> +bool trace_kprobe_on_func_entry(struct trace_kprobe *tp);
> +bool trace_kprobe_error_injectable(struct trace_kprobe *tp);
> +#else
> +static inline bool trace_kprobe_on_func_entry(struct trace_kprobe *tp)
> +{
> + return false;
> +}
> +
> +static inline bool trace_kprobe_error_injectable(struct trace_kprobe *tp)
> +{
> + return false;
> +}
> +#endif /* CONFIG_KPROBE_EVENTS */
> +
> static inline unsigned int trace_probe_load_flag(struct trace_probe *tp)
> {
> return smp_load_acquire(&tp->event->flags);
> @@ -329,6 +341,18 @@ trace_probe_primary_from_call(struct trace_event_call *call)
> return list_first_entry_or_null(&tpe->probes, struct trace_probe, list);
> }
>
> +static nokprobe_inline struct trace_kprobe *
> +trace_kprobe_primary_from_call(struct trace_event_call *call)
> +{
> + struct trace_probe *tp;
> +
> + tp = trace_probe_primary_from_call(call);
> + if (WARN_ON_ONCE(!tp))
> + return NULL;
> +
> + return container_of(tp, struct trace_kprobe, tp);
> +}
> +
> static inline struct list_head *trace_probe_probe_list(struct trace_probe *tp)
> {
> return &tp->event->probes;
> --
> 2.47.3
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [RFC PATCH 2/4] trace: Allow kprobes to override livepatched functions
From: Menglong Dong @ 2026-04-02 12:48 UTC (permalink / raw)
To: Yafang Shao
Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
mathieu.desnoyers, kpsingh, mattbobrowski, song, jolsa, ast,
daniel, andrii, martin.lau, eddyz87, memxor, yonghong.song,
Yafang Shao, live-patching, linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <20260402092607.96430-3-laoar.shao@gmail.com>
On 2026/4/2 17:26, Yafang Shao wrote:
> Introduce the ability for kprobes to override the return values of
> functions that have been livepatched. This functionality is guarded by the
> CONFIG_KPROBE_OVERRIDE_KLP_FUNC configuration option.
Hi, Yafang. This is a interesting idea.
For now, the bpf_override_return() can only be used on the kernel
functions that allow error injection to prevent the BPF program from
crash the kernel. If we use it on the kernel functions that patched
by the KLP, we can crash the kernel easily by return a invalid value
with bpf_override_return(), right? (Of course, we can crash the kernel
easily with KLP too ;)
I haven't figure out the use case yet. Can KLP be used together with
the BPF program that use bpf_override_return()? The KLP will modify
the RIP on the stack, and the bpf_override_return() will modify it too.
AFAIK, there can't be two ftrace_ops that both have the
FTRACE_OPS_FL_IPMODIFY flag. Did I miss something?
It will be helpful for me to understand the use case if a selftests is
offered :)
BTW, if we allow the usage of bpf_override_return() on the KLP patched
function, we should allow the usage of BPF_MODIFY_RETURN on this
case too, right?
Thanks!
Menglong Dong
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> kernel/trace/Kconfig | 14 ++++++++++++++
> kernel/trace/bpf_trace.c | 3 ++-
> kernel/trace/trace_kprobe.c | 17 +++++++++++++++++
> kernel/trace/trace_probe.h | 5 +++++
> 4 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 49de13cae428..db712c8cb745 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -1279,6 +1279,20 @@ config HIST_TRIGGERS_DEBUG
>
> If unsure, say N.
>
> +config KPROBE_OVERRIDE_KLP_FUNC
> + bool "Allow kprobes to override livepatched functions"
> + depends on KPROBES && LIVEPATCH
> + help
> + This option allows BPF programs to use kprobes to override functions
> + that have already been patched by Livepatch (KLP).
> +
> + Enabling this provides a mechanism to dynamically control execution
> + flow without requiring a reboot or a new livepatch module. It
> + effectively combines the persistence of livepatching with the
> + programmability of BPF.
> +
> + If unsure, say N.
> +
> source "kernel/trace/rv/Kconfig"
>
> endif # FTRACE
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index c901ace836cb..08ae2b1a912c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1935,7 +1935,8 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
> if (!tp)
> return -EINVAL;
> if (!trace_kprobe_on_func_entry(tp) ||
> - !trace_kprobe_error_injectable(tp))
> + (!trace_kprobe_error_injectable(tp) &&
> + !trace_kprobe_klp_func_overridable(tp)))
> return -EINVAL;
> }
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 768702674a5c..6f05451fbc76 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -213,6 +213,23 @@ bool trace_kprobe_error_injectable(struct trace_kprobe *tp)
> return within_error_injection_list(trace_kprobe_address(tp));
> }
>
> +bool trace_kprobe_klp_func_overridable(struct trace_kprobe *tp)
> +{
> + bool overridable = false;
> +#ifdef CONFIG_KPROBE_OVERRIDE_KLP_FUNC
> + struct module *mod;
> + unsigned long addr;
> +
> + addr = trace_kprobe_address(tp);
> + rcu_read_lock();
> + mod = __module_address(addr);
> + if (mod && mod->klp)
> + overridable = true;
> + rcu_read_unlock();
> +#endif
> + return overridable;
> +}
> +
> static int register_kprobe_event(struct trace_kprobe *tk);
> static int unregister_kprobe_event(struct trace_kprobe *tk);
>
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 958eb78a9068..84bd2617db7c 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -271,6 +271,7 @@ struct trace_kprobe {
> #ifdef CONFIG_KPROBE_EVENTS
> bool trace_kprobe_on_func_entry(struct trace_kprobe *tp);
> bool trace_kprobe_error_injectable(struct trace_kprobe *tp);
> +bool trace_kprobe_klp_func_overridable(struct trace_kprobe *tp);
> #else
> static inline bool trace_kprobe_on_func_entry(struct trace_kprobe *tp)
> {
> @@ -281,6 +282,10 @@ static inline bool trace_kprobe_error_injectable(struct trace_kprobe *tp)
> {
> return false;
> }
> +static inline bool trace_kprobe_klp_func_overridable(struct trace_kprobe *tp)
> +{
> + return false;
> +}
> #endif /* CONFIG_KPROBE_EVENTS */
>
> static inline unsigned int trace_probe_load_flag(struct trace_probe *tp)
>
^ permalink raw reply
* Re: [PATCH bpf v3 1/2] bpf: Reject sleepable kprobe_multi programs at attach time
From: Jiri Olsa @ 2026-04-02 9:47 UTC (permalink / raw)
To: Varun R Mallya
Cc: bpf, ast, daniel, memxor, yonghong.song, rostedt, mhiramat,
linux-kernel, linux-trace-kernel
In-Reply-To: <20260401191126.440683-1-varunrmallya@gmail.com>
On Thu, Apr 02, 2026 at 12:41:25AM +0530, Varun R Mallya wrote:
> kprobe.multi programs run in atomic/RCU context and cannot sleep.
> However, bpf_kprobe_multi_link_attach() did not validate whether the
> program being attached had the sleepable flag set, allowing sleepable
> helpers such as bpf_copy_from_user() to be invoked from a non-sleepable
> context.
>
> This causes a "sleeping function called from invalid context" splat:
>
> BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:169
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1787, name: sudo
> preempt_count: 1, expected: 0
> RCU nest depth: 2, expected: 0
>
> Fix this by rejecting sleepable programs early in
> bpf_kprobe_multi_link_attach(), before any further processing.
>
> Fixes: 0dcac2725406 ("bpf: Add multi kprobe link")
> Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
nice catch!
Acked-by: Jiri Olsa <jolsa@kernel.org>
thanks,
jirka
> ---
> kernel/trace/bpf_trace.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 0b040a417442..af7079aa0f36 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2752,6 +2752,10 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
> if (!is_kprobe_multi(prog))
> return -EINVAL;
>
> + /* kprobe_multi is not allowed to be sleepable. */
> + if (prog->sleepable)
> + return -EINVAL;
> +
> /* Writing to context is not allowed for kprobes. */
> if (prog->aux->kprobe_write_ctx)
> return -EINVAL;
> --
> 2.53.0
>
^ permalink raw reply
* Re: [PATCH bpf v3 2/2] selftests/bpf: Add test to ensure kprobe_multi is not sleepable
From: Jiri Olsa @ 2026-04-02 9:46 UTC (permalink / raw)
To: Kumar Kartikeya Dwivedi
Cc: Varun R Mallya, bpf, ast, daniel, yonghong.song, rostedt,
mhiramat, linux-kernel, linux-trace-kernel
In-Reply-To: <CAP01T74cudrCFGAJhhNUWdCS+D1Gn5yFNccaS85YcoX8vdgzBQ@mail.gmail.com>
On Thu, Apr 02, 2026 at 12:50:10AM +0200, Kumar Kartikeya Dwivedi wrote:
> On Wed, 1 Apr 2026 at 21:11, Varun R Mallya <varunrmallya@gmail.com> wrote:
> >
> > Add a selftest to ensure that kprobe_multi programs cannot be attached
> > using the BPF_F_SLEEPABLE flag. This test succeeds when the kernel
> > rejects attachment of kprobe_multi when the BPF_F_SLEEPABLE flag is set.
> >
> > Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
> > ---
>
> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>
> > .../bpf/prog_tests/kprobe_multi_test.c | 41 +++++++++++++++++++
> > .../bpf/progs/kprobe_multi_sleepable.c | 13 ++++++
> > 2 files changed, 54 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/progs/kprobe_multi_sleepable.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> > index 78c974d4ea33..f02fec2b6fda 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> > @@ -10,6 +10,7 @@
> > #include "kprobe_multi_session_cookie.skel.h"
> > #include "kprobe_multi_verifier.skel.h"
> > #include "kprobe_write_ctx.skel.h"
> > +#include "kprobe_multi_sleepable.skel.h"
> > #include "bpf/libbpf_internal.h"
> > #include "bpf/hashmap.h"
> >
> > @@ -633,6 +634,44 @@ static void test_attach_write_ctx(void)
> > }
> > #endif
> >
> > +static void test_attach_multi_sleepable(void)
> > +{
> > + struct kprobe_multi_sleepable *skel;
> > + int err;
> > +
> > + skel = kprobe_multi_sleepable__open();
> > + if (!ASSERT_OK_PTR(skel, "kprobe_multi_sleepable__open"))
> > + return;
> > +
> > + err = bpf_program__set_flags(skel->progs.handle_kprobe_multi_sleepable,
> > + BPF_F_SLEEPABLE);
> > + if (!ASSERT_OK(err, "bpf_program__set_flags"))
> > + goto cleanup;
> > +
> > + /* Load should succeed even with BPF_F_SLEEPABLE for KPROBE types */
> > + err = kprobe_multi_sleepable__load(skel);
> > + if (!ASSERT_OK(err, "kprobe_multi_sleepable__load"))
> > + goto cleanup;
> > +
> > + /* Attachment must fail for kprobe.multi + BPF_F_SLEEPABLE.
> > + * Also chosen a stable symbol to send into opts
> > + */
> > + LIBBPF_OPTS(bpf_kprobe_multi_opts, opts);
> > + const char *sym = "vfs_read";
> > +
> > + opts.syms = &sym;
> > + opts.cnt = 1;
> > +
> > + skel->links.handle_kprobe_multi_sleepable =
> > + bpf_program__attach_kprobe_multi_opts(skel->progs.handle_kprobe_multi_sleepable,
> > + NULL, &opts);
> > + ASSERT_ERR_PTR(skel->links.handle_kprobe_multi_sleepable,
> > + "bpf_program__attach_kprobe_multi_opts");
>
> Nit: While vfs_read will likely remain stable, the check could
> probably be stronger to distinguish an attach error from -EINVAL?
> I added a typo to vfs_read and it still passed, because it failed to
> attach instead of getting rejected on unfixed kernel.
> May not be a big deal since vfs_read is unlikely to break.
> I verified it works by adding bpf_copy_from_user to the program and
> attaching to SYS_PREFIX sys_getpid and invoking the splat though, so
> LGTM otherwise.
why not use bpf_fentry_test2 ? you could also put it in pattern argument
and bypass opts completely (up to you)
also there's test_attach_api_fails test, please move it over there
thanks,
jirka
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox