* [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
@ 2025-09-22 9:57 Menglong Dong
2025-09-22 14:08 ` Song Liu
2025-09-23 7:52 ` kernel test robot
0 siblings, 2 replies; 6+ messages in thread
From: Menglong Dong @ 2025-09-22 9:57 UTC (permalink / raw)
To: jolsa
Cc: kpsingh, mattbobrowski, song, ast, daniel, andrii, martin.lau,
eddyz87, yonghong.song, john.fastabend, sdf, haoluo, rostedt,
mhiramat, mathieu.desnoyers, bpf, linux-kernel,
linux-trace-kernel
The "data" in struct bpf_session_run_ctx is always 8-bytes aligned.
Therefore, we can store the "is_return" to the last bit of the "data",
which can make bpf_session_run_ctx 8-bytes aligned and save memory.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
kernel/trace/bpf_trace.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f2360579658e..b6a34aa039f6 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2269,7 +2269,6 @@ fs_initcall(bpf_event_init);
struct bpf_session_run_ctx {
struct bpf_run_ctx run_ctx;
- bool is_return;
void *data;
};
@@ -2535,8 +2534,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
{
struct bpf_kprobe_multi_run_ctx run_ctx = {
.session_ctx = {
- .is_return = is_return,
- .data = data,
+ .data = (void *)((unsigned long)data | !!is_return),
},
.link = link,
.entry_ip = entry_ip,
@@ -3058,8 +3056,7 @@ static int uprobe_prog_run(struct bpf_uprobe *uprobe,
struct bpf_uprobe_multi_link *link = uprobe->link;
struct bpf_uprobe_multi_run_ctx run_ctx = {
.session_ctx = {
- .is_return = is_return,
- .data = data,
+ .data = (void *)((unsigned long)data | !!is_return),
},
.entry_ip = entry_ip,
.uprobe = uprobe,
@@ -3316,7 +3313,7 @@ __bpf_kfunc bool bpf_session_is_return(void)
struct bpf_session_run_ctx *session_ctx;
session_ctx = container_of(current->bpf_ctx, struct bpf_session_run_ctx, run_ctx);
- return session_ctx->is_return;
+ return (unsigned long)session_ctx->data & 1;
}
__bpf_kfunc __u64 *bpf_session_cookie(void)
@@ -3324,7 +3321,7 @@ __bpf_kfunc __u64 *bpf_session_cookie(void)
struct bpf_session_run_ctx *session_ctx;
session_ctx = container_of(current->bpf_ctx, struct bpf_session_run_ctx, run_ctx);
- return session_ctx->data;
+ return (__u64 *)((unsigned long)session_ctx->data & ~1);
}
__bpf_kfunc_end_defs();
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
2025-09-22 9:57 [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx Menglong Dong
@ 2025-09-22 14:08 ` Song Liu
2025-09-22 14:10 ` Menglong Dong
2025-09-23 7:52 ` kernel test robot
1 sibling, 1 reply; 6+ messages in thread
From: Song Liu @ 2025-09-22 14:08 UTC (permalink / raw)
To: Menglong Dong
Cc: jolsa, kpsingh, mattbobrowski, ast, daniel, andrii, martin.lau,
eddyz87, yonghong.song, john.fastabend, sdf, haoluo, rostedt,
mhiramat, mathieu.desnoyers, bpf, linux-kernel,
linux-trace-kernel
On Mon, Sep 22, 2025 at 11:57 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> The "data" in struct bpf_session_run_ctx is always 8-bytes aligned.
> Therefore, we can store the "is_return" to the last bit of the "data",
> which can make bpf_session_run_ctx 8-bytes aligned and save memory.
Does this really save anything? AFAICT, bpf_session_run_ctx is
only allocated on the stack. Therefore, we don't save any memory
unless there is potential risk of stack overflow.
OTOH, this last-bit logic is confusing and error prone. I would argue
against this type of optimization.
Thanks,
Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
2025-09-22 14:08 ` Song Liu
@ 2025-09-22 14:10 ` Menglong Dong
2025-09-23 19:23 ` Alexei Starovoitov
0 siblings, 1 reply; 6+ messages in thread
From: Menglong Dong @ 2025-09-22 14:10 UTC (permalink / raw)
To: Song Liu
Cc: jolsa, kpsingh, mattbobrowski, ast, daniel, andrii, martin.lau,
eddyz87, yonghong.song, john.fastabend, sdf, haoluo, rostedt,
mhiramat, mathieu.desnoyers, bpf, linux-kernel,
linux-trace-kernel
On Mon, Sep 22, 2025 at 10:08 PM Song Liu <song@kernel.org> wrote:
>
> On Mon, Sep 22, 2025 at 11:57 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > The "data" in struct bpf_session_run_ctx is always 8-bytes aligned.
> > Therefore, we can store the "is_return" to the last bit of the "data",
> > which can make bpf_session_run_ctx 8-bytes aligned and save memory.
>
> Does this really save anything? AFAICT, bpf_session_run_ctx is
> only allocated on the stack. Therefore, we don't save any memory
> unless there is potential risk of stack overflow.
Hi, Song. My original intention is to save the usage of the
stack to prevent potential stack overflow, especially when we
trace all the kernel functions with kprobe-multi.
The most thing for me is that the unaligned field in the struct
looks very awkward, and it consumes 8-bytes only for a bit.
>
> OTOH, this last-bit logic is confusing and error prone. I would argue
> against this type of optimization.
Ah, you are right about this part. It does make the code more
confusing :/
Thanks!
Menglong Dong
>
> Thanks,
> Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
2025-09-22 9:57 [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx Menglong Dong
2025-09-22 14:08 ` Song Liu
@ 2025-09-23 7:52 ` kernel test robot
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-09-23 7:52 UTC (permalink / raw)
To: Menglong Dong, jolsa
Cc: oe-kbuild-all, kpsingh, mattbobrowski, song, ast, daniel, andrii,
martin.lau, eddyz87, yonghong.song, john.fastabend, sdf, haoluo,
rostedt, mhiramat, mathieu.desnoyers, bpf, linux-kernel,
linux-trace-kernel
Hi Menglong,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/Menglong-Dong/bpf-remove-is_return-in-struct-bpf_session_run_ctx/20250922-175833
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20250922095705.252519-1-dongml2%40chinatelecom.cn
patch subject: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
config: i386-randconfig-r113-20250923 (https://download.01.org/0day-ci/archive/20250923/202509231550.BWhLcP9e-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250923/202509231550.BWhLcP9e-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202509231550.BWhLcP9e-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
kernel/trace/bpf_trace.c:833:41: sparse: sparse: incorrect type in assignment (different address spaces) @@ expected void [noderef] __user *[addressable] [assigned] [usertype] sival_ptr @@ got void * @@
kernel/trace/bpf_trace.c:833:41: sparse: expected void [noderef] __user *[addressable] [assigned] [usertype] sival_ptr
kernel/trace/bpf_trace.c:833:41: sparse: got void *
>> kernel/trace/bpf_trace.c:3059:62: sparse: sparse: dubious: x | !y
kernel/trace/bpf_trace.c:3512:52: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c:3526:56: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c:3540:52: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c:3547:56: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c:3555:52: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c:3563:56: sparse: sparse: cast removes address space '__user' of expression
kernel/trace/bpf_trace.c: note: in included file (through include/linux/rbtree.h, include/linux/mm_types.h, include/linux/mmzone.h, ...):
include/linux/rcupdate.h:869:9: sparse: sparse: context imbalance in 'uprobe_prog_run' - unexpected unlock
vim +3059 kernel/trace/bpf_trace.c
3050
3051 static int uprobe_prog_run(struct bpf_uprobe *uprobe,
3052 unsigned long entry_ip,
3053 struct pt_regs *regs,
3054 bool is_return, void *data)
3055 {
3056 struct bpf_uprobe_multi_link *link = uprobe->link;
3057 struct bpf_uprobe_multi_run_ctx run_ctx = {
3058 .session_ctx = {
> 3059 .data = (void *)((unsigned long)data | !!is_return),
3060 },
3061 .entry_ip = entry_ip,
3062 .uprobe = uprobe,
3063 };
3064 struct bpf_prog *prog = link->link.prog;
3065 bool sleepable = prog->sleepable;
3066 struct bpf_run_ctx *old_run_ctx;
3067 int err;
3068
3069 if (link->task && !same_thread_group(current, link->task))
3070 return 0;
3071
3072 if (sleepable)
3073 rcu_read_lock_trace();
3074 else
3075 rcu_read_lock();
3076
3077 migrate_disable();
3078
3079 old_run_ctx = bpf_set_run_ctx(&run_ctx.session_ctx.run_ctx);
3080 err = bpf_prog_run(link->link.prog, regs);
3081 bpf_reset_run_ctx(old_run_ctx);
3082
3083 migrate_enable();
3084
3085 if (sleepable)
3086 rcu_read_unlock_trace();
3087 else
3088 rcu_read_unlock();
3089 return err;
3090 }
3091
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
2025-09-22 14:10 ` Menglong Dong
@ 2025-09-23 19:23 ` Alexei Starovoitov
2025-09-24 2:03 ` menglong.dong
0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2025-09-23 19:23 UTC (permalink / raw)
To: Menglong Dong
Cc: Song Liu, Jiri Olsa, KP Singh, Matt Bobrowski, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, bpf, LKML,
linux-trace-kernel
On Mon, Sep 22, 2025 at 7:11 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> On Mon, Sep 22, 2025 at 10:08 PM Song Liu <song@kernel.org> wrote:
> >
> > On Mon, Sep 22, 2025 at 11:57 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > The "data" in struct bpf_session_run_ctx is always 8-bytes aligned.
> > > Therefore, we can store the "is_return" to the last bit of the "data",
> > > which can make bpf_session_run_ctx 8-bytes aligned and save memory.
> >
> > Does this really save anything? AFAICT, bpf_session_run_ctx is
> > only allocated on the stack. Therefore, we don't save any memory
> > unless there is potential risk of stack overflow.
>
> Hi, Song. My original intention is to save the usage of the
> stack to prevent potential stack overflow,
8 bytes won't matter, but wasting 8 bytes for 1 bit is indeed annoying.
> especially when we
> trace all the kernel functions with kprobe-multi.
What do you mean? kprobe-multi won't recurse,
so tracing all or a few functions is the same concern
from stack overflow pov, no ?
> The most thing for me is that the unaligned field in the struct
> looks very awkward, and it consumes 8-bytes only for a bit.
let's keep it as-is. If stack overflow is indeed an issue we need
a generic way to detect it and prevent it.
We've been thinking whether vmap stack guard pages
can become JIT's extable-like things, so when stack overflow
happens we unwind stack and stop bpf prog instead of panicing.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx
2025-09-23 19:23 ` Alexei Starovoitov
@ 2025-09-24 2:03 ` menglong.dong
0 siblings, 0 replies; 6+ messages in thread
From: menglong.dong @ 2025-09-24 2:03 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Song Liu, Jiri Olsa, KP Singh, Matt Bobrowski, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, bpf, LKML,
linux-trace-kernel
On 2025/9/24 03:23 Alexei Starovoitov <alexei.starovoitov@gmail.com> write:
> On Mon, Sep 22, 2025 at 7:11 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > On Mon, Sep 22, 2025 at 10:08 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Mon, Sep 22, 2025 at 11:57 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > >
> > > > The "data" in struct bpf_session_run_ctx is always 8-bytes aligned.
> > > > Therefore, we can store the "is_return" to the last bit of the "data",
> > > > which can make bpf_session_run_ctx 8-bytes aligned and save memory.
> > >
> > > Does this really save anything? AFAICT, bpf_session_run_ctx is
> > > only allocated on the stack. Therefore, we don't save any memory
> > > unless there is potential risk of stack overflow.
> >
> > Hi, Song. My original intention is to save the usage of the
> > stack to prevent potential stack overflow,
>
> 8 bytes won't matter, but wasting 8 bytes for 1 bit is indeed annoying.
>
> > especially when we
> > trace all the kernel functions with kprobe-multi.
>
> What do you mean? kprobe-multi won't recurse,
> so tracing all or a few functions is the same concern
> from stack overflow pov, no ?
You are right, I made something wrong. I mixed it with origin
call case of the bpf trampoline, which will store all the things
in the stack for every function call.
>
> > The most thing for me is that the unaligned field in the struct
> > looks very awkward, and it consumes 8-bytes only for a bit.
>
> let's keep it as-is. If stack overflow is indeed an issue we need
> a generic way to detect it and prevent it.
> We've been thinking whether vmap stack guard pages
> can become JIT's extable-like things, so when stack overflow
Interesting
> happens we unwind stack and stop bpf prog instead of panicing.
Yeah, I think it's OK to keep it still.
Thanks!
Menglong Dong
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-09-24 2:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-22 9:57 [PATCH bpf-next] bpf: remove is_return in struct bpf_session_run_ctx Menglong Dong
2025-09-22 14:08 ` Song Liu
2025-09-22 14:10 ` Menglong Dong
2025-09-23 19:23 ` Alexei Starovoitov
2025-09-24 2:03 ` menglong.dong
2025-09-23 7:52 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).