* bpf_probe_read_user EFAULT
@ 2022-12-27 14:56 Victor Laforet
2022-12-27 16:00 ` Jiri Olsa
0 siblings, 1 reply; 11+ messages in thread
From: Victor Laforet @ 2022-12-27 14:56 UTC (permalink / raw)
To: bpf
Hi all,
I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
I am seeking your help to try and make this code work.
Thank you!
My goal is to read the variable pid on every bpf event.
Here is a full example:
(cat /sys/kernel/debug/tracing/trace_pipe to read the output).
sched_switch.bpf.c
```
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
int *input_pid;
char _license[4] SEC("license") = "GPL";
SEC("tp_btf/sched_switch")
int handle_sched_switch(u64 *ctx)
{
int pid;
int err;
err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
if (err != 0)
{
bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
return 0;
}
bpf_printk("pid %d.\n", pid);
return 0;
}
```
sched_switch.c
```
#include <stdio.h>
#include <unistd.h>
#include <sys/resource.h>
#include <bpf/libbpf.h>
#include "sched_switch.skel.h"
#include <time.h>
static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
{
return vfprintf(stderr, format, args);
}
int main(int argc, char **argv)
{
struct sched_switch_bpf *skel;
int err;
int pid = getpid();
libbpf_set_print(libbpf_print_fn);
skel = sched_switch_bpf__open();
if (!skel)
{
fprintf(stderr, "Failed to open BPF skeleton\n");
return 1;
}
skel->bss->input_pid = &pid;
err = sched_switch_bpf__load(skel);
if (err)
{
fprintf(stderr, "Failed to load and verify BPF skeleton\n");
goto cleanup;
}
err = sched_switch_bpf__attach(skel);
if (err)
{
fprintf(stderr, "Failed to attach BPF skeleton\n");
goto cleanup;
}
while (1);
cleanup:
sched_switch_bpf__destroy(skel);
return -err;
}
```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2022-12-27 14:56 bpf_probe_read_user EFAULT Victor Laforet
@ 2022-12-27 16:00 ` Jiri Olsa
2022-12-28 14:00 ` Victor Laforet
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2022-12-27 16:00 UTC (permalink / raw)
To: Victor Laforet; +Cc: bpf
On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> Hi all,
>
> I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
>
> I am seeking your help to try and make this code work.
> Thank you!
>
> My goal is to read the variable pid on every bpf event.
> Here is a full example:
> (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
>
> sched_switch.bpf.c
> ```
> #include "vmlinux.h"
> #include <bpf/bpf_helpers.h>
>
> int *input_pid;
>
> char _license[4] SEC("license") = "GPL";
>
> SEC("tp_btf/sched_switch")
> int handle_sched_switch(u64 *ctx)
you might want to filter for your task, because sched_switch
tracepoint is called for any task scheduler switch
check BPF_PROG macro in bpf selftests on how to access tp_btf
arguments from context, for sched_switch it's:
TP_PROTO(bool preempt,
struct task_struct *prev,
struct task_struct *next,
unsigned int prev_state),
and call the read helper only for prev->pid == 'your app pid',
there's bpf_copy_from_user_task helper you could use to read
another task's user memory reliably, but it needs to be called
from sleepable probe and you need to have the task pointer
jirka
> {
> int pid;
> int err;
>
> err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> if (err != 0)
> {
> bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> return 0;
> }
>
> bpf_printk("pid %d.\n", pid);
> return 0;
> }
> ```
>
> sched_switch.c
> ```
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/resource.h>
> #include <bpf/libbpf.h>
> #include "sched_switch.skel.h"
> #include <time.h>
>
> static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> {
> return vfprintf(stderr, format, args);
> }
>
> int main(int argc, char **argv)
> {
> struct sched_switch_bpf *skel;
> int err;
> int pid = getpid();
>
> libbpf_set_print(libbpf_print_fn);
>
> skel = sched_switch_bpf__open();
> if (!skel)
> {
> fprintf(stderr, "Failed to open BPF skeleton\n");
> return 1;
> }
>
> skel->bss->input_pid = &pid;
>
> err = sched_switch_bpf__load(skel);
> if (err)
> {
> fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> goto cleanup;
> }
>
> err = sched_switch_bpf__attach(skel);
> if (err)
> {
> fprintf(stderr, "Failed to attach BPF skeleton\n");
> goto cleanup;
> }
>
> while (1);
>
> cleanup:
> sched_switch_bpf__destroy(skel);
> return -err;
> }
> ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2022-12-27 16:00 ` Jiri Olsa
@ 2022-12-28 14:00 ` Victor Laforet
2022-12-28 19:41 ` Yonghong Song
0 siblings, 1 reply; 11+ messages in thread
From: Victor Laforet @ 2022-12-28 14:00 UTC (permalink / raw)
To: Jiri Olsa; +Cc: bpf
Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
I don’t really understand why bpf_probe_read_user would not work for this use case.
Victor
----- Mail original -----
De: "Jiri Olsa" <olsajiri@gmail.com>
À: "Victor Laforet" <victor.laforet@ip-paris.fr>
Cc: "bpf" <bpf@vger.kernel.org>
Envoyé: Mardi 27 Décembre 2022 17:00:42
Objet: Re: bpf_probe_read_user EFAULT
On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> Hi all,
>
> I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
>
> I am seeking your help to try and make this code work.
> Thank you!
>
> My goal is to read the variable pid on every bpf event.
> Here is a full example:
> (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
>
> sched_switch.bpf.c
> ```
> #include "vmlinux.h"
> #include <bpf/bpf_helpers.h>
>
> int *input_pid;
>
> char _license[4] SEC("license") = "GPL";
>
> SEC("tp_btf/sched_switch")
> int handle_sched_switch(u64 *ctx)
you might want to filter for your task, because sched_switch
tracepoint is called for any task scheduler switch
check BPF_PROG macro in bpf selftests on how to access tp_btf
arguments from context, for sched_switch it's:
TP_PROTO(bool preempt,
struct task_struct *prev,
struct task_struct *next,
unsigned int prev_state),
and call the read helper only for prev->pid == 'your app pid',
there's bpf_copy_from_user_task helper you could use to read
another task's user memory reliably, but it needs to be called
from sleepable probe and you need to have the task pointer
jirka
> {
> int pid;
> int err;
>
> err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> if (err != 0)
> {
> bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> return 0;
> }
>
> bpf_printk("pid %d.\n", pid);
> return 0;
> }
> ```
>
> sched_switch.c
> ```
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/resource.h>
> #include <bpf/libbpf.h>
> #include "sched_switch.skel.h"
> #include <time.h>
>
> static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> {
> return vfprintf(stderr, format, args);
> }
>
> int main(int argc, char **argv)
> {
> struct sched_switch_bpf *skel;
> int err;
> int pid = getpid();
>
> libbpf_set_print(libbpf_print_fn);
>
> skel = sched_switch_bpf__open();
> if (!skel)
> {
> fprintf(stderr, "Failed to open BPF skeleton\n");
> return 1;
> }
>
> skel->bss->input_pid = &pid;
>
> err = sched_switch_bpf__load(skel);
> if (err)
> {
> fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> goto cleanup;
> }
>
> err = sched_switch_bpf__attach(skel);
> if (err)
> {
> fprintf(stderr, "Failed to attach BPF skeleton\n");
> goto cleanup;
> }
>
> while (1);
>
> cleanup:
> sched_switch_bpf__destroy(skel);
> return -err;
> }
> ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2022-12-28 14:00 ` Victor Laforet
@ 2022-12-28 19:41 ` Yonghong Song
2023-01-02 22:10 ` Victor Laforet
[not found] ` <5692f180-5b78-48e0-b974-b60bd58c0839@Spark>
0 siblings, 2 replies; 11+ messages in thread
From: Yonghong Song @ 2022-12-28 19:41 UTC (permalink / raw)
To: Victor Laforet, Jiri Olsa; +Cc: bpf
On 12/28/22 6:00 AM, Victor Laforet wrote:
> Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
>
> Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> I don’t really understand why bpf_probe_read_user would not work for this use case.
Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
You could also use bpf_copy_from_user() if you have target_pid checking.
It is possible that the user variable you intended to access is not in
memory. In such cases, bpf_probe_read_user() will return EFAULT. But
bpf_copy_from_user() and bpf_copy_from_user_task() will go through
page fault process to bring the variable to the memory.
Also because of this extra work, bpf_copy_from_user() and
bpf_copy_from_user_task() only work for sleepable programs.
>
> Victor
>
> ----- Mail original -----
> De: "Jiri Olsa" <olsajiri@gmail.com>
> À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> Cc: "bpf" <bpf@vger.kernel.org>
> Envoyé: Mardi 27 Décembre 2022 17:00:42
> Objet: Re: bpf_probe_read_user EFAULT
>
> On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
>> Hi all,
>>
>> I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
>>
>> I am seeking your help to try and make this code work.
>> Thank you!
>>
>> My goal is to read the variable pid on every bpf event.
>> Here is a full example:
>> (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
>>
>> sched_switch.bpf.c
>> ```
>> #include "vmlinux.h"
>> #include <bpf/bpf_helpers.h>
>>
>> int *input_pid;
>>
>> char _license[4] SEC("license") = "GPL";
>>
>> SEC("tp_btf/sched_switch")
>> int handle_sched_switch(u64 *ctx)
>
> you might want to filter for your task, because sched_switch
> tracepoint is called for any task scheduler switch
>
> check BPF_PROG macro in bpf selftests on how to access tp_btf
> arguments from context, for sched_switch it's:
>
> TP_PROTO(bool preempt,
> struct task_struct *prev,
> struct task_struct *next,
> unsigned int prev_state),
>
> and call the read helper only for prev->pid == 'your app pid',
>
> there's bpf_copy_from_user_task helper you could use to read
> another task's user memory reliably, but it needs to be called
> from sleepable probe and you need to have the task pointer
>
> jirka
>
>> {
>> int pid;
>> int err;
>>
>> err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
>> if (err != 0)
>> {
>> bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
>> return 0;
>> }
>>
>> bpf_printk("pid %d.\n", pid);
>> return 0;
>> }
>> ```
>>
>> sched_switch.c
>> ```
>> #include <stdio.h>
>> #include <unistd.h>
>> #include <sys/resource.h>
>> #include <bpf/libbpf.h>
>> #include "sched_switch.skel.h"
>> #include <time.h>
>>
>> static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
>> {
>> return vfprintf(stderr, format, args);
>> }
>>
>> int main(int argc, char **argv)
>> {
>> struct sched_switch_bpf *skel;
>> int err;
>> int pid = getpid();
>>
>> libbpf_set_print(libbpf_print_fn);
>>
>> skel = sched_switch_bpf__open();
>> if (!skel)
>> {
>> fprintf(stderr, "Failed to open BPF skeleton\n");
>> return 1;
>> }
>>
>> skel->bss->input_pid = &pid;
>>
>> err = sched_switch_bpf__load(skel);
>> if (err)
>> {
>> fprintf(stderr, "Failed to load and verify BPF skeleton\n");
>> goto cleanup;
>> }
>>
>> err = sched_switch_bpf__attach(skel);
>> if (err)
>> {
>> fprintf(stderr, "Failed to attach BPF skeleton\n");
>> goto cleanup;
>> }
>>
>> while (1);
>>
>> cleanup:
>> sched_switch_bpf__destroy(skel);
>> return -err;
>> }
>> ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2022-12-28 19:41 ` Yonghong Song
@ 2023-01-02 22:10 ` Victor Laforet
[not found] ` <5692f180-5b78-48e0-b974-b60bd58c0839@Spark>
1 sibling, 0 replies; 11+ messages in thread
From: Victor Laforet @ 2023-01-02 22:10 UTC (permalink / raw)
To: Yonghong Song; +Cc: Jiri Olsa, bpf
Thanks!
I have tried to use bpf_copy_from_user_task() in place of bpf_probe_read_user() however I cannot seem to run my program. It fails with 'unknown func bpf_copy_from_user_task’.
If I understood correctly, this function should be in ‘bpf/bpf_helpers.h’?
Another quick question:
I have set the bpf program as sleepable using ‘ bpf_program__set_flags(skel, BPF_F_SLEEPABLE);'
I couldn’t find any other way to do that. Is it the right way to set it sleepable?
Victor
De: "Yonghong Song" <yhs@meta.com>
À: "Victor Laforet" <victor.laforet@ip-paris.fr>, "Jiri Olsa" <olsajiri@gmail.com>
Cc: "bpf" <bpf@vger.kernel.org>
Envoyé: Mercredi 28 Décembre 2022 20:41:33
Objet: Re: bpf_probe_read_user EFAULT
On 12/28/22 6:00 AM, Victor Laforet wrote:
> Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
>
> Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> I don’t really understand why bpf_probe_read_user would not work for this use case.
Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
You could also use bpf_copy_from_user() if you have target_pid checking.
It is possible that the user variable you intended to access is not in
memory. In such cases, bpf_probe_read_user() will return EFAULT. But
bpf_copy_from_user() and bpf_copy_from_user_task() will go through
page fault process to bring the variable to the memory.
Also because of this extra work, bpf_copy_from_user() and
bpf_copy_from_user_task() only work for sleepable programs.
>
> Victor
>
> ----- Mail original -----
> De: "Jiri Olsa" <olsajiri@gmail.com>
> À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> Cc: "bpf" <bpf@vger.kernel.org>
> Envoyé: Mardi 27 Décembre 2022 17:00:42
> Objet: Re: bpf_probe_read_user EFAULT
>
> On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
>> Hi all,
>>
>> I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
>>
>> I am seeking your help to try and make this code work.
>> Thank you!
>>
>> My goal is to read the variable pid on every bpf event.
>> Here is a full example:
>> (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
>>
>> sched_switch.bpf.c
>> ```
>> #include "vmlinux.h"
>> #include <bpf/bpf_helpers.h>
>>
>> int *input_pid;
>>
>> char _license[4] SEC("license") = "GPL";
>>
>> SEC("tp_btf/sched_switch")
>> int handle_sched_switch(u64 *ctx)
>
> you might want to filter for your task, because sched_switch
> tracepoint is called for any task scheduler switch
>
> check BPF_PROG macro in bpf selftests on how to access tp_btf
> arguments from context, for sched_switch it's:
>
> TP_PROTO(bool preempt,
> struct task_struct *prev,
> struct task_struct *next,
> unsigned int prev_state),
>
> and call the read helper only for prev->pid == 'your app pid',
>
> there's bpf_copy_from_user_task helper you could use to read
> another task's user memory reliably, but it needs to be called
> from sleepable probe and you need to have the task pointer
>
> jirka
>
>> {
>> int pid;
>> int err;
>>
>> err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
>> if (err != 0)
>> {
>> bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
>> return 0;
>> }
>>
>> bpf_printk("pid %d.\n", pid);
>> return 0;
>> }
>> ```
>>
>> sched_switch.c
>> ```
>> #include <stdio.h>
>> #include <unistd.h>
>> #include <sys/resource.h>
>> #include <bpf/libbpf.h>
>> #include "sched_switch.skel.h"
>> #include <time.h>
>>
>> static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
>> {
>> return vfprintf(stderr, format, args);
>> }
>>
>> int main(int argc, char **argv)
>> {
>> struct sched_switch_bpf *skel;
>> int err;
>> int pid = getpid();
>>
>> libbpf_set_print(libbpf_print_fn);
>>
>> skel = sched_switch_bpf__open();
>> if (!skel)
>> {
>> fprintf(stderr, "Failed to open BPF skeleton\n");
>> return 1;
>> }
>>
>> skel->bss->input_pid = &pid;
>>
>> err = sched_switch_bpf__load(skel);
>> if (err)
>> {
>> fprintf(stderr, "Failed to load and verify BPF skeleton\n");
>> goto cleanup;
>> }
>>
>> err = sched_switch_bpf__attach(skel);
>> if (err)
>> {
>> fprintf(stderr, "Failed to attach BPF skeleton\n");
>> goto cleanup;
>> }
>>
>> while (1);
>>
>> cleanup:
>> sched_switch_bpf__destroy(skel);
>> return -err;
>> }
>> ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
[not found] ` <5692f180-5b78-48e0-b974-b60bd58c0839@Spark>
@ 2023-01-03 8:03 ` Jiri Olsa
2023-01-04 15:24 ` Victor Laforet
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-01-03 8:03 UTC (permalink / raw)
To: Victor Laforet; +Cc: Jiri Olsa, Yonghong Song, bpf
On Mon, Jan 02, 2023 at 11:07:50PM +0100, Victor Laforet wrote:
> Thanks!
>
> I have tried to use bpf_copy_from_user_task() in place of bpf_probe_read_user() however I cannot seem to run my program. It fails with 'unknown func bpf_copy_from_user_task’.
> If I understood correctly, this function should be in ‘bpf/bpf_helpers.h’?
the declaration is in bpf_helper_defs.h, which is included by
bpf_helpers.h, so you need to #include it
>
> Another quick question:
> I have set the bpf program as sleepable using ‘ bpf_program__set_flags(skel, BPF_F_SLEEPABLE);'
> I couldn’t find any other way to do that. Is it the right way to set it sleepable?
should work, but you could specify that directly in the program
section name, like SEC("fentry.s/...")
and it's just certain program types that can sleep:
[jolsa@krava bpf]$ grep SEC_SLEEPABLE libbpf.c
...
SEC_DEF("uprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
SEC_DEF("uretprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
SEC_DEF("fentry.s+", TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("fmod_ret.s+", TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("fexit.s+", TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("lsm.s+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
SEC_DEF("iter.s+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
SEC_DEF("syscall", SYSCALL, 0, SEC_SLEEPABLE),
jirka
>
> Victor
> On 28 Dec 2022 at 20:41 +0100, Yonghong Song <yhs@meta.com>, wrote:
> >
> >
> > On 12/28/22 6:00 AM, Victor Laforet wrote:
> > > Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
> > >
> > > Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> > > I don’t really understand why bpf_probe_read_user would not work for this use case.
> >
> > Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
> > You could also use bpf_copy_from_user() if you have target_pid checking.
> >
> > It is possible that the user variable you intended to access is not in
> > memory. In such cases, bpf_probe_read_user() will return EFAULT. But
> > bpf_copy_from_user() and bpf_copy_from_user_task() will go through
> > page fault process to bring the variable to the memory.
> > Also because of this extra work, bpf_copy_from_user() and
> > bpf_copy_from_user_task() only work for sleepable programs.
> >
> > >
> > > Victor
> > >
> > > ----- Mail original -----
> > > De: "Jiri Olsa" <olsajiri@gmail.com>
> > > À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> > > Cc: "bpf" <bpf@vger.kernel.org>
> > > Envoyé: Mardi 27 Décembre 2022 17:00:42
> > > Objet: Re: bpf_probe_read_user EFAULT
> > >
> > > On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> > > > Hi all,
> > > >
> > > > I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
> > > >
> > > > I am seeking your help to try and make this code work.
> > > > Thank you!
> > > >
> > > > My goal is to read the variable pid on every bpf event.
> > > > Here is a full example:
> > > > (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
> > > >
> > > > sched_switch.bpf.c
> > > > ```
> > > > #include "vmlinux.h"
> > > > #include <bpf/bpf_helpers.h>
> > > >
> > > > int *input_pid;
> > > >
> > > > char _license[4] SEC("license") = "GPL";
> > > >
> > > > SEC("tp_btf/sched_switch")
> > > > int handle_sched_switch(u64 *ctx)
> > >
> > > you might want to filter for your task, because sched_switch
> > > tracepoint is called for any task scheduler switch
> > >
> > > check BPF_PROG macro in bpf selftests on how to access tp_btf
> > > arguments from context, for sched_switch it's:
> > >
> > > TP_PROTO(bool preempt,
> > > struct task_struct *prev,
> > > struct task_struct *next,
> > > unsigned int prev_state),
> > >
> > > and call the read helper only for prev->pid == 'your app pid',
> > >
> > > there's bpf_copy_from_user_task helper you could use to read
> > > another task's user memory reliably, but it needs to be called
> > > from sleepable probe and you need to have the task pointer
> > >
> > > jirka
> > >
> > > > {
> > > > int pid;
> > > > int err;
> > > >
> > > > err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> > > > if (err != 0)
> > > > {
> > > > bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> > > > return 0;
> > > > }
> > > >
> > > > bpf_printk("pid %d.\n", pid);
> > > > return 0;
> > > > }
> > > > ```
> > > >
> > > > sched_switch.c
> > > > ```
> > > > #include <stdio.h>
> > > > #include <unistd.h>
> > > > #include <sys/resource.h>
> > > > #include <bpf/libbpf.h>
> > > > #include "sched_switch.skel.h"
> > > > #include <time.h>
> > > >
> > > > static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> > > > {
> > > > return vfprintf(stderr, format, args);
> > > > }
> > > >
> > > > int main(int argc, char **argv)
> > > > {
> > > > struct sched_switch_bpf *skel;
> > > > int err;
> > > > int pid = getpid();
> > > >
> > > > libbpf_set_print(libbpf_print_fn);
> > > >
> > > > skel = sched_switch_bpf__open();
> > > > if (!skel)
> > > > {
> > > > fprintf(stderr, "Failed to open BPF skeleton\n");
> > > > return 1;
> > > > }
> > > >
> > > > skel->bss->input_pid = &pid;
> > > >
> > > > err = sched_switch_bpf__load(skel);
> > > > if (err)
> > > > {
> > > > fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> > > > goto cleanup;
> > > > }
> > > >
> > > > err = sched_switch_bpf__attach(skel);
> > > > if (err)
> > > > {
> > > > fprintf(stderr, "Failed to attach BPF skeleton\n");
> > > > goto cleanup;
> > > > }
> > > >
> > > > while (1);
> > > >
> > > > cleanup:
> > > > sched_switch_bpf__destroy(skel);
> > > > return -err;
> > > > }
> > > > ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2023-01-03 8:03 ` Jiri Olsa
@ 2023-01-04 15:24 ` Victor Laforet
2023-01-04 21:41 ` Jiri Olsa
0 siblings, 1 reply; 11+ messages in thread
From: Victor Laforet @ 2023-01-04 15:24 UTC (permalink / raw)
To: Jiri Olsa; +Cc: bpf
Ok thanks. As I understand, tp_btf/+ probes (specifically tp_btf/sched_switch that I need) cannot be sleepable? It is then not possible to read user space memory from the bpf code?
----- Mail original -----
De: "Jiri Olsa" <olsajiri@gmail.com>
À: "Victor Laforet" <victor.laforet@ip-paris.fr>
Cc: "Jiri Olsa" <olsajiri@gmail.com>, "Yonghong Song" <yhs@meta.com>, "bpf" <bpf@vger.kernel.org>
Envoyé: Mardi 3 Janvier 2023 09:03:38
Objet: Re: bpf_probe_read_user EFAULT
On Mon, Jan 02, 2023 at 11:07:50PM +0100, Victor Laforet wrote:
> Thanks!
>
> I have tried to use bpf_copy_from_user_task() in place of bpf_probe_read_user() however I cannot seem to run my program. It fails with 'unknown func bpf_copy_from_user_task’.
> If I understood correctly, this function should be in ‘bpf/bpf_helpers.h’?
the declaration is in bpf_helper_defs.h, which is included by
bpf_helpers.h, so you need to #include it
>
> Another quick question:
> I have set the bpf program as sleepable using ‘ bpf_program__set_flags(skel, BPF_F_SLEEPABLE);'
> I couldn’t find any other way to do that. Is it the right way to set it sleepable?
should work, but you could specify that directly in the program
section name, like SEC("fentry.s/...")
and it's just certain program types that can sleep:
[jolsa@krava bpf]$ grep SEC_SLEEPABLE libbpf.c
...
SEC_DEF("uprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
SEC_DEF("uretprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
SEC_DEF("fentry.s+", TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("fmod_ret.s+", TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("fexit.s+", TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
SEC_DEF("lsm.s+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
SEC_DEF("iter.s+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
SEC_DEF("syscall", SYSCALL, 0, SEC_SLEEPABLE),
jirka
>
> Victor
> On 28 Dec 2022 at 20:41 +0100, Yonghong Song <yhs@meta.com>, wrote:
> >
> >
> > On 12/28/22 6:00 AM, Victor Laforet wrote:
> > > Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
> > >
> > > Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> > > I don’t really understand why bpf_probe_read_user would not work for this use case.
> >
> > Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
> > You could also use bpf_copy_from_user() if you have target_pid checking.
> >
> > It is possible that the user variable you intended to access is not in
> > memory. In such cases, bpf_probe_read_user() will return EFAULT. But
> > bpf_copy_from_user() and bpf_copy_from_user_task() will go through
> > page fault process to bring the variable to the memory.
> > Also because of this extra work, bpf_copy_from_user() and
> > bpf_copy_from_user_task() only work for sleepable programs.
> >
> > >
> > > Victor
> > >
> > > ----- Mail original -----
> > > De: "Jiri Olsa" <olsajiri@gmail.com>
> > > À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> > > Cc: "bpf" <bpf@vger.kernel.org>
> > > Envoyé: Mardi 27 Décembre 2022 17:00:42
> > > Objet: Re: bpf_probe_read_user EFAULT
> > >
> > > On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> > > > Hi all,
> > > >
> > > > I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
> > > >
> > > > I am seeking your help to try and make this code work.
> > > > Thank you!
> > > >
> > > > My goal is to read the variable pid on every bpf event.
> > > > Here is a full example:
> > > > (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
> > > >
> > > > sched_switch.bpf.c
> > > > ```
> > > > #include "vmlinux.h"
> > > > #include <bpf/bpf_helpers.h>
> > > >
> > > > int *input_pid;
> > > >
> > > > char _license[4] SEC("license") = "GPL";
> > > >
> > > > SEC("tp_btf/sched_switch")
> > > > int handle_sched_switch(u64 *ctx)
> > >
> > > you might want to filter for your task, because sched_switch
> > > tracepoint is called for any task scheduler switch
> > >
> > > check BPF_PROG macro in bpf selftests on how to access tp_btf
> > > arguments from context, for sched_switch it's:
> > >
> > > TP_PROTO(bool preempt,
> > > struct task_struct *prev,
> > > struct task_struct *next,
> > > unsigned int prev_state),
> > >
> > > and call the read helper only for prev->pid == 'your app pid',
> > >
> > > there's bpf_copy_from_user_task helper you could use to read
> > > another task's user memory reliably, but it needs to be called
> > > from sleepable probe and you need to have the task pointer
> > >
> > > jirka
> > >
> > > > {
> > > > int pid;
> > > > int err;
> > > >
> > > > err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> > > > if (err != 0)
> > > > {
> > > > bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> > > > return 0;
> > > > }
> > > >
> > > > bpf_printk("pid %d.\n", pid);
> > > > return 0;
> > > > }
> > > > ```
> > > >
> > > > sched_switch.c
> > > > ```
> > > > #include <stdio.h>
> > > > #include <unistd.h>
> > > > #include <sys/resource.h>
> > > > #include <bpf/libbpf.h>
> > > > #include "sched_switch.skel.h"
> > > > #include <time.h>
> > > >
> > > > static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> > > > {
> > > > return vfprintf(stderr, format, args);
> > > > }
> > > >
> > > > int main(int argc, char **argv)
> > > > {
> > > > struct sched_switch_bpf *skel;
> > > > int err;
> > > > int pid = getpid();
> > > >
> > > > libbpf_set_print(libbpf_print_fn);
> > > >
> > > > skel = sched_switch_bpf__open();
> > > > if (!skel)
> > > > {
> > > > fprintf(stderr, "Failed to open BPF skeleton\n");
> > > > return 1;
> > > > }
> > > >
> > > > skel->bss->input_pid = &pid;
> > > >
> > > > err = sched_switch_bpf__load(skel);
> > > > if (err)
> > > > {
> > > > fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> > > > goto cleanup;
> > > > }
> > > >
> > > > err = sched_switch_bpf__attach(skel);
> > > > if (err)
> > > > {
> > > > fprintf(stderr, "Failed to attach BPF skeleton\n");
> > > > goto cleanup;
> > > > }
> > > >
> > > > while (1);
> > > >
> > > > cleanup:
> > > > sched_switch_bpf__destroy(skel);
> > > > return -err;
> > > > }
> > > > ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2023-01-04 15:24 ` Victor Laforet
@ 2023-01-04 21:41 ` Jiri Olsa
2023-01-04 22:08 ` Jiri Olsa
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-01-04 21:41 UTC (permalink / raw)
To: Victor Laforet; +Cc: Jiri Olsa, bpf
On Wed, Jan 04, 2023 at 04:24:27PM +0100, Victor Laforet wrote:
> Ok thanks. As I understand, tp_btf/+ probes (specifically tp_btf/sched_switch that I need) cannot be sleepable? It is then not possible to read user space memory from the bpf code?
yes, only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable
you could use kprobe.. I was able to hook to finish_task_switch with kprobe,
which gives you the next process (as current) and previous process in first
argument
# bpftrace -e 'kprobe:finish_task_switch.isra.0 { printf("%d:%d\n", cpu, pid) }' | head
Attaching 1 probe...
1:0
1:4579
1:3189
1:4579
0:40
0:430
3:4581
3:0
but perhaps there's better function to hook, check around the sched_swith tracepoint call
jirka
>
> ----- Mail original -----
> De: "Jiri Olsa" <olsajiri@gmail.com>
> À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> Cc: "Jiri Olsa" <olsajiri@gmail.com>, "Yonghong Song" <yhs@meta.com>, "bpf" <bpf@vger.kernel.org>
> Envoyé: Mardi 3 Janvier 2023 09:03:38
> Objet: Re: bpf_probe_read_user EFAULT
>
> On Mon, Jan 02, 2023 at 11:07:50PM +0100, Victor Laforet wrote:
> > Thanks!
> >
> > I have tried to use bpf_copy_from_user_task() in place of bpf_probe_read_user() however I cannot seem to run my program. It fails with 'unknown func bpf_copy_from_user_task’.
> > If I understood correctly, this function should be in ‘bpf/bpf_helpers.h’?
>
> the declaration is in bpf_helper_defs.h, which is included by
> bpf_helpers.h, so you need to #include it
>
> >
> > Another quick question:
> > I have set the bpf program as sleepable using ‘ bpf_program__set_flags(skel, BPF_F_SLEEPABLE);'
> > I couldn’t find any other way to do that. Is it the right way to set it sleepable?
>
> should work, but you could specify that directly in the program
> section name, like SEC("fentry.s/...")
>
> and it's just certain program types that can sleep:
>
> [jolsa@krava bpf]$ grep SEC_SLEEPABLE libbpf.c
> ...
> SEC_DEF("uprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
> SEC_DEF("uretprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
> SEC_DEF("fentry.s+", TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> SEC_DEF("fmod_ret.s+", TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> SEC_DEF("fexit.s+", TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> SEC_DEF("lsm.s+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
> SEC_DEF("iter.s+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
> SEC_DEF("syscall", SYSCALL, 0, SEC_SLEEPABLE),
>
> jirka
>
>
> >
> > Victor
> > On 28 Dec 2022 at 20:41 +0100, Yonghong Song <yhs@meta.com>, wrote:
> > >
> > >
> > > On 12/28/22 6:00 AM, Victor Laforet wrote:
> > > > Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
> > > >
> > > > Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> > > > I don’t really understand why bpf_probe_read_user would not work for this use case.
> > >
> > > Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
> > > You could also use bpf_copy_from_user() if you have target_pid checking.
> > >
> > > It is possible that the user variable you intended to access is not in
> > > memory. In such cases, bpf_probe_read_user() will return EFAULT. But
> > > bpf_copy_from_user() and bpf_copy_from_user_task() will go through
> > > page fault process to bring the variable to the memory.
> > > Also because of this extra work, bpf_copy_from_user() and
> > > bpf_copy_from_user_task() only work for sleepable programs.
> > >
> > > >
> > > > Victor
> > > >
> > > > ----- Mail original -----
> > > > De: "Jiri Olsa" <olsajiri@gmail.com>
> > > > À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> > > > Cc: "bpf" <bpf@vger.kernel.org>
> > > > Envoyé: Mardi 27 Décembre 2022 17:00:42
> > > > Objet: Re: bpf_probe_read_user EFAULT
> > > >
> > > > On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> > > > > Hi all,
> > > > >
> > > > > I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
> > > > >
> > > > > I am seeking your help to try and make this code work.
> > > > > Thank you!
> > > > >
> > > > > My goal is to read the variable pid on every bpf event.
> > > > > Here is a full example:
> > > > > (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
> > > > >
> > > > > sched_switch.bpf.c
> > > > > ```
> > > > > #include "vmlinux.h"
> > > > > #include <bpf/bpf_helpers.h>
> > > > >
> > > > > int *input_pid;
> > > > >
> > > > > char _license[4] SEC("license") = "GPL";
> > > > >
> > > > > SEC("tp_btf/sched_switch")
> > > > > int handle_sched_switch(u64 *ctx)
> > > >
> > > > you might want to filter for your task, because sched_switch
> > > > tracepoint is called for any task scheduler switch
> > > >
> > > > check BPF_PROG macro in bpf selftests on how to access tp_btf
> > > > arguments from context, for sched_switch it's:
> > > >
> > > > TP_PROTO(bool preempt,
> > > > struct task_struct *prev,
> > > > struct task_struct *next,
> > > > unsigned int prev_state),
> > > >
> > > > and call the read helper only for prev->pid == 'your app pid',
> > > >
> > > > there's bpf_copy_from_user_task helper you could use to read
> > > > another task's user memory reliably, but it needs to be called
> > > > from sleepable probe and you need to have the task pointer
> > > >
> > > > jirka
> > > >
> > > > > {
> > > > > int pid;
> > > > > int err;
> > > > >
> > > > > err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> > > > > if (err != 0)
> > > > > {
> > > > > bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> > > > > return 0;
> > > > > }
> > > > >
> > > > > bpf_printk("pid %d.\n", pid);
> > > > > return 0;
> > > > > }
> > > > > ```
> > > > >
> > > > > sched_switch.c
> > > > > ```
> > > > > #include <stdio.h>
> > > > > #include <unistd.h>
> > > > > #include <sys/resource.h>
> > > > > #include <bpf/libbpf.h>
> > > > > #include "sched_switch.skel.h"
> > > > > #include <time.h>
> > > > >
> > > > > static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> > > > > {
> > > > > return vfprintf(stderr, format, args);
> > > > > }
> > > > >
> > > > > int main(int argc, char **argv)
> > > > > {
> > > > > struct sched_switch_bpf *skel;
> > > > > int err;
> > > > > int pid = getpid();
> > > > >
> > > > > libbpf_set_print(libbpf_print_fn);
> > > > >
> > > > > skel = sched_switch_bpf__open();
> > > > > if (!skel)
> > > > > {
> > > > > fprintf(stderr, "Failed to open BPF skeleton\n");
> > > > > return 1;
> > > > > }
> > > > >
> > > > > skel->bss->input_pid = &pid;
> > > > >
> > > > > err = sched_switch_bpf__load(skel);
> > > > > if (err)
> > > > > {
> > > > > fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> > > > > goto cleanup;
> > > > > }
> > > > >
> > > > > err = sched_switch_bpf__attach(skel);
> > > > > if (err)
> > > > > {
> > > > > fprintf(stderr, "Failed to attach BPF skeleton\n");
> > > > > goto cleanup;
> > > > > }
> > > > >
> > > > > while (1);
> > > > >
> > > > > cleanup:
> > > > > sched_switch_bpf__destroy(skel);
> > > > > return -err;
> > > > > }
> > > > > ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2023-01-04 21:41 ` Jiri Olsa
@ 2023-01-04 22:08 ` Jiri Olsa
2023-01-05 9:47 ` Jiri Olsa
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-01-04 22:08 UTC (permalink / raw)
To: Victor Laforet; +Cc: Jiri Olsa, bpf
On Wed, Jan 04, 2023 at 10:42:02PM +0100, Jiri Olsa wrote:
> On Wed, Jan 04, 2023 at 04:24:27PM +0100, Victor Laforet wrote:
> > Ok thanks. As I understand, tp_btf/+ probes (specifically tp_btf/sched_switch that I need) cannot be sleepable? It is then not possible to read user space memory from the bpf code?
>
> yes, only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable
>
> you could use kprobe.. I was able to hook to finish_task_switch with kprobe,
> which gives you the next process (as current) and previous process in first
> argument
>
> # bpftrace -e 'kprobe:finish_task_switch.isra.0 { printf("%d:%d\n", cpu, pid) }' | head
> Attaching 1 probe...
> 1:0
> 1:4579
> 1:3189
> 1:4579
> 0:40
> 0:430
> 3:4581
> 3:0
>
> but perhaps there's better function to hook, check around the sched_swith tracepoint call
ugh, actually you can't sleep under schedule.. so I don't think you can
reliably read user memory from there
jirka
>
> jirka
>
>
> >
> > ----- Mail original -----
> > De: "Jiri Olsa" <olsajiri@gmail.com>
> > À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> > Cc: "Jiri Olsa" <olsajiri@gmail.com>, "Yonghong Song" <yhs@meta.com>, "bpf" <bpf@vger.kernel.org>
> > Envoyé: Mardi 3 Janvier 2023 09:03:38
> > Objet: Re: bpf_probe_read_user EFAULT
> >
> > On Mon, Jan 02, 2023 at 11:07:50PM +0100, Victor Laforet wrote:
> > > Thanks!
> > >
> > > I have tried to use bpf_copy_from_user_task() in place of bpf_probe_read_user() however I cannot seem to run my program. It fails with 'unknown func bpf_copy_from_user_task’.
> > > If I understood correctly, this function should be in ‘bpf/bpf_helpers.h’?
> >
> > the declaration is in bpf_helper_defs.h, which is included by
> > bpf_helpers.h, so you need to #include it
> >
> > >
> > > Another quick question:
> > > I have set the bpf program as sleepable using ‘ bpf_program__set_flags(skel, BPF_F_SLEEPABLE);'
> > > I couldn’t find any other way to do that. Is it the right way to set it sleepable?
> >
> > should work, but you could specify that directly in the program
> > section name, like SEC("fentry.s/...")
> >
> > and it's just certain program types that can sleep:
> >
> > [jolsa@krava bpf]$ grep SEC_SLEEPABLE libbpf.c
> > ...
> > SEC_DEF("uprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
> > SEC_DEF("uretprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
> > SEC_DEF("fentry.s+", TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> > SEC_DEF("fmod_ret.s+", TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> > SEC_DEF("fexit.s+", TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> > SEC_DEF("lsm.s+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
> > SEC_DEF("iter.s+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
> > SEC_DEF("syscall", SYSCALL, 0, SEC_SLEEPABLE),
> >
> > jirka
> >
> >
> > >
> > > Victor
> > > On 28 Dec 2022 at 20:41 +0100, Yonghong Song <yhs@meta.com>, wrote:
> > > >
> > > >
> > > > On 12/28/22 6:00 AM, Victor Laforet wrote:
> > > > > Yes I am sorry I did not mention that the example I sent was a minimal working example. I am filtering the events to select only preempted and events with the right pid as prev.
> > > > >
> > > > > Would bpf_copy_from_user_task work better in this setting than bpf_probe_read_user ?
> > > > > I don’t really understand why bpf_probe_read_user would not work for this use case.
> > > >
> > > > Right, bpf_copy_from_user_task() is better than bpf_probe_read_user().
> > > > You could also use bpf_copy_from_user() if you have target_pid checking.
> > > >
> > > > It is possible that the user variable you intended to access is not in
> > > > memory. In such cases, bpf_probe_read_user() will return EFAULT. But
> > > > bpf_copy_from_user() and bpf_copy_from_user_task() will go through
> > > > page fault process to bring the variable to the memory.
> > > > Also because of this extra work, bpf_copy_from_user() and
> > > > bpf_copy_from_user_task() only work for sleepable programs.
> > > >
> > > > >
> > > > > Victor
> > > > >
> > > > > ----- Mail original -----
> > > > > De: "Jiri Olsa" <olsajiri@gmail.com>
> > > > > À: "Victor Laforet" <victor.laforet@ip-paris.fr>
> > > > > Cc: "bpf" <bpf@vger.kernel.org>
> > > > > Envoyé: Mardi 27 Décembre 2022 17:00:42
> > > > > Objet: Re: bpf_probe_read_user EFAULT
> > > > >
> > > > > On Tue, Dec 27, 2022 at 03:56:06PM +0100, Victor Laforet wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > I am trying to use bpf_probe_read_user to read a user space value from BPF. The issue is that I am getting -14 (-EFAULT) result from bpf_probe_read_user. I haven’t been able to make this function work reliably. Sometimes I get no error code then it goes back to EFAULT.
> > > > > >
> > > > > > I am seeking your help to try and make this code work.
> > > > > > Thank you!
> > > > > >
> > > > > > My goal is to read the variable pid on every bpf event.
> > > > > > Here is a full example:
> > > > > > (cat /sys/kernel/debug/tracing/trace_pipe to read the output).
> > > > > >
> > > > > > sched_switch.bpf.c
> > > > > > ```
> > > > > > #include "vmlinux.h"
> > > > > > #include <bpf/bpf_helpers.h>
> > > > > >
> > > > > > int *input_pid;
> > > > > >
> > > > > > char _license[4] SEC("license") = "GPL";
> > > > > >
> > > > > > SEC("tp_btf/sched_switch")
> > > > > > int handle_sched_switch(u64 *ctx)
> > > > >
> > > > > you might want to filter for your task, because sched_switch
> > > > > tracepoint is called for any task scheduler switch
> > > > >
> > > > > check BPF_PROG macro in bpf selftests on how to access tp_btf
> > > > > arguments from context, for sched_switch it's:
> > > > >
> > > > > TP_PROTO(bool preempt,
> > > > > struct task_struct *prev,
> > > > > struct task_struct *next,
> > > > > unsigned int prev_state),
> > > > >
> > > > > and call the read helper only for prev->pid == 'your app pid',
> > > > >
> > > > > there's bpf_copy_from_user_task helper you could use to read
> > > > > another task's user memory reliably, but it needs to be called
> > > > > from sleepable probe and you need to have the task pointer
> > > > >
> > > > > jirka
> > > > >
> > > > > > {
> > > > > > int pid;
> > > > > > int err;
> > > > > >
> > > > > > err = bpf_probe_read_user(&pid, sizeof(int), (void *)input_pid);
> > > > > > if (err != 0)
> > > > > > {
> > > > > > bpf_printk("Error on bpf_probe_read_user(pid) -> %d.\n", err);
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > bpf_printk("pid %d.\n", pid);
> > > > > > return 0;
> > > > > > }
> > > > > > ```
> > > > > >
> > > > > > sched_switch.c
> > > > > > ```
> > > > > > #include <stdio.h>
> > > > > > #include <unistd.h>
> > > > > > #include <sys/resource.h>
> > > > > > #include <bpf/libbpf.h>
> > > > > > #include "sched_switch.skel.h"
> > > > > > #include <time.h>
> > > > > >
> > > > > > static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> > > > > > {
> > > > > > return vfprintf(stderr, format, args);
> > > > > > }
> > > > > >
> > > > > > int main(int argc, char **argv)
> > > > > > {
> > > > > > struct sched_switch_bpf *skel;
> > > > > > int err;
> > > > > > int pid = getpid();
> > > > > >
> > > > > > libbpf_set_print(libbpf_print_fn);
> > > > > >
> > > > > > skel = sched_switch_bpf__open();
> > > > > > if (!skel)
> > > > > > {
> > > > > > fprintf(stderr, "Failed to open BPF skeleton\n");
> > > > > > return 1;
> > > > > > }
> > > > > >
> > > > > > skel->bss->input_pid = &pid;
> > > > > >
> > > > > > err = sched_switch_bpf__load(skel);
> > > > > > if (err)
> > > > > > {
> > > > > > fprintf(stderr, "Failed to load and verify BPF skeleton\n");
> > > > > > goto cleanup;
> > > > > > }
> > > > > >
> > > > > > err = sched_switch_bpf__attach(skel);
> > > > > > if (err)
> > > > > > {
> > > > > > fprintf(stderr, "Failed to attach BPF skeleton\n");
> > > > > > goto cleanup;
> > > > > > }
> > > > > >
> > > > > > while (1);
> > > > > >
> > > > > > cleanup:
> > > > > > sched_switch_bpf__destroy(skel);
> > > > > > return -err;
> > > > > > }
> > > > > > ```
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2023-01-04 22:08 ` Jiri Olsa
@ 2023-01-05 9:47 ` Jiri Olsa
2023-01-06 3:26 ` Alexei Starovoitov
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-01-05 9:47 UTC (permalink / raw)
To: Jiri Olsa, Alexei Starovoitov; +Cc: Victor Laforet, bpf
On Wed, Jan 04, 2023 at 11:08:23PM +0100, Jiri Olsa wrote:
> On Wed, Jan 04, 2023 at 10:42:02PM +0100, Jiri Olsa wrote:
> > On Wed, Jan 04, 2023 at 04:24:27PM +0100, Victor Laforet wrote:
> > > Ok thanks. As I understand, tp_btf/+ probes (specifically tp_btf/sched_switch that I need) cannot be sleepable? It is then not possible to read user space memory from the bpf code?
> >
> > yes, only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable
we actually allow to create tp_btf program with BPF_F_SLEEPABLE flag,
because it's TRACING prog type, but still bpf program can't sleep when
executed in tracepoint context.. so I wonder we should not allow to
load it, Alexei?
jirka
---
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 98a8051ce316..390621d79fbb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16755,10 +16755,14 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
return -EINVAL;
}
- if (prog->aux->sleepable && prog->type != BPF_PROG_TYPE_TRACING &&
- prog->type != BPF_PROG_TYPE_LSM && prog->type != BPF_PROG_TYPE_KPROBE) {
- verbose(env, "Only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable\n");
- return -EINVAL;
+ if (prog->aux->sleepable) {
+ if ((prog->type == BPF_PROG_TYPE_TRACING &&
+ prog->expected_attach_type == BPF_TRACE_RAW_TP) ||
+ (prog->type != BPF_PROG_TYPE_TRACING &&
+ prog->type != BPF_PROG_TYPE_LSM && prog->type != BPF_PROG_TYPE_KPROBE)) {
+ verbose(env, "Only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable\n");
+ return -EINVAL;
+ }
}
if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: bpf_probe_read_user EFAULT
2023-01-05 9:47 ` Jiri Olsa
@ 2023-01-06 3:26 ` Alexei Starovoitov
0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2023-01-06 3:26 UTC (permalink / raw)
To: Jiri Olsa; +Cc: Victor Laforet, bpf
On Thu, Jan 5, 2023 at 1:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Wed, Jan 04, 2023 at 11:08:23PM +0100, Jiri Olsa wrote:
> > On Wed, Jan 04, 2023 at 10:42:02PM +0100, Jiri Olsa wrote:
> > > On Wed, Jan 04, 2023 at 04:24:27PM +0100, Victor Laforet wrote:
> > > > Ok thanks. As I understand, tp_btf/+ probes (specifically tp_btf/sched_switch that I need) cannot be sleepable? It is then not possible to read user space memory from the bpf code?
> > >
> > > yes, only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable
>
> we actually allow to create tp_btf program with BPF_F_SLEEPABLE flag,
> because it's TRACING prog type, but still bpf program can't sleep when
> executed in tracepoint context.. so I wonder we should not allow to
> load it, Alexei?
>
> jirka
>
>
> ---
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 98a8051ce316..390621d79fbb 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -16755,10 +16755,14 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
> return -EINVAL;
> }
>
> - if (prog->aux->sleepable && prog->type != BPF_PROG_TYPE_TRACING &&
> - prog->type != BPF_PROG_TYPE_LSM && prog->type != BPF_PROG_TYPE_KPROBE) {
> - verbose(env, "Only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable\n");
> - return -EINVAL;
> + if (prog->aux->sleepable) {
> + if ((prog->type == BPF_PROG_TYPE_TRACING &&
> + prog->expected_attach_type == BPF_TRACE_RAW_TP) ||
> + (prog->type != BPF_PROG_TYPE_TRACING &&
> + prog->type != BPF_PROG_TYPE_LSM && prog->type != BPF_PROG_TYPE_KPROBE)) {
> + verbose(env, "Only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable\n");
> + return -EINVAL;
Ahh. good catch. The sleepable flag makes no difference for tp_btf.
Indeed, let's disable this combo.
I was thinking whether explicit list of
fentry/fexit/fmod_ret || lsm || kprobe would be cleaner?
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-01-06 3:27 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-27 14:56 bpf_probe_read_user EFAULT Victor Laforet
2022-12-27 16:00 ` Jiri Olsa
2022-12-28 14:00 ` Victor Laforet
2022-12-28 19:41 ` Yonghong Song
2023-01-02 22:10 ` Victor Laforet
[not found] ` <5692f180-5b78-48e0-b974-b60bd58c0839@Spark>
2023-01-03 8:03 ` Jiri Olsa
2023-01-04 15:24 ` Victor Laforet
2023-01-04 21:41 ` Jiri Olsa
2023-01-04 22:08 ` Jiri Olsa
2023-01-05 9:47 ` Jiri Olsa
2023-01-06 3:26 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox