* [PATCH bpf-next 1/2] bpf: Add __rcu_read_{lock,unlock} into btf id deny list
2023-04-24 16:11 [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline Yafang Shao
@ 2023-04-24 16:11 ` Yafang Shao
2023-04-24 16:11 ` [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init Yafang Shao
2023-04-24 21:30 ` [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline patchwork-bot+netdevbpf
2 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2023-04-24 16:11 UTC (permalink / raw)
To: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
Cc: bpf, Yafang Shao
The tracing recursion prevention mechanism must be protected by rcu, that
leaves __rcu_read_{lock,unlock} unprotected by this mechanism. If we trace
them, the recursion will happen. Let's add them into the btf id deny list.
When CONFIG_PREEMPT_RCU is enabled, it can be reproduced with a simple bpf
program as such:
SEC("fentry/__rcu_read_lock")
int fentry_run()
{
return 0;
}
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
kernel/bpf/verifier.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5dae11e..83fb94f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18645,6 +18645,10 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
BTF_ID(func, preempt_count_add)
BTF_ID(func, preempt_count_sub)
#endif
+#ifdef CONFIG_PREEMPT_RCU
+BTF_ID(func, __rcu_read_lock)
+BTF_ID(func, __rcu_read_unlock)
+#endif
BTF_SET_END(btf_id_deny)
static bool can_be_sleepable(struct bpf_prog *prog)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-04-24 16:11 [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline Yafang Shao
2023-04-24 16:11 ` [PATCH bpf-next 1/2] bpf: Add __rcu_read_{lock,unlock} into btf id deny list Yafang Shao
@ 2023-04-24 16:11 ` Yafang Shao
2023-04-24 21:13 ` Alexei Starovoitov
2023-04-24 21:30 ` [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline patchwork-bot+netdevbpf
2 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2023-04-24 16:11 UTC (permalink / raw)
To: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa
Cc: bpf, Yafang Shao
The kernel will panic as follows when attaching fexit to mm_init,
[ 86.549700] ------------[ cut here ]------------
[ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
[ 86.549713] #PF: supervisor read access in kernel mode
[ 86.549715] #PF: error_code(0x0000) - not-present page
[ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
[ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
[ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
[ 86.549754] Call Trace:
[ 86.549755] <TASK>
[ 86.549757] check_preempt_curr+0x5e/0x70
[ 86.549761] ttwu_do_activate+0xab/0x350
[ 86.549763] try_to_wake_up+0x314/0x680
[ 86.549765] wake_up_process+0x15/0x20
[ 86.549767] insert_work+0xb2/0xd0
[ 86.549772] __queue_work+0x20a/0x400
[ 86.549774] queue_work_on+0x7b/0x90
[ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
[ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
[ 86.549813] soft_cursor+0x1cb/0x250
[ 86.549816] bit_cursor+0x3ce/0x630
[ 86.549818] fbcon_cursor+0x139/0x1c0
[ 86.549821] ? __pfx_bit_cursor+0x10/0x10
[ 86.549822] hide_cursor+0x31/0xd0
[ 86.549825] vt_console_print+0x477/0x4e0
[ 86.549828] console_flush_all+0x182/0x440
[ 86.549832] console_unlock+0x58/0xf0
[ 86.549834] vprintk_emit+0x1ae/0x200
[ 86.549837] vprintk_default+0x1d/0x30
[ 86.549839] vprintk+0x5c/0x90
[ 86.549841] _printk+0x58/0x80
[ 86.549843] __warn_printk+0x7e/0x1a0
[ 86.549845] ? trace_preempt_off+0x1b/0x70
[ 86.549848] ? trace_preempt_on+0x1b/0x70
[ 86.549849] ? __percpu_counter_init+0x8e/0xb0
[ 86.549853] refcount_warn_saturate+0x9f/0x150
[ 86.549855] mm_init+0x379/0x390
[ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
[ 86.549862] mm_init+0x5/0x390
[ 86.549865] ? mm_alloc+0x4e/0x60
[ 86.549866] alloc_bprm+0x8a/0x2e0
[ 86.549869] do_execveat_common.isra.0+0x67/0x240
[ 86.549872] __x64_sys_execve+0x37/0x50
[ 86.549874] do_syscall_64+0x38/0x90
[ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
The reason is that when we attach the btf id of the function mm_init we
actually attach the mm_init defined in init/main.c rather than the
function defined in kernel/fork.c. That can be proved by parsing
/sys/kernel/btf/vmlinux:
[2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
[2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
'buf' type_id=57
[2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
[2496] FUNC 'mm_init' type_id=118 linkage=static
[2497] FUNC 'trap_init' type_id=118 linkage=static
[2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
From the above information we can find that the FUNCs above and below
mm_init are all defined in init/main.c. So there's no doubt that the
mm_init is also the function defined in init/main.c.
So when a task calls mm_init and thus the bpf trampoline is triggered it
will use the information of the mm_init defined in init/main.c. Then the
panic will occur.
It seems that there're issues in btf, for example it is unnecessary to
generate btf for the functions annonated with __init. We need to improve
btf. However we also need to change the function defined in
kernel/fork.c to task_mm_init to better distinguish them. After it is
renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
[13970] FUNC 'mm_alloc' type_id=13969 linkage=static
[13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
'mm' type_id=204
'p' type_id=197
'user_ns' type_id=452
[13972] FUNC 'task_mm_init' type_id=13971 linkage=static
[13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
[13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
'orig' type_id=197
'node' type_id=21
[13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
And then attaching task_mm_init won't panic. Improving the btf will be
handled later.
This issue can be reproduced with a simple bpf program as such:
SEC("fexit/mm_init")
int fentry_run()
{
return 0;
}
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
kernel/fork.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 0c92f22..af8110d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1116,7 +1116,7 @@ static void mm_init_uprobes_state(struct mm_struct *mm)
#endif
}
-static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
+static struct mm_struct *task_mm_init(struct mm_struct *mm, struct task_struct *p,
struct user_namespace *user_ns)
{
int i;
@@ -1193,7 +1193,7 @@ struct mm_struct *mm_alloc(void)
return NULL;
memset(mm, 0, sizeof(*mm));
- return mm_init(mm, current, current_user_ns());
+ return task_mm_init(mm, current, current_user_ns());
}
static inline void __mmput(struct mm_struct *mm)
@@ -1542,7 +1542,7 @@ static struct mm_struct *dup_mm(struct task_struct *tsk,
memcpy(mm, oldmm, sizeof(*mm));
- if (!mm_init(mm, tsk, mm->user_ns))
+ if (!task_mm_init(mm, tsk, mm->user_ns))
goto fail_nomem;
err = dup_mmap(mm, oldmm);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-04-24 16:11 ` [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init Yafang Shao
@ 2023-04-24 21:13 ` Alexei Starovoitov
2023-04-27 11:34 ` Yafang Shao
0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-04-24 21:13 UTC (permalink / raw)
To: Yafang Shao, Alan Maguire
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf
On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> The kernel will panic as follows when attaching fexit to mm_init,
>
> [ 86.549700] ------------[ cut here ]------------
> [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> [ 86.549713] #PF: supervisor read access in kernel mode
> [ 86.549715] #PF: error_code(0x0000) - not-present page
> [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> [ 86.549754] Call Trace:
> [ 86.549755] <TASK>
> [ 86.549757] check_preempt_curr+0x5e/0x70
> [ 86.549761] ttwu_do_activate+0xab/0x350
> [ 86.549763] try_to_wake_up+0x314/0x680
> [ 86.549765] wake_up_process+0x15/0x20
> [ 86.549767] insert_work+0xb2/0xd0
> [ 86.549772] __queue_work+0x20a/0x400
> [ 86.549774] queue_work_on+0x7b/0x90
> [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> [ 86.549813] soft_cursor+0x1cb/0x250
> [ 86.549816] bit_cursor+0x3ce/0x630
> [ 86.549818] fbcon_cursor+0x139/0x1c0
> [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
> [ 86.549822] hide_cursor+0x31/0xd0
> [ 86.549825] vt_console_print+0x477/0x4e0
> [ 86.549828] console_flush_all+0x182/0x440
> [ 86.549832] console_unlock+0x58/0xf0
> [ 86.549834] vprintk_emit+0x1ae/0x200
> [ 86.549837] vprintk_default+0x1d/0x30
> [ 86.549839] vprintk+0x5c/0x90
> [ 86.549841] _printk+0x58/0x80
> [ 86.549843] __warn_printk+0x7e/0x1a0
> [ 86.549845] ? trace_preempt_off+0x1b/0x70
> [ 86.549848] ? trace_preempt_on+0x1b/0x70
> [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
> [ 86.549853] refcount_warn_saturate+0x9f/0x150
> [ 86.549855] mm_init+0x379/0x390
> [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
> [ 86.549862] mm_init+0x5/0x390
> [ 86.549865] ? mm_alloc+0x4e/0x60
> [ 86.549866] alloc_bprm+0x8a/0x2e0
> [ 86.549869] do_execveat_common.isra.0+0x67/0x240
> [ 86.549872] __x64_sys_execve+0x37/0x50
> [ 86.549874] do_syscall_64+0x38/0x90
> [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>
> The reason is that when we attach the btf id of the function mm_init we
> actually attach the mm_init defined in init/main.c rather than the
> function defined in kernel/fork.c. That can be proved by parsing
> /sys/kernel/btf/vmlinux:
>
> [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> 'buf' type_id=57
> [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> [2496] FUNC 'mm_init' type_id=118 linkage=static
> [2497] FUNC 'trap_init' type_id=118 linkage=static
> [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
>
> From the above information we can find that the FUNCs above and below
> mm_init are all defined in init/main.c. So there's no doubt that the
> mm_init is also the function defined in init/main.c.
>
> So when a task calls mm_init and thus the bpf trampoline is triggered it
> will use the information of the mm_init defined in init/main.c. Then the
> panic will occur.
>
> It seems that there're issues in btf, for example it is unnecessary to
> generate btf for the functions annonated with __init. We need to improve
> btf. However we also need to change the function defined in
> kernel/fork.c to task_mm_init to better distinguish them. After it is
> renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
>
> [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> 'mm' type_id=204
> 'p' type_id=197
> 'user_ns' type_id=452
> [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> 'orig' type_id=197
> 'node' type_id=21
> [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
>
> And then attaching task_mm_init won't panic. Improving the btf will be
> handled later.
We're not going to hack the kernel to workaround pahole issue.
Let's fix pahole instead.
cc-ing Alan for ideas.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-04-24 21:13 ` Alexei Starovoitov
@ 2023-04-27 11:34 ` Yafang Shao
2023-05-02 3:40 ` pahole issue. " Alexei Starovoitov
0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2023-04-27 11:34 UTC (permalink / raw)
To: Alexei Starovoitov, Alan Maguire
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf
On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > The kernel will panic as follows when attaching fexit to mm_init,
> >
> > [ 86.549700] ------------[ cut here ]------------
> > [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> > [ 86.549713] #PF: supervisor read access in kernel mode
> > [ 86.549715] #PF: error_code(0x0000) - not-present page
> > [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> > [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> > [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> > [ 86.549754] Call Trace:
> > [ 86.549755] <TASK>
> > [ 86.549757] check_preempt_curr+0x5e/0x70
> > [ 86.549761] ttwu_do_activate+0xab/0x350
> > [ 86.549763] try_to_wake_up+0x314/0x680
> > [ 86.549765] wake_up_process+0x15/0x20
> > [ 86.549767] insert_work+0xb2/0xd0
> > [ 86.549772] __queue_work+0x20a/0x400
> > [ 86.549774] queue_work_on+0x7b/0x90
> > [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> > [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> > [ 86.549813] soft_cursor+0x1cb/0x250
> > [ 86.549816] bit_cursor+0x3ce/0x630
> > [ 86.549818] fbcon_cursor+0x139/0x1c0
> > [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
> > [ 86.549822] hide_cursor+0x31/0xd0
> > [ 86.549825] vt_console_print+0x477/0x4e0
> > [ 86.549828] console_flush_all+0x182/0x440
> > [ 86.549832] console_unlock+0x58/0xf0
> > [ 86.549834] vprintk_emit+0x1ae/0x200
> > [ 86.549837] vprintk_default+0x1d/0x30
> > [ 86.549839] vprintk+0x5c/0x90
> > [ 86.549841] _printk+0x58/0x80
> > [ 86.549843] __warn_printk+0x7e/0x1a0
> > [ 86.549845] ? trace_preempt_off+0x1b/0x70
> > [ 86.549848] ? trace_preempt_on+0x1b/0x70
> > [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
> > [ 86.549853] refcount_warn_saturate+0x9f/0x150
> > [ 86.549855] mm_init+0x379/0x390
> > [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
> > [ 86.549862] mm_init+0x5/0x390
> > [ 86.549865] ? mm_alloc+0x4e/0x60
> > [ 86.549866] alloc_bprm+0x8a/0x2e0
> > [ 86.549869] do_execveat_common.isra.0+0x67/0x240
> > [ 86.549872] __x64_sys_execve+0x37/0x50
> > [ 86.549874] do_syscall_64+0x38/0x90
> > [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >
> > The reason is that when we attach the btf id of the function mm_init we
> > actually attach the mm_init defined in init/main.c rather than the
> > function defined in kernel/fork.c. That can be proved by parsing
> > /sys/kernel/btf/vmlinux:
> >
> > [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> > [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> > 'buf' type_id=57
> > [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> > [2496] FUNC 'mm_init' type_id=118 linkage=static
> > [2497] FUNC 'trap_init' type_id=118 linkage=static
> > [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
> >
> > From the above information we can find that the FUNCs above and below
> > mm_init are all defined in init/main.c. So there's no doubt that the
> > mm_init is also the function defined in init/main.c.
> >
> > So when a task calls mm_init and thus the bpf trampoline is triggered it
> > will use the information of the mm_init defined in init/main.c. Then the
> > panic will occur.
> >
> > It seems that there're issues in btf, for example it is unnecessary to
> > generate btf for the functions annonated with __init. We need to improve
> > btf. However we also need to change the function defined in
> > kernel/fork.c to task_mm_init to better distinguish them. After it is
> > renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
> >
> > [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> > [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> > 'mm' type_id=204
> > 'p' type_id=197
> > 'user_ns' type_id=452
> > [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> > [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> > [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> > 'orig' type_id=197
> > 'node' type_id=21
> > [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
> >
> > And then attaching task_mm_init won't panic. Improving the btf will be
> > handled later.
>
> We're not going to hack the kernel to workaround pahole issue.
> Let's fix pahole instead.
> cc-ing Alan for ideas.
Any comment on it, Alan ?
I think we can just skip generating BTF for the functions in
__section(".init.text"), as these functions will be freed after
kernel init. There won't be use cases for them.
--
Regards
Yafang
^ permalink raw reply [flat|nested] 15+ messages in thread* pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-04-27 11:34 ` Yafang Shao
@ 2023-05-02 3:40 ` Alexei Starovoitov
2023-05-09 15:35 ` Yafang Shao
2023-05-09 18:43 ` Alan Maguire
0 siblings, 2 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2023-05-02 3:40 UTC (permalink / raw)
To: Yafang Shao
Cc: Alan Maguire, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf
Alan,
wdyt on below?
On Thu, Apr 27, 2023 at 4:35 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > The kernel will panic as follows when attaching fexit to mm_init,
> > >
> > > [ 86.549700] ------------[ cut here ]------------
> > > [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> > > [ 86.549713] #PF: supervisor read access in kernel mode
> > > [ 86.549715] #PF: error_code(0x0000) - not-present page
> > > [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> > > [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> > > [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> > > [ 86.549754] Call Trace:
> > > [ 86.549755] <TASK>
> > > [ 86.549757] check_preempt_curr+0x5e/0x70
> > > [ 86.549761] ttwu_do_activate+0xab/0x350
> > > [ 86.549763] try_to_wake_up+0x314/0x680
> > > [ 86.549765] wake_up_process+0x15/0x20
> > > [ 86.549767] insert_work+0xb2/0xd0
> > > [ 86.549772] __queue_work+0x20a/0x400
> > > [ 86.549774] queue_work_on+0x7b/0x90
> > > [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> > > [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> > > [ 86.549813] soft_cursor+0x1cb/0x250
> > > [ 86.549816] bit_cursor+0x3ce/0x630
> > > [ 86.549818] fbcon_cursor+0x139/0x1c0
> > > [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
> > > [ 86.549822] hide_cursor+0x31/0xd0
> > > [ 86.549825] vt_console_print+0x477/0x4e0
> > > [ 86.549828] console_flush_all+0x182/0x440
> > > [ 86.549832] console_unlock+0x58/0xf0
> > > [ 86.549834] vprintk_emit+0x1ae/0x200
> > > [ 86.549837] vprintk_default+0x1d/0x30
> > > [ 86.549839] vprintk+0x5c/0x90
> > > [ 86.549841] _printk+0x58/0x80
> > > [ 86.549843] __warn_printk+0x7e/0x1a0
> > > [ 86.549845] ? trace_preempt_off+0x1b/0x70
> > > [ 86.549848] ? trace_preempt_on+0x1b/0x70
> > > [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
> > > [ 86.549853] refcount_warn_saturate+0x9f/0x150
> > > [ 86.549855] mm_init+0x379/0x390
> > > [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
> > > [ 86.549862] mm_init+0x5/0x390
> > > [ 86.549865] ? mm_alloc+0x4e/0x60
> > > [ 86.549866] alloc_bprm+0x8a/0x2e0
> > > [ 86.549869] do_execveat_common.isra.0+0x67/0x240
> > > [ 86.549872] __x64_sys_execve+0x37/0x50
> > > [ 86.549874] do_syscall_64+0x38/0x90
> > > [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > >
> > > The reason is that when we attach the btf id of the function mm_init we
> > > actually attach the mm_init defined in init/main.c rather than the
> > > function defined in kernel/fork.c. That can be proved by parsing
> > > /sys/kernel/btf/vmlinux:
> > >
> > > [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> > > [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> > > 'buf' type_id=57
> > > [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> > > [2496] FUNC 'mm_init' type_id=118 linkage=static
> > > [2497] FUNC 'trap_init' type_id=118 linkage=static
> > > [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
> > >
> > > From the above information we can find that the FUNCs above and below
> > > mm_init are all defined in init/main.c. So there's no doubt that the
> > > mm_init is also the function defined in init/main.c.
> > >
> > > So when a task calls mm_init and thus the bpf trampoline is triggered it
> > > will use the information of the mm_init defined in init/main.c. Then the
> > > panic will occur.
> > >
> > > It seems that there're issues in btf, for example it is unnecessary to
> > > generate btf for the functions annonated with __init. We need to improve
> > > btf. However we also need to change the function defined in
> > > kernel/fork.c to task_mm_init to better distinguish them. After it is
> > > renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
> > >
> > > [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> > > [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> > > 'mm' type_id=204
> > > 'p' type_id=197
> > > 'user_ns' type_id=452
> > > [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> > > [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> > > [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> > > 'orig' type_id=197
> > > 'node' type_id=21
> > > [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
> > >
> > > And then attaching task_mm_init won't panic. Improving the btf will be
> > > handled later.
> >
> > We're not going to hack the kernel to workaround pahole issue.
> > Let's fix pahole instead.
> > cc-ing Alan for ideas.
>
> Any comment on it, Alan ?
> I think we can just skip generating BTF for the functions in
> __section(".init.text"), as these functions will be freed after
> kernel init. There won't be use cases for them.
>
> --
> Regards
> Yafang
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-02 3:40 ` pahole issue. " Alexei Starovoitov
@ 2023-05-09 15:35 ` Yafang Shao
2023-05-09 17:18 ` Alexei Starovoitov
2023-05-09 18:43 ` Alan Maguire
1 sibling, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2023-05-09 15:35 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Alan Maguire, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf
On Tue, May 2, 2023 at 11:40 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> Alan,
>
> wdyt on below?
>
Hi Alexei,
Per my understanding, not only does pahole have issues, but also there
are issues in the kernel.
This panic is caused by the inconsistency between BTF and kallsyms as such:
bpf_check_attach_target
tname = btf_name_by_offset(btf, t->name_off); // btf
addr = kallsyms_lookup_name(tname); // kallsyms
So if the function displayed in /proc/sys/btf/vmlinux is not the same
with the function displayed in /proc/kallsyms, we will get a wrong
addr. I think it is not proper to rely wholly on the userspace tools
to make them the same. The kernel should also imrpve the verifier to
make sure they are really the same function. WDYT?
> On Thu, Apr 27, 2023 at 4:35 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > The kernel will panic as follows when attaching fexit to mm_init,
> > > >
> > > > [ 86.549700] ------------[ cut here ]------------
> > > > [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> > > > [ 86.549713] #PF: supervisor read access in kernel mode
> > > > [ 86.549715] #PF: error_code(0x0000) - not-present page
> > > > [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> > > > [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > > [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> > > > [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> > > > [ 86.549754] Call Trace:
> > > > [ 86.549755] <TASK>
> > > > [ 86.549757] check_preempt_curr+0x5e/0x70
> > > > [ 86.549761] ttwu_do_activate+0xab/0x350
> > > > [ 86.549763] try_to_wake_up+0x314/0x680
> > > > [ 86.549765] wake_up_process+0x15/0x20
> > > > [ 86.549767] insert_work+0xb2/0xd0
> > > > [ 86.549772] __queue_work+0x20a/0x400
> > > > [ 86.549774] queue_work_on+0x7b/0x90
> > > > [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> > > > [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> > > > [ 86.549813] soft_cursor+0x1cb/0x250
> > > > [ 86.549816] bit_cursor+0x3ce/0x630
> > > > [ 86.549818] fbcon_cursor+0x139/0x1c0
> > > > [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
> > > > [ 86.549822] hide_cursor+0x31/0xd0
> > > > [ 86.549825] vt_console_print+0x477/0x4e0
> > > > [ 86.549828] console_flush_all+0x182/0x440
> > > > [ 86.549832] console_unlock+0x58/0xf0
> > > > [ 86.549834] vprintk_emit+0x1ae/0x200
> > > > [ 86.549837] vprintk_default+0x1d/0x30
> > > > [ 86.549839] vprintk+0x5c/0x90
> > > > [ 86.549841] _printk+0x58/0x80
> > > > [ 86.549843] __warn_printk+0x7e/0x1a0
> > > > [ 86.549845] ? trace_preempt_off+0x1b/0x70
> > > > [ 86.549848] ? trace_preempt_on+0x1b/0x70
> > > > [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
> > > > [ 86.549853] refcount_warn_saturate+0x9f/0x150
> > > > [ 86.549855] mm_init+0x379/0x390
> > > > [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
> > > > [ 86.549862] mm_init+0x5/0x390
> > > > [ 86.549865] ? mm_alloc+0x4e/0x60
> > > > [ 86.549866] alloc_bprm+0x8a/0x2e0
> > > > [ 86.549869] do_execveat_common.isra.0+0x67/0x240
> > > > [ 86.549872] __x64_sys_execve+0x37/0x50
> > > > [ 86.549874] do_syscall_64+0x38/0x90
> > > > [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > > >
> > > > The reason is that when we attach the btf id of the function mm_init we
> > > > actually attach the mm_init defined in init/main.c rather than the
> > > > function defined in kernel/fork.c. That can be proved by parsing
> > > > /sys/kernel/btf/vmlinux:
> > > >
> > > > [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> > > > [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> > > > 'buf' type_id=57
> > > > [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> > > > [2496] FUNC 'mm_init' type_id=118 linkage=static
> > > > [2497] FUNC 'trap_init' type_id=118 linkage=static
> > > > [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
> > > >
> > > > From the above information we can find that the FUNCs above and below
> > > > mm_init are all defined in init/main.c. So there's no doubt that the
> > > > mm_init is also the function defined in init/main.c.
> > > >
> > > > So when a task calls mm_init and thus the bpf trampoline is triggered it
> > > > will use the information of the mm_init defined in init/main.c. Then the
> > > > panic will occur.
> > > >
> > > > It seems that there're issues in btf, for example it is unnecessary to
> > > > generate btf for the functions annonated with __init. We need to improve
> > > > btf. However we also need to change the function defined in
> > > > kernel/fork.c to task_mm_init to better distinguish them. After it is
> > > > renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
> > > >
> > > > [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> > > > [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> > > > 'mm' type_id=204
> > > > 'p' type_id=197
> > > > 'user_ns' type_id=452
> > > > [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> > > > [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> > > > [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> > > > 'orig' type_id=197
> > > > 'node' type_id=21
> > > > [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
> > > >
> > > > And then attaching task_mm_init won't panic. Improving the btf will be
> > > > handled later.
> > >
> > > We're not going to hack the kernel to workaround pahole issue.
> > > Let's fix pahole instead.
> > > cc-ing Alan for ideas.
> >
> > Any comment on it, Alan ?
> > I think we can just skip generating BTF for the functions in
> > __section(".init.text"), as these functions will be freed after
> > kernel init. There won't be use cases for them.
> >
> > --
> > Regards
> > Yafang
--
Regards
Yafang
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-09 15:35 ` Yafang Shao
@ 2023-05-09 17:18 ` Alexei Starovoitov
2023-05-10 5:38 ` Yafang Shao
0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-05-09 17:18 UTC (permalink / raw)
To: Yafang Shao
Cc: Alan Maguire, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf
On Tue, May 9, 2023 at 8:36 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Tue, May 2, 2023 at 11:40 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > Alan,
> >
> > wdyt on below?
> >
>
> Hi Alexei,
>
> Per my understanding, not only does pahole have issues, but also there
> are issues in the kernel.
> This panic is caused by the inconsistency between BTF and kallsyms as such:
> bpf_check_attach_target
> tname = btf_name_by_offset(btf, t->name_off); // btf
> addr = kallsyms_lookup_name(tname); // kallsyms
>
> So if the function displayed in /proc/sys/btf/vmlinux is not the same
> with the function displayed in /proc/kallsyms, we will get a wrong
> addr. I think it is not proper to rely wholly on the userspace tools
> to make them the same. The kernel should also imrpve the verifier to
> make sure they are really the same function. WDYT?
Are you saying it's not proper to rely on compilers
and linkers to build the kernel?
pahole, resolved_btfid, kallsym gen, objtool are part of the
compilation process.
The bugs in them are discovered from time to time and
have to be fixed. Just like compiler and linker bugs.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-09 17:18 ` Alexei Starovoitov
@ 2023-05-10 5:38 ` Yafang Shao
2023-05-10 13:54 ` Yonghong Song
0 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2023-05-10 5:38 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Alan Maguire, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf
On Wed, May 10, 2023 at 1:18 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, May 9, 2023 at 8:36 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Tue, May 2, 2023 at 11:40 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > Alan,
> > >
> > > wdyt on below?
> > >
> >
> > Hi Alexei,
> >
> > Per my understanding, not only does pahole have issues, but also there
> > are issues in the kernel.
> > This panic is caused by the inconsistency between BTF and kallsyms as such:
> > bpf_check_attach_target
> > tname = btf_name_by_offset(btf, t->name_off); // btf
> > addr = kallsyms_lookup_name(tname); // kallsyms
> >
> > So if the function displayed in /proc/sys/btf/vmlinux is not the same
> > with the function displayed in /proc/kallsyms, we will get a wrong
> > addr. I think it is not proper to rely wholly on the userspace tools
> > to make them the same. The kernel should also imrpve the verifier to
> > make sure they are really the same function. WDYT?
>
> Are you saying it's not proper to rely on compilers
> and linkers to build the kernel?
> pahole, resolved_btfid, kallsym gen, objtool are part of the
> compilation process.
> The bugs in them are discovered from time to time and
> have to be fixed. Just like compiler and linker bugs.
I was wondering if it is possible to add BTF_ID into kallsyms or to
add function address into BTF. Because the function name is not
unique, while the function ID is unique. So with the function ID we
can always get what we want.
For example,
$ cat /proc/kallsyms | awk '{if ($2=="t"||$2=="T") {print $3}}' |
sort| uniq -c | sort -n -r | less
56 __pfx_cleanup_module
56 cleanup_module
47 __pfx_cpumask_weight.constprop.0
47 cpumask_weight.constprop.0
21 __pfx_jhash
21 __pfx_cpumask_weight
21 jhash
21 cpumask_weight
17 type_show
17 __pfx_type_show
14 __rhashtable_insert_fast.constprop.0
14 __pfx___rhashtable_insert_fast.constprop.0
12 __rhashtable_remove_fast_one.cold
12 __rhashtable_remove_fast_one
12 __pfx___rhashtable_remove_fast_one
11 __xfrm_policy_check2.constprop.0
11 __pfx___xfrm_policy_check2.constprop.0
11 __pfx_modalias_show
11 modalias_show
10 rht_key_get_hash.isra.0
10 __pfx_rht_key_get_hash.isra.0
10 __pfx_name_show
10 __pfx_init_once
10 name_show
10 init_once
9 __pfx_event_show
9 event_show
8 __pfx_dst_output
8 dst_output
7 state_show
7 size_show
7 __pfx_state_show
7 __pfx_size_show
kallsyms_lookup_name() always returns the first function and ignores
the others, so it is impossible to trace the other functions with the
same name AFAIK.
--
Regards
Yafang
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-10 5:38 ` Yafang Shao
@ 2023-05-10 13:54 ` Yonghong Song
0 siblings, 0 replies; 15+ messages in thread
From: Yonghong Song @ 2023-05-10 13:54 UTC (permalink / raw)
To: Yafang Shao, Alexei Starovoitov
Cc: Alan Maguire, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf
On 5/9/23 10:38 PM, Yafang Shao wrote:
> On Wed, May 10, 2023 at 1:18 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Tue, May 9, 2023 at 8:36 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>>
>>> On Tue, May 2, 2023 at 11:40 AM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> Alan,
>>>>
>>>> wdyt on below?
>>>>
>>>
>>> Hi Alexei,
>>>
>>> Per my understanding, not only does pahole have issues, but also there
>>> are issues in the kernel.
>>> This panic is caused by the inconsistency between BTF and kallsyms as such:
>>> bpf_check_attach_target
>>> tname = btf_name_by_offset(btf, t->name_off); // btf
>>> addr = kallsyms_lookup_name(tname); // kallsyms
>>>
>>> So if the function displayed in /proc/sys/btf/vmlinux is not the same
>>> with the function displayed in /proc/kallsyms, we will get a wrong
>>> addr. I think it is not proper to rely wholly on the userspace tools
>>> to make them the same. The kernel should also imrpve the verifier to
>>> make sure they are really the same function. WDYT?
>>
>> Are you saying it's not proper to rely on compilers
>> and linkers to build the kernel?
>> pahole, resolved_btfid, kallsym gen, objtool are part of the
>> compilation process.
>> The bugs in them are discovered from time to time and
>> have to be fixed. Just like compiler and linker bugs.
>
> I was wondering if it is possible to add BTF_ID into kallsyms or to
> add function address into BTF. Because the function name is not
> unique, while the function ID is unique. So with the function ID we
> can always get what we want.
We just had some discussions during LSFMMBPF led by Jiri
about how to resolve such issues. Yes, adding 'addr' to BTF
is one solution. Also, if this patch set
(https://lore.kernel.org/lkml/20221205163157.269335-1-nick.alcock@oracle.com/)
can make it
into the kernel, adding 'file_path' to the BTF should also work.
So ya, two possible solutions here.
>
> For example,
>
> $ cat /proc/kallsyms | awk '{if ($2=="t"||$2=="T") {print $3}}' |
> sort| uniq -c | sort -n -r | less
> 56 __pfx_cleanup_module
> 56 cleanup_module
> 47 __pfx_cpumask_weight.constprop.0
> 47 cpumask_weight.constprop.0
> 21 __pfx_jhash
> 21 __pfx_cpumask_weight
> 21 jhash
> 21 cpumask_weight
> 17 type_show
> 17 __pfx_type_show
> 14 __rhashtable_insert_fast.constprop.0
> 14 __pfx___rhashtable_insert_fast.constprop.0
> 12 __rhashtable_remove_fast_one.cold
> 12 __rhashtable_remove_fast_one
> 12 __pfx___rhashtable_remove_fast_one
> 11 __xfrm_policy_check2.constprop.0
> 11 __pfx___xfrm_policy_check2.constprop.0
> 11 __pfx_modalias_show
> 11 modalias_show
> 10 rht_key_get_hash.isra.0
> 10 __pfx_rht_key_get_hash.isra.0
> 10 __pfx_name_show
> 10 __pfx_init_once
> 10 name_show
> 10 init_once
> 9 __pfx_event_show
> 9 event_show
> 8 __pfx_dst_output
> 8 dst_output
> 7 state_show
> 7 size_show
> 7 __pfx_state_show
> 7 __pfx_size_show
>
> kallsyms_lookup_name() always returns the first function and ignores
> the others, so it is impossible to trace the other functions with the
> same name AFAIK.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-02 3:40 ` pahole issue. " Alexei Starovoitov
2023-05-09 15:35 ` Yafang Shao
@ 2023-05-09 18:43 ` Alan Maguire
2023-05-09 21:21 ` Alexei Starovoitov
1 sibling, 1 reply; 15+ messages in thread
From: Alan Maguire @ 2023-05-09 18:43 UTC (permalink / raw)
To: Alexei Starovoitov, Yafang Shao
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf
On 02/05/2023 04:40, Alexei Starovoitov wrote:
> Alan,
>
> wdyt on below?
>
apologies, missed this; see below..
> On Thu, Apr 27, 2023 at 4:35 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>
>> On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>>
>>> On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>
>>>> The kernel will panic as follows when attaching fexit to mm_init,
>>>>
>>>> [ 86.549700] ------------[ cut here ]------------
>>>> [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
>>>> [ 86.549713] #PF: supervisor read access in kernel mode
>>>> [ 86.549715] #PF: error_code(0x0000) - not-present page
>>>> [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
>>>> [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>> [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
>>>> [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
>>>> [ 86.549754] Call Trace:
>>>> [ 86.549755] <TASK>
>>>> [ 86.549757] check_preempt_curr+0x5e/0x70
>>>> [ 86.549761] ttwu_do_activate+0xab/0x350
>>>> [ 86.549763] try_to_wake_up+0x314/0x680
>>>> [ 86.549765] wake_up_process+0x15/0x20
>>>> [ 86.549767] insert_work+0xb2/0xd0
>>>> [ 86.549772] __queue_work+0x20a/0x400
>>>> [ 86.549774] queue_work_on+0x7b/0x90
>>>> [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
>>>> [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
>>>> [ 86.549813] soft_cursor+0x1cb/0x250
>>>> [ 86.549816] bit_cursor+0x3ce/0x630
>>>> [ 86.549818] fbcon_cursor+0x139/0x1c0
>>>> [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
>>>> [ 86.549822] hide_cursor+0x31/0xd0
>>>> [ 86.549825] vt_console_print+0x477/0x4e0
>>>> [ 86.549828] console_flush_all+0x182/0x440
>>>> [ 86.549832] console_unlock+0x58/0xf0
>>>> [ 86.549834] vprintk_emit+0x1ae/0x200
>>>> [ 86.549837] vprintk_default+0x1d/0x30
>>>> [ 86.549839] vprintk+0x5c/0x90
>>>> [ 86.549841] _printk+0x58/0x80
>>>> [ 86.549843] __warn_printk+0x7e/0x1a0
>>>> [ 86.549845] ? trace_preempt_off+0x1b/0x70
>>>> [ 86.549848] ? trace_preempt_on+0x1b/0x70
>>>> [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
>>>> [ 86.549853] refcount_warn_saturate+0x9f/0x150
>>>> [ 86.549855] mm_init+0x379/0x390
>>>> [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
>>>> [ 86.549862] mm_init+0x5/0x390
>>>> [ 86.549865] ? mm_alloc+0x4e/0x60
>>>> [ 86.549866] alloc_bprm+0x8a/0x2e0
>>>> [ 86.549869] do_execveat_common.isra.0+0x67/0x240
>>>> [ 86.549872] __x64_sys_execve+0x37/0x50
>>>> [ 86.549874] do_syscall_64+0x38/0x90
>>>> [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>
>>>> The reason is that when we attach the btf id of the function mm_init we
>>>> actually attach the mm_init defined in init/main.c rather than the
>>>> function defined in kernel/fork.c. That can be proved by parsing
>>>> /sys/kernel/btf/vmlinux:
>>>>
>>>> [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
>>>> [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
>>>> 'buf' type_id=57
>>>> [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
>>>> [2496] FUNC 'mm_init' type_id=118 linkage=static
>>>> [2497] FUNC 'trap_init' type_id=118 linkage=static
>>>> [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
>>>>
>>>> From the above information we can find that the FUNCs above and below
>>>> mm_init are all defined in init/main.c. So there's no doubt that the
>>>> mm_init is also the function defined in init/main.c.
>>>>
>>>> So when a task calls mm_init and thus the bpf trampoline is triggered it
>>>> will use the information of the mm_init defined in init/main.c. Then the
>>>> panic will occur.
>>>>
>>>> It seems that there're issues in btf, for example it is unnecessary to
>>>> generate btf for the functions annonated with __init. We need to improve
>>>> btf. However we also need to change the function defined in
>>>> kernel/fork.c to task_mm_init to better distinguish them. After it is
>>>> renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
>>>>
>>>> [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
>>>> [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
>>>> 'mm' type_id=204
>>>> 'p' type_id=197
>>>> 'user_ns' type_id=452
>>>> [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
>>>> [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
>>>> [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
>>>> 'orig' type_id=197
>>>> 'node' type_id=21
>>>> [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
>>>>
>>>> And then attaching task_mm_init won't panic. Improving the btf will be
>>>> handled later.
>>>
>>> We're not going to hack the kernel to workaround pahole issue.
>>> Let's fix pahole instead.
>>> cc-ing Alan for ideas.
>>
>> Any comment on it, Alan ?
>> I think we can just skip generating BTF for the functions in
>> __section(".init.text"), as these functions will be freed after
>> kernel init. There won't be use cases for them.
>>
won't the pahole v1.25 changes help here; can you try applying
https://lore.kernel.org/bpf/1675949331-27935-1-git-send-email-alan.maguire@oracle.com/
...and build using pahole; this should eliminate any functions
with inconsistent prototypes via
--skip_encoding_btf_inconsistent_proto
Do not encode functions with multiple inconsistent prototypes or
unexpected register use for their parameters, where the regis‐
ters used do not match calling conventions.
I'll check this at my end too.
Alexei, if this works should we look at applying the above
again to bpf-next? If so I'll resend the patch.
Thanks!
Alan
>> --
>> Regards
>> Yafang
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-09 18:43 ` Alan Maguire
@ 2023-05-09 21:21 ` Alexei Starovoitov
2023-05-10 13:08 ` Alan Maguire
0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-05-09 21:21 UTC (permalink / raw)
To: Alan Maguire, Arnaldo Carvalho de Melo
Cc: Yafang Shao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf
On Tue, May 9, 2023 at 11:44 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 02/05/2023 04:40, Alexei Starovoitov wrote:
> > Alan,
> >
> > wdyt on below?
> >
>
> apologies, missed this; see below..
>
> > On Thu, Apr 27, 2023 at 4:35 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >>
> >> On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
> >> <alexei.starovoitov@gmail.com> wrote:
> >>>
> >>> On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >>>>
> >>>> The kernel will panic as follows when attaching fexit to mm_init,
> >>>>
> >>>> [ 86.549700] ------------[ cut here ]------------
> >>>> [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
> >>>> [ 86.549713] #PF: supervisor read access in kernel mode
> >>>> [ 86.549715] #PF: error_code(0x0000) - not-present page
> >>>> [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
> >>>> [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
> >>>> [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
> >>>> [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
> >>>> [ 86.549754] Call Trace:
> >>>> [ 86.549755] <TASK>
> >>>> [ 86.549757] check_preempt_curr+0x5e/0x70
> >>>> [ 86.549761] ttwu_do_activate+0xab/0x350
> >>>> [ 86.549763] try_to_wake_up+0x314/0x680
> >>>> [ 86.549765] wake_up_process+0x15/0x20
> >>>> [ 86.549767] insert_work+0xb2/0xd0
> >>>> [ 86.549772] __queue_work+0x20a/0x400
> >>>> [ 86.549774] queue_work_on+0x7b/0x90
> >>>> [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
> >>>> [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
> >>>> [ 86.549813] soft_cursor+0x1cb/0x250
> >>>> [ 86.549816] bit_cursor+0x3ce/0x630
> >>>> [ 86.549818] fbcon_cursor+0x139/0x1c0
> >>>> [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
> >>>> [ 86.549822] hide_cursor+0x31/0xd0
> >>>> [ 86.549825] vt_console_print+0x477/0x4e0
> >>>> [ 86.549828] console_flush_all+0x182/0x440
> >>>> [ 86.549832] console_unlock+0x58/0xf0
> >>>> [ 86.549834] vprintk_emit+0x1ae/0x200
> >>>> [ 86.549837] vprintk_default+0x1d/0x30
> >>>> [ 86.549839] vprintk+0x5c/0x90
> >>>> [ 86.549841] _printk+0x58/0x80
> >>>> [ 86.549843] __warn_printk+0x7e/0x1a0
> >>>> [ 86.549845] ? trace_preempt_off+0x1b/0x70
> >>>> [ 86.549848] ? trace_preempt_on+0x1b/0x70
> >>>> [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
> >>>> [ 86.549853] refcount_warn_saturate+0x9f/0x150
> >>>> [ 86.549855] mm_init+0x379/0x390
> >>>> [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
> >>>> [ 86.549862] mm_init+0x5/0x390
> >>>> [ 86.549865] ? mm_alloc+0x4e/0x60
> >>>> [ 86.549866] alloc_bprm+0x8a/0x2e0
> >>>> [ 86.549869] do_execveat_common.isra.0+0x67/0x240
> >>>> [ 86.549872] __x64_sys_execve+0x37/0x50
> >>>> [ 86.549874] do_syscall_64+0x38/0x90
> >>>> [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >>>>
> >>>> The reason is that when we attach the btf id of the function mm_init we
> >>>> actually attach the mm_init defined in init/main.c rather than the
> >>>> function defined in kernel/fork.c. That can be proved by parsing
> >>>> /sys/kernel/btf/vmlinux:
> >>>>
> >>>> [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
> >>>> [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
> >>>> 'buf' type_id=57
> >>>> [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
> >>>> [2496] FUNC 'mm_init' type_id=118 linkage=static
> >>>> [2497] FUNC 'trap_init' type_id=118 linkage=static
> >>>> [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
> >>>>
> >>>> From the above information we can find that the FUNCs above and below
> >>>> mm_init are all defined in init/main.c. So there's no doubt that the
> >>>> mm_init is also the function defined in init/main.c.
> >>>>
> >>>> So when a task calls mm_init and thus the bpf trampoline is triggered it
> >>>> will use the information of the mm_init defined in init/main.c. Then the
> >>>> panic will occur.
> >>>>
> >>>> It seems that there're issues in btf, for example it is unnecessary to
> >>>> generate btf for the functions annonated with __init. We need to improve
> >>>> btf. However we also need to change the function defined in
> >>>> kernel/fork.c to task_mm_init to better distinguish them. After it is
> >>>> renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
> >>>>
> >>>> [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
> >>>> [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
> >>>> 'mm' type_id=204
> >>>> 'p' type_id=197
> >>>> 'user_ns' type_id=452
> >>>> [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
> >>>> [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
> >>>> [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
> >>>> 'orig' type_id=197
> >>>> 'node' type_id=21
> >>>> [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
> >>>>
> >>>> And then attaching task_mm_init won't panic. Improving the btf will be
> >>>> handled later.
> >>>
> >>> We're not going to hack the kernel to workaround pahole issue.
> >>> Let's fix pahole instead.
> >>> cc-ing Alan for ideas.
> >>
> >> Any comment on it, Alan ?
> >> I think we can just skip generating BTF for the functions in
> >> __section(".init.text"), as these functions will be freed after
> >> kernel init. There won't be use cases for them.
> >>
>
> won't the pahole v1.25 changes help here; can you try applying
>
> https://lore.kernel.org/bpf/1675949331-27935-1-git-send-email-alan.maguire@oracle.com/
>
> ...and build using pahole; this should eliminate any functions
> with inconsistent prototypes via
>
> --skip_encoding_btf_inconsistent_proto
> Do not encode functions with multiple inconsistent prototypes or
> unexpected register use for their parameters, where the regis‐
> ters used do not match calling conventions.
>
>
> I'll check this at my end too.
>
> Alexei, if this works should we look at applying the above
> again to bpf-next? If so I'll resend the patch.
I've lost the track with pahole fixes.
Did Arnaldo re-tag pahole 1.25 or released 1.26 with the fixes?
Alan,
please submit a fresh patch for bpf-next to enable
--skip_encoding_btf_inconsistent_proto, so it can go through CI.
I cannot test all combinations manually.
Thanks!
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-09 21:21 ` Alexei Starovoitov
@ 2023-05-10 13:08 ` Alan Maguire
2023-05-10 15:28 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 15+ messages in thread
From: Alan Maguire @ 2023-05-10 13:08 UTC (permalink / raw)
To: Alexei Starovoitov, Arnaldo Carvalho de Melo
Cc: Yafang Shao, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf
On 09/05/2023 22:21, Alexei Starovoitov wrote:
> On Tue, May 9, 2023 at 11:44 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> On 02/05/2023 04:40, Alexei Starovoitov wrote:
>>> Alan,
>>>
>>> wdyt on below?
>>>
>>
>> apologies, missed this; see below..
>>
>>> On Thu, Apr 27, 2023 at 4:35 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>
>>>> On Tue, Apr 25, 2023 at 5:13 AM Alexei Starovoitov
>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>
>>>>> On Mon, Apr 24, 2023 at 9:12 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>>>>>
>>>>>> The kernel will panic as follows when attaching fexit to mm_init,
>>>>>>
>>>>>> [ 86.549700] ------------[ cut here ]------------
>>>>>> [ 86.549712] BUG: kernel NULL pointer dereference, address: 0000000000000078
>>>>>> [ 86.549713] #PF: supervisor read access in kernel mode
>>>>>> [ 86.549715] #PF: error_code(0x0000) - not-present page
>>>>>> [ 86.549716] PGD 10308f067 P4D 10308f067 PUD 11754e067 PMD 0
>>>>>> [ 86.549719] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>>>> [ 86.549722] CPU: 9 PID: 9829 Comm: main_amd64 Kdump: loaded Not tainted 6.3.0-rc6+ #12
>>>>>> [ 86.549725] RIP: 0010:check_preempt_wakeup+0xd1/0x310
>>>>>> [ 86.549754] Call Trace:
>>>>>> [ 86.549755] <TASK>
>>>>>> [ 86.549757] check_preempt_curr+0x5e/0x70
>>>>>> [ 86.549761] ttwu_do_activate+0xab/0x350
>>>>>> [ 86.549763] try_to_wake_up+0x314/0x680
>>>>>> [ 86.549765] wake_up_process+0x15/0x20
>>>>>> [ 86.549767] insert_work+0xb2/0xd0
>>>>>> [ 86.549772] __queue_work+0x20a/0x400
>>>>>> [ 86.549774] queue_work_on+0x7b/0x90
>>>>>> [ 86.549778] drm_fb_helper_sys_imageblit+0xd7/0xf0 [drm_kms_helper]
>>>>>> [ 86.549801] drm_fbdev_fb_imageblit+0x5b/0xb0 [drm_kms_helper]
>>>>>> [ 86.549813] soft_cursor+0x1cb/0x250
>>>>>> [ 86.549816] bit_cursor+0x3ce/0x630
>>>>>> [ 86.549818] fbcon_cursor+0x139/0x1c0
>>>>>> [ 86.549821] ? __pfx_bit_cursor+0x10/0x10
>>>>>> [ 86.549822] hide_cursor+0x31/0xd0
>>>>>> [ 86.549825] vt_console_print+0x477/0x4e0
>>>>>> [ 86.549828] console_flush_all+0x182/0x440
>>>>>> [ 86.549832] console_unlock+0x58/0xf0
>>>>>> [ 86.549834] vprintk_emit+0x1ae/0x200
>>>>>> [ 86.549837] vprintk_default+0x1d/0x30
>>>>>> [ 86.549839] vprintk+0x5c/0x90
>>>>>> [ 86.549841] _printk+0x58/0x80
>>>>>> [ 86.549843] __warn_printk+0x7e/0x1a0
>>>>>> [ 86.549845] ? trace_preempt_off+0x1b/0x70
>>>>>> [ 86.549848] ? trace_preempt_on+0x1b/0x70
>>>>>> [ 86.549849] ? __percpu_counter_init+0x8e/0xb0
>>>>>> [ 86.549853] refcount_warn_saturate+0x9f/0x150
>>>>>> [ 86.549855] mm_init+0x379/0x390
>>>>>> [ 86.549859] bpf_trampoline_6442453440_0+0x23/0x1000
>>>>>> [ 86.549862] mm_init+0x5/0x390
>>>>>> [ 86.549865] ? mm_alloc+0x4e/0x60
>>>>>> [ 86.549866] alloc_bprm+0x8a/0x2e0
>>>>>> [ 86.549869] do_execveat_common.isra.0+0x67/0x240
>>>>>> [ 86.549872] __x64_sys_execve+0x37/0x50
>>>>>> [ 86.549874] do_syscall_64+0x38/0x90
>>>>>> [ 86.549877] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>>>
>>>>>> The reason is that when we attach the btf id of the function mm_init we
>>>>>> actually attach the mm_init defined in init/main.c rather than the
>>>>>> function defined in kernel/fork.c. That can be proved by parsing
>>>>>> /sys/kernel/btf/vmlinux:
>>>>>>
>>>>>> [2493] FUNC 'initcall_blacklist' type_id=2477 linkage=static
>>>>>> [2494] FUNC_PROTO '(anon)' ret_type_id=21 vlen=1
>>>>>> 'buf' type_id=57
>>>>>> [2495] FUNC 'early_randomize_kstack_offset' type_id=2494 linkage=static
>>>>>> [2496] FUNC 'mm_init' type_id=118 linkage=static
>>>>>> [2497] FUNC 'trap_init' type_id=118 linkage=static
>>>>>> [2498] FUNC 'thread_stack_cache_init' type_id=118 linkage=static
>>>>>>
>>>>>> From the above information we can find that the FUNCs above and below
>>>>>> mm_init are all defined in init/main.c. So there's no doubt that the
>>>>>> mm_init is also the function defined in init/main.c.
>>>>>>
>>>>>> So when a task calls mm_init and thus the bpf trampoline is triggered it
>>>>>> will use the information of the mm_init defined in init/main.c. Then the
>>>>>> panic will occur.
>>>>>>
>>>>>> It seems that there're issues in btf, for example it is unnecessary to
>>>>>> generate btf for the functions annonated with __init. We need to improve
>>>>>> btf. However we also need to change the function defined in
>>>>>> kernel/fork.c to task_mm_init to better distinguish them. After it is
>>>>>> renamed to task_mm_init, the /sys/kernel/btf/vmlinux will be:
>>>>>>
>>>>>> [13970] FUNC 'mm_alloc' type_id=13969 linkage=static
>>>>>> [13971] FUNC_PROTO '(anon)' ret_type_id=204 vlen=3
>>>>>> 'mm' type_id=204
>>>>>> 'p' type_id=197
>>>>>> 'user_ns' type_id=452
>>>>>> [13972] FUNC 'task_mm_init' type_id=13971 linkage=static
>>>>>> [13973] FUNC 'coredump_filter_setup' type_id=3804 linkage=static
>>>>>> [13974] FUNC_PROTO '(anon)' ret_type_id=197 vlen=2
>>>>>> 'orig' type_id=197
>>>>>> 'node' type_id=21
>>>>>> [13975] FUNC 'dup_task_struct' type_id=13974 linkage=static
>>>>>>
>>>>>> And then attaching task_mm_init won't panic. Improving the btf will be
>>>>>> handled later.
>>>>>
>>>>> We're not going to hack the kernel to workaround pahole issue.
>>>>> Let's fix pahole instead.
>>>>> cc-ing Alan for ideas.
>>>>
>>>> Any comment on it, Alan ?
>>>> I think we can just skip generating BTF for the functions in
>>>> __section(".init.text"), as these functions will be freed after
>>>> kernel init. There won't be use cases for them.
>>>>
>>
>> won't the pahole v1.25 changes help here; can you try applying
>>
>> https://lore.kernel.org/bpf/1675949331-27935-1-git-send-email-alan.maguire@oracle.com/
>>
>> ...and build using pahole; this should eliminate any functions
>> with inconsistent prototypes via
>>
>> --skip_encoding_btf_inconsistent_proto
>> Do not encode functions with multiple inconsistent prototypes or
>> unexpected register use for their parameters, where the regis‐
>> ters used do not match calling conventions.
>>
>>
>> I'll check this at my end too.
>>
>> Alexei, if this works should we look at applying the above
>> again to bpf-next? If so I'll resend the patch.
>
> I've lost the track with pahole fixes.
> Did Arnaldo re-tag pahole 1.25 or released 1.26 with the fixes?
>
My understanding is the pahole repo is prepped for v1.25,
meaning that it will build with version 1.25 but it is
not officially released yet; Arnaldo will correct me if
I've got that wrong.
> Alan,
> please submit a fresh patch for bpf-next to enable
> --skip_encoding_btf_inconsistent_proto, so it can go through CI.
> I cannot test all combinations manually.
>
Done; see [1]. If CI picks up the latest version from
the dwarves repo, it will see version 1.25.
I've tested the above specific case along with general testing
using latest pahole.
When running pahole with --verbose and
--skip_encoding_btf_inconsistent_proto
we see:
skipping addition of 'mm_init'(mm_init) due to multiple inconsistent
function prototypes
...and
bpftool btf dump file vmlinux |grep mm_init
shows the function is not encoded in BTF due to these inconsistencies.
Thanks!
Alan
[1]
https://lore.kernel.org/bpf/20230510130241.1696561-1-alan.maguire@oracle.com/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: pahole issue. Re: [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init
2023-05-10 13:08 ` Alan Maguire
@ 2023-05-10 15:28 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-05-10 15:28 UTC (permalink / raw)
To: Alan Maguire
Cc: Alexei Starovoitov, Yafang Shao, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, Jiri Olsa, bpf
Em Wed, May 10, 2023 at 02:08:49PM +0100, Alan Maguire escreveu:
> On 09/05/2023 22:21, Alexei Starovoitov wrote:
> > On Tue, May 9, 2023 at 11:44 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >> On 02/05/2023 04:40, Alexei Starovoitov wrote:
> >> won't the pahole v1.25 changes help here; can you try applying
> >> https://lore.kernel.org/bpf/1675949331-27935-1-git-send-email-alan.maguire@oracle.com/
> >> ...and build using pahole; this should eliminate any functions
> >> with inconsistent prototypes via
> >> --skip_encoding_btf_inconsistent_proto
> >> Do not encode functions with multiple inconsistent prototypes or
> >> unexpected register use for their parameters, where the regis‐
> >> ters used do not match calling conventions.
> >> I'll check this at my end too.
> >> Alexei, if this works should we look at applying the above
> >> again to bpf-next? If so I'll resend the patch.
> > I've lost the track with pahole fixes.
> > Did Arnaldo re-tag pahole 1.25 or released 1.26 with the fixes?
> My understanding is the pahole repo is prepped for v1.25,
> meaning that it will build with version 1.25 but it is
> not officially released yet; Arnaldo will correct me if
> I've got that wrong.
Right, pahole 1.25 was released with those options/fixes.
- Arnaldo
> > Alan,
> > please submit a fresh patch for bpf-next to enable
> > --skip_encoding_btf_inconsistent_proto, so it can go through CI.
> > I cannot test all combinations manually.
> >
>
> Done; see [1]. If CI picks up the latest version from
> the dwarves repo, it will see version 1.25.
>
> I've tested the above specific case along with general testing
> using latest pahole.
>
> When running pahole with --verbose and
> --skip_encoding_btf_inconsistent_proto
>
> we see:
>
> skipping addition of 'mm_init'(mm_init) due to multiple inconsistent
> function prototypes
>
>
> ...and
>
> bpftool btf dump file vmlinux |grep mm_init
>
> shows the function is not encoded in BTF due to these inconsistencies.
>
> Thanks!
>
> Alan
>
> [1]
> https://lore.kernel.org/bpf/20230510130241.1696561-1-alan.maguire@oracle.com/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline
2023-04-24 16:11 [PATCH bpf-next 0/2] bpf: Fix issues caused by bpf trampoline Yafang Shao
2023-04-24 16:11 ` [PATCH bpf-next 1/2] bpf: Add __rcu_read_{lock,unlock} into btf id deny list Yafang Shao
2023-04-24 16:11 ` [PATCH bpf-next 2/2] fork: Rename mm_init to task_mm_init Yafang Shao
@ 2023-04-24 21:30 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-04-24 21:30 UTC (permalink / raw)
To: Yafang Shao
Cc: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
kpsingh, sdf, haoluo, jolsa, bpf
Hello:
This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Mon, 24 Apr 2023 16:11:02 +0000 you wrote:
> The panic caused by fentry[1] drives me to write a testcase[2] to check if
> it safe to attach other kernel functions. Unsurprisingly it catches some
> issues. This patchset fixes them.
>
> [1]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=c11bd046485d7bf1ca200db0e7d0bdc4bafdd395
> [2]. https://github.com/laoar/ebpf/tree/main/fentry
>
> [...]
Here is the summary with links:
- [bpf-next,1/2] bpf: Add __rcu_read_{lock,unlock} into btf id deny list
https://git.kernel.org/bpf/bpf-next/c/a0c109dcafb1
- [bpf-next,2/2] fork: Rename mm_init to task_mm_init
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread