* Kernel bug from adding bpf actions in tc
@ 2018-04-05 15:23 Lucas Bates
2018-04-05 17:27 ` Davide Caratti
0 siblings, 1 reply; 2+ messages in thread
From: Lucas Bates @ 2018-04-05 15:23 UTC (permalink / raw)
To: dcaratti; +Cc: Linux Kernel Network Developers
Hi Davide,
Our overnight tc test runs of net-next revealed a kernel bug on one of
the BPF tests you submitted, d959. The add action completes
successfully, but the bug occurs on the verify when tdc does a get of
the action that was just added. Here's the text of the dump:
[ 61.973632] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000020
[ 61.974366] PGD 8000000081881067 P4D 8000000081881067 PUD 83784067 PMD 0
[ 61.974986] Oops: 0000 [#1] SMP PTI
[ 61.975309] Modules linked in: kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
aes_x86_64 crypto_simd psmouse glue_helper cryptd serio_raw
[ 61.976800] CPU: 28 PID: 1087 Comm: tc Not tainted 4.16.0+ #28
[ 61.977329] RIP: 0010:__bpf_prog_put+0x5/0xe0
[ 61.977731] RSP: 0018:ffff9647c4823788 EFLAGS: 00010202
[ 61.978204] RAX: 0000000000000000 RBX: ffff9647c48237a0 RCX: 000000000000176c
[ 61.978845] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[ 61.979484] RBP: 0000000000000000 R08: 0000000000025be0 R09: ffffffffa4794077
[ 61.980121] R10: ffffdc1bc2130b00 R11: 0000000000000000 R12: 0000000000000000
[ 61.980763] R13: 0000000000000001 R14: ffff8889869969f0 R15: 00000000ffffffea
[ 61.981398] FS: 00007faa72489700(0000) GS:ffff888988f00000(0000)
knlGS:0000000000000000
[ 61.982114] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 61.982627] CR2: 0000000000000020 CR3: 0000000085938004 CR4: 00000000003606a0
[ 61.983263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 61.983897] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 61.984531] Call Trace:
[ 61.984766] tcf_bpf_cfg_cleanup+0x2f/0x40
[ 61.985139] tcf_bpf_cleanup+0x3c/0x50
[ 61.985479] ? uncore_event_cpu_online+0x80/0x3c0
[ 61.985922] __tcf_idr_release+0x72/0x150
[ 61.986297] tcf_bpf_init+0x102/0x3e0
[ 61.986637] ? perf_trace_sched_process_exec+0xf4/0x140
[ 61.987108] tcf_action_init_1+0x36c/0x410
[ 61.987482] ? ___slab_alloc+0x218/0x4b0
[ 61.987841] tcf_action_init+0x106/0x190
[ 61.988204] tc_ctl_action+0x11a/0x220
[ 61.988551] rtnetlink_rcv_msg+0x243/0x2f0
[ 61.988931] ? _cond_resched+0x16/0x40
[ 61.989277] ? __kmalloc_node_track_caller+0x1e6/0x2a0
[ 61.989746] ? rtnl_calcit.isra.29+0xe0/0xe0
[ 61.990137] netlink_rcv_skb+0xde/0x110
[ 61.990494] netlink_unicast+0x16d/0x220
[ 61.990858] netlink_sendmsg+0x293/0x370
[ 61.991224] sock_sendmsg+0x36/0x40
[ 61.991559] ___sys_sendmsg+0x2cb/0x2e0
[ 61.991913] ? pagecache_get_page+0x27/0x220
[ 61.992302] ? filemap_fault+0xa2/0x650
[ 61.992651] ? page_add_file_rmap+0x108/0x200
[ 61.993057] ? alloc_set_pte+0x2aa/0x530
[ 61.993419] ? finish_fault+0x4e/0x70
[ 61.993758] ? __handle_mm_fault+0xbfc/0x1110
[ 61.994160] ? __sys_sendmsg+0x53/0x80
[ 61.994506] __sys_sendmsg+0x53/0x80
[ 61.994849] do_syscall_64+0x6e/0x120
[ 61.995189] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 61.995662] RIP: 0033:0x7faa71885bd0
[ 61.995991] RSP: 002b:00007ffccf65bf28 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[ 61.996693] RAX: ffffffffffffffda RBX: 00007ffccf65c050 RCX: 00007faa71885bd0
[ 61.997350] RDX: 0000000000000000 RSI: 00007ffccf65bfa0 RDI: 0000000000000003
[ 61.998001] RBP: 000000005ac513e0 R08: 0000000000000001 R09: 0000000000000000
[ 61.998649] R10: 00000000000005e7 R11: 0000000000000246 R12: 0000000000000000
[ 61.999287] R13: 00007ffccf6600b0 R14: 0000000000000001 R15: 0000000000674600
[ 61.999942] Code: c6 72 00 48 8b 43 20 48 c7 c7 d0 61 9b a5 c7 40
10 00 00 00 00 5b e9 1b d9 74 00 90 66 2e 0f 1f 84 00 00 00 00 00 66
66 66 66 90 <48> 8b 47 20 f0 ff 08 74 01 c3 41 54 41 89 f4 55 48 89 fd
53 66
[ 62.001660] RIP: __bpf_prog_put+0x5/0xe0 RSP: ffff9647c4823788
[ 62.002183] CR2: 0000000000000020
[ 62.002488] ---[ end trace 19b56d1a66dd8e2a ]---
I'm sending this your way because you were the last one to touch this
part of the code. Have you seen this in your own testing? (This
can't be replicated by hand, only by running tdc).
- Lucas
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Kernel bug from adding bpf actions in tc
2018-04-05 15:23 Kernel bug from adding bpf actions in tc Lucas Bates
@ 2018-04-05 17:27 ` Davide Caratti
0 siblings, 0 replies; 2+ messages in thread
From: Davide Caratti @ 2018-04-05 17:27 UTC (permalink / raw)
To: Lucas Bates; +Cc: Linux Kernel Network Developers
On Thu, 2018-04-05 at 11:23 -0400, Lucas Bates wrote:
> Hi Davide,
>
> Our overnight tc test runs of net-next revealed a kernel bug on one of
> the BPF tests you submitted, d959. The add action completes
> successfully, but the bug occurs on the verify when tdc does a get of
> the action that was just added. Here's the text of the dump:
>
looking at the call trace, I think cfg->filter is NULL when
tcf_bpf_cleanup() is called, and apparently we are in the error path of
tcf_bpf_init(), when
prog->bpf_ops = cfg.bpf_ops;
...
rcu_assign_pointer(prog->filter, cfg.filter);
have not been executed yet.
If tcf_idr_release() is called in this situation, cfg->is_ebpf is assigned
to true, and bpf_prog_put() can dereference a NULL pointer.
I will try reproducing in the next hours, and eventually followup with a
patch.
thanks!
regards,
--
davide
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-04-05 17:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-05 15:23 Kernel bug from adding bpf actions in tc Lucas Bates
2018-04-05 17:27 ` Davide Caratti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).