* Re: [PATCH bpf] bpf: fix recursive lock when verdict program return SK_PASS [not found] ` <2939664d-e38d-4ac4-b8cf-3ef60c5fd5c6@linux.dev> @ 2024-11-08 21:07 ` Martin KaFai Lau 2024-11-10 5:04 ` Jiayuan Chen 0 siblings, 1 reply; 2+ messages in thread From: Martin KaFai Lau @ 2024-11-08 21:07 UTC (permalink / raw) To: mrpre, John Fastabend, Jakub Sitnicki Cc: edumazet, davem, dsahern, kuba, pabeni, netdev, bpf, linux-kernel, Vincent Whitchurch On 11/8/24 1:03 PM, Martin KaFai Lau wrote: > On 11/6/24 4:44 AM, mrpre wrote: >> When the stream_verdict program returns SK_PASS, it places the received skb >> into its own receive queue, but a recursive lock eventually occurs, leading >> to an operating system deadlock. This issue has been present since v6.9. >> >> ''' >> sk_psock_strp_data_ready >> write_lock_bh(&sk->sk_callback_lock) >> strp_data_ready >> strp_read_sock >> read_sock -> tcp_read_sock >> strp_recv >> cb.rcv_msg -> sk_psock_strp_read >> # now stream_verdict return SK_PASS without peer sock assign >> __SK_PASS = sk_psock_map_verd(SK_PASS, NULL) >> sk_psock_verdict_apply >> sk_psock_skb_ingress_self >> sk_psock_skb_ingress_enqueue >> sk_psock_data_ready >> read_lock_bh(&sk->sk_callback_lock) <= dead lock >> >> ''' >> >> This topic has been discussed before, but it has not been fixed. >> Previous discussion: >> https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch > > Is the selftest included in this link still useful to reproduce this bug? > If yes, please include that also. > >> >> Fixes: 6648e613226e ("bpf, skmsg: Fix NULL pointer dereference in >> sk_psock_skb_ingress_enqueue") >> Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> >> Signed-off-by: Jiayuan Chen <mrpre@163.com> > > Please also use the real name in the author (i.e. the email sender). The patch > needs a real author name also. I had manually fixed one of your earlier > lock_sock fix before applying. and the bpf mailing list address has a typo in the original patch email... I fixed that in this reply. > > pw-bot: cr > >> Signed-off-by: John Fastabend <john.fastabend@gmail.com> > > The patch and the earlier discussion make sense to me. > John and JakubS, please help to take another look in the next respin. > > ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH bpf] bpf: fix recursive lock when verdict program return SK_PASS 2024-11-08 21:07 ` [PATCH bpf] bpf: fix recursive lock when verdict program return SK_PASS Martin KaFai Lau @ 2024-11-10 5:04 ` Jiayuan Chen 0 siblings, 0 replies; 2+ messages in thread From: Jiayuan Chen @ 2024-11-10 5:04 UTC (permalink / raw) To: Martin KaFai Lau; +Cc: bpf On Fri, Nov 08, 2024 at 01:07:57PM +0800, Martin KaFai Lau wrote: > On 11/8/24 1:03 PM, Martin KaFai Lau wrote: > > On 11/6/24 4:44 AM, mrpre wrote: > > > When the stream_verdict program returns SK_PASS, it places the received skb > > > into its own receive queue, but a recursive lock eventually occurs, leading > > > to an operating system deadlock. This issue has been present since v6.9. > > > > > > ''' > > > sk_psock_strp_data_ready > > > write_lock_bh(&sk->sk_callback_lock) > > > strp_data_ready > > > strp_read_sock > > > read_sock -> tcp_read_sock > > > strp_recv > > > cb.rcv_msg -> sk_psock_strp_read > > > # now stream_verdict return SK_PASS without peer sock assign > > > __SK_PASS = sk_psock_map_verd(SK_PASS, NULL) > > > sk_psock_verdict_apply > > > sk_psock_skb_ingress_self > > > sk_psock_skb_ingress_enqueue > > > sk_psock_data_ready > > > read_lock_bh(&sk->sk_callback_lock) <= dead lock > > > > > > ''' > > > > > > This topic has been discussed before, but it has not been fixed. > > > Previous discussion: > > > https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch > > > > Is the selftest included in this link still useful to reproduce this bug? > > If yes, please include that also. > > > > > > > > Fixes: 6648e613226e ("bpf, skmsg: Fix NULL pointer dereference in > > > sk_psock_skb_ingress_enqueue") > > > Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> > > > Signed-off-by: Jiayuan Chen <mrpre@163.com> > > > > Please also use the real name in the author (i.e. the email sender). The > > patch needs a real author name also. I had manually fixed one of your > > earlier lock_sock fix before applying. > > and the bpf mailing list address has a typo in the original patch email... I > fixed that in this reply. > > > > > pw-bot: cr > > > > > Signed-off-by: John Fastabend <john.fastabend@gmail.com> > > > > The patch and the earlier discussion make sense to me. > > John and JakubS, please help to take another look in the next respin. > > > > Hi Martin, Thank you for the reminder. I’ve added test case in the new patch,and I found that the deadlock issue can be reproduced 100% of the time whenever the test cases are run. This is indeed a very dangerous defect. New patch: https://lore.kernel.org/bpf/20241109150305.141759-1-mrpre@163.com/T/#t (Additionally, I followed your guidance and used the correct names in the new patch. Thanks again.) ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-11-10 5:04 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241106124431.5583-1-mrpre@163.com>
[not found] ` <2939664d-e38d-4ac4-b8cf-3ef60c5fd5c6@linux.dev>
2024-11-08 21:07 ` [PATCH bpf] bpf: fix recursive lock when verdict program return SK_PASS Martin KaFai Lau
2024-11-10 5:04 ` Jiayuan Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox