* mptcp splat @ 2024-03-27 16:43 Alexei Starovoitov 2024-03-27 16:56 ` Paolo Abeni 0 siblings, 1 reply; 10+ messages in thread From: Alexei Starovoitov @ 2024-03-27 16:43 UTC (permalink / raw) To: Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau Hi, I ffwded bpf tree with the recent net fixes and caught this: [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 subflow_data_ready+0x147/0x1c0 [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: bpf_test_no_cfi(O)] [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G O 6.8.0-12873-g2c43c33bfd23 #1014 #[ 48.467143] Call Trace: [ 48.469094] <TASK> [ 48.472159] ? __warn+0x80/0x180 [ 48.475019] ? subflow_data_ready+0x147/0x1c0 [ 48.478068] ? report_bug+0x189/0x1c0 [ 48.480725] ? handle_bug+0x36/0x70 [ 48.483061] ? exc_invalid_op+0x13/0x60 [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 [ 48.488754] ? subflow_data_ready+0x147/0x1c0 [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 [ 48.495026] sk_setsockopt+0x6c0/0x1540 It doesn't reproduce all the time though. Some race? Known issue? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 16:43 mptcp splat Alexei Starovoitov @ 2024-03-27 16:56 ` Paolo Abeni 2024-03-27 17:00 ` Alexei Starovoitov 0 siblings, 1 reply; 10+ messages in thread From: Paolo Abeni @ 2024-03-27 16:56 UTC (permalink / raw) To: Alexei Starovoitov, Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: > I ffwded bpf tree with the recent net fixes and caught this: > > [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 > subflow_data_ready+0x147/0x1c0 > [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: > bpf_test_no_cfi(O)] > [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G > O 6.8.0-12873-g2c43c33bfd23 #1014 > #[ 48.467143] Call Trace: > [ 48.469094] <TASK> > [ 48.472159] ? __warn+0x80/0x180 > [ 48.475019] ? subflow_data_ready+0x147/0x1c0 > [ 48.478068] ? report_bug+0x189/0x1c0 > [ 48.480725] ? handle_bug+0x36/0x70 > [ 48.483061] ? exc_invalid_op+0x13/0x60 > [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 > [ 48.488754] ? subflow_data_ready+0x147/0x1c0 > [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 > [ 48.495026] sk_setsockopt+0x6c0/0x1540 > > It doesn't reproduce all the time though. > Some race? > Known issue? It was not known to me. Looks like something related to not so recent changes (rcvlowat support). Definitely looks lie a race. If you could share more info about the running context and/or a full decoded splat it could help, thanks! Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 16:56 ` Paolo Abeni @ 2024-03-27 17:00 ` Alexei Starovoitov 2024-03-27 18:32 ` Paolo Abeni 0 siblings, 1 reply; 10+ messages in thread From: Alexei Starovoitov @ 2024-03-27 17:00 UTC (permalink / raw) To: Paolo Abeni Cc: Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau On Wed, Mar 27, 2024 at 9:56 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: > > I ffwded bpf tree with the recent net fixes and caught this: > > > > [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 > > subflow_data_ready+0x147/0x1c0 > > [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: > > bpf_test_no_cfi(O)] > > [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G > > O 6.8.0-12873-g2c43c33bfd23 #1014 > > #[ 48.467143] Call Trace: > > [ 48.469094] <TASK> > > [ 48.472159] ? __warn+0x80/0x180 > > [ 48.475019] ? subflow_data_ready+0x147/0x1c0 > > [ 48.478068] ? report_bug+0x189/0x1c0 > > [ 48.480725] ? handle_bug+0x36/0x70 > > [ 48.483061] ? exc_invalid_op+0x13/0x60 > > [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 > > [ 48.488754] ? subflow_data_ready+0x147/0x1c0 > > [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 > > [ 48.495026] sk_setsockopt+0x6c0/0x1540 > > > > It doesn't reproduce all the time though. > > Some race? > > Known issue? > > It was not known to me. Looks like something related to not so recent > changes (rcvlowat support). > > Definitely looks lie a race. > > If you could share more info about the running context and/or a full > decoded splat it could help, thanks! This is just running bpf selftests in parallel: test_progs -j The end of the splat: [ 48.500075] __bpf_setsockopt+0x6f/0x90 [ 48.503124] bpf_sock_ops_setsockopt+0x3c/0x90 [ 48.506053] bpf_prog_509ce5db2c7f9981_bpf_test_sockopt_int+0xb4/0x11b [ 48.510178] bpf_prog_dce07e362d941d2b_bpf_test_socket_sockopt+0x12b/0x132 [ 48.515070] bpf_prog_348c9b5faaf10092_skops_sockopt+0x954/0xe86 [ 48.519050] __cgroup_bpf_run_filter_sock_ops+0xbc/0x250 [ 48.523836] tcp_connect+0x879/0x1160 [ 48.527239] ? ktime_get_with_offset+0x8d/0x140 [ 48.531362] tcp_v6_connect+0x50c/0x870 [ 48.534609] ? mptcp_connect+0x129/0x280 [ 48.538483] mptcp_connect+0x129/0x280 [ 48.542436] __inet_stream_connect+0xce/0x370 [ 48.546664] ? rcu_is_watching+0xd/0x40 [ 48.549063] ? lock_release+0x1c4/0x280 [ 48.553497] ? inet_stream_connect+0x22/0x50 [ 48.557289] ? rcu_is_watching+0xd/0x40 [ 48.560430] inet_stream_connect+0x36/0x50 [ 48.563604] bpf_trampoline_6442491565+0x49/0xef [ 48.567770] ? security_socket_connect+0x34/0x50 [ 48.575400] inet_stream_connect+0x5/0x50 [ 48.577721] __sys_connect+0x63/0x90 [ 48.580189] ? bpf_trace_run2+0xb0/0x1a0 [ 48.583171] ? rcu_is_watching+0xd/0x40 [ 48.585802] ? syscall_trace_enter+0xfb/0x1e0 [ 48.588836] __x64_sys_connect+0x14/0x20 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 17:00 ` Alexei Starovoitov @ 2024-03-27 18:32 ` Paolo Abeni 2024-03-27 18:45 ` Mat Martineau 2024-03-27 18:50 ` Alexei Starovoitov 0 siblings, 2 replies; 10+ messages in thread From: Paolo Abeni @ 2024-03-27 18:32 UTC (permalink / raw) To: Alexei Starovoitov Cc: Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau On Wed, 2024-03-27 at 10:00 -0700, Alexei Starovoitov wrote: > On Wed, Mar 27, 2024 at 9:56 AM Paolo Abeni <pabeni@redhat.com> wrote: > > > > On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: > > > I ffwded bpf tree with the recent net fixes and caught this: > > > > > > [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 > > > subflow_data_ready+0x147/0x1c0 > > > [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: > > > bpf_test_no_cfi(O)] > > > [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G > > > O 6.8.0-12873-g2c43c33bfd23 #1014 > > > #[ 48.467143] Call Trace: > > > [ 48.469094] <TASK> > > > [ 48.472159] ? __warn+0x80/0x180 > > > [ 48.475019] ? subflow_data_ready+0x147/0x1c0 > > > [ 48.478068] ? report_bug+0x189/0x1c0 > > > [ 48.480725] ? handle_bug+0x36/0x70 > > > [ 48.483061] ? exc_invalid_op+0x13/0x60 > > > [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 > > > [ 48.488754] ? subflow_data_ready+0x147/0x1c0 > > > [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 > > > [ 48.495026] sk_setsockopt+0x6c0/0x1540 > > > > > > It doesn't reproduce all the time though. > > > Some race? > > > Known issue? > > > > It was not known to me. Looks like something related to not so recent > > changes (rcvlowat support). > > > > Definitely looks lie a race. > > > > If you could share more info about the running context and/or a full > > decoded splat it could help, thanks! > > This is just running bpf selftests in parallel: > test_progs -j > > The end of the splat: > [ 48.500075] __bpf_setsockopt+0x6f/0x90 > [ 48.503124] bpf_sock_ops_setsockopt+0x3c/0x90 > [ 48.506053] bpf_prog_509ce5db2c7f9981_bpf_test_sockopt_int+0xb4/0x11b > [ 48.510178] bpf_prog_dce07e362d941d2b_bpf_test_socket_sockopt+0x12b/0x132 > [ 48.515070] bpf_prog_348c9b5faaf10092_skops_sockopt+0x954/0xe86 > [ 48.519050] __cgroup_bpf_run_filter_sock_ops+0xbc/0x250 > [ 48.523836] tcp_connect+0x879/0x1160 > [ 48.527239] ? ktime_get_with_offset+0x8d/0x140 > [ 48.531362] tcp_v6_connect+0x50c/0x870 > [ 48.534609] ? mptcp_connect+0x129/0x280 > [ 48.538483] mptcp_connect+0x129/0x280 > [ 48.542436] __inet_stream_connect+0xce/0x370 > [ 48.546664] ? rcu_is_watching+0xd/0x40 > [ 48.549063] ? lock_release+0x1c4/0x280 > [ 48.553497] ? inet_stream_connect+0x22/0x50 > [ 48.557289] ? rcu_is_watching+0xd/0x40 > [ 48.560430] inet_stream_connect+0x36/0x50 > [ 48.563604] bpf_trampoline_6442491565+0x49/0xef > [ 48.567770] ? security_socket_connect+0x34/0x50 > [ 48.575400] inet_stream_connect+0x5/0x50 > [ 48.577721] __sys_connect+0x63/0x90 > [ 48.580189] ? bpf_trace_run2+0xb0/0x1a0 > [ 48.583171] ? rcu_is_watching+0xd/0x40 > [ 48.585802] ? syscall_trace_enter+0xfb/0x1e0 > [ 48.588836] __x64_sys_connect+0x14/0x20 Ouch, it looks bad. BPF should not allow any action on mptcp subflows that go through sk_socket. They touch the mptcp main socket, which is _not_ protected by the subflow socket lock. AFICS currently the relevant set of racing sockopt allowed by bpf boils down to SO_RCVLOWAT only - sk_setsockopt(SO_RCVLOWAT) will call sk- >sk_socket->ops->set_rcvlowat() So something like the following (completely untested) should possibly address the issue at hand, but I think it would be better/safer completely disable ebpf on mptcp subflows, WDYT? Thanks, Paolo --- diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index dcd1c76d2a3b..6e5e64c2cf89 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -1493,6 +1493,9 @@ int mptcp_set_rcvlowat(struct sock *sk, int val) struct mptcp_subflow_context *subflow; int space, cap; + if (has_current_bpf_ctx()) + return -EINVAL; + if (sk->sk_userlocks & SOCK_RCVBUF_LOCK) cap = sk->sk_rcvbuf >> 1; else ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 18:32 ` Paolo Abeni @ 2024-03-27 18:45 ` Mat Martineau 2024-03-27 18:50 ` Alexei Starovoitov 1 sibling, 0 replies; 10+ messages in thread From: Mat Martineau @ 2024-03-27 18:45 UTC (permalink / raw) To: Paolo Abeni Cc: Alexei Starovoitov, Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Jakub Kicinski, Martin KaFai Lau [-- Attachment #1: Type: text/plain, Size: 3641 bytes --] On Wed, 27 Mar 2024, Paolo Abeni wrote: > On Wed, 2024-03-27 at 10:00 -0700, Alexei Starovoitov wrote: >> On Wed, Mar 27, 2024 at 9:56 AM Paolo Abeni <pabeni@redhat.com> wrote: >>> >>> On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: >>>> I ffwded bpf tree with the recent net fixes and caught this: >>>> >>>> [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 >>>> subflow_data_ready+0x147/0x1c0 >>>> [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: >>>> bpf_test_no_cfi(O)] >>>> [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G >>>> O 6.8.0-12873-g2c43c33bfd23 #1014 >>>> #[ 48.467143] Call Trace: >>>> [ 48.469094] <TASK> >>>> [ 48.472159] ? __warn+0x80/0x180 >>>> [ 48.475019] ? subflow_data_ready+0x147/0x1c0 >>>> [ 48.478068] ? report_bug+0x189/0x1c0 >>>> [ 48.480725] ? handle_bug+0x36/0x70 >>>> [ 48.483061] ? exc_invalid_op+0x13/0x60 >>>> [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 >>>> [ 48.488754] ? subflow_data_ready+0x147/0x1c0 >>>> [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 >>>> [ 48.495026] sk_setsockopt+0x6c0/0x1540 >>>> >>>> It doesn't reproduce all the time though. >>>> Some race? >>>> Known issue? >>> >>> It was not known to me. Looks like something related to not so recent >>> changes (rcvlowat support). >>> >>> Definitely looks lie a race. >>> >>> If you could share more info about the running context and/or a full >>> decoded splat it could help, thanks! >> >> This is just running bpf selftests in parallel: >> test_progs -j >> >> The end of the splat: >> [ 48.500075] __bpf_setsockopt+0x6f/0x90 >> [ 48.503124] bpf_sock_ops_setsockopt+0x3c/0x90 >> [ 48.506053] bpf_prog_509ce5db2c7f9981_bpf_test_sockopt_int+0xb4/0x11b >> [ 48.510178] bpf_prog_dce07e362d941d2b_bpf_test_socket_sockopt+0x12b/0x132 >> [ 48.515070] bpf_prog_348c9b5faaf10092_skops_sockopt+0x954/0xe86 >> [ 48.519050] __cgroup_bpf_run_filter_sock_ops+0xbc/0x250 >> [ 48.523836] tcp_connect+0x879/0x1160 >> [ 48.527239] ? ktime_get_with_offset+0x8d/0x140 >> [ 48.531362] tcp_v6_connect+0x50c/0x870 >> [ 48.534609] ? mptcp_connect+0x129/0x280 >> [ 48.538483] mptcp_connect+0x129/0x280 >> [ 48.542436] __inet_stream_connect+0xce/0x370 >> [ 48.546664] ? rcu_is_watching+0xd/0x40 >> [ 48.549063] ? lock_release+0x1c4/0x280 >> [ 48.553497] ? inet_stream_connect+0x22/0x50 >> [ 48.557289] ? rcu_is_watching+0xd/0x40 >> [ 48.560430] inet_stream_connect+0x36/0x50 >> [ 48.563604] bpf_trampoline_6442491565+0x49/0xef >> [ 48.567770] ? security_socket_connect+0x34/0x50 >> [ 48.575400] inet_stream_connect+0x5/0x50 >> [ 48.577721] __sys_connect+0x63/0x90 >> [ 48.580189] ? bpf_trace_run2+0xb0/0x1a0 >> [ 48.583171] ? rcu_is_watching+0xd/0x40 >> [ 48.585802] ? syscall_trace_enter+0xfb/0x1e0 >> [ 48.588836] __x64_sys_connect+0x14/0x20 > > Ouch, it looks bad. BPF should not allow any action on mptcp subflows > that go through sk_socket. They touch the mptcp main socket, which is > _not_ protected by the subflow socket lock. > > AFICS currently the relevant set of racing sockopt allowed by bpf boils > down to SO_RCVLOWAT only - sk_setsockopt(SO_RCVLOWAT) will call sk- >> sk_socket->ops->set_rcvlowat() > > So something like the following (completely untested) should possibly > address the issue at hand, but I think it would be better/safer > completely disable ebpf on mptcp subflows, WDYT? > Paolo - I agree that the MPTCP socket needs to manage any changes to the subflow sockets, so ebpf will only exercise control of subflows through the MPTCP socket. - Mat ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 18:32 ` Paolo Abeni 2024-03-27 18:45 ` Mat Martineau @ 2024-03-27 18:50 ` Alexei Starovoitov 2024-03-27 22:16 ` Martin KaFai Lau 1 sibling, 1 reply; 10+ messages in thread From: Alexei Starovoitov @ 2024-03-27 18:50 UTC (permalink / raw) To: Paolo Abeni Cc: Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau On Wed, Mar 27, 2024 at 11:33 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Wed, 2024-03-27 at 10:00 -0700, Alexei Starovoitov wrote: > > On Wed, Mar 27, 2024 at 9:56 AM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: > > > > I ffwded bpf tree with the recent net fixes and caught this: > > > > > > > > [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 > > > > subflow_data_ready+0x147/0x1c0 > > > > [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: > > > > bpf_test_no_cfi(O)] > > > > [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G > > > > O 6.8.0-12873-g2c43c33bfd23 #1014 > > > > #[ 48.467143] Call Trace: > > > > [ 48.469094] <TASK> > > > > [ 48.472159] ? __warn+0x80/0x180 > > > > [ 48.475019] ? subflow_data_ready+0x147/0x1c0 > > > > [ 48.478068] ? report_bug+0x189/0x1c0 > > > > [ 48.480725] ? handle_bug+0x36/0x70 > > > > [ 48.483061] ? exc_invalid_op+0x13/0x60 > > > > [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 > > > > [ 48.488754] ? subflow_data_ready+0x147/0x1c0 > > > > [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 > > > > [ 48.495026] sk_setsockopt+0x6c0/0x1540 > > > > > > > > It doesn't reproduce all the time though. > > > > Some race? > > > > Known issue? > > > > > > It was not known to me. Looks like something related to not so recent > > > changes (rcvlowat support). > > > > > > Definitely looks lie a race. > > > > > > If you could share more info about the running context and/or a full > > > decoded splat it could help, thanks! > > > > This is just running bpf selftests in parallel: > > test_progs -j > > > > The end of the splat: > > [ 48.500075] __bpf_setsockopt+0x6f/0x90 > > [ 48.503124] bpf_sock_ops_setsockopt+0x3c/0x90 > > [ 48.506053] bpf_prog_509ce5db2c7f9981_bpf_test_sockopt_int+0xb4/0x11b > > [ 48.510178] bpf_prog_dce07e362d941d2b_bpf_test_socket_sockopt+0x12b/0x132 > > [ 48.515070] bpf_prog_348c9b5faaf10092_skops_sockopt+0x954/0xe86 > > [ 48.519050] __cgroup_bpf_run_filter_sock_ops+0xbc/0x250 > > [ 48.523836] tcp_connect+0x879/0x1160 > > [ 48.527239] ? ktime_get_with_offset+0x8d/0x140 > > [ 48.531362] tcp_v6_connect+0x50c/0x870 > > [ 48.534609] ? mptcp_connect+0x129/0x280 > > [ 48.538483] mptcp_connect+0x129/0x280 > > [ 48.542436] __inet_stream_connect+0xce/0x370 > > [ 48.546664] ? rcu_is_watching+0xd/0x40 > > [ 48.549063] ? lock_release+0x1c4/0x280 > > [ 48.553497] ? inet_stream_connect+0x22/0x50 > > [ 48.557289] ? rcu_is_watching+0xd/0x40 > > [ 48.560430] inet_stream_connect+0x36/0x50 > > [ 48.563604] bpf_trampoline_6442491565+0x49/0xef > > [ 48.567770] ? security_socket_connect+0x34/0x50 > > [ 48.575400] inet_stream_connect+0x5/0x50 > > [ 48.577721] __sys_connect+0x63/0x90 > > [ 48.580189] ? bpf_trace_run2+0xb0/0x1a0 > > [ 48.583171] ? rcu_is_watching+0xd/0x40 > > [ 48.585802] ? syscall_trace_enter+0xfb/0x1e0 > > [ 48.588836] __x64_sys_connect+0x14/0x20 > > Ouch, it looks bad. BPF should not allow any action on mptcp subflows > that go through sk_socket. They touch the mptcp main socket, which is > _not_ protected by the subflow socket lock. > > AFICS currently the relevant set of racing sockopt allowed by bpf boils > down to SO_RCVLOWAT only - sk_setsockopt(SO_RCVLOWAT) will call sk- > >sk_socket->ops->set_rcvlowat() > > So something like the following (completely untested) should possibly > address the issue at hand, but I think it would be better/safer > completely disable ebpf on mptcp subflows, WDYT? > > Thanks, > > Paolo > > --- > diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c > index dcd1c76d2a3b..6e5e64c2cf89 100644 > --- a/net/mptcp/sockopt.c > +++ b/net/mptcp/sockopt.c > @@ -1493,6 +1493,9 @@ int mptcp_set_rcvlowat(struct sock *sk, int val) > struct mptcp_subflow_context *subflow; > int space, cap; > > + if (has_current_bpf_ctx()) > + return -EINVAL; > + Looks fine to me. Martin, Do you have any better ideas? The splat explains the race. In this case setget_sockopt test happen to run in parallel with mptcp/bpf test and the former one was TCP connect request but it was for subflow. We can disable that callback when tcp flow is a subflow, but that doesn't feel right. I think Paolo's targeted fix is cleaner. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 18:50 ` Alexei Starovoitov @ 2024-03-27 22:16 ` Martin KaFai Lau 2024-03-28 17:35 ` Matthieu Baerts 0 siblings, 1 reply; 10+ messages in thread From: Martin KaFai Lau @ 2024-03-27 22:16 UTC (permalink / raw) To: Alexei Starovoitov, Paolo Abeni Cc: Network Development, bpf, MPTCP Upstream, Matthieu Baerts, Mat Martineau, Jakub Kicinski, Martin KaFai Lau On 3/27/24 11:50 AM, Alexei Starovoitov wrote: > On Wed, Mar 27, 2024 at 11:33 AM Paolo Abeni <pabeni@redhat.com> wrote: >> >> On Wed, 2024-03-27 at 10:00 -0700, Alexei Starovoitov wrote: >>> On Wed, Mar 27, 2024 at 9:56 AM Paolo Abeni <pabeni@redhat.com> wrote: >>>> >>>> On Wed, 2024-03-27 at 09:43 -0700, Alexei Starovoitov wrote: >>>>> I ffwded bpf tree with the recent net fixes and caught this: >>>>> >>>>> [ 48.386337] WARNING: CPU: 32 PID: 3276 at net/mptcp/subflow.c:1430 >>>>> subflow_data_ready+0x147/0x1c0 >>>>> [ 48.392012] Modules linked in: dummy bpf_testmod(O) [last unloaded: >>>>> bpf_test_no_cfi(O)] >>>>> [ 48.396609] CPU: 32 PID: 3276 Comm: test_progs Tainted: G >>>>> O 6.8.0-12873-g2c43c33bfd23 #1014 >>>>> #[ 48.467143] Call Trace: >>>>> [ 48.469094] <TASK> >>>>> [ 48.472159] ? __warn+0x80/0x180 >>>>> [ 48.475019] ? subflow_data_ready+0x147/0x1c0 >>>>> [ 48.478068] ? report_bug+0x189/0x1c0 >>>>> [ 48.480725] ? handle_bug+0x36/0x70 >>>>> [ 48.483061] ? exc_invalid_op+0x13/0x60 >>>>> [ 48.485809] ? asm_exc_invalid_op+0x16/0x20 >>>>> [ 48.488754] ? subflow_data_ready+0x147/0x1c0 >>>>> [ 48.492159] mptcp_set_rcvlowat+0x79/0x1d0 >>>>> [ 48.495026] sk_setsockopt+0x6c0/0x1540 >>>>> >>>>> It doesn't reproduce all the time though. >>>>> Some race? >>>>> Known issue? >>>> >>>> It was not known to me. Looks like something related to not so recent >>>> changes (rcvlowat support). >>>> >>>> Definitely looks lie a race. >>>> >>>> If you could share more info about the running context and/or a full >>>> decoded splat it could help, thanks! >>> >>> This is just running bpf selftests in parallel: >>> test_progs -j >>> >>> The end of the splat: >>> [ 48.500075] __bpf_setsockopt+0x6f/0x90 >>> [ 48.503124] bpf_sock_ops_setsockopt+0x3c/0x90 >>> [ 48.506053] bpf_prog_509ce5db2c7f9981_bpf_test_sockopt_int+0xb4/0x11b >>> [ 48.510178] bpf_prog_dce07e362d941d2b_bpf_test_socket_sockopt+0x12b/0x132 >>> [ 48.515070] bpf_prog_348c9b5faaf10092_skops_sockopt+0x954/0xe86 >>> [ 48.519050] __cgroup_bpf_run_filter_sock_ops+0xbc/0x250 >>> [ 48.523836] tcp_connect+0x879/0x1160 >>> [ 48.527239] ? ktime_get_with_offset+0x8d/0x140 >>> [ 48.531362] tcp_v6_connect+0x50c/0x870 >>> [ 48.534609] ? mptcp_connect+0x129/0x280 >>> [ 48.538483] mptcp_connect+0x129/0x280 >>> [ 48.542436] __inet_stream_connect+0xce/0x370 >>> [ 48.546664] ? rcu_is_watching+0xd/0x40 >>> [ 48.549063] ? lock_release+0x1c4/0x280 >>> [ 48.553497] ? inet_stream_connect+0x22/0x50 >>> [ 48.557289] ? rcu_is_watching+0xd/0x40 >>> [ 48.560430] inet_stream_connect+0x36/0x50 >>> [ 48.563604] bpf_trampoline_6442491565+0x49/0xef >>> [ 48.567770] ? security_socket_connect+0x34/0x50 >>> [ 48.575400] inet_stream_connect+0x5/0x50 >>> [ 48.577721] __sys_connect+0x63/0x90 >>> [ 48.580189] ? bpf_trace_run2+0xb0/0x1a0 >>> [ 48.583171] ? rcu_is_watching+0xd/0x40 >>> [ 48.585802] ? syscall_trace_enter+0xfb/0x1e0 >>> [ 48.588836] __x64_sys_connect+0x14/0x20 >> >> Ouch, it looks bad. BPF should not allow any action on mptcp subflows >> that go through sk_socket. They touch the mptcp main socket, which is >> _not_ protected by the subflow socket lock. >> >> AFICS currently the relevant set of racing sockopt allowed by bpf boils >> down to SO_RCVLOWAT only - sk_setsockopt(SO_RCVLOWAT) will call sk- >>> sk_socket->ops->set_rcvlowat() >> >> So something like the following (completely untested) should possibly >> address the issue at hand, but I think it would be better/safer >> completely disable ebpf on mptcp subflows, WDYT? >> >> Thanks, >> >> Paolo >> >> --- >> diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c >> index dcd1c76d2a3b..6e5e64c2cf89 100644 >> --- a/net/mptcp/sockopt.c >> +++ b/net/mptcp/sockopt.c >> @@ -1493,6 +1493,9 @@ int mptcp_set_rcvlowat(struct sock *sk, int val) >> struct mptcp_subflow_context *subflow; >> int space, cap; >> >> + if (has_current_bpf_ctx()) >> + return -EINVAL; >> + > > Looks fine to me. > > Martin, > > Do you have any better ideas? > > The splat explains the race. > In this case setget_sockopt test happen to run in parallel > with mptcp/bpf test and the former one was TCP connect request > but it was for subflow. > > We can disable that callback when tcp flow is a subflow, > but that doesn't feel right. I am also not sure if we can disable all set/getsockopt for tcp subflows. Not clear if there is use case that depends on this to setsockopt the subflow. > I think Paolo's targeted fix is cleaner. I will also go with Paolo's fix. The radius is smaller. I don't have a better idea. Unrelated, is there a way to tell if a tcp_sock is a subflow? bpf prog can use it to decide if it wants to setsockopt on a subflow or not. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-27 22:16 ` Martin KaFai Lau @ 2024-03-28 17:35 ` Matthieu Baerts 2024-03-29 16:26 ` Alexei Starovoitov 0 siblings, 1 reply; 10+ messages in thread From: Matthieu Baerts @ 2024-03-28 17:35 UTC (permalink / raw) To: Martin KaFai Lau, Alexei Starovoitov, Paolo Abeni Cc: Network Development, bpf, MPTCP Upstream, Mat Martineau, Jakub Kicinski, Martin KaFai Lau, Geliang Tang Hi Martin, On 27/03/2024 23:16, Martin KaFai Lau wrote: (...) > Unrelated, is there a way to tell if a tcp_sock is a subflow? Yes, you can use "sk_is_mptcp(sk)". Please note that this 'sk' *has* to be a tcp_sock, this is not checked by the helper. That's what is used with bpf_mptcp_sock_from_subflow() https://elixir.bootlin.com/linux/latest/source/net/mptcp/bpf.c#L15 > bpf prog > can use it to decide if it wants to setsockopt on a subflow or not. I think it is important to keep the possibility to set socket options per subflow. If the original issue discussed here is limited to set_rcvlowat(), best to address it there. Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-28 17:35 ` Matthieu Baerts @ 2024-03-29 16:26 ` Alexei Starovoitov 2024-03-29 16:34 ` Paolo Abeni 0 siblings, 1 reply; 10+ messages in thread From: Alexei Starovoitov @ 2024-03-29 16:26 UTC (permalink / raw) To: Matthieu Baerts Cc: Martin KaFai Lau, Paolo Abeni, Network Development, bpf, MPTCP Upstream, Mat Martineau, Jakub Kicinski, Martin KaFai Lau, Geliang Tang On Thu, Mar 28, 2024 at 10:35 AM Matthieu Baerts <matttbe@kernel.org> wrote: > > Hi Martin, > > On 27/03/2024 23:16, Martin KaFai Lau wrote: > > (...) > > > Unrelated, is there a way to tell if a tcp_sock is a subflow? > > Yes, you can use "sk_is_mptcp(sk)". Please note that this 'sk' *has* to > be a tcp_sock, this is not checked by the helper. > > That's what is used with bpf_mptcp_sock_from_subflow() > > https://elixir.bootlin.com/linux/latest/source/net/mptcp/bpf.c#L15 > > > bpf prog > > can use it to decide if it wants to setsockopt on a subflow or not. > I think it is important to keep the possibility to set socket options > per subflow. If the original issue discussed here is limited to > set_rcvlowat(), best to address it there. All makes sense to me. Paolo, could you send an official patch? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mptcp splat 2024-03-29 16:26 ` Alexei Starovoitov @ 2024-03-29 16:34 ` Paolo Abeni 0 siblings, 0 replies; 10+ messages in thread From: Paolo Abeni @ 2024-03-29 16:34 UTC (permalink / raw) To: Alexei Starovoitov, Matthieu Baerts Cc: Martin KaFai Lau, Network Development, bpf, MPTCP Upstream, Mat Martineau, Jakub Kicinski, Martin KaFai Lau, Geliang Tang On Fri, 2024-03-29 at 09:26 -0700, Alexei Starovoitov wrote: > On Thu, Mar 28, 2024 at 10:35 AM Matthieu Baerts <matttbe@kernel.org> wrote: > > > > Hi Martin, > > > > On 27/03/2024 23:16, Martin KaFai Lau wrote: > > > > (...) > > > > > Unrelated, is there a way to tell if a tcp_sock is a subflow? > > > > Yes, you can use "sk_is_mptcp(sk)". Please note that this 'sk' *has* to > > be a tcp_sock, this is not checked by the helper. > > > > That's what is used with bpf_mptcp_sock_from_subflow() > > > > https://elixir.bootlin.com/linux/latest/source/net/mptcp/bpf.c#L15 > > > > > bpf prog > > > can use it to decide if it wants to setsockopt on a subflow or not. > > I think it is important to keep the possibility to set socket options > > per subflow. If the original issue discussed here is limited to > > set_rcvlowat(), best to address it there. > > > All makes sense to me. > > Paolo, > could you send an official patch? Sure, thank you for reminding me. This was falling off my radar. I'll send it after some testing. Thanks! Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-03-29 16:34 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-27 16:43 mptcp splat Alexei Starovoitov 2024-03-27 16:56 ` Paolo Abeni 2024-03-27 17:00 ` Alexei Starovoitov 2024-03-27 18:32 ` Paolo Abeni 2024-03-27 18:45 ` Mat Martineau 2024-03-27 18:50 ` Alexei Starovoitov 2024-03-27 22:16 ` Martin KaFai Lau 2024-03-28 17:35 ` Matthieu Baerts 2024-03-29 16:26 ` Alexei Starovoitov 2024-03-29 16:34 ` Paolo Abeni
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox