From: Stanislav Fomichev <sdf@google.com>
To: Aditi Ghag <aditi.ghag@isovalent.com>
Cc: bpf@vger.kernel.org, kafai@fb.com, edumazet@google.com
Subject: Re: [PATCH v4 bpf-next 3/4] bpf,tcp: Avoid taking fast sock lock in iterator
Date: Fri, 24 Mar 2023 14:45:17 -0700 [thread overview]
Message-ID: <ZB4Z7cnF+RDMaKvW@google.com> (raw)
In-Reply-To: <20230323200633.3175753-4-aditi.ghag@isovalent.com>
On 03/23, Aditi Ghag wrote:
> Previously, BPF TCP iterator was acquiring fast version of sock lock that
> disables the BH. This introduced a circular dependency with code paths
> that
> later acquire sockets hash table bucket lock.
> Replace the fast version of sock lock with slow that faciliates BPF
> programs executed from the iterator to destroy TCP listening sockets
> using the bpf_sock_destroy kfunc.
> Here is a stack trace that motivated this change:
> ```
> 1) sock_lock with BH disabled + bucket lock
> lock_acquire+0xcd/0x330
> _raw_spin_lock_bh+0x38/0x50
> inet_unhash+0x96/0xd0
> tcp_set_state+0x6a/0x210
> tcp_abort+0x12b/0x230
> bpf_prog_f4110fb1100e26b5_iter_tcp6_server+0xa3/0xaa
> bpf_iter_run_prog+0x1ff/0x340
> bpf_iter_tcp_seq_show+0xca/0x190
> bpf_seq_read+0x177/0x450
> vfs_read+0xc6/0x300
> ksys_read+0x69/0xf0
> do_syscall_64+0x3c/0x90
> entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 2) sock lock with BH enable
> [ 1.499968] lock_acquire+0xcd/0x330
> [ 1.500316] _raw_spin_lock+0x33/0x40
> [ 1.500670] sk_clone_lock+0x146/0x520
> [ 1.501030] inet_csk_clone_lock+0x1b/0x110
> [ 1.501433] tcp_create_openreq_child+0x22/0x3f0
> [ 1.501873] tcp_v6_syn_recv_sock+0x96/0x940
> [ 1.502284] tcp_check_req+0x137/0x660
> [ 1.502646] tcp_v6_rcv+0xa63/0xe80
> [ 1.502994] ip6_protocol_deliver_rcu+0x78/0x590
> [ 1.503434] ip6_input_finish+0x72/0x140
> [ 1.503818] __netif_receive_skb_one_core+0x63/0xa0
> [ 1.504281] process_backlog+0x79/0x260
> [ 1.504668] __napi_poll.constprop.0+0x27/0x170
> [ 1.505104] net_rx_action+0x14a/0x2a0
> [ 1.505469] __do_softirq+0x165/0x510
> [ 1.505842] do_softirq+0xcd/0x100
> [ 1.506172] __local_bh_enable_ip+0xcc/0xf0
> [ 1.506588] ip6_finish_output2+0x2a8/0xb00
> [ 1.506988] ip6_finish_output+0x274/0x510
> [ 1.507377] ip6_xmit+0x319/0x9b0
> [ 1.507726] inet6_csk_xmit+0x12b/0x2b0
> [ 1.508096] __tcp_transmit_skb+0x549/0xc40
> [ 1.508498] tcp_rcv_state_process+0x362/0x1180
> ```
> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Don't need fixes because it doesn't trigger without your new
bpf_sock_destroy?
> ---
> net/ipv4/tcp_ipv4.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index ea370afa70ed..f2d370a9450f 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2962,7 +2962,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file
> *seq, void *v)
> struct bpf_iter_meta meta;
> struct bpf_prog *prog;
> struct sock *sk = v;
> - bool slow;
> uid_t uid;
> int ret;
> @@ -2970,7 +2969,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file
> *seq, void *v)
> return 0;
> if (sk_fullsock(sk))
> - slow = lock_sock_fast(sk);
> + lock_sock(sk);
> if (unlikely(sk_unhashed(sk))) {
> ret = SEQ_SKIP;
> @@ -2994,7 +2993,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file
> *seq, void *v)
> unlock:
> if (sk_fullsock(sk))
> - unlock_sock_fast(sk, slow);
> + release_sock(sk);
> return ret;
> }
> --
> 2.34.1
next prev parent reply other threads:[~2023-03-24 21:45 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-23 20:06 [PATCH v4 bpf-next 0/5] bpf-nex: Add socket destroy capability Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 1/4] bpf: Implement batching in UDP iterator Aditi Ghag
2023-03-24 21:56 ` Stanislav Fomichev
2023-03-27 15:52 ` Aditi Ghag
2023-03-27 16:52 ` Stanislav Fomichev
2023-03-27 22:28 ` Martin KaFai Lau
2023-03-28 17:06 ` Aditi Ghag
2023-03-28 21:33 ` Martin KaFai Lau
2023-03-29 16:20 ` Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 2/4] bpf: Add bpf_sock_destroy kfunc Aditi Ghag
2023-03-23 23:58 ` Martin KaFai Lau
2023-03-24 21:37 ` Stanislav Fomichev
2023-03-30 14:42 ` Aditi Ghag
2023-03-30 16:32 ` Stanislav Fomichev
2023-03-30 17:30 ` Martin KaFai Lau
2023-04-03 15:58 ` Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 3/4] bpf,tcp: Avoid taking fast sock lock in iterator Aditi Ghag
2023-03-24 21:45 ` Stanislav Fomichev [this message]
2023-03-28 15:20 ` Aditi Ghag
2023-03-27 22:34 ` Martin KaFai Lau
2023-03-23 20:06 ` [PATCH v4 bpf-next 4/4] selftests/bpf: Add tests for bpf_sock_destroy Aditi Ghag
2023-03-24 21:52 ` Stanislav Fomichev
2023-03-27 15:57 ` Aditi Ghag
2023-03-27 16:54 ` Stanislav Fomichev
2023-03-28 17:50 ` Aditi Ghag
2023-03-28 18:35 ` Stanislav Fomichev
2023-03-29 23:13 ` Aditi Ghag
2023-03-29 23:25 ` Aditi Ghag
2023-03-29 23:25 ` Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZB4Z7cnF+RDMaKvW@google.com \
--to=sdf@google.com \
--cc=aditi.ghag@isovalent.com \
--cc=bpf@vger.kernel.org \
--cc=edumazet@google.com \
--cc=kafai@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox