All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: Aditi Ghag <aditi.ghag@isovalent.com>
Cc: bpf@vger.kernel.org, kafai@fb.com, edumazet@google.com
Subject: Re: [PATCH v4 bpf-next 3/4] bpf,tcp: Avoid taking fast sock lock in iterator
Date: Fri, 24 Mar 2023 14:45:17 -0700	[thread overview]
Message-ID: <ZB4Z7cnF+RDMaKvW@google.com> (raw)
In-Reply-To: <20230323200633.3175753-4-aditi.ghag@isovalent.com>

On 03/23, Aditi Ghag wrote:
> Previously, BPF TCP iterator was acquiring fast version of sock lock that
> disables the BH. This introduced a circular dependency with code paths  
> that
> later acquire sockets hash table bucket lock.
> Replace the fast version of sock lock with slow that faciliates BPF
> programs executed from the iterator to destroy TCP listening sockets
> using the bpf_sock_destroy kfunc.

> Here is a stack trace that motivated this change:

> ```
> 1) sock_lock with BH disabled + bucket lock

> lock_acquire+0xcd/0x330
> _raw_spin_lock_bh+0x38/0x50
> inet_unhash+0x96/0xd0
> tcp_set_state+0x6a/0x210
> tcp_abort+0x12b/0x230
> bpf_prog_f4110fb1100e26b5_iter_tcp6_server+0xa3/0xaa
> bpf_iter_run_prog+0x1ff/0x340
> bpf_iter_tcp_seq_show+0xca/0x190
> bpf_seq_read+0x177/0x450
> vfs_read+0xc6/0x300
> ksys_read+0x69/0xf0
> do_syscall_64+0x3c/0x90
> entry_SYSCALL_64_after_hwframe+0x72/0xdc

> 2) sock lock with BH enable

> [    1.499968]   lock_acquire+0xcd/0x330
> [    1.500316]   _raw_spin_lock+0x33/0x40
> [    1.500670]   sk_clone_lock+0x146/0x520
> [    1.501030]   inet_csk_clone_lock+0x1b/0x110
> [    1.501433]   tcp_create_openreq_child+0x22/0x3f0
> [    1.501873]   tcp_v6_syn_recv_sock+0x96/0x940
> [    1.502284]   tcp_check_req+0x137/0x660
> [    1.502646]   tcp_v6_rcv+0xa63/0xe80
> [    1.502994]   ip6_protocol_deliver_rcu+0x78/0x590
> [    1.503434]   ip6_input_finish+0x72/0x140
> [    1.503818]   __netif_receive_skb_one_core+0x63/0xa0
> [    1.504281]   process_backlog+0x79/0x260
> [    1.504668]   __napi_poll.constprop.0+0x27/0x170
> [    1.505104]   net_rx_action+0x14a/0x2a0
> [    1.505469]   __do_softirq+0x165/0x510
> [    1.505842]   do_softirq+0xcd/0x100
> [    1.506172]   __local_bh_enable_ip+0xcc/0xf0
> [    1.506588]   ip6_finish_output2+0x2a8/0xb00
> [    1.506988]   ip6_finish_output+0x274/0x510
> [    1.507377]   ip6_xmit+0x319/0x9b0
> [    1.507726]   inet6_csk_xmit+0x12b/0x2b0
> [    1.508096]   __tcp_transmit_skb+0x549/0xc40
> [    1.508498]   tcp_rcv_state_process+0x362/0x1180

> ```

> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

Acked-by: Stanislav Fomichev <sdf@google.com>

Don't need fixes because it doesn't trigger without your new
bpf_sock_destroy?


> ---
>   net/ipv4/tcp_ipv4.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)

> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index ea370afa70ed..f2d370a9450f 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2962,7 +2962,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file  
> *seq, void *v)
>   	struct bpf_iter_meta meta;
>   	struct bpf_prog *prog;
>   	struct sock *sk = v;
> -	bool slow;
>   	uid_t uid;
>   	int ret;

> @@ -2970,7 +2969,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file  
> *seq, void *v)
>   		return 0;

>   	if (sk_fullsock(sk))
> -		slow = lock_sock_fast(sk);
> +		lock_sock(sk);

>   	if (unlikely(sk_unhashed(sk))) {
>   		ret = SEQ_SKIP;
> @@ -2994,7 +2993,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file  
> *seq, void *v)

>   unlock:
>   	if (sk_fullsock(sk))
> -		unlock_sock_fast(sk, slow);
> +		release_sock(sk);
>   	return ret;

>   }
> --
> 2.34.1


  reply	other threads:[~2023-03-24 21:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-23 20:06 [PATCH v4 bpf-next 0/5] bpf-nex: Add socket destroy capability Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 1/4] bpf: Implement batching in UDP iterator Aditi Ghag
2023-03-24 21:56   ` Stanislav Fomichev
2023-03-27 15:52     ` Aditi Ghag
2023-03-27 16:52       ` Stanislav Fomichev
2023-03-27 22:28   ` Martin KaFai Lau
2023-03-28 17:06     ` Aditi Ghag
2023-03-28 21:33       ` Martin KaFai Lau
2023-03-29 16:20         ` Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 2/4] bpf: Add bpf_sock_destroy kfunc Aditi Ghag
2023-03-23 23:58   ` Martin KaFai Lau
2023-03-24 21:37   ` Stanislav Fomichev
2023-03-30 14:42     ` Aditi Ghag
2023-03-30 16:32       ` Stanislav Fomichev
2023-03-30 17:30         ` Martin KaFai Lau
2023-04-03 15:58           ` Aditi Ghag
2023-03-23 20:06 ` [PATCH v4 bpf-next 3/4] bpf,tcp: Avoid taking fast sock lock in iterator Aditi Ghag
2023-03-24 21:45   ` Stanislav Fomichev [this message]
2023-03-28 15:20     ` Aditi Ghag
2023-03-27 22:34   ` Martin KaFai Lau
2023-03-23 20:06 ` [PATCH v4 bpf-next 4/4] selftests/bpf: Add tests for bpf_sock_destroy Aditi Ghag
2023-03-24 21:52   ` Stanislav Fomichev
2023-03-27 15:57     ` Aditi Ghag
2023-03-27 16:54       ` Stanislav Fomichev
2023-03-28 17:50         ` Aditi Ghag
2023-03-28 18:35           ` Stanislav Fomichev
2023-03-29 23:13             ` Aditi Ghag
2023-03-29 23:25               ` Aditi Ghag
2023-03-29 23:25               ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZB4Z7cnF+RDMaKvW@google.com \
    --to=sdf@google.com \
    --cc=aditi.ghag@isovalent.com \
    --cc=bpf@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=kafai@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.