All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <gnault@redhat.com>
Cc: <davem@davemloft.net>, <dsahern@kernel.org>,
	<edumazet@google.com>, <kuba@kernel.org>, <kuniyu@amazon.com>,
	<mkubecek@suse.cz>, <netdev@vger.kernel.org>, <pabeni@redhat.com>
Subject: Re: [PATCH net-next v2] tcp: Dump bound-only sockets in inet_diag.
Date: Fri, 24 Nov 2023 17:39:42 -0800	[thread overview]
Message-ID: <20231125013942.80997-1-kuniyu@amazon.com> (raw)
In-Reply-To: <bfb52b5103de808cda022e2d16bac6cf3ef747d6.1700780828.git.gnault@redhat.com>

From: Guillaume Nault <gnault@redhat.com>
Date: Fri, 24 Nov 2023 00:11:42 +0100
> Walk the hashinfo->bhash2 table so that inet_diag can dump TCP sockets
> that are bound but haven't yet called connect() or listen().
> 
> This allows ss to dump bound-only TCP sockets, together with listening
> sockets (as there's no specific state for bound-only sockets). This is
> similar to the UDP behaviour for which bound-only sockets are already
> dumped by ss -lu.
> 
> The code is inspired by the ->lhash2 loop. However there's no manual
> test of the source port, since this kind of filtering is already
> handled by inet_diag_bc_sk(). Also, a maximum of 16 sockets are dumped
> at a time, to avoid running with bh disabled for too long.
> 
> No change is needed for ss. With an IPv4, an IPv6 and an IPv6-only
> socket, bound respectively to 40000, 64000, 60000, the result is:
> 
>   $ ss -lt
>   State  Recv-Q Send-Q Local Address:Port  Peer Address:PortProcess
>   UNCONN 0      0            0.0.0.0:40000      0.0.0.0:*
>   UNCONN 0      0               [::]:60000         [::]:*
>   UNCONN 0      0                  *:64000            *:*
> 
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
> ---
> 
> v2:
>   * Use ->bhash2 instead of ->bhash (Kuniyuki Iwashima).
>   * Process no more than 16 sockets at a time (Kuniyuki Iwashima).
> 
>  net/ipv4/inet_diag.c | 88 +++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 87 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
> index 7d0e7aaa71e0..d7fb6a625cb7 100644
> --- a/net/ipv4/inet_diag.c
> +++ b/net/ipv4/inet_diag.c
> @@ -1077,10 +1077,96 @@ void inet_diag_dump_icsk(struct inet_hashinfo *hashinfo, struct sk_buff *skb,
>  		s_i = num = s_num = 0;
>  	}
>  
> +/* Process a maximum of SKARR_SZ sockets at a time when walking hash buckets
> + * with bh disabled.
> + */
> +#define SKARR_SZ 16
> +
> +	/* Dump bound-only sockets */
> +	if (cb->args[0] == 1) {
> +		if (!(idiag_states & TCPF_CLOSE))
> +			goto skip_bind_ht;
> +
> +		for (i = s_i; i < hashinfo->bhash_size; i++) {
> +			struct inet_bind_hashbucket *ibb;
> +			struct inet_bind2_bucket *tb2;
> +			struct sock *sk_arr[SKARR_SZ];
> +			int num_arr[SKARR_SZ];
> +			int idx, accum, res;
> +
> +resume_bind_walk:
> +			num = 0;
> +			accum = 0;
> +			ibb = &hashinfo->bhash2[i];
> +
> +			spin_lock_bh(&ibb->lock);
> +			inet_bind_bucket_for_each(tb2, &ibb->chain) {
> +				if (!net_eq(ib2_net(tb2), net))
> +					continue;
> +
> +				sk_for_each_bound_bhash2(sk, &tb2->owners) {
> +					struct inet_sock *inet = inet_sk(sk);
> +
> +					if (num < s_num)
> +						goto next_bind;
> +
> +					if (sk->sk_state != TCP_CLOSE ||
> +					    !inet->inet_num)
> +						goto next_bind;
> +
> +					if (r->sdiag_family != AF_UNSPEC &&
> +					    r->sdiag_family != sk->sk_family)
> +						goto next_bind;
> +
> +					if (!inet_diag_bc_sk(bc, sk))
> +						goto next_bind;
> +
> +					if (!refcount_inc_not_zero(&sk->sk_refcnt))
> +						goto next_bind;

I guess this is copied from the ehash code below, but could
refcount_inc_not_zero() fail for bhash2 under spin_lock_bh() ?


> +
> +					num_arr[accum] = num;
> +					sk_arr[accum] = sk;
> +					if (++accum == SKARR_SZ)
> +						goto pause_bind_walk;
> +next_bind:
> +					num++;
> +				}
> +			}
> +pause_bind_walk:
> +			spin_unlock_bh(&ibb->lock);
> +
> +			res = 0;
> +			for (idx = 0; idx < accum; idx++) {
> +				if (res >= 0) {
> +					res = inet_sk_diag_fill(sk_arr[idx],
> +								NULL, skb, cb,
> +								r, NLM_F_MULTI,
> +								net_admin);
> +					if (res < 0)
> +						num = num_arr[idx];
> +				}
> +				sock_gen_put(sk_arr[idx]);
> +			}
> +			if (res < 0)
> +				goto done;
> +
> +			cond_resched();
> +
> +			if (accum == SKARR_SZ) {
> +				s_num = num + 1;
> +				goto resume_bind_walk;
> +			}
> +
> +			s_num = 0;
> +		}
> +skip_bind_ht:
> +		cb->args[0] = 2;
> +		s_i = num = s_num = 0;
> +	}
> +
>  	if (!(idiag_states & ~TCPF_LISTEN))
>  		goto out;
>  
> -#define SKARR_SZ 16
>  	for (i = s_i; i <= hashinfo->ehash_mask; i++) {
>  		struct inet_ehash_bucket *head = &hashinfo->ehash[i];
>  		spinlock_t *lock = inet_ehash_lockp(hashinfo, i);
> -- 
> 2.39.2

  reply	other threads:[~2023-11-25  1:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-23 23:11 [PATCH net-next v2] tcp: Dump bound-only sockets in inet_diag Guillaume Nault
2023-11-25  1:39 ` Kuniyuki Iwashima [this message]
2023-11-27 17:26   ` Guillaume Nault
2023-11-27 17:56     ` Kuniyuki Iwashima
2023-11-28  1:13       ` Guillaume Nault
2023-11-28 10:14 ` Eric Dumazet
2023-11-28 22:18   ` Guillaume Nault
2023-11-29 15:39     ` Eric Dumazet
2023-11-29 17:52       ` Guillaume Nault

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231125013942.80997-1-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=gnault@redhat.com \
    --cc=kuba@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.