From: Guillaume Nault <gnault@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: David Miller <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, David Ahern <dsahern@kernel.org>,
Kuniyuki Iwashima <kuniyu@amazon.com>,
Michal Kubecek <mkubecek@suse.cz>
Subject: Re: [PATCH net-next v2] tcp: Dump bound-only sockets in inet_diag.
Date: Wed, 29 Nov 2023 18:52:53 +0100 [thread overview]
Message-ID: <ZWd6dWXOorUv/PRc@debian> (raw)
In-Reply-To: <CANn89iLYsaZ+uDJA_4M-46XS0fbp0foiumhMdjtfw-Jg9bNq+w@mail.gmail.com>
On Wed, Nov 29, 2023 at 04:39:55PM +0100, Eric Dumazet wrote:
> On Tue, Nov 28, 2023 at 11:18 PM Guillaume Nault <gnault@redhat.com> wrote:
> >
> > On Tue, Nov 28, 2023 at 11:14:28AM +0100, Eric Dumazet wrote:
> > > On Fri, Nov 24, 2023 at 12:11 AM Guillaume Nault <gnault@redhat.com> wrote:
> > > >
> > > > Walk the hashinfo->bhash2 table so that inet_diag can dump TCP sockets
> > > > that are bound but haven't yet called connect() or listen().
> > > >
> > > > This allows ss to dump bound-only TCP sockets, together with listening
> > > > sockets (as there's no specific state for bound-only sockets). This is
> > > > similar to the UDP behaviour for which bound-only sockets are already
> > > > dumped by ss -lu.
> > > >
> > > > The code is inspired by the ->lhash2 loop. However there's no manual
> > > > test of the source port, since this kind of filtering is already
> > > > handled by inet_diag_bc_sk(). Also, a maximum of 16 sockets are dumped
> > > > at a time, to avoid running with bh disabled for too long.
> > > >
> > > > No change is needed for ss. With an IPv4, an IPv6 and an IPv6-only
> > > > socket, bound respectively to 40000, 64000, 60000, the result is:
> > > >
> > > > $ ss -lt
> > > > State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
> > > > UNCONN 0 0 0.0.0.0:40000 0.0.0.0:*
> > > > UNCONN 0 0 [::]:60000 [::]:*
> > > > UNCONN 0 0 *:64000 *:*
> > >
> > >
> > > Hmm... "ss -l" is supposed to only list listening sockets.
> > >
> > > So this change might confuse some users ?
> > >
> >
> > On the other hand I can't find a more sensible solution. The problem is
> > that "ss -l" sets both the TCPF_LISTEN and the TCPF_CLOSE flags. And
> > since we don't have a way to express "bound but not yet listening"
> > sockets, these sockets fall into the CLOSE category. So we're really
> > just returning what ss asked for.
> >
> > If we can't rely on TCPF_CLOSE, then I don't see what kind of filter we
> > could use to request a dump of these TCP sockets. Using "-a" doesn't
> > help as it just sets all the TCPF_* flags (appart from
> > TCPF_NEW_SYN_RECV). Adding a new option wouldn't help either as we
> > couldn't map it to any of the TCPF_* flags. In any case, we still need
> > to rely on TCPF_CLOSE.
> >
> > So maybe we can just improve the ss man page for "-l" and explain that
> > it also lists closed sockets, which includes the bound-only ones
> > (this is already true for non-TCP sockets anyway). We could also tell
> > the user to run "ss state listening" for getting listening sockets
> > exclusively (or we could implement a new option, like "-L", to make
> > that shorter if necessary).
>
> This exists already : ss -t state LISTENING
>
> >
> > What do you think?
> >
>
> We might need a new bit in r->idiag_state (we have a lot of free bits
> there), different from
> the combination used by "ss -l" which unfortunately used ( (1 <<
> SS_LISTEN) | (1 << SS_CLOSE) ) )
>
> "ss -t state bound" (or ss -tB ???) would then set this new bit ( 1
> << SS_BOUND) and the kernel would handle this pseudo state ?
>
> (mapped to CLOSED and in bhash2)
I'll do that in v3. Thanks.
> diff --git a/include/net/tcp_states.h b/include/net/tcp_states.h
> index cc00118acca1b695a534bd73984b9d1f1794db25..97238c0f64aa6643cf492a856e8d67ddcca1a729
> 100644
> --- a/include/net/tcp_states.h
> +++ b/include/net/tcp_states.h
> @@ -22,6 +22,7 @@ enum {
> TCP_LISTEN,
> TCP_CLOSING, /* Now a valid state */
> TCP_NEW_SYN_RECV,
> + TCP_BOUND,
>
> TCP_MAX_STATES /* Leave at the end! */
> };
> @@ -43,6 +44,7 @@ enum {
> TCPF_LISTEN = (1 << TCP_LISTEN),
> TCPF_CLOSING = (1 << TCP_CLOSING),
> TCPF_NEW_SYN_RECV = (1 << TCP_NEW_SYN_RECV),
> + TCPF_BOUND = (1 << TCP_BOUND),
> };
>
> #endif /* _LINUX_TCP_STATES_H */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index a62423360a5681525496f8840bfe1d37ea3dc504..c3d22edb9e9563672356750153167884c92eae67
> 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -6570,6 +6570,7 @@ enum {
> BPF_TCP_LISTEN,
> BPF_TCP_CLOSING, /* Now a valid state */
> BPF_TCP_NEW_SYN_RECV,
> + BPF_TCP_BOUND,
>
> BPF_TCP_MAX_STATES /* Leave at the end! */
> };
>
prev parent reply other threads:[~2023-11-29 17:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-23 23:11 [PATCH net-next v2] tcp: Dump bound-only sockets in inet_diag Guillaume Nault
2023-11-25 1:39 ` Kuniyuki Iwashima
2023-11-27 17:26 ` Guillaume Nault
2023-11-27 17:56 ` Kuniyuki Iwashima
2023-11-28 1:13 ` Guillaume Nault
2023-11-28 10:14 ` Eric Dumazet
2023-11-28 22:18 ` Guillaume Nault
2023-11-29 15:39 ` Eric Dumazet
2023-11-29 17:52 ` Guillaume Nault [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZWd6dWXOorUv/PRc@debian \
--to=gnault@redhat.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=mkubecek@suse.cz \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.