From: Jordan Rife <jrife@google.com>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Stanislav Fomichev <sdf@fomichev.me>,
Andrii Nakryiko <andrii@kernel.org>,
Yusuke Suzuki <yusuke.suzuki@isovalent.com>,
Jakub Kicinski <kuba@kernel.org>
Subject: Re: [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
Date: Wed, 1 Apr 2026 20:50:41 +0000 [thread overview]
Message-ID: <ac1qqMc8M96aDORI@google.com> (raw)
In-Reply-To: <CAAVpQUDk1ywgsuCtreVCV4yVJRm_mcjsnC-+LuUwLRLHWavtZA@mail.gmail.com>
On Mon, Mar 30, 2026 at 06:21:39PM -0700, Kuniyuki Iwashima wrote:
> On Mon, Mar 30, 2026 at 2:57 PM Jordan Rife <jrife@google.com> wrote:
> >
> > Adjust lookups and scoring to keep their results equivalent to before
> > even if inet_daddr+inet_dport are left intact after disconnecting a
> > socket (sk_state == TCP_CLOSE). sk_state == TCP_ESTABLISHED implies that
> > *daddr is non-zero, so remove redundant checks for that at the same
> > time. Note that __udp6_lib_demux_lookup already checks if sk_state ==
> > TCP_ESTABLISHED, so no change was needed there [1].
> >
> > I could find no discernible difference in performance in
> > udp4_lib_lookup2 before and after the change in compute_score.
>
> What workload did you test the series with ?
These measurements were taken on the server side while running a netperf
UDP_STREAM test over a 100 Gbps link.
> I think we want to see results under DDoS.
Intuitively, it seems like the performance should be similar.
sk_state resides in the same cache line as inet_daddr, inet_dport, and
sk_v6_daddr, and we trade a comparison with inet_daddr/sk_v6_daddr for
one with sk_state. Of course, code-level intuition can be wrong, so I'm
happy to do some more extensive testing if you feel it's warranted to
make sure that performance isn't regressing.
> >
> > (AMD Ryzen 9 9900X)
> >
> > kprobe:udp4_lib_lookup2 {
> > @start[cpu] = nsecs;
> > }
> > kretprobe:udp4_lib_lookup2 {
> > @lookup[cpu] = hist(nsecs - @start[cpu], 2);
> > }
> >
> > BEFORE
> > ======
> > @lookup[11]:
> > [80, 96) 1387077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112) 364973 |@@@@@@@@@@@@@ |
> > [112, 128) 34261 |@ |
> > [128, 160) 7246 | |
> > [160, 192) 215 | |
> > [192, 224) 126 | |
> >
> > AFTER
> > =====
> > @lookup[11]:
> > [80, 96) 1408594 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112) 340568 |@@@@@@@@@@@@ |
> > [112, 128) 30753 |@ |
> > [128, 160) 8019 | |
> > [160, 192) 231 | |
> > [192, 224) 157 | |
> >
> > [1]: https://lore.kernel.org/netdev/20170623222537.130493-1-tracywwnj@gmail.com/
> >
> > Signed-off-by: Jordan Rife <jrife@google.com>
> > ---
> > net/ipv4/udp.c | 20 +++++++++++---------
> > net/ipv6/udp.c | 18 +++++++++---------
> > 2 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index b60fad393e18..d91c587c3657 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -385,16 +385,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> > score = (sk->sk_family == PF_INET) ? 2 : 1;
> >
> > inet = inet_sk(sk);
> > - if (inet->inet_daddr) {
> > + if (sk->sk_state == TCP_ESTABLISHED) {
> > if (inet->inet_daddr != saddr)
> > return -1;
> > score += 4;
> > - }
> >
> > - if (inet->inet_dport) {
> > - if (inet->inet_dport != sport)
> > - return -1;
> > - score += 4;
> > + if (inet->inet_dport) {
> > + if (inet->inet_dport != sport)
> > + return -1;
> > + score += 4;
> > + }
> > }
> >
> > dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
> > @@ -796,8 +796,9 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> > if (!net_eq(sock_net(sk), net) ||
> > udp_sk(sk)->udp_port_hash != hnum ||
> > - (inet->inet_daddr && inet->inet_daddr != rmt_addr) ||
> > - (inet->inet_dport != rmt_port && inet->inet_dport) ||
> > + (sk->sk_state == TCP_ESTABLISHED &&
> > + (inet->inet_daddr != rmt_addr ||
> > + (inet->inet_dport != rmt_port && inet->inet_dport))) ||
> > (inet->inet_rcv_saddr && inet->inet_rcv_saddr != loc_addr) ||
> > ipv6_only_sock(sk) ||
> > !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
> > @@ -2854,7 +2855,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
> > ports = INET_COMBINED_PORTS(rmt_port, hnum);
> >
> > udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> > - if (inet_match(net, sk, acookie, ports, dif, sdif))
> > + if (sk->sk_state == TCP_ESTABLISHED &&
> > + inet_match(net, sk, acookie, ports, dif, sdif))
> > return sk;
> > /* Only check first socket in chain */
> > break;
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 010b909275dd..b93a9a3e7678 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -147,16 +147,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> > score = 0;
> > inet = inet_sk(sk);
> >
> > - if (inet->inet_dport) {
> > + if (sk->sk_state == TCP_ESTABLISHED) {
> > if (inet->inet_dport != sport)
> > return -1;
> > score++;
> > - }
> >
> > - if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> > - if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > - return -1;
> > - score++;
> > + if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
>
> This looks unnecessary.
Yep, thanks. This is inverted. It should have been similar to the ipv4
logic where the unnecessary `if (inet->inet_daddr)` check was removed
after adding if `(sk->sk_state == TCP_ESTABLISHED)`:
if (sk->sk_state == TCP_ESTABLISHED) {
if (inet->inet_dport) {
if (inet->inet_dport != sport)
return -1;
score++;
}
if (!ipv6_addr_equal(&sk->sk_v6_addr, saddr))
return -1;
score++;
}
>
> > + if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > + return -1;
> > + score++;
> > + }
> > }
> >
> > bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
> > @@ -949,9 +949,9 @@ static bool __udp_v6_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> > if (udp_sk(sk)->udp_port_hash != hnum ||
> > sk->sk_family != PF_INET6 ||
> > - (inet->inet_dport && inet->inet_dport != rmt_port) ||
> > - (!ipv6_addr_any(&sk->sk_v6_daddr) &&
> > - !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
> > + (sk->sk_state == TCP_ESTABLISHED &&
> > + ((inet->inet_dport && inet->inet_dport != rmt_port) ||
> > + !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))) ||
> > !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
> > (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
> > !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
> > --
> > 2.53.0.1118.gaef5881109-goog
> >
Thanks,
Jordan
next prev parent reply other threads:[~2026-04-01 20:50 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
2026-03-31 1:21 ` Kuniyuki Iwashima
2026-04-01 20:50 ` Jordan Rife [this message]
2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
2026-03-31 16:51 ` kernel test robot
2026-03-31 17:33 ` kernel test robot
2026-03-31 17:42 ` kernel test robot
2026-03-31 17:55 ` kernel test robot
2026-03-31 18:49 ` kernel test robot
2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac1qqMc8M96aDORI@google.com \
--to=jrife@google.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=sdf@fomichev.me \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yusuke.suzuki@isovalent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox