All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jordan Rife <jrife@google.com>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
	 Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	 Stanislav Fomichev <sdf@fomichev.me>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Yusuke Suzuki <yusuke.suzuki@isovalent.com>,
	Jakub Kicinski <kuba@kernel.org>
Subject: Re: [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
Date: Wed, 1 Apr 2026 20:50:41 +0000	[thread overview]
Message-ID: <ac1qqMc8M96aDORI@google.com> (raw)
In-Reply-To: <CAAVpQUDk1ywgsuCtreVCV4yVJRm_mcjsnC-+LuUwLRLHWavtZA@mail.gmail.com>

On Mon, Mar 30, 2026 at 06:21:39PM -0700, Kuniyuki Iwashima wrote:
> On Mon, Mar 30, 2026 at 2:57 PM Jordan Rife <jrife@google.com> wrote:
> >
> > Adjust lookups and scoring to keep their results equivalent to before
> > even if inet_daddr+inet_dport are left intact after disconnecting a
> > socket (sk_state == TCP_CLOSE). sk_state == TCP_ESTABLISHED implies that
> > *daddr is non-zero, so remove redundant checks for that at the same
> > time. Note that __udp6_lib_demux_lookup already checks if sk_state ==
> > TCP_ESTABLISHED, so no change was needed there [1].
> >
> > I could find no discernible difference in performance in
> > udp4_lib_lookup2 before and after the change in compute_score.
> 
> What workload did you test the series with ?

These measurements were taken on the server side while running a netperf
UDP_STREAM test over a 100 Gbps link.

> I think we want to see results under DDoS.

Intuitively, it seems like the performance should be similar.
sk_state resides in the same cache line as inet_daddr, inet_dport, and
sk_v6_daddr, and we trade a comparison with inet_daddr/sk_v6_daddr for
one with sk_state. Of course, code-level intuition can be wrong, so I'm
happy to do some more extensive testing if you feel it's warranted to
make sure that performance isn't regressing.
 
> >
> > (AMD Ryzen 9 9900X)
> >
> > kprobe:udp4_lib_lookup2 {
> >         @start[cpu] = nsecs;
> > }
> > kretprobe:udp4_lib_lookup2 {
> >         @lookup[cpu] = hist(nsecs - @start[cpu], 2);
> > }
> >
> > BEFORE
> > ======
> > @lookup[11]:
> > [80, 96)         1387077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112)         364973 |@@@@@@@@@@@@@                                       |
> > [112, 128)         34261 |@                                                   |
> > [128, 160)          7246 |                                                    |
> > [160, 192)           215 |                                                    |
> > [192, 224)           126 |                                                    |
> >
> > AFTER
> > =====
> > @lookup[11]:
> > [80, 96)         1408594 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112)         340568 |@@@@@@@@@@@@                                        |
> > [112, 128)         30753 |@                                                   |
> > [128, 160)          8019 |                                                    |
> > [160, 192)           231 |                                                    |
> > [192, 224)           157 |                                                    |
> >
> > [1]: https://lore.kernel.org/netdev/20170623222537.130493-1-tracywwnj@gmail.com/
> >
> > Signed-off-by: Jordan Rife <jrife@google.com>
> > ---
> >  net/ipv4/udp.c | 20 +++++++++++---------
> >  net/ipv6/udp.c | 18 +++++++++---------
> >  2 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index b60fad393e18..d91c587c3657 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -385,16 +385,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> >         score = (sk->sk_family == PF_INET) ? 2 : 1;
> >
> >         inet = inet_sk(sk);
> > -       if (inet->inet_daddr) {
> > +       if (sk->sk_state == TCP_ESTABLISHED) {
> >                 if (inet->inet_daddr != saddr)
> >                         return -1;
> >                 score += 4;
> > -       }
> >
> > -       if (inet->inet_dport) {
> > -               if (inet->inet_dport != sport)
> > -                       return -1;
> > -               score += 4;
> > +               if (inet->inet_dport) {
> > +                       if (inet->inet_dport != sport)
> > +                               return -1;
> > +                       score += 4;
> > +               }
> >         }
> >
> >         dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
> > @@ -796,8 +796,9 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> >         if (!net_eq(sock_net(sk), net) ||
> >             udp_sk(sk)->udp_port_hash != hnum ||
> > -           (inet->inet_daddr && inet->inet_daddr != rmt_addr) ||
> > -           (inet->inet_dport != rmt_port && inet->inet_dport) ||
> > +           (sk->sk_state == TCP_ESTABLISHED &&
> > +            (inet->inet_daddr != rmt_addr ||
> > +            (inet->inet_dport != rmt_port && inet->inet_dport))) ||
> >             (inet->inet_rcv_saddr && inet->inet_rcv_saddr != loc_addr) ||
> >             ipv6_only_sock(sk) ||
> >             !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
> > @@ -2854,7 +2855,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
> >         ports = INET_COMBINED_PORTS(rmt_port, hnum);
> >
> >         udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> > -               if (inet_match(net, sk, acookie, ports, dif, sdif))
> > +               if (sk->sk_state == TCP_ESTABLISHED &&
> > +                   inet_match(net, sk, acookie, ports, dif, sdif))
> >                         return sk;
> >                 /* Only check first socket in chain */
> >                 break;
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 010b909275dd..b93a9a3e7678 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -147,16 +147,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> >         score = 0;
> >         inet = inet_sk(sk);
> >
> > -       if (inet->inet_dport) {
> > +       if (sk->sk_state == TCP_ESTABLISHED) {
> >                 if (inet->inet_dport != sport)
> >                         return -1;
> >                 score++;
> > -       }
> >
> > -       if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> > -               if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > -                       return -1;
> > -               score++;
> > +               if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> 
> This looks unnecessary.

Yep, thanks. This is inverted. It should have been similar to the ipv4
logic where the unnecessary `if (inet->inet_daddr)` check was removed
after adding if `(sk->sk_state == TCP_ESTABLISHED)`:

	if (sk->sk_state == TCP_ESTABLISHED) {
		if (inet->inet_dport) {
			if (inet->inet_dport != sport)
				return -1;
			score++;
		}

		if (!ipv6_addr_equal(&sk->sk_v6_addr, saddr))
			return -1;
		score++;
	}
 
> 
> > +                       if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > +                               return -1;
> > +                       score++;
> > +               }
> >         }
> >
> >         bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
> > @@ -949,9 +949,9 @@ static bool __udp_v6_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> >         if (udp_sk(sk)->udp_port_hash != hnum ||
> >             sk->sk_family != PF_INET6 ||
> > -           (inet->inet_dport && inet->inet_dport != rmt_port) ||
> > -           (!ipv6_addr_any(&sk->sk_v6_daddr) &&
> > -                   !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
> > +           (sk->sk_state == TCP_ESTABLISHED &&
> > +            ((inet->inet_dport && inet->inet_dport != rmt_port) ||
> > +            !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))) ||
> >             !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
> >             (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
> >                     !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
> > --
> > 2.53.0.1118.gaef5881109-goog
> >

Thanks,
Jordan

  reply	other threads:[~2026-04-01 20:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
2026-03-31  1:21   ` Kuniyuki Iwashima
2026-04-01 20:50     ` Jordan Rife [this message]
2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
2026-03-31 16:51   ` kernel test robot
2026-03-31 17:33   ` kernel test robot
2026-03-31 17:42   ` kernel test robot
2026-03-31 17:55   ` kernel test robot
2026-03-31 18:49   ` kernel test robot
2026-04-25  2:45   ` sashiko-bot
2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife
2026-04-25  4:31   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac1qqMc8M96aDORI@google.com \
    --to=jrife@google.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@fomichev.me \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yusuke.suzuki@isovalent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.