From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <edumazet@google.com>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, eric.dumazet@gmail.com,
Martin KaFai Lau <kafai@fb.com>, Joe Stringer <joe@wand.net.nz>,
Alexei Starovoitov <ast@kernel.org>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
Kuniyuki Iwashima <kuniyu@amazon.com>
Subject: Re: [PATCH net-next] udp: no longer touch sk->sk_refcnt in early demux
Date: Fri, 08 Mar 2024 10:19:59 +0100 [thread overview]
Message-ID: <77f54006d8127bc76c8fb81c7cfa8df1723e317e.camel@redhat.com> (raw)
In-Reply-To: <d149f4511c39f39fa6dc8e7c7324962434ae82e9.camel@redhat.com>
On Fri, 2024-03-08 at 09:37 +0100, Paolo Abeni wrote:
> On Thu, 2024-03-07 at 22:00 +0000, Eric Dumazet wrote:
> > After commits ca065d0cf80f ("udp: no longer use SLAB_DESTROY_BY_RCU")
> > and 7ae215d23c12 ("bpf: Don't refcount LISTEN sockets in sk_assign()")
> > UDP early demux no longer need to grab a refcount on the UDP socket.
> >
> > This save two atomic operations per incoming packet for connected
> > sockets.
>
> This reminds me of a old series:
>
> https://lore.kernel.org/netdev/cover.1506114055.git.pabeni@redhat.com/
>
> and I'm wondering if we could reconsider such option.
>
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Martin KaFai Lau <kafai@fb.com>
> > Cc: Joe Stringer <joe@wand.net.nz>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> > net/ipv4/udp.c | 5 +++--
> > net/ipv6/udp.c | 5 +++--
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index a8acea17b4e5344d022ae8f8eb674d1a36f8035a..e43ad1d846bdc2ddf5767606b78bbd055f692aa8 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -2570,11 +2570,12 @@ int udp_v4_early_demux(struct sk_buff *skb)
> > uh->source, iph->saddr, dif, sdif);
> > }
> >
> > - if (!sk || !refcount_inc_not_zero(&sk->sk_refcnt))
> > + if (!sk)
> > return 0;
> >
> > skb->sk = sk;
> > - skb->destructor = sock_efree;
> > + DEBUG_NET_WARN_ON_ONCE(sk_is_refcounted(sk));
> > + skb->destructor = sock_pfree;
>
> I *think* that the skb may escape the current rcu section if e.g. if
> matches a nf dup target in the input tables.
>
> Back then I tried to implement some debug infra to track such accesses:
>
> https://lore.kernel.org/lkml/cover.1507294365.git.pabeni@redhat.com/
>
> which was buggy (prone to false negative). I think it can be improved
> to something more reliable, perhaps I should revamp it?
>
> I'm also wondering if the DEBUG_NET_WARN_ON_ONCE is worthy?!? the sk is
> an hashed UDP socket so is a full sock and has the bit SOCK_RCU_FREE
> set.
>
> Perhaps we could use a simple 'noop' destructor as in:
>
> https://lore.kernel.org/netdev/b16163e3a4fa4d772edeabd8743acb4a07206bb9.1506114055.git.pabeni@redhat.com/
Please ignore this last part, too late I noticed we need 'sock_pfree'
to let inet_steal_sock() work as expected.
Paolo
next prev parent reply other threads:[~2024-03-08 9:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 22:00 [PATCH net-next] udp: no longer touch sk->sk_refcnt in early demux Eric Dumazet
2024-03-08 8:37 ` Paolo Abeni
2024-03-08 9:19 ` Paolo Abeni [this message]
2024-03-08 9:23 ` Eric Dumazet
2024-03-08 9:21 ` Eric Dumazet
2024-03-08 11:10 ` Paolo Abeni
2024-03-08 12:40 ` Eric Dumazet
2024-03-11 19:39 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=77f54006d8127bc76c8fb81c7cfa8df1723e317e.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=ast@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=joe@wand.net.nz \
--cc=kafai@fb.com \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=netdev@vger.kernel.org \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).