From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Florian Westphal <fw@strlen.de>,
Eric Dumazet <edumazet@google.com>,
Hannes Frederic Sowa <hannes@stressinduktion.org>
Subject: Re: [PATCH net-next 1/5] net: add support for noref skb->sk
Date: Thu, 21 Sep 2017 11:14:57 +0200 [thread overview]
Message-ID: <1505985297.2560.39.camel@redhat.com> (raw)
In-Reply-To: <1505929295.29839.103.camel@edumazet-glaptop3.roam.corp.google.com>
Hi,
Thank you for looking at it!
On Wed, 2017-09-20 at 10:41 -0700, Eric Dumazet wrote:
> On Wed, 2017-09-20 at 18:54 +0200, Paolo Abeni wrote:
> > Noref sk do not carry a socket refcount, are valid
> > only inside the current RCU section and must be
> > explicitly cleared before exiting such section.
> >
> > They will be used in a later patch to allow early demux
> > without sock refcounting.
>
>
>
>
> > +/* dummy destructor used by noref sockets */
> > +void sock_dummyfree(struct sk_buff *skb)
> > +{
>
> BUG();
>
> > +}
> > +EXPORT_SYMBOL(sock_dummyfree);
> > +
We can call sock_dummyfree() in legitimate paths, see below, but we can
add a:
WARN_ON_ONCE(!rcu_read_lock_held());
here and in skb_clear_noref_sk(). That should help much to catch
possible bugs.
> I do not see how you ensure we do not leave RCU section with an skb
> destructor pointing to this sock_dummyfree()
>
> This patch series looks quite dangerous to me.
The idea is to explicitly clear the sknoref references before leaving
the RCU section. Quite alike what we currently do for dst noref, but
here the only place where we get a noref socket is the socket early
demux, thus the scope of this change is more limited to what we have
with noref dst_entries.
The relevant code is in the next 2 patches; after the demux we preserve
the sknoref only if the skb has a local destination. The UDP socket
will then set the noref on early demux lookup, and the skb will either:
* land on the corresponding UDP socket, the receive function will steal
the sknoref
* be dropped by some nft/iptables target - the dummy destructor is
called
* forwarded by some nft/iptables target outside the input path; we
clear the skref explicitly in such targets.
Currently there are an handful of places affected, and we can simplify
the code dropping the early demux result for locally terminated
multicast sockets on a host acting as a multicast router, please see
the comment on the next patch.
> Do we really have real applications using connected UDP sockets and
> wanting very high pps throughput ?
The ultimate goal is to improve the unconnected UDP sockets scenario,
we do actually have use cases for that - DNS servers and VoIP SBCs.
Thanks,
Paolo
next prev parent reply other threads:[~2017-09-21 9:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-20 16:54 [PATCH net-next 0/5] net: introduce noref sk Paolo Abeni
2017-09-20 16:54 ` [PATCH net-next 1/5] net: add support for noref skb->sk Paolo Abeni
2017-09-20 17:41 ` Eric Dumazet
2017-09-21 9:14 ` Paolo Abeni [this message]
2017-09-21 10:35 ` Eric Dumazet
2017-09-20 16:54 ` [PATCH net-next 2/5] net: allow early demux to fetch noref socket Paolo Abeni
2017-09-21 9:13 ` Paolo Abeni
2017-09-20 16:54 ` [PATCH net-next 3/5] udp: do not touch socket refcount in early demux Paolo Abeni
2017-09-20 16:54 ` [PATCH net-next 4/5] net: add simple socket-like dst cache helpers Paolo Abeni
2017-09-20 16:54 ` [PATCH net-next 5/5] udp: perform full socket lookup in early demux Paolo Abeni
2017-09-21 3:20 ` [PATCH net-next 0/5] net: introduce noref sk David Miller
2017-09-21 9:42 ` Paolo Abeni
2017-09-21 10:37 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1505985297.2560.39.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=netdev@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.