From: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
To: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [RFC v3] Add TCP encap_rcv hook
Date: Thu, 12 Apr 2012 22:10:03 +0900 [thread overview]
Message-ID: <20120412130959.GA31379@verge.net.au> (raw)
In-Reply-To: <1334218829.5300.5903.camel@edumazet-glaptop>
On Thu, Apr 12, 2012 at 10:20:29AM +0200, Eric Dumazet wrote:
> On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> > This hook is based on a hook of the same name provided by UDP. It provides
> > a way for to receive packets that have a TCP header and treat them in some
> > alternate way.
> >
> > It is intended to be used by an implementation of the STT tunneling
> > protocol within Open vSwtich's datapath. A prototype of such an
> > implementation has been made.
> >
> > The STT draft is available at
> > http://tools.ietf.org/html/draft-davie-stt-01
> >
> > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> > The first version can be found at:
> > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> >
> > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> >
>
> Hi Simon
>
> Oh well, this is insane :(
>
> > ---
> > include/linux/tcp.h | 3 +++
> > net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++-
> > 2 files changed, 25 insertions(+), 1 deletion(-)
> >
> > v3
> > * First post to netdev
> > * Replace more UDP references with TCP
> > * Move socket accesses to inside socket lock
> > and release lock on return.
> >
> > v2
> > * Fix comment to refer to TCP rather than UDP
> > * Allow skb to continue traversing the stack if
> > the encap_rcv callback returns a positive value.
> > This is the same behaviour as the UDP hook.
> >
> > diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> > index b6c62d2..7210b23 100644
> > --- a/include/linux/tcp.h
> > +++ b/include/linux/tcp.h
> > @@ -472,6 +472,9 @@ struct tcp_sock {
> > * contains related tcp_cookie_transactions fields.
> > */
> > struct tcp_cookie_values *cookie_values;
> > +
> > + /* For encapsulation sockets. */
> > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> > };
> >
>
> This adds a new cache miss for all incoming tcp frames...
>
> > static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 3a25cf7..9898f71 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > const struct iphdr *iph;
> > const struct tcphdr *th;
> > struct sock *sk;
> > + struct tcp_sock *tp;
> > int ret;
> > struct net *net = dev_net(skb->dev);
> > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >
> > if (skb->pkt_type != PACKET_HOST)
> > goto discard_it;
> > @@ -1726,9 +1728,27 @@ process:
> >
> > bh_lock_sock_nested(sk);
> > ret = 0;
> > +
> > + tp = tcp_sk(sk);
> > + encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> > + if (encap_rcv != NULL) {
>
> and a new conditional...
>
> > + /*
> > + * This is an encapsulation socket so pass the skb to
> > + * the socket's tcp_encap_rcv() hook. Otherwise, just
> > + * fall through and pass this up the TCP socket.
> > + * up->encap_rcv() returns the following value:
> > + * <=0 if skb was successfully passed to the encap
> > + * handler or was discarded by it.
> > + * >0 if skb should be passed on to TCP.
> > + */
> > + if (encap_rcv(sk, skb) <= 0) {
> > + ret = 0;
> > + goto unlock_sock;
> > + }
> > + }
> > +
> > if (!sock_owned_by_user(sk)) {
> > #ifdef CONFIG_NET_DMA
> > - struct tcp_sock *tp = tcp_sk(sk);
> > if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
> > tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
> > if (tp->ucopy.dma_chan)
> > @@ -1744,6 +1764,7 @@ process:
> > NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
> > goto discard_and_relse;
> > }
> > +unlock_sock:
> > bh_unlock_sock(sk);
> >
> > sock_put(sk);
>
> I dont know, this sounds as a hack. Since you obviously spent a lot of
> time on this stuff, lets be constructive.
Hi Eric,
Thanks, I didn't really expect my patch to go in smoothly as is.
Though it may well be my first brush with insanity.
>
> I really suggest you take a look at <linux/static_key.h>
>
> So that on machines without any need for this encap_rcv, we dont even
> need to fetch tp->encap_rcv
>
> if (static_key_false(&stt_active)) {
> /* stt might be used on this socket */
> encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> if (encap_rcv) {
> ...
> }
> }
>
> This way, if stt is not used/loaded, we have a single NOP
>
> If stt is used, NOP is patched to a JMP stt_code
>
>
> I probably implement this idea on UDP shortly so that you can have a
> reference for your implementation.
Thanks, I see your UDP code now. I'll see about getting the same thing
working for TCP.
prev parent reply other threads:[~2012-04-12 13:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-12 7:42 [RFC v3] Add TCP encap_rcv hook Simon Horman
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 8:20 ` Eric Dumazet
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
2012-04-12 14:35 ` Simon Horman
[not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 14:40 ` [RFC v4] Add TCP encap_rcv hook Simon Horman
2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
[not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-13 17:45 ` Benjamin LaHaise
2012-04-12 13:10 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120412130959.GA31379@verge.net.au \
--to=horms-/r6kz+ddxgppr4jqbcensq@public.gmane.org \
--cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
--cc=eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).