netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3] Add TCP encap_rcv hook
@ 2012-04-12  7:42 Simon Horman
       [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12  7:42 UTC (permalink / raw)
  To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

This hook is based on a hook of the same name provided by UDP.  It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.

It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.

The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01

My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The first version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html

Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

---
 include/linux/tcp.h |    3 +++
 net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

v3
* First post to netdev
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
  and release lock on return.

v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
  the encap_rcv callback returns a positive value.
  This is the same behaviour as the UDP hook.

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
 	 * contains related tcp_cookie_transactions fields.
 	 */
 	struct tcp_cookie_values  *cookie_values;
+
+	/* For encapsulation sockets. */
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 };
 
 static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..9898f71 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	struct sock *sk;
+	struct tcp_sock *tp;
 	int ret;
 	struct net *net = dev_net(skb->dev);
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 	if (skb->pkt_type != PACKET_HOST)
 		goto discard_it;
@@ -1726,9 +1728,27 @@ process:
 
 	bh_lock_sock_nested(sk);
 	ret = 0;
+
+	tp = tcp_sk(sk);
+	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+	if (encap_rcv != NULL) {
+		/*
+		 * This is an encapsulation socket so pass the skb to
+		 * the socket's tcp_encap_rcv() hook. Otherwise, just
+		 * fall through and pass this up the TCP socket.
+		 * up->encap_rcv() returns the following value:
+		 * <=0 if skb was successfully passed to the encap
+		 *     handler or was discarded by it.
+		 * >0 if skb should be passed on to TCP.
+		 */
+		if (encap_rcv(sk, skb) <= 0) {
+			ret = 0;
+			goto unlock_sock;
+		}
+	}
+
 	if (!sock_owned_by_user(sk)) {
 #ifdef CONFIG_NET_DMA
-		struct tcp_sock *tp = tcp_sk(sk);
 		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
 			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
 		if (tp->ucopy.dma_chan)
@@ -1744,6 +1764,7 @@ process:
 		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
 		goto discard_and_relse;
 	}
+unlock_sock:
 	bh_unlock_sock(sk);
 
 	sock_put(sk);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC v3] Add TCP encap_rcv hook
       [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12  8:20   ` Eric Dumazet
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  8:20 UTC (permalink / raw)
  To: Simon Horman; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> This hook is based on a hook of the same name provided by UDP.  It provides
> a way for to receive packets that have a TCP header and treat them in some
> alternate way.
> 
> It is intended to be used by an implementation of the STT tunneling
> protocol within Open vSwtich's datapath. A prototype of such an
> implementation has been made.
> 
> The STT draft is available at
> http://tools.ietf.org/html/draft-davie-stt-01
> 
> My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> The first version can be found at:
> http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> 
> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> 

Hi Simon

Oh well, this is insane :(

> ---
>  include/linux/tcp.h |    3 +++
>  net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> v3
> * First post to netdev
> * Replace more UDP references with TCP
> * Move socket accesses to inside socket lock
>   and release lock on return.
> 
> v2
> * Fix comment to refer to TCP rather than UDP
> * Allow skb to continue traversing the stack if
>   the encap_rcv callback returns a positive value.
>   This is the same behaviour as the UDP hook.
> 
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index b6c62d2..7210b23 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -472,6 +472,9 @@ struct tcp_sock {
>  	 * contains related tcp_cookie_transactions fields.
>  	 */
>  	struct tcp_cookie_values  *cookie_values;
> +
> +	/* For encapsulation sockets. */
> +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  };
>  

This adds a new cache miss for all incoming tcp frames...

>  static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 3a25cf7..9898f71 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  	const struct iphdr *iph;
>  	const struct tcphdr *th;
>  	struct sock *sk;
> +	struct tcp_sock *tp;
>  	int ret;
>  	struct net *net = dev_net(skb->dev);
> +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  
>  	if (skb->pkt_type != PACKET_HOST)
>  		goto discard_it;
> @@ -1726,9 +1728,27 @@ process:
>  
>  	bh_lock_sock_nested(sk);
>  	ret = 0;
> +
> +	tp = tcp_sk(sk);
> +	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> +	if (encap_rcv != NULL) {

and a new conditional...

> +		/*
> +		 * This is an encapsulation socket so pass the skb to
> +		 * the socket's tcp_encap_rcv() hook. Otherwise, just
> +		 * fall through and pass this up the TCP socket.
> +		 * up->encap_rcv() returns the following value:
> +		 * <=0 if skb was successfully passed to the encap
> +		 *     handler or was discarded by it.
> +		 * >0 if skb should be passed on to TCP.
> +		 */
> +		if (encap_rcv(sk, skb) <= 0) {
> +			ret = 0;
> +			goto unlock_sock;
> +		}
> +	}
> +
>  	if (!sock_owned_by_user(sk)) {
>  #ifdef CONFIG_NET_DMA
> -		struct tcp_sock *tp = tcp_sk(sk);
>  		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
>  			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
>  		if (tp->ucopy.dma_chan)
> @@ -1744,6 +1764,7 @@ process:
>  		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
>  		goto discard_and_relse;
>  	}
> +unlock_sock:
>  	bh_unlock_sock(sk);
>  
>  	sock_put(sk);

I dont know, this sounds as a hack. Since you obviously spent a lot of
time on this stuff, lets be constructive.

I really suggest you take a look at <linux/static_key.h>

So that on machines without any need for this encap_rcv, we dont even
need to fetch tp->encap_rcv

if (static_key_false(&stt_active)) {
	/* stt might be used on this socket */
	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
	if (encap_rcv) {
		...
	}
}

This way, if stt is not used/loaded, we have a single NOP

If stt is used, NOP is patched to a JMP stt_code


I probably implement this idea on UDP shortly so that you can have a
reference for your implementation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  8:20   ` Eric Dumazet
@ 2012-04-12  9:05     ` Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
                         ` (2 more replies)
  2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman
  1 sibling, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  9:05 UTC (permalink / raw)
  To: Simon Horman, David Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

Most machines dont use UDP encapsulation (L2TP)

Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.

Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.

If static_key is not yet enabled, the fast path does a single JMP .

When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.

Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
---
 include/net/udp.h    |    1 +
 net/ipv4/udp.c       |   12 +++++++++++-
 net/l2tp/l2tp_core.c |    1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index 5d606d9..9671f5f 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -267,4 +267,5 @@ extern void udp_init(void);
 extern int udp4_ufo_send_check(struct sk_buff *skb);
 extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
 	netdev_features_t features);
+extern void udp_encap_enable(void);
 #endif	/* _UDP_H */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index fe14105..ad1e0dd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -107,6 +107,7 @@
 #include <net/checksum.h>
 #include <net/xfrm.h>
 #include <trace/events/udp.h>
+#include <linux/static_key.h>
 #include "udp_impl.h"
 
 struct udp_table udp_table __read_mostly;
@@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 
 }
 
+static struct static_key udp_encap_needed __read_mostly;
+void udp_encap_enable(void)
+{
+	if (!static_key_enabled(&udp_encap_needed))
+		static_key_slow_inc(&udp_encap_needed);
+}
+EXPORT_SYMBOL(udp_encap_enable);
+
 /* returns:
  *  -1: error
  *   0: success
@@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 		goto drop;
 	nf_reset(skb);
 
-	if (up->encap_type) {
+	if (static_key_false(&udp_encap_needed) && up->encap_type) {
 		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 		/*
@@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
 			/* FALLTHROUGH */
 		case UDP_ENCAP_L2TPINUDP:
 			up->encap_type = val;
+			udp_encap_enable();
 			break;
 		default:
 			err = -ENOPROTOOPT;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 89ff8c6..f6732b6 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 		/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
 		udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
 		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
+		udp_encap_enable();
 	}
 
 	sk->sk_user_data = tunnel;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12  9:10       ` Eric Dumazet
  2012-04-12 14:35       ` Simon Horman
  2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
  2 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  9:10 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller

On Thu, 2012-04-12 at 11:05 +0200, Eric Dumazet wrote:

> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.

Small note Simon,

The jump trick is effective on x86 (and maybe some other arches) when

CONFIG_JUMP_LABEL=y

Else, its replaced by atomic_read(...) > 0, a cnditional jump but
reading a read_mostly/shared variable, instead of a per socket field.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v3] Add TCP encap_rcv hook
  2012-04-12  8:20   ` Eric Dumazet
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12 13:10     ` Simon Horman
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 13:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 12, 2012 at 10:20:29AM +0200, Eric Dumazet wrote:
> On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> > This hook is based on a hook of the same name provided by UDP.  It provides
> > a way for to receive packets that have a TCP header and treat them in some
> > alternate way.
> > 
> > It is intended to be used by an implementation of the STT tunneling
> > protocol within Open vSwtich's datapath. A prototype of such an
> > implementation has been made.
> > 
> > The STT draft is available at
> > http://tools.ietf.org/html/draft-davie-stt-01
> > 
> > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> > The first version can be found at:
> > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> > 
> > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> > 
> 
> Hi Simon
> 
> Oh well, this is insane :(
> 
> > ---
> >  include/linux/tcp.h |    3 +++
> >  net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > v3
> > * First post to netdev
> > * Replace more UDP references with TCP
> > * Move socket accesses to inside socket lock
> >   and release lock on return.
> > 
> > v2
> > * Fix comment to refer to TCP rather than UDP
> > * Allow skb to continue traversing the stack if
> >   the encap_rcv callback returns a positive value.
> >   This is the same behaviour as the UDP hook.
> > 
> > diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> > index b6c62d2..7210b23 100644
> > --- a/include/linux/tcp.h
> > +++ b/include/linux/tcp.h
> > @@ -472,6 +472,9 @@ struct tcp_sock {
> >  	 * contains related tcp_cookie_transactions fields.
> >  	 */
> >  	struct tcp_cookie_values  *cookie_values;
> > +
> > +	/* For encapsulation sockets. */
> > +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >  };
> >  
> 
> This adds a new cache miss for all incoming tcp frames...
> 
> >  static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 3a25cf7..9898f71 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >  	const struct iphdr *iph;
> >  	const struct tcphdr *th;
> >  	struct sock *sk;
> > +	struct tcp_sock *tp;
> >  	int ret;
> >  	struct net *net = dev_net(skb->dev);
> > +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >  
> >  	if (skb->pkt_type != PACKET_HOST)
> >  		goto discard_it;
> > @@ -1726,9 +1728,27 @@ process:
> >  
> >  	bh_lock_sock_nested(sk);
> >  	ret = 0;
> > +
> > +	tp = tcp_sk(sk);
> > +	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> > +	if (encap_rcv != NULL) {
> 
> and a new conditional...
> 
> > +		/*
> > +		 * This is an encapsulation socket so pass the skb to
> > +		 * the socket's tcp_encap_rcv() hook. Otherwise, just
> > +		 * fall through and pass this up the TCP socket.
> > +		 * up->encap_rcv() returns the following value:
> > +		 * <=0 if skb was successfully passed to the encap
> > +		 *     handler or was discarded by it.
> > +		 * >0 if skb should be passed on to TCP.
> > +		 */
> > +		if (encap_rcv(sk, skb) <= 0) {
> > +			ret = 0;
> > +			goto unlock_sock;
> > +		}
> > +	}
> > +
> >  	if (!sock_owned_by_user(sk)) {
> >  #ifdef CONFIG_NET_DMA
> > -		struct tcp_sock *tp = tcp_sk(sk);
> >  		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
> >  			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
> >  		if (tp->ucopy.dma_chan)
> > @@ -1744,6 +1764,7 @@ process:
> >  		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
> >  		goto discard_and_relse;
> >  	}
> > +unlock_sock:
> >  	bh_unlock_sock(sk);
> >  
> >  	sock_put(sk);
> 
> I dont know, this sounds as a hack. Since you obviously spent a lot of
> time on this stuff, lets be constructive.

Hi Eric,

Thanks, I didn't really expect my patch to go in smoothly as is.
Though it may well be my first brush with insanity.

> 
> I really suggest you take a look at <linux/static_key.h>
> 
> So that on machines without any need for this encap_rcv, we dont even
> need to fetch tp->encap_rcv
> 
> if (static_key_false(&stt_active)) {
> 	/* stt might be used on this socket */
> 	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> 	if (encap_rcv) {
> 		...
> 	}
> }
> 
> This way, if stt is not used/loaded, we have a single NOP
> 
> If stt is used, NOP is patched to a JMP stt_code
> 
> 
> I probably implement this idea on UDP shortly so that you can have a
> reference for your implementation.

Thanks, I see your UDP code now. I'll see about getting the same thing
working for TCP.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
@ 2012-04-12 14:35       ` Simon Horman
       [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
  2 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller

On Thu, Apr 12, 2012 at 11:05:28AM +0200, Eric Dumazet wrote:
> Most machines dont use UDP encapsulation (L2TP)
> 
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
> 
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
> 
> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.

Thanks Eric,

I have not had a chance to test your code, though it should be easy enough
to do so in the context of Open vSwitch as its CAPWAP implementation makes
use of UDP's encap_rcv (which is how I arrived at adding hook to TCP to
implement STT for Open vSwtich).

I have incorporated your static_key code into a new version of my TCP
encap_rcv patch and that does appear to work. I will post it ASAP.

> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
> ---
>  include/net/udp.h    |    1 +
>  net/ipv4/udp.c       |   12 +++++++++++-
>  net/l2tp/l2tp_core.c |    1 +
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 5d606d9..9671f5f 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -267,4 +267,5 @@ extern void udp_init(void);
>  extern int udp4_ufo_send_check(struct sk_buff *skb);
>  extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
>  	netdev_features_t features);
> +extern void udp_encap_enable(void);
>  #endif	/* _UDP_H */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index fe14105..ad1e0dd 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -107,6 +107,7 @@
>  #include <net/checksum.h>
>  #include <net/xfrm.h>
>  #include <trace/events/udp.h>
> +#include <linux/static_key.h>
>  #include "udp_impl.h"
>  
>  struct udp_table udp_table __read_mostly;
> @@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>  
>  }
>  
> +static struct static_key udp_encap_needed __read_mostly;
> +void udp_encap_enable(void)
> +{
> +	if (!static_key_enabled(&udp_encap_needed))
> +		static_key_slow_inc(&udp_encap_needed);
> +}
> +EXPORT_SYMBOL(udp_encap_enable);
> +
>  /* returns:
>   *  -1: error
>   *   0: success
> @@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>  		goto drop;
>  	nf_reset(skb);
>  
> -	if (up->encap_type) {
> +	if (static_key_false(&udp_encap_needed) && up->encap_type) {
>  		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  
>  		/*
> @@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
>  			/* FALLTHROUGH */
>  		case UDP_ENCAP_L2TPINUDP:
>  			up->encap_type = val;
> +			udp_encap_enable();
>  			break;
>  		default:
>  			err = -ENOPROTOOPT;
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 89ff8c6..f6732b6 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
>  		/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
>  		udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
>  		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
> +		udp_encap_enable();
>  	}
>  
>  	sk->sk_user_data = tunnel;
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v4] Add TCP encap_rcv hook
       [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12 14:40           ` Simon Horman
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	David Miller

This hook is based on a hook of the same name provided by UDP.  It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.

It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.

The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01

My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The second version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg09001.html
It needs to be updated to call tcp_encap_enable()

Cc: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

---
v4
* Make use of static_key,
  a tonic for insanity suggested by Eric Dumazet

v3
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
  and release lock on return.

v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
  the encap_rcv callback returns a positive value.
  This is the same behaviour as the UDP hook.
---
 include/linux/tcp.h |    3 +++
 include/net/tcp.h   |    1 +
 net/ipv4/tcp_ipv4.c |   34 +++++++++++++++++++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
 	 * contains related tcp_cookie_transactions fields.
 	 */
 	struct tcp_cookie_values  *cookie_values;
+
+	/* For encapsulation sockets. */
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 };
 
 static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f75a04d..f2c4ac0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1575,5 +1575,6 @@ static inline struct tcp_extend_values *tcp_xv(struct request_values *rvp)
 
 extern void tcp_v4_init(void);
 extern void tcp_init(void);
+extern void tcp_encap_enable(void);
 
 #endif	/* _TCP_H */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..dadcec6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -62,6 +62,7 @@
 #include <linux/init.h>
 #include <linux/times.h>
 #include <linux/slab.h>
+#include <linux/static_key.h>
 
 #include <net/net_namespace.h>
 #include <net/icmp.h>
@@ -1657,6 +1658,14 @@ csum_err:
 }
 EXPORT_SYMBOL(tcp_v4_do_rcv);
 
+static struct static_key tcp_encap_needed __read_mostly;
+void tcp_encap_enable(void)
+{
+	if (!static_key_enabled(&tcp_encap_needed))
+		static_key_slow_inc(&tcp_encap_needed);
+}
+EXPORT_SYMBOL(tcp_encap_enable);
+
 /*
  *	From tcp_input.c
  */
@@ -1666,6 +1675,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	struct sock *sk;
+	struct tcp_sock *tp;
 	int ret;
 	struct net *net = dev_net(skb->dev);
 
@@ -1726,9 +1736,30 @@ process:
 
 	bh_lock_sock_nested(sk);
 	ret = 0;
+
+	tp = tcp_sk(sk);
+	if (static_key_false(&tcp_encap_needed)) {
+		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
+		encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+		if (encap_rcv != NULL) {
+			/*
+			 * This is an encapsulation socket so pass the skb to
+			 * the socket's tcp_encap_rcv() hook. Otherwise, just
+			 * fall through and pass this up the TCP socket.
+			 * up->encap_rcv() returns the following value:
+			 * <=0 if skb was successfully passed to the encap
+			 *     handler or was discarded by it.
+			 * >0 if skb should be passed on to TCP.
+			 */
+			if (encap_rcv(sk, skb) <= 0) {
+				ret = 0;
+				goto unlock_sock;
+			}
+		}
+	}
+
 	if (!sock_owned_by_user(sk)) {
 #ifdef CONFIG_NET_DMA
-		struct tcp_sock *tp = tcp_sk(sk);
 		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
 			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
 		if (tp->ucopy.dma_chan)
@@ -1744,6 +1775,7 @@ process:
 		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
 		goto discard_and_relse;
 	}
+unlock_sock:
 	bh_unlock_sock(sk);
 
 	sock_put(sk);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
  2012-04-12 14:35       ` Simon Horman
@ 2012-04-13 17:41       ` David Miller
       [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2012-04-13 17:41 UTC (permalink / raw)
  To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	bcrl-Bw31MaZKKs3YtjvyW6yDsg

From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Thu, 12 Apr 2012 11:05:28 +0200

> Most machines dont use UDP encapsulation (L2TP)
> 
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
> 
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
> 
> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Applied to net-next, thanks Eric.

Ban, please incorporate this scheme when you respin your
fixed ipv6 encap/l2tp patches.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
       [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-04-13 17:45           ` Benjamin LaHaise
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin LaHaise @ 2012-04-13 17:45 UTC (permalink / raw)
  To: David Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w

Hi folks,

On Fri, Apr 13, 2012 at 01:41:08PM -0400, David Miller wrote:
> Ban, please incorporate this scheme when you respin your
> fixed ipv6 encap/l2tp patches.
> 
> Thanks.

Okay, will do.  Thanks for the feedback and heads up.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-04-13 17:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-12  7:42 [RFC v3] Add TCP encap_rcv hook Simon Horman
     [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12  8:20   ` Eric Dumazet
2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12  9:10       ` Eric Dumazet
2012-04-12 14:35       ` Simon Horman
     [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 14:40           ` [RFC v4] Add TCP encap_rcv hook Simon Horman
2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
     [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-13 17:45           ` Benjamin LaHaise
2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).