* [RFC v3] Add TCP encap_rcv hook
@ 2012-04-12 7:42 Simon Horman
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12 7:42 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
This hook is based on a hook of the same name provided by UDP. It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.
It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.
The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01
My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The first version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
include/linux/tcp.h | 3 +++
net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++-
2 files changed, 25 insertions(+), 1 deletion(-)
v3
* First post to netdev
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
and release lock on return.
v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
the encap_rcv callback returns a positive value.
This is the same behaviour as the UDP hook.
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
* contains related tcp_cookie_transactions fields.
*/
struct tcp_cookie_values *cookie_values;
+
+ /* For encapsulation sockets. */
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..9898f71 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
const struct iphdr *iph;
const struct tcphdr *th;
struct sock *sk;
+ struct tcp_sock *tp;
int ret;
struct net *net = dev_net(skb->dev);
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
if (skb->pkt_type != PACKET_HOST)
goto discard_it;
@@ -1726,9 +1728,27 @@ process:
bh_lock_sock_nested(sk);
ret = 0;
+
+ tp = tcp_sk(sk);
+ encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+ if (encap_rcv != NULL) {
+ /*
+ * This is an encapsulation socket so pass the skb to
+ * the socket's tcp_encap_rcv() hook. Otherwise, just
+ * fall through and pass this up the TCP socket.
+ * up->encap_rcv() returns the following value:
+ * <=0 if skb was successfully passed to the encap
+ * handler or was discarded by it.
+ * >0 if skb should be passed on to TCP.
+ */
+ if (encap_rcv(sk, skb) <= 0) {
+ ret = 0;
+ goto unlock_sock;
+ }
+ }
+
if (!sock_owned_by_user(sk)) {
#ifdef CONFIG_NET_DMA
- struct tcp_sock *tp = tcp_sk(sk);
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
if (tp->ucopy.dma_chan)
@@ -1744,6 +1764,7 @@ process:
NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
goto discard_and_relse;
}
+unlock_sock:
bh_unlock_sock(sk);
sock_put(sk);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC v3] Add TCP encap_rcv hook
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12 8:20 ` Eric Dumazet
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman
0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12 8:20 UTC (permalink / raw)
To: Simon Horman; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> This hook is based on a hook of the same name provided by UDP. It provides
> a way for to receive packets that have a TCP header and treat them in some
> alternate way.
>
> It is intended to be used by an implementation of the STT tunneling
> protocol within Open vSwtich's datapath. A prototype of such an
> implementation has been made.
>
> The STT draft is available at
> http://tools.ietf.org/html/draft-davie-stt-01
>
> My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> The first version can be found at:
> http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
>
> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
>
Hi Simon
Oh well, this is insane :(
> ---
> include/linux/tcp.h | 3 +++
> net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++-
> 2 files changed, 25 insertions(+), 1 deletion(-)
>
> v3
> * First post to netdev
> * Replace more UDP references with TCP
> * Move socket accesses to inside socket lock
> and release lock on return.
>
> v2
> * Fix comment to refer to TCP rather than UDP
> * Allow skb to continue traversing the stack if
> the encap_rcv callback returns a positive value.
> This is the same behaviour as the UDP hook.
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index b6c62d2..7210b23 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -472,6 +472,9 @@ struct tcp_sock {
> * contains related tcp_cookie_transactions fields.
> */
> struct tcp_cookie_values *cookie_values;
> +
> + /* For encapsulation sockets. */
> + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> };
>
This adds a new cache miss for all incoming tcp frames...
> static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 3a25cf7..9898f71 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
> const struct iphdr *iph;
> const struct tcphdr *th;
> struct sock *sk;
> + struct tcp_sock *tp;
> int ret;
> struct net *net = dev_net(skb->dev);
> + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>
> if (skb->pkt_type != PACKET_HOST)
> goto discard_it;
> @@ -1726,9 +1728,27 @@ process:
>
> bh_lock_sock_nested(sk);
> ret = 0;
> +
> + tp = tcp_sk(sk);
> + encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> + if (encap_rcv != NULL) {
and a new conditional...
> + /*
> + * This is an encapsulation socket so pass the skb to
> + * the socket's tcp_encap_rcv() hook. Otherwise, just
> + * fall through and pass this up the TCP socket.
> + * up->encap_rcv() returns the following value:
> + * <=0 if skb was successfully passed to the encap
> + * handler or was discarded by it.
> + * >0 if skb should be passed on to TCP.
> + */
> + if (encap_rcv(sk, skb) <= 0) {
> + ret = 0;
> + goto unlock_sock;
> + }
> + }
> +
> if (!sock_owned_by_user(sk)) {
> #ifdef CONFIG_NET_DMA
> - struct tcp_sock *tp = tcp_sk(sk);
> if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
> tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
> if (tp->ucopy.dma_chan)
> @@ -1744,6 +1764,7 @@ process:
> NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
> goto discard_and_relse;
> }
> +unlock_sock:
> bh_unlock_sock(sk);
>
> sock_put(sk);
I dont know, this sounds as a hack. Since you obviously spent a lot of
time on this stuff, lets be constructive.
I really suggest you take a look at <linux/static_key.h>
So that on machines without any need for this encap_rcv, we dont even
need to fetch tp->encap_rcv
if (static_key_false(&stt_active)) {
/* stt might be used on this socket */
encap_rcv = ACCESS_ONCE(tp->encap_rcv);
if (encap_rcv) {
...
}
}
This way, if stt is not used/loaded, we have a single NOP
If stt is used, NOP is patched to a JMP stt_code
I probably implement this idea on UDP shortly so that you can have a
reference for your implementation.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net-next] udp: intoduce udp_encap_needed static_key
2012-04-12 8:20 ` Eric Dumazet
@ 2012-04-12 9:05 ` Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
` (2 more replies)
2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman
1 sibling, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12 9:05 UTC (permalink / raw)
To: Simon Horman, David Miller
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
Most machines dont use UDP encapsulation (L2TP)
Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.
Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.
If static_key is not yet enabled, the fast path does a single JMP .
When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.
Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
---
include/net/udp.h | 1 +
net/ipv4/udp.c | 12 +++++++++++-
net/l2tp/l2tp_core.c | 1 +
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/include/net/udp.h b/include/net/udp.h
index 5d606d9..9671f5f 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -267,4 +267,5 @@ extern void udp_init(void);
extern int udp4_ufo_send_check(struct sk_buff *skb);
extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
netdev_features_t features);
+extern void udp_encap_enable(void);
#endif /* _UDP_H */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index fe14105..ad1e0dd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -107,6 +107,7 @@
#include <net/checksum.h>
#include <net/xfrm.h>
#include <trace/events/udp.h>
+#include <linux/static_key.h>
#include "udp_impl.h"
struct udp_table udp_table __read_mostly;
@@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
}
+static struct static_key udp_encap_needed __read_mostly;
+void udp_encap_enable(void)
+{
+ if (!static_key_enabled(&udp_encap_needed))
+ static_key_slow_inc(&udp_encap_needed);
+}
+EXPORT_SYMBOL(udp_encap_enable);
+
/* returns:
* -1: error
* 0: success
@@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
goto drop;
nf_reset(skb);
- if (up->encap_type) {
+ if (static_key_false(&udp_encap_needed) && up->encap_type) {
int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
/*
@@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
/* FALLTHROUGH */
case UDP_ENCAP_L2TPINUDP:
up->encap_type = val;
+ udp_encap_enable();
break;
default:
err = -ENOPROTOOPT;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 89ff8c6..f6732b6 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
+ udp_encap_enable();
}
sk->sk_user_data = tunnel;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12 9:10 ` Eric Dumazet
2012-04-12 14:35 ` Simon Horman
2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
2 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12 9:10 UTC (permalink / raw)
To: Simon Horman
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David Miller
On Thu, 2012-04-12 at 11:05 +0200, Eric Dumazet wrote:
> If static_key is not yet enabled, the fast path does a single JMP .
>
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.
Small note Simon,
The jump trick is effective on x86 (and maybe some other arches) when
CONFIG_JUMP_LABEL=y
Else, its replaced by atomic_read(...) > 0, a cnditional jump but
reading a read_mostly/shared variable, instead of a per socket field.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC v3] Add TCP encap_rcv hook
2012-04-12 8:20 ` Eric Dumazet
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12 13:10 ` Simon Horman
1 sibling, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 13:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
On Thu, Apr 12, 2012 at 10:20:29AM +0200, Eric Dumazet wrote:
> On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> > This hook is based on a hook of the same name provided by UDP. It provides
> > a way for to receive packets that have a TCP header and treat them in some
> > alternate way.
> >
> > It is intended to be used by an implementation of the STT tunneling
> > protocol within Open vSwtich's datapath. A prototype of such an
> > implementation has been made.
> >
> > The STT draft is available at
> > http://tools.ietf.org/html/draft-davie-stt-01
> >
> > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> > The first version can be found at:
> > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> >
> > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> >
>
> Hi Simon
>
> Oh well, this is insane :(
>
> > ---
> > include/linux/tcp.h | 3 +++
> > net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++-
> > 2 files changed, 25 insertions(+), 1 deletion(-)
> >
> > v3
> > * First post to netdev
> > * Replace more UDP references with TCP
> > * Move socket accesses to inside socket lock
> > and release lock on return.
> >
> > v2
> > * Fix comment to refer to TCP rather than UDP
> > * Allow skb to continue traversing the stack if
> > the encap_rcv callback returns a positive value.
> > This is the same behaviour as the UDP hook.
> >
> > diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> > index b6c62d2..7210b23 100644
> > --- a/include/linux/tcp.h
> > +++ b/include/linux/tcp.h
> > @@ -472,6 +472,9 @@ struct tcp_sock {
> > * contains related tcp_cookie_transactions fields.
> > */
> > struct tcp_cookie_values *cookie_values;
> > +
> > + /* For encapsulation sockets. */
> > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> > };
> >
>
> This adds a new cache miss for all incoming tcp frames...
>
> > static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 3a25cf7..9898f71 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > const struct iphdr *iph;
> > const struct tcphdr *th;
> > struct sock *sk;
> > + struct tcp_sock *tp;
> > int ret;
> > struct net *net = dev_net(skb->dev);
> > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >
> > if (skb->pkt_type != PACKET_HOST)
> > goto discard_it;
> > @@ -1726,9 +1728,27 @@ process:
> >
> > bh_lock_sock_nested(sk);
> > ret = 0;
> > +
> > + tp = tcp_sk(sk);
> > + encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> > + if (encap_rcv != NULL) {
>
> and a new conditional...
>
> > + /*
> > + * This is an encapsulation socket so pass the skb to
> > + * the socket's tcp_encap_rcv() hook. Otherwise, just
> > + * fall through and pass this up the TCP socket.
> > + * up->encap_rcv() returns the following value:
> > + * <=0 if skb was successfully passed to the encap
> > + * handler or was discarded by it.
> > + * >0 if skb should be passed on to TCP.
> > + */
> > + if (encap_rcv(sk, skb) <= 0) {
> > + ret = 0;
> > + goto unlock_sock;
> > + }
> > + }
> > +
> > if (!sock_owned_by_user(sk)) {
> > #ifdef CONFIG_NET_DMA
> > - struct tcp_sock *tp = tcp_sk(sk);
> > if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
> > tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
> > if (tp->ucopy.dma_chan)
> > @@ -1744,6 +1764,7 @@ process:
> > NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
> > goto discard_and_relse;
> > }
> > +unlock_sock:
> > bh_unlock_sock(sk);
> >
> > sock_put(sk);
>
> I dont know, this sounds as a hack. Since you obviously spent a lot of
> time on this stuff, lets be constructive.
Hi Eric,
Thanks, I didn't really expect my patch to go in smoothly as is.
Though it may well be my first brush with insanity.
>
> I really suggest you take a look at <linux/static_key.h>
>
> So that on machines without any need for this encap_rcv, we dont even
> need to fetch tp->encap_rcv
>
> if (static_key_false(&stt_active)) {
> /* stt might be used on this socket */
> encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> if (encap_rcv) {
> ...
> }
> }
>
> This way, if stt is not used/loaded, we have a single NOP
>
> If stt is used, NOP is patched to a JMP stt_code
>
>
> I probably implement this idea on UDP shortly so that you can have a
> reference for your implementation.
Thanks, I see your UDP code now. I'll see about getting the same thing
working for TCP.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
@ 2012-04-12 14:35 ` Simon Horman
[not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
2 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:35 UTC (permalink / raw)
To: Eric Dumazet
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David Miller
On Thu, Apr 12, 2012 at 11:05:28AM +0200, Eric Dumazet wrote:
> Most machines dont use UDP encapsulation (L2TP)
>
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
>
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
>
> If static_key is not yet enabled, the fast path does a single JMP .
>
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.
Thanks Eric,
I have not had a chance to test your code, though it should be easy enough
to do so in the context of Open vSwitch as its CAPWAP implementation makes
use of UDP's encap_rcv (which is how I arrived at adding hook to TCP to
implement STT for Open vSwtich).
I have incorporated your static_key code into a new version of my TCP
encap_rcv patch and that does appear to work. I will post it ASAP.
>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
> ---
> include/net/udp.h | 1 +
> net/ipv4/udp.c | 12 +++++++++++-
> net/l2tp/l2tp_core.c | 1 +
> 3 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 5d606d9..9671f5f 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -267,4 +267,5 @@ extern void udp_init(void);
> extern int udp4_ufo_send_check(struct sk_buff *skb);
> extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
> netdev_features_t features);
> +extern void udp_encap_enable(void);
> #endif /* _UDP_H */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index fe14105..ad1e0dd 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -107,6 +107,7 @@
> #include <net/checksum.h>
> #include <net/xfrm.h>
> #include <trace/events/udp.h>
> +#include <linux/static_key.h>
> #include "udp_impl.h"
>
> struct udp_table udp_table __read_mostly;
> @@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>
> }
>
> +static struct static_key udp_encap_needed __read_mostly;
> +void udp_encap_enable(void)
> +{
> + if (!static_key_enabled(&udp_encap_needed))
> + static_key_slow_inc(&udp_encap_needed);
> +}
> +EXPORT_SYMBOL(udp_encap_enable);
> +
> /* returns:
> * -1: error
> * 0: success
> @@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> goto drop;
> nf_reset(skb);
>
> - if (up->encap_type) {
> + if (static_key_false(&udp_encap_needed) && up->encap_type) {
> int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>
> /*
> @@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
> /* FALLTHROUGH */
> case UDP_ENCAP_L2TPINUDP:
> up->encap_type = val;
> + udp_encap_enable();
> break;
> default:
> err = -ENOPROTOOPT;
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 89ff8c6..f6732b6 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
> /* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
> udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
> udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
> + udp_encap_enable();
> }
>
> sk->sk_user_data = tunnel;
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC v4] Add TCP encap_rcv hook
[not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12 14:40 ` Simon Horman
0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:40 UTC (permalink / raw)
To: Eric Dumazet
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David Miller
This hook is based on a hook of the same name provided by UDP. It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.
It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.
The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01
My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The second version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg09001.html
It needs to be updated to call tcp_encap_enable()
Cc: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* Make use of static_key,
a tonic for insanity suggested by Eric Dumazet
v3
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
and release lock on return.
v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
the encap_rcv callback returns a positive value.
This is the same behaviour as the UDP hook.
---
include/linux/tcp.h | 3 +++
include/net/tcp.h | 1 +
net/ipv4/tcp_ipv4.c | 34 +++++++++++++++++++++++++++++++++-
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
* contains related tcp_cookie_transactions fields.
*/
struct tcp_cookie_values *cookie_values;
+
+ /* For encapsulation sockets. */
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f75a04d..f2c4ac0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1575,5 +1575,6 @@ static inline struct tcp_extend_values *tcp_xv(struct request_values *rvp)
extern void tcp_v4_init(void);
extern void tcp_init(void);
+extern void tcp_encap_enable(void);
#endif /* _TCP_H */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..dadcec6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -62,6 +62,7 @@
#include <linux/init.h>
#include <linux/times.h>
#include <linux/slab.h>
+#include <linux/static_key.h>
#include <net/net_namespace.h>
#include <net/icmp.h>
@@ -1657,6 +1658,14 @@ csum_err:
}
EXPORT_SYMBOL(tcp_v4_do_rcv);
+static struct static_key tcp_encap_needed __read_mostly;
+void tcp_encap_enable(void)
+{
+ if (!static_key_enabled(&tcp_encap_needed))
+ static_key_slow_inc(&tcp_encap_needed);
+}
+EXPORT_SYMBOL(tcp_encap_enable);
+
/*
* From tcp_input.c
*/
@@ -1666,6 +1675,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
const struct iphdr *iph;
const struct tcphdr *th;
struct sock *sk;
+ struct tcp_sock *tp;
int ret;
struct net *net = dev_net(skb->dev);
@@ -1726,9 +1736,30 @@ process:
bh_lock_sock_nested(sk);
ret = 0;
+
+ tp = tcp_sk(sk);
+ if (static_key_false(&tcp_encap_needed)) {
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
+ encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+ if (encap_rcv != NULL) {
+ /*
+ * This is an encapsulation socket so pass the skb to
+ * the socket's tcp_encap_rcv() hook. Otherwise, just
+ * fall through and pass this up the TCP socket.
+ * up->encap_rcv() returns the following value:
+ * <=0 if skb was successfully passed to the encap
+ * handler or was discarded by it.
+ * >0 if skb should be passed on to TCP.
+ */
+ if (encap_rcv(sk, skb) <= 0) {
+ ret = 0;
+ goto unlock_sock;
+ }
+ }
+ }
+
if (!sock_owned_by_user(sk)) {
#ifdef CONFIG_NET_DMA
- struct tcp_sock *tp = tcp_sk(sk);
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
if (tp->ucopy.dma_chan)
@@ -1744,6 +1775,7 @@ process:
NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
goto discard_and_relse;
}
+unlock_sock:
bh_unlock_sock(sk);
sock_put(sk);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
2012-04-12 14:35 ` Simon Horman
@ 2012-04-13 17:41 ` David Miller
[not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2012-04-13 17:41 UTC (permalink / raw)
To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
bcrl-Bw31MaZKKs3YtjvyW6yDsg
From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Thu, 12 Apr 2012 11:05:28 +0200
> Most machines dont use UDP encapsulation (L2TP)
>
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
>
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
>
> If static_key is not yet enabled, the fast path does a single JMP .
>
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.
>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Applied to net-next, thanks Eric.
Ban, please incorporate this scheme when you respin your
fixed ipv6 encap/l2tp patches.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
[not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-04-13 17:45 ` Benjamin LaHaise
0 siblings, 0 replies; 9+ messages in thread
From: Benjamin LaHaise @ 2012-04-13 17:45 UTC (permalink / raw)
To: David Miller
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
Hi folks,
On Fri, Apr 13, 2012 at 01:41:08PM -0400, David Miller wrote:
> Ban, please incorporate this scheme when you respin your
> fixed ipv6 encap/l2tp patches.
>
> Thanks.
Okay, will do. Thanks for the feedback and heads up.
-ben
--
"Thought is the essence of where you are now."
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-04-13 17:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-12 7:42 [RFC v3] Add TCP encap_rcv hook Simon Horman
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 8:20 ` Eric Dumazet
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
2012-04-12 14:35 ` Simon Horman
[not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 14:40 ` [RFC v4] Add TCP encap_rcv hook Simon Horman
2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
[not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-13 17:45 ` Benjamin LaHaise
2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).