* [RFC v3] Add TCP encap_rcv hook
@ 2012-04-12 7:42 Simon Horman
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12 7:42 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
This hook is based on a hook of the same name provided by UDP. It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.
It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.
The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01
My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The first version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
include/linux/tcp.h | 3 +++
net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++-
2 files changed, 25 insertions(+), 1 deletion(-)
v3
* First post to netdev
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
and release lock on return.
v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
the encap_rcv callback returns a positive value.
This is the same behaviour as the UDP hook.
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
* contains related tcp_cookie_transactions fields.
*/
struct tcp_cookie_values *cookie_values;
+
+ /* For encapsulation sockets. */
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..9898f71 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
const struct iphdr *iph;
const struct tcphdr *th;
struct sock *sk;
+ struct tcp_sock *tp;
int ret;
struct net *net = dev_net(skb->dev);
+ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
if (skb->pkt_type != PACKET_HOST)
goto discard_it;
@@ -1726,9 +1728,27 @@ process:
bh_lock_sock_nested(sk);
ret = 0;
+
+ tp = tcp_sk(sk);
+ encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+ if (encap_rcv != NULL) {
+ /*
+ * This is an encapsulation socket so pass the skb to
+ * the socket's tcp_encap_rcv() hook. Otherwise, just
+ * fall through and pass this up the TCP socket.
+ * up->encap_rcv() returns the following value:
+ * <=0 if skb was successfully passed to the encap
+ * handler or was discarded by it.
+ * >0 if skb should be passed on to TCP.
+ */
+ if (encap_rcv(sk, skb) <= 0) {
+ ret = 0;
+ goto unlock_sock;
+ }
+ }
+
if (!sock_owned_by_user(sk)) {
#ifdef CONFIG_NET_DMA
- struct tcp_sock *tp = tcp_sk(sk);
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
if (tp->ucopy.dma_chan)
@@ -1744,6 +1764,7 @@ process:
NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
goto discard_and_relse;
}
+unlock_sock:
bh_unlock_sock(sk);
sock_put(sk);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 9+ messages in thread[parent not found: <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>]
* Re: [RFC v3] Add TCP encap_rcv hook [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> @ 2012-04-12 8:20 ` Eric Dumazet 2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet 2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman 0 siblings, 2 replies; 9+ messages in thread From: Eric Dumazet @ 2012-04-12 8:20 UTC (permalink / raw) To: Simon Horman; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote: > This hook is based on a hook of the same name provided by UDP. It provides > a way for to receive packets that have a TCP header and treat them in some > alternate way. > > It is intended to be used by an implementation of the STT tunneling > protocol within Open vSwtich's datapath. A prototype of such an > implementation has been made. > > The STT draft is available at > http://tools.ietf.org/html/draft-davie-stt-01 > > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org > The first version can be found at: > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html > > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> > Hi Simon Oh well, this is insane :( > --- > include/linux/tcp.h | 3 +++ > net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++- > 2 files changed, 25 insertions(+), 1 deletion(-) > > v3 > * First post to netdev > * Replace more UDP references with TCP > * Move socket accesses to inside socket lock > and release lock on return. > > v2 > * Fix comment to refer to TCP rather than UDP > * Allow skb to continue traversing the stack if > the encap_rcv callback returns a positive value. > This is the same behaviour as the UDP hook. > > diff --git a/include/linux/tcp.h b/include/linux/tcp.h > index b6c62d2..7210b23 100644 > --- a/include/linux/tcp.h > +++ b/include/linux/tcp.h > @@ -472,6 +472,9 @@ struct tcp_sock { > * contains related tcp_cookie_transactions fields. > */ > struct tcp_cookie_values *cookie_values; > + > + /* For encapsulation sockets. */ > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > }; > This adds a new cache miss for all incoming tcp frames... > static inline struct tcp_sock *tcp_sk(const struct sock *sk) > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 3a25cf7..9898f71 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb) > const struct iphdr *iph; > const struct tcphdr *th; > struct sock *sk; > + struct tcp_sock *tp; > int ret; > struct net *net = dev_net(skb->dev); > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > > if (skb->pkt_type != PACKET_HOST) > goto discard_it; > @@ -1726,9 +1728,27 @@ process: > > bh_lock_sock_nested(sk); > ret = 0; > + > + tp = tcp_sk(sk); > + encap_rcv = ACCESS_ONCE(tp->encap_rcv); > + if (encap_rcv != NULL) { and a new conditional... > + /* > + * This is an encapsulation socket so pass the skb to > + * the socket's tcp_encap_rcv() hook. Otherwise, just > + * fall through and pass this up the TCP socket. > + * up->encap_rcv() returns the following value: > + * <=0 if skb was successfully passed to the encap > + * handler or was discarded by it. > + * >0 if skb should be passed on to TCP. > + */ > + if (encap_rcv(sk, skb) <= 0) { > + ret = 0; > + goto unlock_sock; > + } > + } > + > if (!sock_owned_by_user(sk)) { > #ifdef CONFIG_NET_DMA > - struct tcp_sock *tp = tcp_sk(sk); > if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list) > tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY); > if (tp->ucopy.dma_chan) > @@ -1744,6 +1764,7 @@ process: > NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP); > goto discard_and_relse; > } > +unlock_sock: > bh_unlock_sock(sk); > > sock_put(sk); I dont know, this sounds as a hack. Since you obviously spent a lot of time on this stuff, lets be constructive. I really suggest you take a look at <linux/static_key.h> So that on machines without any need for this encap_rcv, we dont even need to fetch tp->encap_rcv if (static_key_false(&stt_active)) { /* stt might be used on this socket */ encap_rcv = ACCESS_ONCE(tp->encap_rcv); if (encap_rcv) { ... } } This way, if stt is not used/loaded, we have a single NOP If stt is used, NOP is patched to a JMP stt_code I probably implement this idea on UDP shortly so that you can have a reference for your implementation. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net-next] udp: intoduce udp_encap_needed static_key 2012-04-12 8:20 ` Eric Dumazet @ 2012-04-12 9:05 ` Eric Dumazet 2012-04-12 9:10 ` Eric Dumazet ` (2 more replies) 2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman 1 sibling, 3 replies; 9+ messages in thread From: Eric Dumazet @ 2012-04-12 9:05 UTC (permalink / raw) To: Simon Horman, David Miller Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA Most machines dont use UDP encapsulation (L2TP) Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a test if L2TP never setup the encap_rcv on a socket. Idea of this patch came after Simon Horman proposal to add a hook on TCP as well. If static_key is not yet enabled, the fast path does a single JMP . When static_key is enabled, JMP destination is patched to reach the real encap_type/encap_rcv logic, possibly adding cache misses. Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org --- include/net/udp.h | 1 + net/ipv4/udp.c | 12 +++++++++++- net/l2tp/l2tp_core.c | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/net/udp.h b/include/net/udp.h index 5d606d9..9671f5f 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -267,4 +267,5 @@ extern void udp_init(void); extern int udp4_ufo_send_check(struct sk_buff *skb); extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, netdev_features_t features); +extern void udp_encap_enable(void); #endif /* _UDP_H */ diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index fe14105..ad1e0dd 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -107,6 +107,7 @@ #include <net/checksum.h> #include <net/xfrm.h> #include <trace/events/udp.h> +#include <linux/static_key.h> #include "udp_impl.h" struct udp_table udp_table __read_mostly; @@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) } +static struct static_key udp_encap_needed __read_mostly; +void udp_encap_enable(void) +{ + if (!static_key_enabled(&udp_encap_needed)) + static_key_slow_inc(&udp_encap_needed); +} +EXPORT_SYMBOL(udp_encap_enable); + /* returns: * -1: error * 0: success @@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) goto drop; nf_reset(skb); - if (up->encap_type) { + if (static_key_false(&udp_encap_needed) && up->encap_type) { int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); /* @@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname, /* FALLTHROUGH */ case UDP_ENCAP_L2TPINUDP: up->encap_type = val; + udp_encap_enable(); break; default: err = -ENOPROTOOPT; diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 89ff8c6..f6732b6 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32 /* Mark socket as an encapsulation socket. See net/ipv4/udp.c */ udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP; udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv; + udp_encap_enable(); } sk->sk_user_data = tunnel; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key 2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet @ 2012-04-12 9:10 ` Eric Dumazet 2012-04-12 14:35 ` Simon Horman 2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller 2 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2012-04-12 9:10 UTC (permalink / raw) To: Simon Horman Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller On Thu, 2012-04-12 at 11:05 +0200, Eric Dumazet wrote: > If static_key is not yet enabled, the fast path does a single JMP . > > When static_key is enabled, JMP destination is patched to reach the real > encap_type/encap_rcv logic, possibly adding cache misses. Small note Simon, The jump trick is effective on x86 (and maybe some other arches) when CONFIG_JUMP_LABEL=y Else, its replaced by atomic_read(...) > 0, a cnditional jump but reading a read_mostly/shared variable, instead of a per socket field. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key 2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet 2012-04-12 9:10 ` Eric Dumazet @ 2012-04-12 14:35 ` Simon Horman [not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> 2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller 2 siblings, 1 reply; 9+ messages in thread From: Simon Horman @ 2012-04-12 14:35 UTC (permalink / raw) To: Eric Dumazet Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller On Thu, Apr 12, 2012 at 11:05:28AM +0200, Eric Dumazet wrote: > Most machines dont use UDP encapsulation (L2TP) > > Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a > test if L2TP never setup the encap_rcv on a socket. > > Idea of this patch came after Simon Horman proposal to add a hook on TCP > as well. > > If static_key is not yet enabled, the fast path does a single JMP . > > When static_key is enabled, JMP destination is patched to reach the real > encap_type/encap_rcv logic, possibly adding cache misses. Thanks Eric, I have not had a chance to test your code, though it should be easy enough to do so in the context of Open vSwitch as its CAPWAP implementation makes use of UDP's encap_rcv (which is how I arrived at adding hook to TCP to implement STT for Open vSwtich). I have incorporated your static_key code into a new version of my TCP encap_rcv patch and that does appear to work. I will post it ASAP. > > Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> > Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org > --- > include/net/udp.h | 1 + > net/ipv4/udp.c | 12 +++++++++++- > net/l2tp/l2tp_core.c | 1 + > 3 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/include/net/udp.h b/include/net/udp.h > index 5d606d9..9671f5f 100644 > --- a/include/net/udp.h > +++ b/include/net/udp.h > @@ -267,4 +267,5 @@ extern void udp_init(void); > extern int udp4_ufo_send_check(struct sk_buff *skb); > extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, > netdev_features_t features); > +extern void udp_encap_enable(void); > #endif /* _UDP_H */ > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index fe14105..ad1e0dd 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -107,6 +107,7 @@ > #include <net/checksum.h> > #include <net/xfrm.h> > #include <trace/events/udp.h> > +#include <linux/static_key.h> > #include "udp_impl.h" > > struct udp_table udp_table __read_mostly; > @@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) > > } > > +static struct static_key udp_encap_needed __read_mostly; > +void udp_encap_enable(void) > +{ > + if (!static_key_enabled(&udp_encap_needed)) > + static_key_slow_inc(&udp_encap_needed); > +} > +EXPORT_SYMBOL(udp_encap_enable); > + > /* returns: > * -1: error > * 0: success > @@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) > goto drop; > nf_reset(skb); > > - if (up->encap_type) { > + if (static_key_false(&udp_encap_needed) && up->encap_type) { > int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > > /* > @@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname, > /* FALLTHROUGH */ > case UDP_ENCAP_L2TPINUDP: > up->encap_type = val; > + udp_encap_enable(); > break; > default: > err = -ENOPROTOOPT; > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c > index 89ff8c6..f6732b6 100644 > --- a/net/l2tp/l2tp_core.c > +++ b/net/l2tp/l2tp_core.c > @@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32 > /* Mark socket as an encapsulation socket. See net/ipv4/udp.c */ > udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP; > udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv; > + udp_encap_enable(); > } > > sk->sk_user_data = tunnel; > > ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>]
* [RFC v4] Add TCP encap_rcv hook [not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> @ 2012-04-12 14:40 ` Simon Horman 0 siblings, 0 replies; 9+ messages in thread From: Simon Horman @ 2012-04-12 14:40 UTC (permalink / raw) To: Eric Dumazet Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller This hook is based on a hook of the same name provided by UDP. It provides a way for to receive packets that have a TCP header and treat them in some alternate way. It is intended to be used by an implementation of the STT tunneling protocol within Open vSwtich's datapath. A prototype of such an implementation has been made. The STT draft is available at http://tools.ietf.org/html/draft-davie-stt-01 My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org The second version can be found at: http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg09001.html It needs to be updated to call tcp_encap_enable() Cc: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> --- v4 * Make use of static_key, a tonic for insanity suggested by Eric Dumazet v3 * Replace more UDP references with TCP * Move socket accesses to inside socket lock and release lock on return. v2 * Fix comment to refer to TCP rather than UDP * Allow skb to continue traversing the stack if the encap_rcv callback returns a positive value. This is the same behaviour as the UDP hook. --- include/linux/tcp.h | 3 +++ include/net/tcp.h | 1 + net/ipv4/tcp_ipv4.c | 34 +++++++++++++++++++++++++++++++++- 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index b6c62d2..7210b23 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -472,6 +472,9 @@ struct tcp_sock { * contains related tcp_cookie_transactions fields. */ struct tcp_cookie_values *cookie_values; + + /* For encapsulation sockets. */ + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); }; static inline struct tcp_sock *tcp_sk(const struct sock *sk) diff --git a/include/net/tcp.h b/include/net/tcp.h index f75a04d..f2c4ac0 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1575,5 +1575,6 @@ static inline struct tcp_extend_values *tcp_xv(struct request_values *rvp) extern void tcp_v4_init(void); extern void tcp_init(void); +extern void tcp_encap_enable(void); #endif /* _TCP_H */ diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 3a25cf7..dadcec6 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,6 +62,7 @@ #include <linux/init.h> #include <linux/times.h> #include <linux/slab.h> +#include <linux/static_key.h> #include <net/net_namespace.h> #include <net/icmp.h> @@ -1657,6 +1658,14 @@ csum_err: } EXPORT_SYMBOL(tcp_v4_do_rcv); +static struct static_key tcp_encap_needed __read_mostly; +void tcp_encap_enable(void) +{ + if (!static_key_enabled(&tcp_encap_needed)) + static_key_slow_inc(&tcp_encap_needed); +} +EXPORT_SYMBOL(tcp_encap_enable); + /* * From tcp_input.c */ @@ -1666,6 +1675,7 @@ int tcp_v4_rcv(struct sk_buff *skb) const struct iphdr *iph; const struct tcphdr *th; struct sock *sk; + struct tcp_sock *tp; int ret; struct net *net = dev_net(skb->dev); @@ -1726,9 +1736,30 @@ process: bh_lock_sock_nested(sk); ret = 0; + + tp = tcp_sk(sk); + if (static_key_false(&tcp_encap_needed)) { + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); + encap_rcv = ACCESS_ONCE(tp->encap_rcv); + if (encap_rcv != NULL) { + /* + * This is an encapsulation socket so pass the skb to + * the socket's tcp_encap_rcv() hook. Otherwise, just + * fall through and pass this up the TCP socket. + * up->encap_rcv() returns the following value: + * <=0 if skb was successfully passed to the encap + * handler or was discarded by it. + * >0 if skb should be passed on to TCP. + */ + if (encap_rcv(sk, skb) <= 0) { + ret = 0; + goto unlock_sock; + } + } + } + if (!sock_owned_by_user(sk)) { #ifdef CONFIG_NET_DMA - struct tcp_sock *tp = tcp_sk(sk); if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list) tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY); if (tp->ucopy.dma_chan) @@ -1744,6 +1775,7 @@ process: NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP); goto discard_and_relse; } +unlock_sock: bh_unlock_sock(sk); sock_put(sk); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key 2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet 2012-04-12 9:10 ` Eric Dumazet 2012-04-12 14:35 ` Simon Horman @ 2012-04-13 17:41 ` David Miller [not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> 2 siblings, 1 reply; 9+ messages in thread From: David Miller @ 2012-04-13 17:41 UTC (permalink / raw) To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, bcrl-Bw31MaZKKs3YtjvyW6yDsg From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Date: Thu, 12 Apr 2012 11:05:28 +0200 > Most machines dont use UDP encapsulation (L2TP) > > Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a > test if L2TP never setup the encap_rcv on a socket. > > Idea of this patch came after Simon Horman proposal to add a hook on TCP > as well. > > If static_key is not yet enabled, the fast path does a single JMP . > > When static_key is enabled, JMP destination is patched to reach the real > encap_type/encap_rcv logic, possibly adding cache misses. > > Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Applied to net-next, thanks Eric. Ban, please incorporate this scheme when you respin your fixed ipv6 encap/l2tp patches. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>]
* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key [not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> @ 2012-04-13 17:45 ` Benjamin LaHaise 0 siblings, 0 replies; 9+ messages in thread From: Benjamin LaHaise @ 2012-04-13 17:45 UTC (permalink / raw) To: David Miller Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w Hi folks, On Fri, Apr 13, 2012 at 01:41:08PM -0400, David Miller wrote: > Ban, please incorporate this scheme when you respin your > fixed ipv6 encap/l2tp patches. > > Thanks. Okay, will do. Thanks for the feedback and heads up. -ben -- "Thought is the essence of where you are now." ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC v3] Add TCP encap_rcv hook 2012-04-12 8:20 ` Eric Dumazet 2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet @ 2012-04-12 13:10 ` Simon Horman 1 sibling, 0 replies; 9+ messages in thread From: Simon Horman @ 2012-04-12 13:10 UTC (permalink / raw) To: Eric Dumazet; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA On Thu, Apr 12, 2012 at 10:20:29AM +0200, Eric Dumazet wrote: > On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote: > > This hook is based on a hook of the same name provided by UDP. It provides > > a way for to receive packets that have a TCP header and treat them in some > > alternate way. > > > > It is intended to be used by an implementation of the STT tunneling > > protocol within Open vSwtich's datapath. A prototype of such an > > implementation has been made. > > > > The STT draft is available at > > http://tools.ietf.org/html/draft-davie-stt-01 > > > > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org > > The first version can be found at: > > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html > > > > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> > > > > Hi Simon > > Oh well, this is insane :( > > > --- > > include/linux/tcp.h | 3 +++ > > net/ipv4/tcp_ipv4.c | 23 ++++++++++++++++++++++- > > 2 files changed, 25 insertions(+), 1 deletion(-) > > > > v3 > > * First post to netdev > > * Replace more UDP references with TCP > > * Move socket accesses to inside socket lock > > and release lock on return. > > > > v2 > > * Fix comment to refer to TCP rather than UDP > > * Allow skb to continue traversing the stack if > > the encap_rcv callback returns a positive value. > > This is the same behaviour as the UDP hook. > > > > diff --git a/include/linux/tcp.h b/include/linux/tcp.h > > index b6c62d2..7210b23 100644 > > --- a/include/linux/tcp.h > > +++ b/include/linux/tcp.h > > @@ -472,6 +472,9 @@ struct tcp_sock { > > * contains related tcp_cookie_transactions fields. > > */ > > struct tcp_cookie_values *cookie_values; > > + > > + /* For encapsulation sockets. */ > > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > > }; > > > > This adds a new cache miss for all incoming tcp frames... > > > static inline struct tcp_sock *tcp_sk(const struct sock *sk) > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > index 3a25cf7..9898f71 100644 > > --- a/net/ipv4/tcp_ipv4.c > > +++ b/net/ipv4/tcp_ipv4.c > > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb) > > const struct iphdr *iph; > > const struct tcphdr *th; > > struct sock *sk; > > + struct tcp_sock *tp; > > int ret; > > struct net *net = dev_net(skb->dev); > > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > > > > if (skb->pkt_type != PACKET_HOST) > > goto discard_it; > > @@ -1726,9 +1728,27 @@ process: > > > > bh_lock_sock_nested(sk); > > ret = 0; > > + > > + tp = tcp_sk(sk); > > + encap_rcv = ACCESS_ONCE(tp->encap_rcv); > > + if (encap_rcv != NULL) { > > and a new conditional... > > > + /* > > + * This is an encapsulation socket so pass the skb to > > + * the socket's tcp_encap_rcv() hook. Otherwise, just > > + * fall through and pass this up the TCP socket. > > + * up->encap_rcv() returns the following value: > > + * <=0 if skb was successfully passed to the encap > > + * handler or was discarded by it. > > + * >0 if skb should be passed on to TCP. > > + */ > > + if (encap_rcv(sk, skb) <= 0) { > > + ret = 0; > > + goto unlock_sock; > > + } > > + } > > + > > if (!sock_owned_by_user(sk)) { > > #ifdef CONFIG_NET_DMA > > - struct tcp_sock *tp = tcp_sk(sk); > > if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list) > > tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY); > > if (tp->ucopy.dma_chan) > > @@ -1744,6 +1764,7 @@ process: > > NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP); > > goto discard_and_relse; > > } > > +unlock_sock: > > bh_unlock_sock(sk); > > > > sock_put(sk); > > I dont know, this sounds as a hack. Since you obviously spent a lot of > time on this stuff, lets be constructive. Hi Eric, Thanks, I didn't really expect my patch to go in smoothly as is. Though it may well be my first brush with insanity. > > I really suggest you take a look at <linux/static_key.h> > > So that on machines without any need for this encap_rcv, we dont even > need to fetch tp->encap_rcv > > if (static_key_false(&stt_active)) { > /* stt might be used on this socket */ > encap_rcv = ACCESS_ONCE(tp->encap_rcv); > if (encap_rcv) { > ... > } > } > > This way, if stt is not used/loaded, we have a single NOP > > If stt is used, NOP is patched to a JMP stt_code > > > I probably implement this idea on UDP shortly so that you can have a > reference for your implementation. Thanks, I see your UDP code now. I'll see about getting the same thing working for TCP. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-04-13 17:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-12 7:42 [RFC v3] Add TCP encap_rcv hook Simon Horman
[not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 8:20 ` Eric Dumazet
2012-04-12 9:05 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12 9:10 ` Eric Dumazet
2012-04-12 14:35 ` Simon Horman
[not found] ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 14:40 ` [RFC v4] Add TCP encap_rcv hook Simon Horman
2012-04-13 17:41 ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
[not found] ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-13 17:45 ` Benjamin LaHaise
2012-04-12 13:10 ` [RFC v3] Add TCP encap_rcv hook Simon Horman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).