netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next PATCH 00/16] Transparent proxying patches, take six
@ 2008-10-01 14:24 KOVACS Krisztian
  2008-10-01 14:24 ` [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible KOVACS Krisztian
                   ` (16 more replies)
  0 siblings, 17 replies; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Hi Dave,

This is the sixth round of transparent proxying patches recently
discussed on the Netfilter Workshop. Since the last incarnation [1]
we've added support for related ICMP packets in the socket
match. Should apply cleanly on top of net-next-2.6. Could you please
apply patches 1-11 (those touching core networking parts) and I'll ask
Patrick McHardy to take care of patches 12-16 (the Netfilter parts).

The aim of the patchset is to make non-locally bound sockets work both
for receiving and sending. The target is IPv4 TCP/UDP at the moment.

Speaking of the patches, there are two big parts:

 * Output path (patches 1-7): these modifications make it possible to
   send IPv4 datagrams with non-local source IP address by:

   - Introducing a new flowi flag (FLOWI_FLAG_ANYSRC) which disables
     source address checking in ip_route_output_slow(). This is
     also necessary for some of the tricks LVS does. [2]

   - Adding the IP_TRANSPARENT socket option (setting this requires
     CAP_NET_ADMIN to prevent source address spoofing).

   - Gluing these together across the TCP/UDP code.

 * Input path (patches 8-15): these changes add redirection support
   for TCP along with an iptables target implementing NAT-less traffic
   interception, and an iptables match to make ahead-of-time socket
   lookups on PREROUTING. These combined with a set of iptables rules
   and policy routing make non-locally bound sockets work.

   - IPv4 TCP and UDP input path is modified to use this stored socket
     reference if it's present.

   - Netfilter IPv4 defragmentation is split into a separate
     module. (This could make sense independently of tproxy and
     conntrack, for example to have a stateless firewall which still
     does fragment reassembly.)

   - The 'socket' iptables match does a socket lookup on the
     destination address and matches if a socket was found.

   - The 'TPROXY' iptables target provides a way to intercept traffic
     without NAT -- it does an ahead-of-time socket lookup on the
     configured address and caches the socket reference in the skb.

The last patch adds a short intro on how to use it. A trivial patch
for netcat demonstrating the necessary modifications for proxies is
available separately at [3]. Squid has support for it in the 3.HEAD
(3.1) branch.


References:
[1] http://lwn.net/Articles/254527/
[2] http://marc.info/?l=linux-netdev&m=118065358510836&...
[3] http://people.netfilter.org/hidden/tproxy/netcat-ip_trans...

-- 
KOVACS Krisztian



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
  2008-10-01 14:24 ` [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:30   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 15/16] iptables TPROXY target KOVACS Krisztian
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

This patch introduces the IP_TRANSPARENT socket option: enabling that will make
the IPv4 routing omit the non-local source address check on output. Setting
IP_TRANSPARENT requires NET_ADMIN capability.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/linux/in.h               |    1 +
 include/net/inet_sock.h          |    3 ++-
 include/net/inet_timewait_sock.h |    3 ++-
 net/ipv4/inet_timewait_sock.c    |    1 +
 net/ipv4/ip_sockglue.c           |   15 ++++++++++++++-
 5 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 4065313..db458be 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -75,6 +75,7 @@ struct in_addr {
 #define IP_IPSEC_POLICY	16
 #define IP_XFRM_POLICY	17
 #define IP_PASSSEC	18
+#define IP_TRANSPARENT	19
 
 /* BSD compatibility */
 #define IP_RECVRETOPTS	IP_RETOPTS
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 643e26b..e97b66e 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -129,7 +129,8 @@ struct inet_sock {
 				is_icsk:1,
 				freebind:1,
 				hdrincl:1,
-				mc_loop:1;
+				mc_loop:1,
+				transparent:1;
 	int			mc_index;
 	__be32			mc_addr;
 	struct ip_mc_socklist	*mc_list;
diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index 9132490..80e4977 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -128,7 +128,8 @@ struct inet_timewait_sock {
 	__be16			tw_dport;
 	__u16			tw_num;
 	/* And these are ours. */
-	__u8			tw_ipv6only:1;
+	__u8			tw_ipv6only:1,
+				tw_transparent:1;
 	/* 15 bits hole, try to pack */
 	__u16			tw_ipv6_offset;
 	unsigned long		tw_ttd;
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 743f011..1c5fd38 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -126,6 +126,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat
 		tw->tw_reuse	    = sk->sk_reuse;
 		tw->tw_hash	    = sk->sk_hash;
 		tw->tw_ipv6only	    = 0;
+		tw->tw_transparent  = inet->transparent;
 		tw->tw_prot	    = sk->sk_prot_creator;
 		twsk_net_set(tw, hold_net(sock_net(sk)));
 		atomic_set(&tw->tw_refcnt, 1);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 105d92a..465abf0 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -419,7 +419,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 			     (1<<IP_TTL) | (1<<IP_HDRINCL) |
 			     (1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
 			     (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
-			     (1<<IP_PASSSEC))) ||
+			     (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT))) ||
 	    optname == IP_MULTICAST_TTL ||
 	    optname == IP_MULTICAST_LOOP) {
 		if (optlen >= sizeof(int)) {
@@ -878,6 +878,16 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 		err = xfrm_user_policy(sk, optname, optval, optlen);
 		break;
 
+	case IP_TRANSPARENT:
+		if (!capable(CAP_NET_ADMIN)) {
+			err = -EPERM;
+			break;
+		}
+		if (optlen < 1)
+			goto e_inval;
+		inet->transparent = !!val;
+		break;
+
 	default:
 		err = -ENOPROTOOPT;
 		break;
@@ -1130,6 +1140,9 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname,
 	case IP_FREEBIND:
 		val = inet->freebind;
 		break;
+	case IP_TRANSPARENT:
+		val = inet->transparent;
+		break;
 	default:
 		release_sock(sk);
 		return -ENOPROTOOPT;



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (14 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 04/16] Make inet_sock.h independent of route.h KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:31   ` David Miller
  2008-10-02 13:20 ` [net-next PATCH 00/16] Transparent proxying patches, take six Amos Jeffries
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Setting IP_TRANSPARENT is not really useful without allowing non-local
binds for the socket. To make user-space code simpler we allow these binds
even if IP_TRANSPARENT is set but IP_FREEBIND is not.

Signed-off-by: Tóth László Attila <panther@balabit.hu>
---

 net/ipv4/af_inet.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8a3ac1f..1fbff5f 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -469,7 +469,7 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	 */
 	err = -EADDRNOTAVAIL;
 	if (!sysctl_ip_nonlocal_bind &&
-	    !inet->freebind &&
+	    !(inet->freebind || inet->transparent) &&
 	    addr->sin_addr.s_addr != htonl(INADDR_ANY) &&
 	    chk_addr_ret != RTN_LOCAL &&
 	    chk_addr_ret != RTN_MULTICAST &&



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 01/16] Loosen source address check on IPv4 output
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (9 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 11/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:28   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module KOVACS Krisztian
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.

This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/flow.h |    2 ++
 net/ipv4/route.c   |   20 +++++++++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index 228b247..b45a5e4 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -47,6 +47,8 @@ struct flowi {
 #define fl4_scope	nl_u.ip4_u.scope
 
 	__u8	proto;
+	__u8	flags;
+#define FLOWI_FLAG_ANYSRC 0x01
 	union {
 		struct {
 			__be16	sport;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f62187b..a6d7c58 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2361,11 +2361,6 @@ static int ip_route_output_slow(struct net *net, struct rtable **rp,
 		    ipv4_is_zeronet(oldflp->fl4_src))
 			goto out;
 
-		/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
-		dev_out = ip_dev_find(net, oldflp->fl4_src);
-		if (dev_out == NULL)
-			goto out;
-
 		/* I removed check for oif == dev_out->oif here.
 		   It was wrong for two reasons:
 		   1. ip_dev_find(net, saddr) can return wrong iface, if saddr
@@ -2377,6 +2372,11 @@ static int ip_route_output_slow(struct net *net, struct rtable **rp,
 		if (oldflp->oif == 0
 		    && (ipv4_is_multicast(oldflp->fl4_dst) ||
 			oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
+			/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+			dev_out = ip_dev_find(net, oldflp->fl4_src);
+			if (dev_out == NULL)
+				goto out;
+
 			/* Special hack: user can direct multicasts
 			   and limited broadcast via necessary interface
 			   without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
@@ -2395,9 +2395,15 @@ static int ip_route_output_slow(struct net *net, struct rtable **rp,
 			fl.oif = dev_out->ifindex;
 			goto make_route;
 		}
-		if (dev_out)
+
+		if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) {
+			/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+			dev_out = ip_dev_find(net, oldflp->fl4_src);
+			if (dev_out == NULL)
+				goto out;
 			dev_put(dev_out);
-		dev_out = NULL;
+			dev_out = NULL;
+		}
 	}
 
 



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 04/16] Make inet_sock.h independent of route.h
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (13 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 14/16] iptables socket match KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:34   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set KOVACS Krisztian
  2008-10-02 13:20 ` [net-next PATCH 00/16] Transparent proxying patches, take six Amos Jeffries
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

inet_iif() in inet_sock.h requires route.h. Since users of inet_iif()
usually require other route.h functionality anyway this patch moves
inet_iif() to route.h.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/inet_sock.h            |    7 -------
 include/net/ip_vs.h                |    1 +
 include/net/route.h                |    5 +++++
 net/ipv4/netfilter/nf_nat_helper.c |    1 +
 net/ipv6/af_inet6.c                |    1 +
 5 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index e97b66e..139b78b 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -24,7 +24,6 @@
 #include <net/flow.h>
 #include <net/sock.h>
 #include <net/request_sock.h>
-#include <net/route.h>
 #include <net/netns/hash.h>
 
 /** struct ip_options - IP Options
@@ -195,12 +194,6 @@ static inline int inet_sk_ehashfn(const struct sock *sk)
 	return inet_ehashfn(net, laddr, lport, faddr, fport);
 }
 
-
-static inline int inet_iif(const struct sk_buff *skb)
-{
-	return skb->rtable->rt_iif;
-}
-
 static inline struct request_sock *inet_reqsk_alloc(struct request_sock_ops *ops)
 {
 	struct request_sock *req = reqsk_alloc(ops);
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 33e2ac6..0b2071d 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -22,6 +22,7 @@
 
 #include <net/checksum.h>
 #include <linux/netfilter.h>		/* for union nf_inet_addr */
+#include <linux/ip.h>
 #include <linux/ipv6.h>			/* for struct ipv6hdr */
 #include <net/ipv6.h>			/* for ipv6_addr_copy */
 
diff --git a/include/net/route.h b/include/net/route.h
index 4f0d8c1..31d1485 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -204,4 +204,9 @@ static inline struct inet_peer *rt_get_peer(struct rtable *rt)
 	return rt->peer;
 }
 
+static inline int inet_iif(const struct sk_buff *skb)
+{
+	return skb->rtable->rt_iif;
+}
+
 #endif	/* _ROUTE_H */
diff --git a/net/ipv4/netfilter/nf_nat_helper.c b/net/ipv4/netfilter/nf_nat_helper.c
index 11976ea..112dcfa 100644
--- a/net/ipv4/netfilter/nf_nat_helper.c
+++ b/net/ipv4/netfilter/nf_nat_helper.c
@@ -16,6 +16,7 @@
 #include <linux/udp.h>
 #include <net/checksum.h>
 #include <net/tcp.h>
+#include <net/route.h>
 
 #include <linux/netfilter_ipv4.h>
 #include <net/netfilter/nf_conntrack.h>
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 95055f8..f018704 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -50,6 +50,7 @@
 #include <net/ipip.h>
 #include <net/protocol.h>
 #include <net/inet_common.h>
+#include <net/route.h>
 #include <net/transp_v6.h>
 #include <net/ip6_route.h>
 #include <net/addrconf.h>



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (7 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 16/16] Add documentation KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:42   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 11/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.

This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing
the route lookup for those replies. Transparent replies are enabled if
the listening socket has the transparent socket flag set.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/inet_sock.h |    8 +++++++-
 net/ipv4/tcp_ipv4.c     |   12 +++++++++---
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 139b78b..dced3f6 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -72,7 +72,8 @@ struct inet_request_sock {
 				sack_ok	   : 1,
 				wscale_ok  : 1,
 				ecn_ok	   : 1,
-				acked	   : 1;
+				acked	   : 1,
+				no_srccheck: 1;
 	struct ip_options	*opt;
 };
 
@@ -204,4 +205,9 @@ static inline struct request_sock *inet_reqsk_alloc(struct request_sock_ops *ops
 	return req;
 }
 
+static inline __u8 inet_sk_flowi_flags(const struct sock *sk)
+{
+	return inet_sk(sk)->transparent ? FLOWI_FLAG_ANYSRC : 0;
+}
+
 #endif	/* _INET_SOCK_H */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 44aef1c..1ac4d05 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -591,6 +591,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 				      ip_hdr(skb)->saddr, /* XXX */
 				      sizeof(struct tcphdr), IPPROTO_TCP, 0);
 	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
+	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
 
 	net = dev_net(skb->dst->dev);
 	ip_send_reply(net->ipv4.tcp_sock, skb,
@@ -606,7 +607,8 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 
 static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 			    u32 win, u32 ts, int oif,
-			    struct tcp_md5sig_key *key)
+			    struct tcp_md5sig_key *key,
+			    int reply_flags)
 {
 	struct tcphdr *th = tcp_hdr(skb);
 	struct {
@@ -659,6 +661,7 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 				    ip_hdr(skb)->daddr, &rep.th);
 	}
 #endif
+	arg.flags = reply_flags;
 	arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr,
 				      ip_hdr(skb)->saddr, /* XXX */
 				      arg.iov[0].iov_len, IPPROTO_TCP, 0);
@@ -681,7 +684,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
 			tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
 			tcptw->tw_ts_recent,
 			tw->tw_bound_dev_if,
-			tcp_twsk_md5_key(tcptw)
+			tcp_twsk_md5_key(tcptw),
+			tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0
 			);
 
 	inet_twsk_put(tw);
@@ -694,7 +698,8 @@ static void tcp_v4_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
 			tcp_rsk(req)->rcv_isn + 1, req->rcv_wnd,
 			req->ts_recent,
 			0,
-			tcp_v4_md5_do_lookup(sk, ip_hdr(skb)->daddr));
+			tcp_v4_md5_do_lookup(sk, ip_hdr(skb)->daddr),
+			inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0);
 }
 
 /*
@@ -1244,6 +1249,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	ireq = inet_rsk(req);
 	ireq->loc_addr = daddr;
 	ireq->rmt_addr = saddr;
+	ireq->no_srccheck = inet_sk(sk)->transparent;
 	ireq->opt = tcp_v4_save_options(sk, skb);
 	if (!want_cookie)
 		TCP_ECN_create_request(req, tcp_hdr(skb));



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (5 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 13/16] iptables tproxy core KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:36   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 16/16] Add documentation KOVACS Krisztian
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Set FLOWI_FLAG_ANYSRC in flowi->flags if the socket has the
transparent socket option set. This way we selectively enable certain
connections with non-local source addresses to be routed.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/route.h |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index 31d1485..4e8cae0 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -27,7 +27,7 @@
 #include <net/dst.h>
 #include <net/inetpeer.h>
 #include <net/flow.h>
-#include <net/sock.h>
+#include <net/inet_sock.h>
 #include <linux/in_route.h>
 #include <linux/rtnetlink.h>
 #include <linux/route.h>
@@ -161,6 +161,10 @@ static inline int ip_route_connect(struct rtable **rp, __be32 dst,
 
 	int err;
 	struct net *net = sock_net(sk);
+
+	if (inet_sk(sk)->transparent)
+		fl.flags |= FLOWI_FLAG_ANYSRC;
+
 	if (!dst || !src) {
 		err = __ip_route_output_key(net, rp, &fl);
 		if (err)



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (10 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 01/16] Loosen source address check on IPv4 output KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-02  9:18   ` Patrick McHardy
  2008-10-01 14:24 ` [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Netfilter connection tracking requires all IPv4 packets to be defragmented.
Both the socket match and the TPROXY target depend on this functionality, so
this patch separates the Netfilter IPv4 defrag hooks into a separate module.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/netfilter/ipv4/nf_defrag_ipv4.h    |    6 ++
 net/ipv4/netfilter/Kconfig                     |    5 +
 net/ipv4/netfilter/Makefile                    |    3 +
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   56 +-------------
 net/ipv4/netfilter/nf_defrag_ipv4.c            |   96 ++++++++++++++++++++++++
 5 files changed, 113 insertions(+), 53 deletions(-)

diff --git a/include/net/netfilter/ipv4/nf_defrag_ipv4.h b/include/net/netfilter/ipv4/nf_defrag_ipv4.h
new file mode 100644
index 0000000..6b00ea3
--- /dev/null
+++ b/include/net/netfilter/ipv4/nf_defrag_ipv4.h
@@ -0,0 +1,6 @@
+#ifndef _NF_DEFRAG_IPV4_H
+#define _NF_DEFRAG_IPV4_H
+
+extern void nf_defrag_ipv4_enable(void);
+
+#endif /* _NF_DEFRAG_IPV4_H */
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 90eb7cb..d011677 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -5,10 +5,15 @@
 menu "IP: Netfilter Configuration"
 	depends on INET && NETFILTER
 
+config NF_DEFRAG_IPV4
+	tristate
+	default n
+
 config NF_CONNTRACK_IPV4
 	tristate "IPv4 connection tracking support (required for NAT)"
 	depends on NF_CONNTRACK
 	default m if NETFILTER_ADVANCED=n
+	select NF_DEFRAG_IPV4
 	---help---
 	  Connection tracking keeps a record of what packets have passed
 	  through your machine, in order to figure out how they are related
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 3f31291..dded5e9 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -18,6 +18,9 @@ obj-$(CONFIG_NF_CONNTRACK_IPV4) += nf_conntrack_ipv4.o
 
 obj-$(CONFIG_NF_NAT) += nf_nat.o
 
+# defrag
+obj-$(CONFIG_NF_DEFRAG_IPV4) += nf_defrag_ipv4.o
+
 # NAT helpers (nf_conntrack)
 obj-$(CONFIG_NF_NAT_AMANDA) += nf_nat_amanda.o
 obj-$(CONFIG_NF_NAT_FTP) += nf_nat_ftp.o
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 5a955c4..abf82c3 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -1,3 +1,4 @@
+
 /* (C) 1999-2001 Paul `Rusty' Russell
  * (C) 2002-2004 Netfilter Core Team <coreteam@netfilter.org>
  *
@@ -24,6 +25,7 @@
 #include <net/netfilter/nf_conntrack_core.h>
 #include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
 #include <net/netfilter/nf_nat_helper.h>
+#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
 
 int (*nf_nat_seq_adjust_hook)(struct sk_buff *skb,
 			      struct nf_conn *ct,
@@ -63,23 +65,6 @@ static int ipv4_print_tuple(struct seq_file *s,
 			  NIPQUAD(tuple->dst.u3.ip));
 }
 
-/* Returns new sk_buff, or NULL */
-static int nf_ct_ipv4_gather_frags(struct sk_buff *skb, u_int32_t user)
-{
-	int err;
-
-	skb_orphan(skb);
-
-	local_bh_disable();
-	err = ip_defrag(skb, user);
-	local_bh_enable();
-
-	if (!err)
-		ip_send_check(ip_hdr(skb));
-
-	return err;
-}
-
 static int ipv4_get_l4proto(const struct sk_buff *skb, unsigned int nhoff,
 			    unsigned int *dataoff, u_int8_t *protonum)
 {
@@ -144,28 +129,6 @@ out:
 	return nf_conntrack_confirm(skb);
 }
 
-static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
-					  struct sk_buff *skb,
-					  const struct net_device *in,
-					  const struct net_device *out,
-					  int (*okfn)(struct sk_buff *))
-{
-	/* Previously seen (loopback)?  Ignore.  Do this before
-	   fragment check. */
-	if (skb->nfct)
-		return NF_ACCEPT;
-
-	/* Gather fragments. */
-	if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
-		if (nf_ct_ipv4_gather_frags(skb,
-					    hooknum == NF_INET_PRE_ROUTING ?
-					    IP_DEFRAG_CONNTRACK_IN :
-					    IP_DEFRAG_CONNTRACK_OUT))
-			return NF_STOLEN;
-	}
-	return NF_ACCEPT;
-}
-
 static unsigned int ipv4_conntrack_in(unsigned int hooknum,
 				      struct sk_buff *skb,
 				      const struct net_device *in,
@@ -195,13 +158,6 @@ static unsigned int ipv4_conntrack_local(unsigned int hooknum,
    make it the first hook. */
 static struct nf_hook_ops ipv4_conntrack_ops[] __read_mostly = {
 	{
-		.hook		= ipv4_conntrack_defrag,
-		.owner		= THIS_MODULE,
-		.pf		= PF_INET,
-		.hooknum	= NF_INET_PRE_ROUTING,
-		.priority	= NF_IP_PRI_CONNTRACK_DEFRAG,
-	},
-	{
 		.hook		= ipv4_conntrack_in,
 		.owner		= THIS_MODULE,
 		.pf		= PF_INET,
@@ -209,13 +165,6 @@ static struct nf_hook_ops ipv4_conntrack_ops[] __read_mostly = {
 		.priority	= NF_IP_PRI_CONNTRACK,
 	},
 	{
-		.hook           = ipv4_conntrack_defrag,
-		.owner          = THIS_MODULE,
-		.pf             = PF_INET,
-		.hooknum        = NF_INET_LOCAL_OUT,
-		.priority       = NF_IP_PRI_CONNTRACK_DEFRAG,
-	},
-	{
 		.hook		= ipv4_conntrack_local,
 		.owner		= THIS_MODULE,
 		.pf		= PF_INET,
@@ -422,6 +371,7 @@ static int __init nf_conntrack_l3proto_ipv4_init(void)
 	int ret = 0;
 
 	need_conntrack();
+	nf_defrag_ipv4_enable();
 
 	ret = nf_register_sockopt(&so_getorigdst);
 	if (ret < 0) {
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
new file mode 100644
index 0000000..aa2c50a
--- /dev/null
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -0,0 +1,96 @@
+/* (C) 1999-2001 Paul `Rusty' Russell
+ * (C) 2002-2004 Netfilter Core Team <coreteam@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/types.h>
+#include <linux/ip.h>
+#include <linux/netfilter.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <net/route.h>
+#include <net/ip.h>
+
+#include <linux/netfilter_ipv4.h>
+#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
+
+/* Returns new sk_buff, or NULL */
+static int nf_ct_ipv4_gather_frags(struct sk_buff *skb, u_int32_t user)
+{
+	int err;
+
+	skb_orphan(skb);
+
+	local_bh_disable();
+	err = ip_defrag(skb, user);
+	local_bh_enable();
+
+	if (!err)
+		ip_send_check(ip_hdr(skb));
+
+	return err;
+}
+
+static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
+					  struct sk_buff *skb,
+					  const struct net_device *in,
+					  const struct net_device *out,
+					  int (*okfn)(struct sk_buff *))
+{
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+	/* Previously seen (loopback)?  Ignore.  Do this before
+	   fragment check. */
+	if (skb->nfct)
+		return NF_ACCEPT;
+#endif
+
+	/* Gather fragments. */
+	if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
+		if (nf_ct_ipv4_gather_frags(skb,
+					    hooknum == NF_INET_PRE_ROUTING ?
+					    IP_DEFRAG_CONNTRACK_IN :
+					    IP_DEFRAG_CONNTRACK_OUT))
+			return NF_STOLEN;
+	}
+	return NF_ACCEPT;
+}
+
+static struct nf_hook_ops ipv4_defrag_ops[] = {
+	{
+		.hook		= ipv4_conntrack_defrag,
+		.owner		= THIS_MODULE,
+		.pf		= PF_INET,
+		.hooknum	= NF_INET_PRE_ROUTING,
+		.priority	= NF_IP_PRI_CONNTRACK_DEFRAG,
+	},
+	{
+		.hook           = ipv4_conntrack_defrag,
+		.owner          = THIS_MODULE,
+		.pf             = PF_INET,
+		.hooknum        = NF_INET_LOCAL_OUT,
+		.priority       = NF_IP_PRI_CONNTRACK_DEFRAG,
+	},
+};
+
+static int __init nf_defrag_init(void)
+{
+	return nf_register_hooks(ipv4_defrag_ops, ARRAY_SIZE(ipv4_defrag_ops));
+}
+
+static void __exit nf_defrag_fini(void)
+{
+	nf_unregister_hooks(ipv4_defrag_ops, ARRAY_SIZE(ipv4_defrag_ops));
+}
+
+void nf_defrag_ipv4_enable(void)
+{
+}
+EXPORT_SYMBOL_GPL(nf_defrag_ipv4_enable);
+
+module_init(nf_defrag_init);
+module_exit(nf_defrag_fini);
+
+MODULE_LICENSE("GPL");



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 09/16] Export UDP socket lookup function
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (3 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 08/16] Port redirection support for TCP KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:48   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 13/16] iptables tproxy core KOVACS Krisztian
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

The iptables tproxy code has to be able to do UDP socket hash lookups,
so we have to provide an exported lookup function for this purpose.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/udp.h |    4 ++++
 net/ipv4/udp.c    |    7 +++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index addcdc6..d38f6f2 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -148,6 +148,10 @@ extern int 	udp_lib_setsockopt(struct sock *sk, int level, int optname,
 				   char __user *optval, int optlen,
 				   int (*push_pending_frames)(struct sock *));
 
+extern struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
+				    __be32 daddr, __be16 dport,
+				    int dif);
+
 DECLARE_SNMP_STAT(struct udp_mib, udp_stats_in6);
 
 /* UDP-Lite does not have a standardized MIB yet, so we inherit from UDP */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8e42fbb..28c3c31 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -302,6 +302,13 @@ static struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 	return result;
 }
 
+struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
+			     __be32 daddr, __be16 dport, int dif)
+{
+	return __udp4_lib_lookup(net, saddr, sport, daddr, dport, dif, udp_hash);
+}
+EXPORT_SYMBOL_GPL(udp4_lib_lookup);
+
 static inline struct sock *udp_v4_mcast_next(struct sock *sk,
 					     __be16 loc_port, __be32 loc_addr,
 					     __be16 rmt_port, __be32 rmt_addr,



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 14/16] iptables socket match
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (12 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-02  9:26   ` Patrick McHardy
  2008-10-01 14:24 ` [net-next PATCH 04/16] Make inet_sock.h independent of route.h KOVACS Krisztian
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Add iptables 'socket' match, which matches packets for which a TCP/UDP
socket lookup succeeds.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 net/netfilter/Kconfig     |   15 ++++
 net/netfilter/Makefile    |    1 
 net/netfilter/xt_socket.c |  182 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 198 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index ff1b0e6..a4b8006 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -760,6 +760,21 @@ config NETFILTER_XT_MATCH_SCTP
 	  If you want to compile it as a module, say M here and read
 	  <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
 
+config NETFILTER_XT_MATCH_SOCKET
+	tristate '"socket" match support (EXPERIMENTAL)'
+	depends on EXPERIMENTAL
+	depends on NETFILTER_TPROXY
+	depends on NETFILTER_XTABLES
+	depends on NETFILTER_ADVANCED
+	select NF_DEFRAG_IPV4
+	help
+	  This option adds a `socket' match, which can be used to match
+	  packets for which a TCP or UDP socket lookup finds a valid socket.
+	  It can be used in combination with the MARK target and policy
+	  routing to implement full featured non-locally bound sockets.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_STATE
 	tristate '"state" match support'
 	depends on NETFILTER_XTABLES
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 1b8cb7f..c386755 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -80,6 +80,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_RATEEST) += xt_rateest.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_SCTP) += xt_sctp.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_SOCKET) += xt_socket.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STATE) += xt_state.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STATISTIC) += xt_statistic.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STRING) += xt_string.o
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
new file mode 100644
index 0000000..b726c99
--- /dev/null
+++ b/net/netfilter/xt_socket.c
@@ -0,0 +1,182 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (C) 2007-2008 BalaBit IT Ltd.
+ * Author: Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <net/tcp.h>
+#include <net/udp.h>
+#include <net/icmp.h>
+#include <net/sock.h>
+#include <net/inet_sock.h>
+#include <net/netfilter/nf_tproxy_core.h>
+#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
+
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+#define XT_SOCKET_HAVE_CONNTRACK 1
+#include <net/netfilter/nf_conntrack.h>
+#endif
+
+static int
+extract_icmp_fields(const struct sk_buff *skb,
+		    u8 *protocol,
+		    __be32 *raddr,
+		    __be32 *laddr,
+		    __be16 *rport,
+		    __be16 *lport)
+{
+	struct iphdr *outside_iph = ip_hdr(skb);
+	struct iphdr *inside_iph, _inside_iph;
+	struct icmphdr *icmph, _icmph;
+	__be16 *ports, _ports[2];
+
+	icmph = skb_header_pointer(skb, outside_iph->ihl << 2, sizeof(_icmph), &_icmph);
+	if (icmph == NULL)
+		return 1;
+
+	switch (icmph->type) {
+	case ICMP_DEST_UNREACH:
+	case ICMP_SOURCE_QUENCH:
+	case ICMP_REDIRECT:
+	case ICMP_TIME_EXCEEDED:
+	case ICMP_PARAMETERPROB:
+		break;
+	default:
+		return 1;
+	}
+
+	inside_iph = skb_header_pointer(skb, (outside_iph->ihl << 2) + sizeof(struct icmphdr), sizeof(_inside_iph), &_inside_iph);
+	if (inside_iph == NULL)
+		return -EINVAL;
+
+	if (inside_iph->protocol != IPPROTO_TCP &&
+	    inside_iph->protocol != IPPROTO_UDP)
+		return 1;
+
+	ports = skb_header_pointer(skb, (outside_iph->ihl << 2) + sizeof(struct icmphdr) + (inside_iph->ihl << 2), sizeof(_ports), &_ports);
+	if (ports == NULL)
+		return 1;
+
+	/* the inside IP packet is the one quoted from our side, thus it saddr is the local address */
+	*protocol = inside_iph->protocol;
+	*laddr = inside_iph->saddr;
+	*lport = ports[0];
+	*raddr = inside_iph->daddr;
+	*rport = ports[1];
+
+	return 0;
+}
+
+
+static bool
+socket_mt(const struct sk_buff *skb,
+	  const struct net_device *in,
+	  const struct net_device *out,
+	  const struct xt_match *match,
+	  const void *matchinfo,
+	  int offset,
+	  unsigned int protoff,
+	  bool *hotdrop)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	struct udphdr _hdr, *hp = NULL;
+	struct sock *sk;
+	__be32 daddr, saddr;
+	__be16 dport, sport;
+	u8 protocol;
+#ifdef XT_SOCKET_HAVE_CONNTRACK
+	struct nf_conn const *ct;
+	enum ip_conntrack_info ctinfo;
+#endif
+
+	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
+		hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &_hdr);
+		if (hp == NULL)
+			return false;
+
+		protocol = iph->protocol;
+		saddr = iph->saddr;
+		sport = hp->source;
+		daddr = iph->daddr;
+		dport = hp->dest;
+
+	}
+	else if (iph->protocol == IPPROTO_ICMP) {
+		if (extract_icmp_fields(skb, &protocol, &saddr, &daddr, &sport, &dport))
+			return false;
+	}
+	else {
+		return false;
+	}
+
+#ifdef XT_SOCKET_HAVE_CONNTRACK
+	/* Do the lookup with the original socket address in case this is a
+	 * reply packet of an established SNAT-ted connection. */
+
+	ct = nf_ct_get(skb, &ctinfo);
+	if (ct && (ct != &nf_conntrack_untracked) &&
+	    ((iph->protocol != IPPROTO_ICMP && ctinfo == IP_CT_IS_REPLY + IP_CT_ESTABLISHED) ||
+	     (iph->protocol == IPPROTO_ICMP && ctinfo == IP_CT_IS_REPLY + IP_CT_RELATED)) &&
+	    (ct->status & IPS_SRC_NAT_DONE)) {
+
+		daddr = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip;
+		dport = (iph->protocol == IPPROTO_TCP) ?
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.tcp.port :
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.udp.port;
+	}
+#endif
+
+	sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), protocol,
+				   saddr, daddr, sport, dport, in, false);
+	if (sk != NULL) {
+		bool wildcard = (inet_sk(sk)->rcv_saddr == 0);
+
+		nf_tproxy_put_sock(sk);
+		if (wildcard)
+			sk = NULL;
+	}
+
+	pr_debug("socket match: proto %u %08x:%u -> %08x:%u (orig %08x:%u) sock %p\n",
+		 protocol, ntohl(saddr), ntohs(sport),
+		 ntohl(daddr), ntohs(dport),
+		 ntohl(iph->daddr), hp ? ntohs(hp->dest) : 0, sk);
+
+	return (sk != NULL);
+}
+
+static struct xt_match socket_mt_reg __read_mostly = {
+	.name		= "socket",
+	.family		= AF_INET,
+	.match		= socket_mt,
+	.hooks		= 1 << NF_INET_PRE_ROUTING,
+	.me		= THIS_MODULE,
+};
+
+static int __init socket_mt_init(void)
+{
+	nf_defrag_ipv4_enable();
+	return xt_register_match(&socket_mt_reg);
+}
+
+static void __exit socket_mt_exit(void)
+{
+	xt_unregister_match(&socket_mt_reg);
+}
+
+module_init(socket_mt_init);
+module_exit(socket_mt_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs");
+MODULE_DESCRIPTION("x_tables socket match module");
+MODULE_ALIAS("ipt_socket");



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (11 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:50   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 14/16] iptables socket match KOVACS Krisztian
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Use the socket cached in the TPROXY target if it's present.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 net/ipv4/tcp_ipv4.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1ac4d05..0029db9 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1577,8 +1577,17 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags	 = iph->tos;
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
+#if defined(CONFIG_NETFILTER_TPROXY) || defined(CONFIG_NETFILTER_TPROXY_MODULE)
+	if (unlikely(skb->sk)) {
+		/* steal reference */
+		sk = skb->sk;
+		skb->destructor = NULL;
+		skb->sk = NULL;
+	} else
+#endif
 	sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
 			th->source, iph->daddr, th->dest, inet_iif(skb));
+
 	if (!sk)
 		goto no_tcp_socket;
 



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 08/16] Port redirection support for TCP
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (2 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 15/16] iptables TPROXY target KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:47   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 09/16] Export UDP socket lookup function KOVACS Krisztian
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.

This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/inet_sock.h         |    2 +-
 include/net/tcp.h               |    1 +
 net/ipv4/inet_connection_sock.c |    2 ++
 net/ipv4/syncookies.c           |    1 +
 net/ipv4/tcp_output.c           |    2 +-
 5 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index dced3f6..de0ecc7 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -61,8 +61,8 @@ struct inet_request_sock {
 	struct request_sock	req;
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 	u16			inet6_rsk_offset;
-	/* 2 bytes hole, try to pack */
 #endif
+	__be16			loc_port;
 	__be32			loc_addr;
 	__be32			rmt_addr;
 	__be16			rmt_port;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 12c9b4f..f6cc341 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -976,6 +976,7 @@ static inline void tcp_openreq_init(struct request_sock *req,
 	ireq->acked = 0;
 	ireq->ecn_ok = 0;
 	ireq->rmt_port = tcp_hdr(skb)->source;
+	ireq->loc_port = tcp_hdr(skb)->dest;
 }
 
 extern void tcp_enter_memory_pressure(struct sock *sk);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 432c570..21fcc5a 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -516,6 +516,8 @@ struct sock *inet_csk_clone(struct sock *sk, const struct request_sock *req,
 		newicsk->icsk_bind_hash = NULL;
 
 		inet_sk(newsk)->dport = inet_rsk(req)->rmt_port;
+		inet_sk(newsk)->num = ntohs(inet_rsk(req)->loc_port);
+		inet_sk(newsk)->sport = inet_rsk(req)->loc_port;
 		newsk->sk_write_space = sk_stream_write_space;
 
 		newicsk->icsk_retransmits = 0;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 929302b..d346c22 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -297,6 +297,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	treq->rcv_isn		= ntohl(th->seq) - 1;
 	treq->snt_isn		= cookie;
 	req->mss		= mss;
+	ireq->loc_port		= th->dest;
 	ireq->rmt_port		= th->source;
 	ireq->loc_addr		= ip_hdr(skb)->daddr;
 	ireq->rmt_addr		= ip_hdr(skb)->saddr;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a8499ef..493553c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2275,7 +2275,7 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst,
 	th->syn = 1;
 	th->ack = 1;
 	TCP_ECN_make_synack(req, th);
-	th->source = inet_sk(sk)->sport;
+	th->source = ireq->loc_port;
 	th->dest = ireq->rmt_port;
 	/* Setting of flags are superfluous here for callers (and ECE is
 	 * not even correctly set)



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 15/16] iptables TPROXY target
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
  2008-10-01 14:24 ` [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible KOVACS Krisztian
  2008-10-01 14:24 ` [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-02  9:28   ` Patrick McHardy
  2008-10-01 14:24 ` [net-next PATCH 08/16] Port redirection support for TCP KOVACS Krisztian
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

The TPROXY target implements redirection of non-local TCP/UDP traffic to local
sockets. Additionally, it's possible to manipulate the packet mark if and only
if a socket has been found. (We need this because we cannot use multiple
targets in the same iptables rule.)

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/linux/netfilter/xt_TPROXY.h |   14 ++++
 net/netfilter/Kconfig               |   15 +++++
 net/netfilter/Makefile              |    1 
 net/netfilter/xt_TPROXY.c           |  112 +++++++++++++++++++++++++++++++++++
 4 files changed, 142 insertions(+), 0 deletions(-)

diff --git a/include/linux/netfilter/xt_TPROXY.h b/include/linux/netfilter/xt_TPROXY.h
new file mode 100644
index 0000000..152e8f9
--- /dev/null
+++ b/include/linux/netfilter/xt_TPROXY.h
@@ -0,0 +1,14 @@
+#ifndef _XT_TPROXY_H_target
+#define _XT_TPROXY_H_target
+
+/* TPROXY target is capable of marking the packet to perform
+ * redirection. We can get rid of that whenever we get support for
+ * mutliple targets in the same rule. */
+struct xt_tproxy_target_info {
+	u_int32_t mark_mask;
+	u_int32_t mark_value;
+	__be32 laddr;
+	__be16 lport;
+};
+
+#endif /* _XT_TPROXY_H_target */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index a4b8006..bef2144 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -421,6 +421,21 @@ config NETFILTER_XT_TARGET_RATEEST
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config NETFILTER_XT_TARGET_TPROXY
+	tristate '"TPROXY" target support (EXPERIMENTAL)'
+	depends on EXPERIMENTAL
+	depends on NETFILTER_TPROXY
+	depends on NETFILTER_XTABLES
+	depends on NETFILTER_ADVANCED
+	select NF_DEFRAG_IPV4
+	help
+	  This option adds a `TPROXY' target, which is somewhat similar to
+	  REDIRECT.  It can only be used in the mangle table and is useful
+	  to redirect traffic to a transparent proxy.  It does _not_ depend
+	  on Netfilter connection tracking and NAT, unlike REDIRECT.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_TARGET_TRACE
 	tristate  '"TRACE" target support'
 	depends on NETFILTER_XTABLES
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index c386755..fdb72f8 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -51,6 +51,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_NFQUEUE) += xt_NFQUEUE.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NOTRACK) += xt_NOTRACK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_RATEEST) += xt_RATEEST.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_SECMARK) += xt_SECMARK.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_TPROXY) += xt_TPROXY.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_TCPMSS) += xt_TCPMSS.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP) += xt_TCPOPTSTRIP.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_TRACE) += xt_TRACE.o
diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
new file mode 100644
index 0000000..183f251
--- /dev/null
+++ b/net/netfilter/xt_TPROXY.c
@@ -0,0 +1,112 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/ip.h>
+#include <net/checksum.h>
+#include <net/udp.h>
+#include <net/inet_sock.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <linux/netfilter/xt_TPROXY.h>
+
+#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
+#include <net/netfilter/nf_tproxy_core.h>
+
+static unsigned int
+tproxy_tg(struct sk_buff *skb,
+	  const struct net_device *in,
+	  const struct net_device *out,
+	  unsigned int hooknum,
+	  const struct xt_target *target,
+	  const void *targinfo)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	const struct xt_tproxy_target_info *tgi = targinfo;
+	struct udphdr _hdr, *hp;
+	struct sock *sk;
+
+	hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &_hdr);
+	if (hp == NULL)
+		return NF_DROP;
+
+	sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), iph->protocol,
+				   iph->saddr, tgi->laddr ? tgi->laddr : iph->daddr,
+				   hp->source, tgi->lport ? tgi->lport : hp->dest,
+				   in, true);
+
+	/* NOTE: assign_sock consumes our sk reference */
+	if (sk && nf_tproxy_assign_sock(skb, sk)) {
+		/* This should be in a separate target, but we don't do multiple
+		   targets on the same rule yet */
+		skb->mark = (skb->mark & ~tgi->mark_mask) ^ tgi->mark_value;
+
+		pr_debug("redirecting: proto %u %08x:%u -> %08x:%u, mark: %x\n",
+			 iph->protocol, ntohl(iph->daddr), ntohs(hp->dest),
+			 ntohl(tgi->laddr), ntohs(tgi->lport), skb->mark);
+		return NF_ACCEPT;
+	}
+
+	pr_debug("no socket, dropping: proto %u %08x:%u -> %08x:%u, mark: %x\n",
+		 iph->protocol, ntohl(iph->daddr), ntohs(hp->dest),
+		 ntohl(tgi->laddr), ntohs(tgi->lport), skb->mark);
+	return NF_DROP;
+}
+
+static bool
+tproxy_tg_check(const char *tablename,
+		const void *entry,
+		const struct xt_target *target,
+		void *targetinfo,
+		unsigned int hook_mask)
+{
+	const struct ipt_ip *i = entry;
+
+	if ((i->proto == IPPROTO_TCP || i->proto == IPPROTO_UDP)
+	    && !(i->invflags & IPT_INV_PROTO))
+		return true;
+
+	pr_info("xt_TPROXY: Can be used only in combination with "
+		"either -p tcp or -p udp\n");
+	return false;
+}
+
+static struct xt_target tproxy_tg_reg __read_mostly = {
+	.name		= "TPROXY",
+	.family		= AF_INET,
+	.table		= "mangle",
+	.target		= tproxy_tg,
+	.targetsize	= sizeof(struct xt_tproxy_target_info),
+	.checkentry	= tproxy_tg_check,
+	.hooks		= 1 << NF_INET_PRE_ROUTING,
+	.me		= THIS_MODULE,
+};
+
+static int __init tproxy_tg_init(void)
+{
+	nf_defrag_ipv4_enable();
+	return xt_register_target(&tproxy_tg_reg);
+}
+
+static void __exit tproxy_tg_exit(void)
+{
+	xt_unregister_target(&tproxy_tg_reg);
+}
+
+module_init(tproxy_tg_init);
+module_exit(tproxy_tg_exit);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs");
+MODULE_DESCRIPTION("Netfilter transparent proxy (TPROXY) target module.");
+MODULE_ALIAS("ipt_TPROXY");



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:45   ` David Miller
  2008-10-01 14:24 ` [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option KOVACS Krisztian
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Netfilter's ip_route_me_harder() tries to re-route packets either generated or
re-routed by Netfilter. This patch changes ip_route_me_harder() to handle
packets from non-locally-bound sockets with IP_TRANSPARENT set as local and to
set the appropriate flowi flags when re-doing the routing lookup.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/ip.h                |    9 +++++++++
 net/ipv4/inet_connection_sock.c |    1 +
 net/ipv4/ip_output.c            |    4 +++-
 net/ipv4/netfilter.c            |    3 +++
 net/ipv4/syncookies.c           |    2 ++
 5 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 250e6ef..d678ea3 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -29,6 +29,7 @@
 
 #include <net/inet_sock.h>
 #include <net/snmp.h>
+#include <net/flow.h>
 
 struct sock;
 
@@ -140,12 +141,20 @@ static inline void ip_tr_mc_map(__be32 addr, char *buf)
 
 struct ip_reply_arg {
 	struct kvec iov[1];   
+	int	    flags;
 	__wsum 	    csum;
 	int	    csumoffset; /* u16 offset of csum in iov[0].iov_base */
 				/* -1 if not needed */ 
 	int	    bound_dev_if;
 }; 
 
+#define IP_REPLY_ARG_NOSRCCHECK 1
+
+static inline __u8 ip_reply_arg_flowi_flags(const struct ip_reply_arg *arg)
+{
+	return (arg->flags & IP_REPLY_ARG_NOSRCCHECK) ? FLOWI_FLAG_ANYSRC : 0;
+}
+
 void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *arg,
 		   unsigned int len); 
 
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 0c1ae68..432c570 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -335,6 +335,7 @@ struct dst_entry* inet_csk_route_req(struct sock *sk,
 					.saddr = ireq->loc_addr,
 					.tos = RT_CONN_FLAGS(sk) } },
 			    .proto = sk->sk_protocol,
+			    .flags = inet_sk_flowi_flags(sk),
 			    .uli_u = { .ports =
 				       { .sport = inet_sk(sk)->sport,
 					 .dport = ireq->rmt_port } } };
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d533a89..d2a8f8b 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -340,6 +340,7 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
 							.saddr = inet->saddr,
 							.tos = RT_CONN_FLAGS(sk) } },
 					    .proto = sk->sk_protocol,
+					    .flags = inet_sk_flowi_flags(sk),
 					    .uli_u = { .ports =
 						       { .sport = inet->sport,
 							 .dport = inet->dport } } };
@@ -1371,7 +1372,8 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar
 				    .uli_u = { .ports =
 					       { .sport = tcp_hdr(skb)->dest,
 						 .dport = tcp_hdr(skb)->source } },
-				    .proto = sk->sk_protocol };
+				    .proto = sk->sk_protocol,
+				    .flags = ip_reply_arg_flowi_flags(arg) };
 		security_skb_classify_flow(skb, &fl);
 		if (ip_route_output_key(sock_net(sk), &rt, &fl))
 			return;
diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
index f8edacd..01671ad 100644
--- a/net/ipv4/netfilter.c
+++ b/net/ipv4/netfilter.c
@@ -20,6 +20,8 @@ int ip_route_me_harder(struct sk_buff *skb, unsigned addr_type)
 	unsigned int type;
 
 	type = inet_addr_type(&init_net, iph->saddr);
+	if (skb->sk && inet_sk(skb->sk)->transparent)
+		type = RTN_LOCAL;
 	if (addr_type == RTN_UNSPEC)
 		addr_type = type;
 
@@ -33,6 +35,7 @@ int ip_route_me_harder(struct sk_buff *skb, unsigned addr_type)
 		fl.nl_u.ip4_u.tos = RT_TOS(iph->tos);
 		fl.oif = skb->sk ? skb->sk->sk_bound_dev_if : 0;
 		fl.mark = skb->mark;
+		fl.flags = skb->sk ? inet_sk_flowi_flags(skb->sk) : 0;
 		if (ip_route_output_key(&init_net, &rt, &fl) != 0)
 			return -1;
 
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 9d38005..929302b 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -16,6 +16,7 @@
 #include <linux/cryptohash.h>
 #include <linux/kernel.h>
 #include <net/tcp.h>
+#include <net/route.h>
 
 /* Timestamps: lowest 9 bits store TCP options */
 #define TSBITS 9
@@ -337,6 +338,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 						.saddr = ireq->loc_addr,
 						.tos = RT_CONN_FLAGS(sk) } },
 				    .proto = IPPROTO_TCP,
+				    .flags = inet_sk_flowi_flags(sk),
 				    .uli_u = { .ports =
 					       { .sport = th->dest,
 						 .dport = th->source } } };



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 11/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (8 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 14:24 ` [net-next PATCH 01/16] Loosen source address check on IPv4 output KOVACS Krisztian
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Use the socket cached in the TPROXY target if it's present.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 net/ipv4/udp.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 28c3c31..230cd8c 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -360,6 +360,14 @@ void __udp4_lib_err(struct sk_buff *skb, u32 info, struct hlist_head udptable[])
 	int err;
 	struct net *net = dev_net(skb->dev);
 
+#if defined(CONFIG_NETFILTER_TPROXY) || defined(CONFIG_NETFILTER_TPROXY_MODULE)
+	if (unlikely(skb->sk)) {
+		/* steal reference */
+		sk = skb->sk;
+		skb->destructor = NULL;
+		skb->sk = NULL;
+	} else
+#endif
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
 			iph->saddr, uh->source, skb->dev->ifindex, udptable);
 	if (sk == NULL) {
@@ -1198,6 +1206,14 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 		return __udp4_lib_mcast_deliver(net, skb, uh,
 				saddr, daddr, udptable);
 
+#if defined(CONFIG_NETFILTER_TPROXY) || defined(CONFIG_NETFILTER_TPROXY_MODULE)
+	if (unlikely(skb->sk)) {
+		/* steal reference */
+		sk = skb->sk;
+		skb->destructor = NULL;
+		skb->sk = NULL;
+	} else
+#endif
 	sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
 			uh->dest, inet_iif(skb), udptable);
 



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 16/16] Add documentation
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (6 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-01 16:22   ` Randy Dunlap
  2008-10-03 14:01   ` [net-next " Jan Engelhardt
  2008-10-01 14:24 ` [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency KOVACS Krisztian
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

Add basic usage instructions to Documentation/networking.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 Documentation/networking/tproxy.txt |   85 +++++++++++++++++++++++++++++++++++
 1 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
new file mode 100644
index 0000000..cf79e60
--- /dev/null
+++ b/Documentation/networking/tproxy.txt
@@ -0,0 +1,85 @@
+Transparent proxy support
+=========================
+
+This feature adds Linux 2.2-like transparent proxy support to current kernels.
+To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
+your kernel config. You will need policy routing too, so be sure to enable that
+as well.
+
+
+1. Making non-local sockets work
+================================
+
+The idea is that you identify packets with destination address matching a local
+socket your box, set the packet mark to a certain value, and then match on that
+value using policy routing to have those packets delivered locally:
+
+# iptables -t mangle -N DIVERT
+# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
+# iptables -t mangle -A DIVERT -j MARK --set-mark 1
+# iptables -t mangle -A DIVERT -j ACCEPT
+
+# ip rule add fwmark 1 lookup 100
+# ip route add local 0.0.0.0/0 dev lo table 100
+
+Because of certain restrictions in the IPv4 routing output code you'll have to
+modify your application to allow it sending datagrams _from_ non-local IP
+addresses. All you have to do is to enable the (SOL_IP, IP_TRANSPARENT) socket
+option before calling bind:
+
+fd = socket(AF_INET, SOCK_STREAM, 0);
+/* - 8< -*/
+int value = 1;
+setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
+/* - 8< -*/
+name.sin_family = AF_INET;
+name.sin_port = htons(0xCAFE);
+name.sin_addr.s_addr = htonl(0xDEADBEEF);
+bind(fd, &name, sizeof(name));
+
+A trivial patch for netcat is available here:
+http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
+
+
+2. Redirecting traffic
+======================
+
+Transparent proxying often involves "intercepting" traffic on a router. This is
+usually done with the iptables REDIRECT target, however, there are serious
+limitations of that method. One of the major issues is that it actually
+modifies the packets to change the destination address -- which might not be
+acceptable in certain situations. (Think of proxying UDP for example: you won't
+be able to find out the original destination address. Even in case of TCP
+getting the original destination address is racy.)
+
+The 'TPROXY' target provides similar functionality without relying on NAT. Simply
+add rules like this to the iptables ruleset above:
+
+# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
+  --tproxy-mark 0x1/0x1 --on-port 50080
+
+Note that for this to work you'll have to modify the proxy to enable (SOL_IP,
+IP_TRANSPARENT) for the listening socket.
+
+
+3. Iptables extensions
+======================
+
+To use tproxy you'll need to have the 'socket' and 'TPROXY' modules
+compiled for iptables. A patched version of iptables is available
+here: http://git.balabit.hu/?p=bazsi/iptables-tproxy.git
+
+
+4. Application support
+======================
+
+4.1. Squid
+----------
+
+Squid 3.HEAD has support built-in. To use it, pass
+'--enable-linux-netfilter' to configure and set the 'tproxy' option on
+the HTTP listener you redirect traffic to with the TPROXY iptables
+target.
+
+For more information please consult the following page on the Squid
+wiki: http://wiki.squid-cache.org/Features/Tproxy4



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH 13/16] iptables tproxy core
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (4 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 09/16] Export UDP socket lookup function KOVACS Krisztian
@ 2008-10-01 14:24 ` KOVACS Krisztian
  2008-10-02  9:19   ` Patrick McHardy
  2008-10-01 14:24 ` [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting KOVACS Krisztian
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: Patrick McHardy, netdev, netfilter-devel

The iptables tproxy core is a module that contains the common routines used by
various tproxy related modules (TPROXY target and socket match)

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/netfilter/nf_tproxy_core.h |   32 +++++++++++
 net/netfilter/Kconfig                  |   15 +++++
 net/netfilter/Makefile                 |    3 +
 net/netfilter/nf_tproxy_core.c         |   96 ++++++++++++++++++++++++++++++++
 4 files changed, 146 insertions(+), 0 deletions(-)

diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
new file mode 100644
index 0000000..208b46f
--- /dev/null
+++ b/include/net/netfilter/nf_tproxy_core.h
@@ -0,0 +1,32 @@
+#ifndef _NF_TPROXY_CORE_H
+#define _NF_TPROXY_CORE_H
+
+#include <linux/types.h>
+#include <linux/in.h>
+#include <linux/skbuff.h>
+#include <net/sock.h>
+#include <net/inet_sock.h>
+#include <net/tcp.h>
+
+/* look up and get a reference to a matching socket */
+extern struct sock *
+nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
+		      const __be32 saddr, const __be32 daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in, bool listening);
+
+static inline void
+nf_tproxy_put_sock(struct sock *sk)
+{
+	/* TIME_WAIT inet sockets have to be handled differently */
+	if ((sk->sk_protocol == IPPROTO_TCP) && (sk->sk_state == TCP_TIME_WAIT))
+		inet_twsk_put(inet_twsk(sk));
+	else
+		sock_put(sk);
+}
+
+/* assign a socket to the skb -- consumes sk */
+int
+nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk);
+
+#endif
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index ee898e7..ff1b0e6 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -287,6 +287,21 @@ config NF_CT_NETLINK
 	help
 	  This option enables support for a netlink-based userspace interface
 
+# transparent proxy support
+config NETFILTER_TPROXY
+	tristate "Transparent proxying support (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	depends on IP_NF_MANGLE
+	depends on NETFILTER_ADVANCED
+	help
+	  This option enables transparent proxying support, that is,
+	  support for handling non-locally bound IPv4 TCP and UDP sockets.
+	  For it to work you will have to configure certain iptables rules
+	  and use policy routing. For more information on how to set it up
+	  see Documentation/networking/tproxy.txt.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XTABLES
 	tristate "Netfilter Xtables support (required for ip_tables)"
 	default m if NETFILTER_ADVANCED=n
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 3bd2cc5..1b8cb7f 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -34,6 +34,9 @@ obj-$(CONFIG_NF_CONNTRACK_SANE) += nf_conntrack_sane.o
 obj-$(CONFIG_NF_CONNTRACK_SIP) += nf_conntrack_sip.o
 obj-$(CONFIG_NF_CONNTRACK_TFTP) += nf_conntrack_tftp.o
 
+# transparent proxy support
+obj-$(CONFIG_NETFILTER_TPROXY) += nf_tproxy_core.o
+
 # generic X tables 
 obj-$(CONFIG_NETFILTER_XTABLES) += x_tables.o xt_tcpudp.o
 
diff --git a/net/netfilter/nf_tproxy_core.c b/net/netfilter/nf_tproxy_core.c
new file mode 100644
index 0000000..fe34f4b
--- /dev/null
+++ b/net/netfilter/nf_tproxy_core.c
@@ -0,0 +1,96 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/version.h>
+#include <linux/module.h>
+
+#include <linux/net.h>
+#include <linux/if.h>
+#include <linux/netdevice.h>
+#include <net/udp.h>
+#include <net/netfilter/nf_tproxy_core.h>
+
+struct sock *
+nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
+		      const __be32 saddr, const __be32 daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in, bool listening_only)
+{
+	struct sock *sk;
+
+	/* look up socket */
+	switch (protocol) {
+	case IPPROTO_TCP:
+		if (listening_only)
+			sk = __inet_lookup_listener(net, &tcp_hashinfo,
+						    daddr, ntohs(dport),
+						    in->ifindex);
+		else
+			sk = __inet_lookup(net, &tcp_hashinfo,
+					   saddr, sport, daddr, dport,
+					   in->ifindex);
+		break;
+	case IPPROTO_UDP:
+		sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
+				     in->ifindex);
+		break;
+	default:
+		WARN_ON(1);
+		sk = NULL;
+	}
+
+	pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, listener only: %d, sock %p\n",
+		 protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), listening_only, sk);
+
+	return sk;
+}
+EXPORT_SYMBOL_GPL(nf_tproxy_get_sock_v4);
+
+static void
+nf_tproxy_destructor(struct sk_buff *skb)
+{
+	struct sock *sk = skb->sk;
+
+	skb->sk = NULL;
+	skb->destructor = NULL;
+
+	if (sk)
+		nf_tproxy_put_sock(sk);
+}
+
+/* consumes sk */
+int
+nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk)
+{
+	if (inet_sk(sk)->transparent) {
+		skb->sk = sk;
+		skb->destructor = nf_tproxy_destructor;
+		return 1;
+	} else
+		nf_tproxy_put_sock(sk);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nf_tproxy_assign_sock);
+
+static int __init nf_tproxy_init(void)
+{
+	pr_info("NF_TPROXY: Transparent proxy support initialized, version 4.1.0\n");
+	pr_info("NF_TPROXY: Copyright (c) 2006-2007 BalaBit IT Ltd.\n");
+	return 0;
+}
+
+module_init(nf_tproxy_init);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs");
+MODULE_DESCRIPTION("Transparent proxy support core routines");



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 01/16] Loosen source address check on IPv4 output
  2008-10-01 14:24 ` [net-next PATCH 01/16] Loosen source address check on IPv4 output KOVACS Krisztian
@ 2008-10-01 14:28   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:28 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> ip_route_output() contains a check to make sure that no flows with
> non-local source IP addresses are routed. This obviously makes using
> such addresses impossible.
> 
> This patch introduces a flowi flag which makes omitting this check
> possible. The new flag provides a way of handling transparent and
> non-transparent connections differently.
> 
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option
  2008-10-01 14:24 ` [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option KOVACS Krisztian
@ 2008-10-01 14:30   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:30 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> This patch introduces the IP_TRANSPARENT socket option: enabling that will make
> the IPv4 routing omit the non-local source address check on output. Setting
> IP_TRANSPARENT requires NET_ADMIN capability.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set
  2008-10-01 14:24 ` [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set KOVACS Krisztian
@ 2008-10-01 14:31   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:31 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> Setting IP_TRANSPARENT is not really useful without allowing non-local
> binds for the socket. To make user-space code simpler we allow these binds
> even if IP_TRANSPARENT is set but IP_FREEBIND is not.
> 
> Signed-off-by: Tóth László Attila <panther@balabit.hu>

Applied to net-next-2.6


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 04/16] Make inet_sock.h independent of route.h
  2008-10-01 14:24 ` [net-next PATCH 04/16] Make inet_sock.h independent of route.h KOVACS Krisztian
@ 2008-10-01 14:34   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:34 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> inet_iif() in inet_sock.h requires route.h. Since users of inet_iif()
> usually require other route.h functionality anyway this patch moves
> inet_iif() to route.h.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

This may create some build fallout for some configurations
on some architectures, because some files may have been
depending upon that implicit inclusion of net/route.h

We'll have to fix them up, so this is just a heads up.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting
  2008-10-01 14:24 ` [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting KOVACS Krisztian
@ 2008-10-01 14:36   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:36 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> Set FLOWI_FLAG_ANYSRC in flowi->flags if the socket has the
> transparent socket option set. This way we selectively enable certain
> connections with non-local source addresses to be routed.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency
  2008-10-01 14:24 ` [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency KOVACS Krisztian
@ 2008-10-01 14:42   ` David Miller
  2008-10-01 14:46     ` KOVACS Krisztian
  0 siblings, 1 reply; 64+ messages in thread
From: David Miller @ 2008-10-01 14:42 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
> incoming packets. The non-local source address check on output bites
> us again, as replies for transparently redirected traffic won't have a
> chance to leave the node.
> 
> This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing
> the route lookup for those replies. Transparent replies are enabled if
> the listening socket has the transparent socket flag set.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

I had to make some modifications to make this build.

I took two include/net/ip.h modifications from patch 7:

1) Adding flags to ip_reply_arg struct
2) definition of IP_REPLY_ARG_NOSRCCHECK

and the result is included below and added to net-next-2.6

tcp: Handle TCP SYN+ACK/ACK/RST transparency

The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.

This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing the
route lookup for those replies. Transparent replies are enabled if the
listening socket has the transparent socket flag set.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/inet_sock.h |    8 +++++++-
 include/net/ip.h        |    3 +++
 net/ipv4/tcp_ipv4.c     |   12 +++++++++---
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 139b78b..dced3f6 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -72,7 +72,8 @@ struct inet_request_sock {
 				sack_ok	   : 1,
 				wscale_ok  : 1,
 				ecn_ok	   : 1,
-				acked	   : 1;
+				acked	   : 1,
+				no_srccheck: 1;
 	struct ip_options	*opt;
 };
 
@@ -204,4 +205,9 @@ static inline struct request_sock *inet_reqsk_alloc(struct request_sock_ops *ops
 	return req;
 }
 
+static inline __u8 inet_sk_flowi_flags(const struct sock *sk)
+{
+	return inet_sk(sk)->transparent ? FLOWI_FLAG_ANYSRC : 0;
+}
+
 #endif	/* _INET_SOCK_H */
diff --git a/include/net/ip.h b/include/net/ip.h
index 250e6ef..90b27f6 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -140,12 +140,15 @@ static inline void ip_tr_mc_map(__be32 addr, char *buf)
 
 struct ip_reply_arg {
 	struct kvec iov[1];   
+	int	    flags;
 	__wsum 	    csum;
 	int	    csumoffset; /* u16 offset of csum in iov[0].iov_base */
 				/* -1 if not needed */ 
 	int	    bound_dev_if;
 }; 
 
+#define IP_REPLY_ARG_NOSRCCHECK 1
+
 void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *arg,
 		   unsigned int len); 
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d13688e..8b24bd8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -591,6 +591,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 				      ip_hdr(skb)->saddr, /* XXX */
 				      sizeof(struct tcphdr), IPPROTO_TCP, 0);
 	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
+	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
 
 	net = dev_net(skb->dst->dev);
 	ip_send_reply(net->ipv4.tcp_sock, skb,
@@ -606,7 +607,8 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 
 static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 			    u32 win, u32 ts, int oif,
-			    struct tcp_md5sig_key *key)
+			    struct tcp_md5sig_key *key,
+			    int reply_flags)
 {
 	struct tcphdr *th = tcp_hdr(skb);
 	struct {
@@ -659,6 +661,7 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 				    ip_hdr(skb)->daddr, &rep.th);
 	}
 #endif
+	arg.flags = reply_flags;
 	arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr,
 				      ip_hdr(skb)->saddr, /* XXX */
 				      arg.iov[0].iov_len, IPPROTO_TCP, 0);
@@ -681,7 +684,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
 			tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale,
 			tcptw->tw_ts_recent,
 			tw->tw_bound_dev_if,
-			tcp_twsk_md5_key(tcptw)
+			tcp_twsk_md5_key(tcptw),
+			tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0
 			);
 
 	inet_twsk_put(tw);
@@ -694,7 +698,8 @@ static void tcp_v4_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
 			tcp_rsk(req)->rcv_isn + 1, req->rcv_wnd,
 			req->ts_recent,
 			0,
-			tcp_v4_md5_do_lookup(sk, ip_hdr(skb)->daddr));
+			tcp_v4_md5_do_lookup(sk, ip_hdr(skb)->daddr),
+			inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0);
 }
 
 /*
@@ -1244,6 +1249,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	ireq = inet_rsk(req);
 	ireq->loc_addr = daddr;
 	ireq->rmt_addr = saddr;
+	ireq->no_srccheck = inet_sk(sk)->transparent;
 	ireq->opt = tcp_v4_save_options(sk, skb);
 	if (!want_cookie)
 		TCP_ECN_create_request(req, tcp_hdr(skb));
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible
  2008-10-01 14:24 ` [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible KOVACS Krisztian
@ 2008-10-01 14:45   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:45 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> Netfilter's ip_route_me_harder() tries to re-route packets either generated or
> re-routed by Netfilter. This patch changes ip_route_me_harder() to handle
> packets from non-locally-bound sockets with IP_TRANSPARENT set as local and to
> set the appropriate flowi flags when re-doing the routing lookup.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6, but obviously without the parts
I had to backport into patch 6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency
  2008-10-01 14:42   ` David Miller
@ 2008-10-01 14:46     ` KOVACS Krisztian
  0 siblings, 0 replies; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 14:46 UTC (permalink / raw)
  To: David Miller; +Cc: hidden, kaber, netdev, netfilter-devel

Hi,

On Wed, Oct 01, 2008 at 07:42:50AM -0700, David Miller wrote:
> From: KOVACS Krisztian <hidden@sch.bme.hu>
> Date: Wed, 01 Oct 2008 16:24:31 +0200
> 
> > The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
> > incoming packets. The non-local source address check on output bites
> > us again, as replies for transparently redirected traffic won't have a
> > chance to leave the node.
> > 
> > This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing
> > the route lookup for those replies. Transparent replies are enabled if
> > the listening socket has the transparent socket flag set.
> > 
> > Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
> 
> I had to make some modifications to make this build.
> 
> I took two include/net/ip.h modifications from patch 7:
> 
> 1) Adding flags to ip_reply_arg struct
> 2) definition of IP_REPLY_ARG_NOSRCCHECK
> 
> and the result is included below and added to net-next-2.6

Oops, my fault, sorry. Should have been more careful when juggling around
with patches yesterday...

-- 
KOVACS Krisztian

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 08/16] Port redirection support for TCP
  2008-10-01 14:24 ` [net-next PATCH 08/16] Port redirection support for TCP KOVACS Krisztian
@ 2008-10-01 14:47   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:47 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> Current TCP code relies on the local port of the listening socket
> being the same as the destination address of the incoming
> connection. Port redirection used by many transparent proxying
> techniques obviously breaks this, so we have to store the original
> destination port address.
> 
> This patch extends struct inet_request_sock and stores the incoming
> destination port value there. It also modifies the handshake code to
> use that value as the source port when sending reply packets.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 09/16] Export UDP socket lookup function
  2008-10-01 14:24 ` [net-next PATCH 09/16] Export UDP socket lookup function KOVACS Krisztian
@ 2008-10-01 14:48   ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-01 14:48 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> The iptables tproxy code has to be able to do UDP socket hash lookups,
> so we have to provide an exported lookup function for this purpose.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 14:24 ` [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
@ 2008-10-01 14:50   ` David Miller
  2008-10-01 15:38     ` KOVACS Krisztian
  0 siblings, 1 reply; 64+ messages in thread
From: David Miller @ 2008-10-01 14:50 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 16:24:31 +0200

> Use the socket cached in the TPROXY target if it's present.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Ok, this starts to get into controversial territory.
:-)

At the very least I think:

1) We should do this unconditionally, and even put
   a "unlikely" there in the test.

2) Actually, the whole operation belongs in a generic
   net/sock.h helper function, and this includes the
   leading if() test.

In the resubmitted patch you can include both UDP
and TCP and the change adding the generic helper all
at once.

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 14:50   ` David Miller
@ 2008-10-01 15:38     ` KOVACS Krisztian
  2008-10-01 15:51       ` David Miller
  0 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-01 15:38 UTC (permalink / raw)
  To: David Miller; +Cc: kaber, netdev, netfilter-devel

Hi,

On Wed, 2008-10-01 at 07:50 -0700, David Miller wrote:
> From: KOVACS Krisztian <hidden@sch.bme.hu>
> Date: Wed, 01 Oct 2008 16:24:31 +0200
> 
> > Use the socket cached in the TPROXY target if it's present.
> > 
> > Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
> 
> Ok, this starts to get into controversial territory.
> :-)
> 
> At the very least I think:
> 
> 1) We should do this unconditionally, and even put
>    a "unlikely" there in the test.
> 
> 2) Actually, the whole operation belongs in a generic
>    net/sock.h helper function, and this includes the
>    leading if() test.

The problem is that if you include the if() test then you have to
include the lookup call as well and that's different for TCP/UDP.

Of course we could create a generic helper that then calls the
appropriate lookup function but that's also an unnecessary extra branch,
isn't it?

-- 
KOVACS Krisztian


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 15:38     ` KOVACS Krisztian
@ 2008-10-01 15:51       ` David Miller
  2008-10-02 15:43         ` KOVACS Krisztian
  0 siblings, 1 reply; 64+ messages in thread
From: David Miller @ 2008-10-01 15:51 UTC (permalink / raw)
  To: hidden; +Cc: kaber, netdev, netfilter-devel

From: KOVACS Krisztian <hidden@sch.bme.hu>
Date: Wed, 01 Oct 2008 17:38:20 +0200

> The problem is that if you include the if() test then you have to
> include the lookup call as well and that's different for TCP/UDP.

No, I only mean to make a helper for this construct:

	if (unlikely(skb->sk)) {
		...
	}

so, something like:

static inline struct sock *sock_skb_steal(struct sk_buff *skb)
{
	if (unlikely(skb->sk)) {
		struct sock *sk = skb->sk;

		skb->destructor = NULL;
		skb->sk = NULL;
		return sk;
	}
	return NULL;
}

and then also get rid of the ifdefs at the place where
these calls are made (TCP and UDP).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-01 14:24 ` [net-next PATCH 16/16] Add documentation KOVACS Krisztian
@ 2008-10-01 16:22   ` Randy Dunlap
  2008-10-02  9:37     ` [RESEND net-next " KOVACS Krisztian
  2008-10-03 14:01   ` [net-next " Jan Engelhardt
  1 sibling, 1 reply; 64+ messages in thread
From: Randy Dunlap @ 2008-10-01 16:22 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, Patrick McHardy, netdev, netfilter-devel

On Wed, 01 Oct 2008 16:24:31 +0200 KOVACS Krisztian wrote:

> Add basic usage instructions to Documentation/networking.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
> ---
> 
>  Documentation/networking/tproxy.txt |   85 +++++++++++++++++++++++++++++++++++
>  1 files changed, 85 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
> new file mode 100644
> index 0000000..cf79e60
> --- /dev/null
> +++ b/Documentation/networking/tproxy.txt
> @@ -0,0 +1,85 @@
> +Transparent proxy support
> +=========================
> +
> +This feature adds Linux 2.2-like transparent proxy support to current kernels.
> +To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
> +your kernel config. You will need policy routing too, so be sure to enable that
> +as well.
> +
> +
> +1. Making non-local sockets work
> +================================
> +
> +The idea is that you identify packets with destination address matching a local
> +socket your box, set the packet mark to a certain value, and then match on that

          on your box   (?)

> +value using policy routing to have those packets delivered locally:
> +
> +# iptables -t mangle -N DIVERT
> +# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
> +# iptables -t mangle -A DIVERT -j MARK --set-mark 1
> +# iptables -t mangle -A DIVERT -j ACCEPT
> +
> +# ip rule add fwmark 1 lookup 100
> +# ip route add local 0.0.0.0/0 dev lo table 100
> +
> +Because of certain restrictions in the IPv4 routing output code you'll have to
> +modify your application to allow it sending datagrams _from_ non-local IP

                                       to send datagrams

> +addresses. All you have to do is to enable the (SOL_IP, IP_TRANSPARENT) socket

                                 is enable the

> +option before calling bind:
> +
> +fd = socket(AF_INET, SOCK_STREAM, 0);
> +/* - 8< -*/
> +int value = 1;
> +setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
> +/* - 8< -*/
> +name.sin_family = AF_INET;
> +name.sin_port = htons(0xCAFE);
> +name.sin_addr.s_addr = htonl(0xDEADBEEF);
> +bind(fd, &name, sizeof(name));
> +
> +A trivial patch for netcat is available here:
> +http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
> +
> +
> +2. Redirecting traffic
> +======================
> +
> +Transparent proxying often involves "intercepting" traffic on a router. This is
> +usually done with the iptables REDIRECT target, however, there are serious

                                           target;

> +limitations of that method. One of the major issues is that it actually
> +modifies the packets to change the destination address -- which might not be
> +acceptable in certain situations. (Think of proxying UDP for example: you won't
> +be able to find out the original destination address. Even in case of TCP
> +getting the original destination address is racy.)
> +
> +The 'TPROXY' target provides similar functionality without relying on NAT. Simply
> +add rules like this to the iptables ruleset above:
> +
> +# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
> +  --tproxy-mark 0x1/0x1 --on-port 50080
> +
> +Note that for this to work you'll have to modify the proxy to enable (SOL_IP,
> +IP_TRANSPARENT) for the listening socket.

Thanks.
---
~Randy

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module
  2008-10-01 14:24 ` [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module KOVACS Krisztian
@ 2008-10-02  9:18   ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02  9:18 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> Netfilter connection tracking requires all IPv4 packets to be defragmented.
> Both the socket match and the TPROXY target depend on this functionality, so
> this patch separates the Netfilter IPv4 defrag hooks into a separate module.
>
>   

Applied, thanks.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 13/16] iptables tproxy core
  2008-10-01 14:24 ` [net-next PATCH 13/16] iptables tproxy core KOVACS Krisztian
@ 2008-10-02  9:19   ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02  9:19 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> The iptables tproxy core is a module that contains the common routines used by
> various tproxy related modules (TPROXY target and socket match)
>
>   

Applied, thanks.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 14/16] iptables socket match
  2008-10-01 14:24 ` [net-next PATCH 14/16] iptables socket match KOVACS Krisztian
@ 2008-10-02  9:26   ` Patrick McHardy
  2008-10-02 10:26     ` KOVACS Krisztian
  2008-10-03 14:04     ` Jan Engelhardt
  0 siblings, 2 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02  9:26 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> Add iptables 'socket' match, which matches packets for which a TCP/UDP
> socket lookup succeeds.
>   

It seems sufficiently different from what xt_owner does to justify a 
separate module.
Minor nitpicking:

> +static int
> +extract_icmp_fields(const struct sk_buff *skb,
> +		    u8 *protocol,
> +		    __be32 *raddr,
> +		    __be32 *laddr,
> +		    __be16 *rport,
> +		    __be16 *lport)
> +{
> +	struct iphdr *outside_iph = ip_hdr(skb);
> +	struct iphdr *inside_iph, _inside_iph;
> +	struct icmphdr *icmph, _icmph;
> +	__be16 *ports, _ports[2];
> +
> +	icmph = skb_header_pointer(skb, outside_iph->ihl << 2, sizeof(_icmph), &_icmph);
>   

The "ihl << 2" is repeating multiple times. Maybe just store it in a 
variable,
also it would be nicer to use ip_hdrlen().

> +	if (icmph == NULL)
> +		return 1;
> +
> +	switch (icmph->type) {
> +	case ICMP_DEST_UNREACH:
> +	case ICMP_SOURCE_QUENCH:
> +	case ICMP_REDIRECT:
> +	case ICMP_TIME_EXCEEDED:
> +	case ICMP_PARAMETERPROB:
> +		break;
> +	default:
> +		return 1;
> +	}
> +
> +	inside_iph = skb_header_pointer(skb, (outside_iph->ihl << 2) + sizeof(struct icmphdr), sizeof(_inside_iph), &_inside_iph);
>   

And these lines (few more below) should break at 80 characters.

> +	if (inside_iph == NULL)
> +		return -EINVAL;
>   
What is the return convention here? It seems this should also return 1 
as the
other exit paths.

> +static bool
> +socket_mt(const struct sk_buff *skb,
> +	  const struct net_device *in,
> +	  const struct net_device *out,
> +	  const struct xt_match *match,
> +	  const void *matchinfo,
> +	  int offset,
> +	  unsigned int protoff,
> +	  bool *hotdrop)
> +{
>   
...
> +	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
> +		hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &_hdr);
> +		if (hp == NULL)
> +			return false;
> +
> +		protocol = iph->protocol;
> +		saddr = iph->saddr;
> +		sport = hp->source;
> +		daddr = iph->daddr;
> +		dport = hp->dest;
> +
> +	}
> +	else if (iph->protocol == IPPROTO_ICMP) {
>   

Please put the else on the same line as the closing brace.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 15/16] iptables TPROXY target
  2008-10-01 14:24 ` [net-next PATCH 15/16] iptables TPROXY target KOVACS Krisztian
@ 2008-10-02  9:28   ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02  9:28 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> The TPROXY target implements redirection of non-local TCP/UDP traffic to local
> sockets. Additionally, it's possible to manipulate the packet mark if and only
> if a socket has been found. (We need this because we cannot use multiple
> targets in the same iptables rule.)
>
>   

Applied, thanks.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RESEND net-next PATCH 16/16] Add documentation
  2008-10-01 16:22   ` Randy Dunlap
@ 2008-10-02  9:37     ` KOVACS Krisztian
  2008-10-02  9:38       ` Patrick McHardy
  0 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-02  9:37 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: David Miller, Patrick McHardy, netdev, netfilter-devel

Hi,

On Wed, 2008-10-01 at 09:22 -0700, Randy Dunlap wrote:
> Thanks.
> ---
> ~Randy

Fixed, thanks a lot.
---

Add basic usage instructions to Documentation/networking.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 Documentation/networking/tproxy.txt |   85 +++++++++++++++++++++++++++++++++++
 1 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
new file mode 100644
index 0000000..7b5996d
--- /dev/null
+++ b/Documentation/networking/tproxy.txt
@@ -0,0 +1,85 @@
+Transparent proxy support
+=========================
+
+This feature adds Linux 2.2-like transparent proxy support to current kernels.
+To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
+your kernel config. You will need policy routing too, so be sure to enable that
+as well.
+
+
+1. Making non-local sockets work
+================================
+
+The idea is that you identify packets with destination address matching a local
+socket on your box, set the packet mark to a certain value, and then match on that
+value using policy routing to have those packets delivered locally:
+
+# iptables -t mangle -N DIVERT
+# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
+# iptables -t mangle -A DIVERT -j MARK --set-mark 1
+# iptables -t mangle -A DIVERT -j ACCEPT
+
+# ip rule add fwmark 1 lookup 100
+# ip route add local 0.0.0.0/0 dev lo table 100
+
+Because of certain restrictions in the IPv4 routing output code you'll have to
+modify your application to allow it to send datagrams _from_ non-local IP
+addresses. All you have to do is enable the (SOL_IP, IP_TRANSPARENT) socket
+option before calling bind:
+
+fd = socket(AF_INET, SOCK_STREAM, 0);
+/* - 8< -*/
+int value = 1;
+setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
+/* - 8< -*/
+name.sin_family = AF_INET;
+name.sin_port = htons(0xCAFE);
+name.sin_addr.s_addr = htonl(0xDEADBEEF);
+bind(fd, &name, sizeof(name));
+
+A trivial patch for netcat is available here:
+http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
+
+
+2. Redirecting traffic
+======================
+
+Transparent proxying often involves "intercepting" traffic on a router. This is
+usually done with the iptables REDIRECT target; however, there are serious
+limitations of that method. One of the major issues is that it actually
+modifies the packets to change the destination address -- which might not be
+acceptable in certain situations. (Think of proxying UDP for example: you won't
+be able to find out the original destination address. Even in case of TCP
+getting the original destination address is racy.)
+
+The 'TPROXY' target provides similar functionality without relying on NAT. Simply
+add rules like this to the iptables ruleset above:
+
+# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
+  --tproxy-mark 0x1/0x1 --on-port 50080
+
+Note that for this to work you'll have to modify the proxy to enable (SOL_IP,
+IP_TRANSPARENT) for the listening socket.
+
+
+3. Iptables extensions
+======================
+
+To use tproxy you'll need to have the 'socket' and 'TPROXY' modules
+compiled for iptables. A patched version of iptables is available
+here: http://git.balabit.hu/?p=bazsi/iptables-tproxy.git
+
+
+4. Application support
+======================
+
+4.1. Squid
+----------
+
+Squid 3.HEAD has support built-in. To use it, pass
+'--enable-linux-netfilter' to configure and set the 'tproxy' option on
+the HTTP listener you redirect traffic to with the TPROXY iptables
+target.
+
+For more information please consult the following page on the Squid
+wiki: http://wiki.squid-cache.org/Features/Tproxy4



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [RESEND net-next PATCH 16/16] Add documentation
  2008-10-02  9:37     ` [RESEND net-next " KOVACS Krisztian
@ 2008-10-02  9:38       ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02  9:38 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: Randy Dunlap, David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> On Wed, 2008-10-01 at 09:22 -0700, Randy Dunlap wrote:
>   
>> Thanks.
>> ---
>> ~Randy
>>     
>
> Fixed, thanks a lot.
> ---
>
> Add basic usage instructions to Documentation/networking.

Applied, thanks.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 14/16] iptables socket match
  2008-10-02  9:26   ` Patrick McHardy
@ 2008-10-02 10:26     ` KOVACS Krisztian
  2008-10-02 10:35       ` Patrick McHardy
  2008-10-03 14:04     ` Jan Engelhardt
  1 sibling, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-02 10:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Miller, netdev, netfilter-devel

Hi,

On Thu, 2008-10-02 at 11:26 +0200, Patrick McHardy wrote:
> KOVACS Krisztian wrote:
> > Add iptables 'socket' match, which matches packets for which a TCP/UDP
> > socket lookup succeeds.
> >   
> 
> It seems sufficiently different from what xt_owner does to justify a 
> separate module.
> Minor nitpicking:
> [...]

Sorry, should be better this time.

- 8< -

Add iptables 'socket' match, which matches packets for which a TCP/UDP
socket lookup succeeds.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 net/netfilter/Kconfig     |   15 ++++
 net/netfilter/Makefile    |    1 
 net/netfilter/xt_socket.c |  192 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 208 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index ff1b0e6..a4b8006 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -760,6 +760,21 @@ config NETFILTER_XT_MATCH_SCTP
 	  If you want to compile it as a module, say M here and read
 	  <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
 
+config NETFILTER_XT_MATCH_SOCKET
+	tristate '"socket" match support (EXPERIMENTAL)'
+	depends on EXPERIMENTAL
+	depends on NETFILTER_TPROXY
+	depends on NETFILTER_XTABLES
+	depends on NETFILTER_ADVANCED
+	select NF_DEFRAG_IPV4
+	help
+	  This option adds a `socket' match, which can be used to match
+	  packets for which a TCP or UDP socket lookup finds a valid socket.
+	  It can be used in combination with the MARK target and policy
+	  routing to implement full featured non-locally bound sockets.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_STATE
 	tristate '"state" match support'
 	depends on NETFILTER_XTABLES
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 1b8cb7f..c386755 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -80,6 +80,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_RATEEST) += xt_rateest.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_SCTP) += xt_sctp.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_SOCKET) += xt_socket.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STATE) += xt_state.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STATISTIC) += xt_statistic.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_STRING) += xt_string.o
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
new file mode 100644
index 0000000..83eeff3
--- /dev/null
+++ b/net/netfilter/xt_socket.c
@@ -0,0 +1,192 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (C) 2007-2008 BalaBit IT Ltd.
+ * Author: Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <net/tcp.h>
+#include <net/udp.h>
+#include <net/icmp.h>
+#include <net/sock.h>
+#include <net/inet_sock.h>
+#include <net/netfilter/nf_tproxy_core.h>
+#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
+
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+#define XT_SOCKET_HAVE_CONNTRACK 1
+#include <net/netfilter/nf_conntrack.h>
+#endif
+
+static int
+extract_icmp_fields(const struct sk_buff *skb,
+		    u8 *protocol,
+		    __be32 *raddr,
+		    __be32 *laddr,
+		    __be16 *rport,
+		    __be16 *lport)
+{
+	unsigned int outside_hdrlen = ip_hdrlen(skb);
+	struct iphdr *inside_iph, _inside_iph;
+	struct icmphdr *icmph, _icmph;
+	__be16 *ports, _ports[2];
+
+	icmph = skb_header_pointer(skb, outside_hdrlen,
+				   sizeof(_icmph), &_icmph);
+	if (icmph == NULL)
+		return 1;
+
+	switch (icmph->type) {
+	case ICMP_DEST_UNREACH:
+	case ICMP_SOURCE_QUENCH:
+	case ICMP_REDIRECT:
+	case ICMP_TIME_EXCEEDED:
+	case ICMP_PARAMETERPROB:
+		break;
+	default:
+		return 1;
+	}
+
+	inside_iph = skb_header_pointer(skb, outside_hdrlen +
+					sizeof(struct icmphdr),
+					sizeof(_inside_iph), &_inside_iph);
+	if (inside_iph == NULL)
+		return 1;
+
+	if (inside_iph->protocol != IPPROTO_TCP &&
+	    inside_iph->protocol != IPPROTO_UDP)
+		return 1;
+
+	ports = skb_header_pointer(skb, outside_hdrlen + 
+				   sizeof(struct icmphdr) +
+				   (inside_iph->ihl << 2),
+				   sizeof(_ports), &_ports);
+	if (ports == NULL)
+		return 1;
+
+	/* the inside IP packet is the one quoted from our side, thus
+	 * its saddr is the local address */
+	*protocol = inside_iph->protocol;
+	*laddr = inside_iph->saddr;
+	*lport = ports[0];
+	*raddr = inside_iph->daddr;
+	*rport = ports[1];
+
+	return 0;
+}
+
+
+static bool
+socket_mt(const struct sk_buff *skb,
+	  const struct net_device *in,
+	  const struct net_device *out,
+	  const struct xt_match *match,
+	  const void *matchinfo,
+	  int offset,
+	  unsigned int protoff,
+	  bool *hotdrop)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	struct udphdr _hdr, *hp = NULL;
+	struct sock *sk;
+	__be32 daddr, saddr;
+	__be16 dport, sport;
+	u8 protocol;
+#ifdef XT_SOCKET_HAVE_CONNTRACK
+	struct nf_conn const *ct;
+	enum ip_conntrack_info ctinfo;
+#endif
+
+	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
+		hp = skb_header_pointer(skb, ip_hdrlen(skb),
+					sizeof(_hdr), &_hdr);
+		if (hp == NULL)
+			return false;
+
+		protocol = iph->protocol;
+		saddr = iph->saddr;
+		sport = hp->source;
+		daddr = iph->daddr;
+		dport = hp->dest;
+
+	} else if (iph->protocol == IPPROTO_ICMP) {
+		if (extract_icmp_fields(skb, &protocol, &saddr, &daddr,
+					&sport, &dport))
+			return false;
+	} else {
+		return false;
+	}
+
+#ifdef XT_SOCKET_HAVE_CONNTRACK
+	/* Do the lookup with the original socket address in case this is a
+	 * reply packet of an established SNAT-ted connection. */
+
+	ct = nf_ct_get(skb, &ctinfo);
+	if (ct && (ct != &nf_conntrack_untracked) &&
+	    ((iph->protocol != IPPROTO_ICMP &&
+	      ctinfo == IP_CT_IS_REPLY + IP_CT_ESTABLISHED) ||
+	     (iph->protocol == IPPROTO_ICMP &&
+	      ctinfo == IP_CT_IS_REPLY + IP_CT_RELATED)) &&
+	    (ct->status & IPS_SRC_NAT_DONE)) {
+
+		daddr = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip;
+		dport = (iph->protocol == IPPROTO_TCP) ?
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.tcp.port :
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.udp.port;
+	}
+#endif
+
+	sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), protocol,
+				   saddr, daddr, sport, dport, in, false);
+	if (sk != NULL) {
+		bool wildcard = (inet_sk(sk)->rcv_saddr == 0);
+
+		nf_tproxy_put_sock(sk);
+		if (wildcard)
+			sk = NULL;
+	}
+
+	pr_debug("socket match: proto %u %08x:%u -> %08x:%u "
+		 "(orig %08x:%u) sock %p\n",
+		 protocol, ntohl(saddr), ntohs(sport),
+		 ntohl(daddr), ntohs(dport),
+		 ntohl(iph->daddr), hp ? ntohs(hp->dest) : 0, sk);
+
+	return (sk != NULL);
+}
+
+static struct xt_match socket_mt_reg __read_mostly = {
+	.name		= "socket",
+	.family		= AF_INET,
+	.match		= socket_mt,
+	.hooks		= 1 << NF_INET_PRE_ROUTING,
+	.me		= THIS_MODULE,
+};
+
+static int __init socket_mt_init(void)
+{
+	nf_defrag_ipv4_enable();
+	return xt_register_match(&socket_mt_reg);
+}
+
+static void __exit socket_mt_exit(void)
+{
+	xt_unregister_match(&socket_mt_reg);
+}
+
+module_init(socket_mt_init);
+module_exit(socket_mt_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs, Balazs Scheidler");
+MODULE_DESCRIPTION("x_tables socket match module");
+MODULE_ALIAS("ipt_socket");



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 14/16] iptables socket match
  2008-10-02 10:26     ` KOVACS Krisztian
@ 2008-10-02 10:35       ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02 10:35 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, netdev, netfilter-devel

KOVACS Krisztian wrote:
> Add iptables 'socket' match, which matches packets for which a TCP/UDP
> socket lookup succeeds.
>   

Applied, thanks.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 00/16] Transparent proxying patches, take six
  2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
                   ` (15 preceding siblings ...)
  2008-10-01 14:24 ` [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set KOVACS Krisztian
@ 2008-10-02 13:20 ` Amos Jeffries
  2008-10-02 15:38   ` Patrick McHardy
  16 siblings, 1 reply; 64+ messages in thread
From: Amos Jeffries @ 2008-10-02 13:20 UTC (permalink / raw)
  To: netfilter-devel

KOVACS Krisztian wrote:
> Hi Dave,
> 
> This is the sixth round of transparent proxying patches recently
> discussed on the Netfilter Workshop. Since the last incarnation [1]
> we've added support for related ICMP packets in the socket
> match. Should apply cleanly on top of net-next-2.6. Could you please
> apply patches 1-11 (those touching core networking parts) and I'll ask
> Patrick McHardy to take care of patches 12-16 (the Netfilter parts).
>
<snip>
> 
> The last patch adds a short intro on how to use it. A trivial patch
> for netcat demonstrating the necessary modifications for proxies is
> available separately at [3]. Squid has support for it in the 3.HEAD
> (3.1) branch.
> 
> 
> References:
> [1] http://lwn.net/Articles/254527/
> [2] http://marc.info/?l=linux-netdev&m=118065358510836&...
> [3] http://people.netfilter.org/hidden/tproxy/netcat-ip_trans...
> 

Now that these patches got accepted. Can anyone update me on what 
release of iptables and/or kernel are expected to have native TPROXY 
target support?

Cheers and thanks for everything guys

Amos Jeffries
Squid Developer

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 00/16] Transparent proxying patches, take six
  2008-10-02 13:20 ` [net-next PATCH 00/16] Transparent proxying patches, take six Amos Jeffries
@ 2008-10-02 15:38   ` Patrick McHardy
  0 siblings, 0 replies; 64+ messages in thread
From: Patrick McHardy @ 2008-10-02 15:38 UTC (permalink / raw)
  To: Amos Jeffries; +Cc: netfilter-devel

Amos Jeffries wrote:
>
> Now that these patches got accepted. Can anyone update me on what 
> release of iptables and/or kernel are expected to have native TPROXY 
> target support? 
Kernel 2.6.28 and the corresponding iptables release (which should be 
1.4.3) will
contain all of this.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-01 15:51       ` David Miller
@ 2008-10-02 15:43         ` KOVACS Krisztian
  2008-10-02 17:09           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-02 15:43 UTC (permalink / raw)
  To: David Miller; +Cc: kaber, netdev, netfilter-devel

Hi,

On Wed, 2008-10-01 at 08:51 -0700, David Miller wrote:
> From: KOVACS Krisztian <hidden@sch.bme.hu>
> Date: Wed, 01 Oct 2008 17:38:20 +0200
> 
> > The problem is that if you include the if() test then you have to
> > include the lookup call as well and that's different for TCP/UDP.
> 
> No, I only mean to make a helper for this construct:
> 
> 	if (unlikely(skb->sk)) {
> 		...
> 	}
> 
> so, something like:
> 
> static inline struct sock *sock_skb_steal(struct sk_buff *skb)
> {
> 	if (unlikely(skb->sk)) {
> 		struct sock *sk = skb->sk;
> 
> 		skb->destructor = NULL;
> 		skb->sk = NULL;
> 		return sk;
> 	}
> 	return NULL;
> }
> 
> and then also get rid of the ifdefs at the place where
> these calls are made (TCP and UDP).

Something like this?

- 8< -

Use the socket reference cached in the skb if present. The iptables
TPROXY rule for example does a socket lookup early and attaches the
socket reference to the skb. So in the respective TCP/UDP/DCCP rcv()
routine we don't do a lookup but simply steal the socket reference from
the skb. 

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/sock.h  |   12 ++++++++++++
 net/dccp/ipv4.c     |    7 ++++---
 net/dccp/ipv6.c     |    9 +++++----
 net/ipv4/tcp_ipv4.c |    6 ++++--
 net/ipv4/udp.c      |    5 +++--
 net/ipv6/tcp_ipv6.c |    9 +++++----
 net/ipv6/udp.c      |    5 +++--
 7 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 75a312d..b60fdc1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1324,6 +1324,18 @@ static inline void sk_change_net(struct sock *sk, struct net *net)
 	sock_net_set(sk, hold_net(net));
 }
 
+static inline struct sock *sock_skb_steal(struct sk_buff *skb)
+{
+	if (unlikely(skb->sk)) {
+		struct sock *sk = skb->sk;
+
+		skb->destructor = NULL;
+		skb->sk = NULL;
+		return sk;
+	}
+	return NULL;
+}
+
 extern void sock_enable_timestamp(struct sock *sk);
 extern int sock_get_timestamp(struct sock *, struct timeval __user *);
 extern int sock_get_timestampns(struct sock *, struct timespec __user *);
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 882c5c4..6968ac8 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -811,9 +811,10 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			   iph->saddr, dh->dccph_sport,
-			   iph->daddr, dh->dccph_dport, inet_iif(skb));
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
+				   iph->saddr, dh->dccph_sport,
+				   iph->daddr, dh->dccph_dport, inet_iif(skb));
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 5e1ee0d..97d1405 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -805,10 +805,11 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
-			    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
-			    inet6_iif(skb));
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
+				    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
+				    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
+				    inet6_iif(skb));
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1ac4d05..e879037 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1577,8 +1577,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags	 = iph->tos;
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
-	sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
-			th->source, iph->daddr, th->dest, inet_iif(skb));
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
+				th->source, iph->daddr, th->dest, inet_iif(skb));
+
 	if (!sk)
 		goto no_tcp_socket;
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 28c3c31..874bceb 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1198,8 +1198,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 		return __udp4_lib_mcast_deliver(net, skb, uh,
 				saddr, daddr, udptable);
 
-	sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
-			uh->dest, inet_iif(skb), udptable);
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
+				uh->dest, inet_iif(skb), udptable);
 
 	if (sk != NULL) {
 		int ret = 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e85f377..c44b6d7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1681,10 +1681,11 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags = ipv6_get_dsfield(ipv6_hdr(skb));
 	TCP_SKB_CB(skb)->sacked = 0;
 
-	sk = __inet6_lookup(net, &tcp_hashinfo,
-			&ipv6_hdr(skb)->saddr, th->source,
-			&ipv6_hdr(skb)->daddr, ntohs(th->dest),
-			inet6_iif(skb));
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __inet6_lookup(net, &tcp_hashinfo,
+				&ipv6_hdr(skb)->saddr, th->source,
+				&ipv6_hdr(skb)->daddr, ntohs(th->dest),
+				inet6_iif(skb));
 
 	if (!sk)
 		goto no_tcp_socket;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index a6aecf7..e2741aa 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -488,8 +488,9 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 	 * check socket cache ... must talk to Alan about his plans
 	 * for sock caches... i'll skip this for now.
 	 */
-	sk = __udp6_lib_lookup(net, saddr, uh->source,
-			       daddr, uh->dest, inet6_iif(skb), udptable);
+	if (likely((sk = sock_skb_steal(skb)) == NULL))
+		sk = __udp6_lib_lookup(net, saddr, uh->source,
+				       daddr, uh->dest, inet6_iif(skb), udptable);
 
 	if (sk == NULL) {
 		if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-02 15:43         ` KOVACS Krisztian
@ 2008-10-02 17:09           ` Arnaldo Carvalho de Melo
  2008-10-02 19:58             ` David Miller
  2008-10-03  8:57             ` KOVACS Krisztian
  0 siblings, 2 replies; 64+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-10-02 17:09 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, kaber, netdev, netfilter-devel

Em Thu, Oct 02, 2008 at 05:43:20PM +0200, KOVACS Krisztian escreveu:
> Hi,
> 
> On Wed, 2008-10-01 at 08:51 -0700, David Miller wrote:
> > From: KOVACS Krisztian <hidden@sch.bme.hu>
> > Date: Wed, 01 Oct 2008 17:38:20 +0200
> > 
> > > The problem is that if you include the if() test then you have to
> > > include the lookup call as well and that's different for TCP/UDP.
> > 
> > No, I only mean to make a helper for this construct:
> > 
> > 	if (unlikely(skb->sk)) {
> > 		...
> > 	}
> > 
> > so, something like:
> > 
> > static inline struct sock *sock_skb_steal(struct sk_buff *skb)
> > {
> > 	if (unlikely(skb->sk)) {
> > 		struct sock *sk = skb->sk;
> > 
> > 		skb->destructor = NULL;
> > 		skb->sk = NULL;
> > 		return sk;
> > 	}
> > 	return NULL;
> > }
> > 
> > and then also get rid of the ifdefs at the place where
> > these calls are made (TCP and UDP).
> 
> Something like this?
> 
> - 8< -

Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
lookup routines? And please rename it to skb_steal_sock, as it acts on a
skb, not on a sock.

- Arnaldo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-02 17:09           ` Arnaldo Carvalho de Melo
@ 2008-10-02 19:58             ` David Miller
  2008-10-03  8:57             ` KOVACS Krisztian
  1 sibling, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-02 19:58 UTC (permalink / raw)
  To: acme; +Cc: hidden, kaber, netdev, netfilter-devel

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 2 Oct 2008 14:09:35 -0300

> Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
> lookup routines? And please rename it to skb_steal_sock, as it acts on a
> skb, not on a sock.

That seems to make sense to me.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-02 17:09           ` Arnaldo Carvalho de Melo
  2008-10-02 19:58             ` David Miller
@ 2008-10-03  8:57             ` KOVACS Krisztian
  2008-10-03 13:47               ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-03  8:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, KOVACS Krisztian, David Miller, kaber,
	netdev, netfilter-devel

Hi,

On cs, okt 02, 2008 at 02:09:35 -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Oct 02, 2008 at 05:43:20PM +0200, KOVACS Krisztian escreveu:
> > Hi,
> > 
> > On Wed, 2008-10-01 at 08:51 -0700, David Miller wrote:
> > > From: KOVACS Krisztian <hidden@sch.bme.hu>
> > > Date: Wed, 01 Oct 2008 17:38:20 +0200
> > > 
> > > > The problem is that if you include the if() test then you have to
> > > > include the lookup call as well and that's different for TCP/UDP.
> > > 
> > > No, I only mean to make a helper for this construct:
> > > 
> > > 	if (unlikely(skb->sk)) {
> > > 		...
> > > 	}
> > > 
> > > so, something like:
> > > 
> > > static inline struct sock *sock_skb_steal(struct sk_buff *skb)
> > > {
> > > 	if (unlikely(skb->sk)) {
> > > 		struct sock *sk = skb->sk;
> > > 
> > > 		skb->destructor = NULL;
> > > 		skb->sk = NULL;
> > > 		return sk;
> > > 	}
> > > 	return NULL;
> > > }
> > > 
> > > and then also get rid of the ifdefs at the place where
> > > these calls are made (TCP and UDP).
> > 
> > Something like this?
> > 
> > - 8< -
> 
> Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
> lookup routines? And please rename it to skb_steal_sock, as it acts on a
> skb, not on a sock.

Those functions don't have access to the skb so unless we change the
signature they won't be able to steal the reference.

The renaming totally makes sense, the current name is misleading.

- 8< -
Use the socket reference cached in the skb if present. The iptables
TPROXY rule for example does a socket lookup early and attaches the
socket reference to the skb. So in the respective TCP/UDP/DCCP rcv()
routine we don't do a lookup but simply steal the socket reference from
the skb. 

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/sock.h  |   12 ++++++++++++
 net/dccp/ipv4.c     |    7 ++++---
 net/dccp/ipv6.c     |    9 +++++----
 net/ipv4/tcp_ipv4.c |    6 ++++--
 net/ipv4/udp.c      |    5 +++--
 net/ipv6/tcp_ipv6.c |    9 +++++----
 net/ipv6/udp.c      |    5 +++--
 7 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 75a312d..18f9670 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1324,6 +1324,18 @@ static inline void sk_change_net(struct sock *sk, struct net *net)
 	sock_net_set(sk, hold_net(net));
 }
 
+static inline struct sock *skb_steal_sock(struct sk_buff *skb)
+{
+	if (unlikely(skb->sk)) {
+		struct sock *sk = skb->sk;
+
+		skb->destructor = NULL;
+		skb->sk = NULL;
+		return sk;
+	}
+	return NULL;
+}
+
 extern void sock_enable_timestamp(struct sock *sk);
 extern int sock_get_timestamp(struct sock *, struct timeval __user *);
 extern int sock_get_timestampns(struct sock *, struct timespec __user *);
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 882c5c4..bb329e8 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -811,9 +811,10 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			   iph->saddr, dh->dccph_sport,
-			   iph->daddr, dh->dccph_dport, inet_iif(skb));
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
+				   iph->saddr, dh->dccph_sport,
+				   iph->daddr, dh->dccph_dport, inet_iif(skb));
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 5e1ee0d..66de2a9 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -805,10 +805,11 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
-			    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
-			    inet6_iif(skb));
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
+				    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
+				    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
+				    inet6_iif(skb));
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1ac4d05..397bc1e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1577,8 +1577,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags	 = iph->tos;
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
-	sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
-			th->source, iph->daddr, th->dest, inet_iif(skb));
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
+				th->source, iph->daddr, th->dest, inet_iif(skb));
+
 	if (!sk)
 		goto no_tcp_socket;
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 28c3c31..4bdb4f5 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1198,8 +1198,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 		return __udp4_lib_mcast_deliver(net, skb, uh,
 				saddr, daddr, udptable);
 
-	sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
-			uh->dest, inet_iif(skb), udptable);
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
+				uh->dest, inet_iif(skb), udptable);
 
 	if (sk != NULL) {
 		int ret = 0;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e85f377..f477b1d 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1681,10 +1681,11 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags = ipv6_get_dsfield(ipv6_hdr(skb));
 	TCP_SKB_CB(skb)->sacked = 0;
 
-	sk = __inet6_lookup(net, &tcp_hashinfo,
-			&ipv6_hdr(skb)->saddr, th->source,
-			&ipv6_hdr(skb)->daddr, ntohs(th->dest),
-			inet6_iif(skb));
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __inet6_lookup(net, &tcp_hashinfo,
+				&ipv6_hdr(skb)->saddr, th->source,
+				&ipv6_hdr(skb)->daddr, ntohs(th->dest),
+				inet6_iif(skb));
 
 	if (!sk)
 		goto no_tcp_socket;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index a6aecf7..659c6a4 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -488,8 +488,9 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 	 * check socket cache ... must talk to Alan about his plans
 	 * for sock caches... i'll skip this for now.
 	 */
-	sk = __udp6_lib_lookup(net, saddr, uh->source,
-			       daddr, uh->dest, inet6_iif(skb), udptable);
+	if (likely((sk = skb_steal_sock(skb)) == NULL))
+		sk = __udp6_lib_lookup(net, saddr, uh->source,
+				       daddr, uh->dest, inet6_iif(skb), udptable);
 
 	if (sk == NULL) {
 		if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))




^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-03  8:57             ` KOVACS Krisztian
@ 2008-10-03 13:47               ` Arnaldo Carvalho de Melo
  2008-10-07  7:36                 ` KOVACS Krisztian
                                   ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-10-03 13:47 UTC (permalink / raw)
  To: KOVACS Krisztian
  Cc: Arnaldo Carvalho de Melo, David Miller, kaber, netdev,
	netfilter-devel

Em Fri, Oct 03, 2008 at 10:57:48AM +0200, KOVACS Krisztian escreveu:
> Hi,
> 
> On cs, okt 02, 2008 at 02:09:35 -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Oct 02, 2008 at 05:43:20PM +0200, KOVACS Krisztian escreveu:
> > > Hi,
> > > 
> > > On Wed, 2008-10-01 at 08:51 -0700, David Miller wrote:
> > > > From: KOVACS Krisztian <hidden@sch.bme.hu>
> > > > Date: Wed, 01 Oct 2008 17:38:20 +0200
> > > > 
> > > > > The problem is that if you include the if() test then you have to
> > > > > include the lookup call as well and that's different for TCP/UDP.
> > > > 
> > > > No, I only mean to make a helper for this construct:
> > > > 
> > > > 	if (unlikely(skb->sk)) {
> > > > 		...
> > > > 	}
> > > > 
> > > > so, something like:
> > > > 
> > > > static inline struct sock *sock_skb_steal(struct sk_buff *skb)
> > > > {
> > > > 	if (unlikely(skb->sk)) {
> > > > 		struct sock *sk = skb->sk;
> > > > 
> > > > 		skb->destructor = NULL;
> > > > 		skb->sk = NULL;
> > > > 		return sk;
> > > > 	}
> > > > 	return NULL;
> > > > }
> > > > 
> > > > and then also get rid of the ifdefs at the place where
> > > > these calls are made (TCP and UDP).
> > > 
> > > Something like this?
> > > 
> > > - 8< -
> > 
> > Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
> > lookup routines? And please rename it to skb_steal_sock, as it acts on a
> > skb, not on a sock.
> 
> Those functions don't have access to the skb so unless we change the
> signature they won't be able to steal the reference.

Indeed, but we should try to have the main TCP code flow clean, ditto for
DCCP, free of such details, so after this activitity settles down I'll
submit something like the patch below.

If Dave agrees and you feel like merging it on your current patchset,
feel free to do it.
 
> The renaming totally makes sense, the current name is misleading.

Thanks,

- Arnaldo

inet_hashtables: Add inet_lookup_skb helpers

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index e48989f..995efbb 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -91,6 +91,17 @@ static inline struct sock *__inet6_lookup(struct net *net,
 	return inet6_lookup_listener(net, hashinfo, daddr, hnum, dif);
 }
 
+static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
+					      struct sk_buff *skb,
+					      const __be16 sport,
+					      const __be16 dport)
+{
+	return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
+			      &ipv6_hdr(skb)->saddr, sport,
+			      &ipv6_hdr(skb)->daddr, ntohs(dport),
+			      inet6_iif(skb));
+}
+
 extern struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
 				 const struct in6_addr *saddr, const __be16 sport,
 				 const struct in6_addr *daddr, const __be16 dport,
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index bb619d8..481681d 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -371,6 +371,18 @@ static inline struct sock *inet_lookup(struct net *net,
 	return sk;
 }
 
+static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
+					     struct sk_buff *skb,
+					     const __be16 sport,
+					     const __be16 dport)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+
+	return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
+			     iph->saddr, sport,
+			     iph->daddr, dport, inet_iif(skb));
+}
+
 extern int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 		struct sock *sk, u32 port_offset,
 		int (*check_established)(struct inet_timewait_death_row *,
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 882c5c4..e3dfdda 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -811,9 +811,8 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			   iph->saddr, dh->dccph_sport,
-			   iph->daddr, dh->dccph_dport, inet_iif(skb));
+	sk = __inet_lookup_skb(&dccp_hashinfo, skb,
+			       dh->dccph_sport, dh->dccph_dport);
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 5e1ee0d..caa7f34 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -805,10 +805,8 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
-			    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
-			    inet6_iif(skb));
+	sk = __inet6_lookup_skb(&dccp_hashinfo, skb,
+			        dh->dccph_sport, dh->dccph_dport);
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1b4fee2..c3caae2 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1567,8 +1567,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags	 = iph->tos;
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
-	sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
-			th->source, iph->daddr, th->dest, inet_iif(skb));
+	sk = __inet_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
 	if (!sk)
 		goto no_tcp_socket;
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b585c85..ff5a99c 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1680,11 +1680,7 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags = ipv6_get_dsfield(ipv6_hdr(skb));
 	TCP_SKB_CB(skb)->sacked = 0;
 
-	sk = __inet6_lookup(net, &tcp_hashinfo,
-			&ipv6_hdr(skb)->saddr, th->source,
-			&ipv6_hdr(skb)->daddr, ntohs(th->dest),
-			inet6_iif(skb));
-
+	sk = __inet6_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
 	if (!sk)
 		goto no_tcp_socket;
 

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-01 14:24 ` [net-next PATCH 16/16] Add documentation KOVACS Krisztian
  2008-10-01 16:22   ` Randy Dunlap
@ 2008-10-03 14:01   ` Jan Engelhardt
  2008-10-07  7:01     ` KOVACS Krisztian
  2008-10-08  0:32     ` Philip Craig
  1 sibling, 2 replies; 64+ messages in thread
From: Jan Engelhardt @ 2008-10-03 14:01 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, Patrick McHardy, netdev, netfilter-devel


On Wednesday 2008-10-01 10:24, KOVACS Krisztian wrote:

>+Transparent proxy support
>+=========================
>+
>+This feature adds Linux 2.2-like transparent proxy support to current kernels.
>+To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
>+your kernel config. You will need policy routing too, so be sure to enable that
>+as well.

To use server-side transparent proxying (i.e. using a foreign address
when sending out packets), only tproxy_core is needed.

>+fd = socket(AF_INET, SOCK_STREAM, 0);

You want to be using IPPROTO_TCP here, as I doubt there is a guarantee
that 0 will never choose SCTP.

>+int value = 1;

Const is good:
	static const unsigned int value = 1;

>+setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
>+/* - 8< -*/
>+name.sin_family = AF_INET;
>+name.sin_port = htons(0xCAFE);
>+name.sin_addr.s_addr = htonl(0xDEADBEEF);

Replace last one by
	inet_pton(PF_INET, "192.0.2.37", &name.sin_addr);

(Hacking anything inside sin_addr is, strictly speaking, breaking the
“encapsulation”, as far as that “exists” in C.)

>+bind(fd, &name, sizeof(name));

You will need

	bind(fd, (const void *)&name, sizeof(name));

to avoid a compiler warning ;-)

>+2. Redirecting traffic
>+======================
>+
>+Transparent proxying often involves "intercepting" traffic on a router. This is
>+usually done with the iptables REDIRECT target, however, there are serious
>+limitations of that method. One of the major issues is that it actually
>+modifies the packets to change the destination address -- which might not be
>+acceptable in certain situations. (Think of proxying UDP for example: you won't
>+be able to find out the original destination address. Even in case of TCP
>+getting the original destination address is racy.)

IIRC, you _can_ find out, though I agree it's rather a hack (with 
tproxy, you can just use the address as received via recvmsg):

	getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, &sockaddr, &sizeptr);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 14/16] iptables socket match
  2008-10-02  9:26   ` Patrick McHardy
  2008-10-02 10:26     ` KOVACS Krisztian
@ 2008-10-03 14:04     ` Jan Engelhardt
  1 sibling, 0 replies; 64+ messages in thread
From: Jan Engelhardt @ 2008-10-03 14:04 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: KOVACS Krisztian, David Miller, netdev, netfilter-devel


On Thursday 2008-10-02 05:26, Patrick McHardy wrote:

> KOVACS Krisztian wrote:
>> Add iptables 'socket' match, which matches packets for which a TCP/UDP
>> socket lookup succeeds.
>
> It seems sufficiently different from what xt_owner does to justify a separate
> module.

I am with you on that. However, I would have had liked — already last year —
to have xt_owner revision 1 be called socket, because it is much more than just
ownage that xt_owner currently plays with, and because it could have saved now
running around with yet another module.
Well, xt_socket at last! ;-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-03 14:01   ` [net-next " Jan Engelhardt
@ 2008-10-07  7:01     ` KOVACS Krisztian
  2008-10-07 13:25       ` [patch] Update tproxy documentation Jan Engelhardt
  2008-10-07 19:50       ` [net-next PATCH 16/16] Add documentation David Miller
  2008-10-08  0:32     ` Philip Craig
  1 sibling, 2 replies; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-07  7:01 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: David Miller, Patrick McHardy, netdev, netfilter-devel

Hi,

On Fri, 2008-10-03 at 10:01 -0400, Jan Engelhardt wrote:
> On Wednesday 2008-10-01 10:24, KOVACS Krisztian wrote:
> 
> >+Transparent proxy support
> >+=========================
> >+
> >+This feature adds Linux 2.2-like transparent proxy support to current kernels.
> >+To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
> >+your kernel config. You will need policy routing too, so be sure to enable that
> >+as well.
> 
> To use server-side transparent proxying (i.e. using a foreign address
> when sending out packets), only tproxy_core is needed.
> 
> >+fd = socket(AF_INET, SOCK_STREAM, 0);
> 
> You want to be using IPPROTO_TCP here, as I doubt there is a guarantee
> that 0 will never choose SCTP.
> 
> >+int value = 1;
> 
> Const is good:
> 	static const unsigned int value = 1;
> 
> >+setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
> >+/* - 8< -*/
> >+name.sin_family = AF_INET;
> >+name.sin_port = htons(0xCAFE);
> >+name.sin_addr.s_addr = htonl(0xDEADBEEF);
> 
> Replace last one by
> 	inet_pton(PF_INET, "192.0.2.37", &name.sin_addr);
> 
> (Hacking anything inside sin_addr is, strictly speaking, breaking the
> “encapsulation”, as far as that “exists” in C.)
> 
> >+bind(fd, &name, sizeof(name));
> 
> You will need
> 
> 	bind(fd, (const void *)&name, sizeof(name));
> 
> to avoid a compiler warning ;-)

Jan, while you're right I think the point of the aim of the example is
to show you that you only need to set the IP_TRANSPARENT flag before
being able to bind to a non-local address.

I'm not opposed to the changes, though, so could you please send a patch
on top of Dave's current net-next tree? Thanks.

> 
> >+2. Redirecting traffic
> >+======================
> >+
> >+Transparent proxying often involves "intercepting" traffic on a router. This is
> >+usually done with the iptables REDIRECT target, however, there are serious
> >+limitations of that method. One of the major issues is that it actually
> >+modifies the packets to change the destination address -- which might not be
> >+acceptable in certain situations. (Think of proxying UDP for example: you won't
> >+be able to find out the original destination address. Even in case of TCP
> >+getting the original destination address is racy.)
> 
> IIRC, you _can_ find out, though I agree it's rather a hack (with 
> tproxy, you can just use the address as received via recvmsg):
> 
> 	getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, &sockaddr, &sizeptr);

This is true only if you have connection tracking loaded while the new
tproxy can be used without conntrack.

-- 
KOVACS Krisztian

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-03 13:47               ` Arnaldo Carvalho de Melo
@ 2008-10-07  7:36                 ` KOVACS Krisztian
  2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
  2008-10-07  7:42                 ` [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
  2008-10-07  7:59                 ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
  2 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-07  7:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, David Miller; +Cc: kaber, netdev, netfilter-devel

Hi,

On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> [...] 
> > > Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
> > > lookup routines? And please rename it to skb_steal_sock, as it acts on a
> > > skb, not on a sock.
> > 
> > Those functions don't have access to the skb so unless we change the
> > signature they won't be able to steal the reference.
> 
> Indeed, but we should try to have the main TCP code flow clean, ditto for
> DCCP, free of such details, so after this activitity settles down I'll
> submit something like the patch below.
> 
> If Dave agrees and you feel like merging it on your current patchset,
> feel free to do it.

Ok, I'll pick this up. Didn't compile because of missing includes in
inet_hashtables.h but I've fixed it.

-- 
KOVACS Krisztian 


inet_hashtables: Add inet_lookup_skb helpers

To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/inet6_hashtables.h |   11 +++++++++++
 include/net/inet_hashtables.h  |   14 ++++++++++++++
 net/dccp/ipv4.c                |    5 ++---
 net/dccp/ipv6.c                |    6 ++----
 net/ipv4/tcp_ipv4.c            |    3 +--
 net/ipv6/tcp_ipv6.c            |    6 +-----
 6 files changed, 31 insertions(+), 14 deletions(-)


diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index e48989f..995efbb 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -91,6 +91,17 @@ static inline struct sock *__inet6_lookup(struct net *net,
 	return inet6_lookup_listener(net, hashinfo, daddr, hnum, dif);
 }
 
+static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
+					      struct sk_buff *skb,
+					      const __be16 sport,
+					      const __be16 dport)
+{
+	return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
+			      &ipv6_hdr(skb)->saddr, sport,
+			      &ipv6_hdr(skb)->daddr, ntohs(dport),
+			      inet6_iif(skb));
+}
+
 extern struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
 				 const struct in6_addr *saddr, const __be16 sport,
 				 const struct in6_addr *daddr, const __be16 dport,
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index bb619d8..3522bbc 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -16,6 +16,7 @@
 
 
 #include <linux/interrupt.h>
+#include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/list.h>
 #include <linux/slab.h>
@@ -28,6 +29,7 @@
 #include <net/inet_connection_sock.h>
 #include <net/inet_sock.h>
 #include <net/sock.h>
+#include <net/route.h>
 #include <net/tcp_states.h>
 #include <net/netns/hash.h>
 
@@ -371,6 +373,18 @@ static inline struct sock *inet_lookup(struct net *net,
 	return sk;
 }
 
+static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
+					     struct sk_buff *skb,
+					     const __be16 sport,
+					     const __be16 dport)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+
+	return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
+			     iph->saddr, sport,
+			     iph->daddr, dport, inet_iif(skb));
+}
+
 extern int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 		struct sock *sk, u32 port_offset,
 		int (*check_established)(struct inet_timewait_death_row *,
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 882c5c4..e3dfdda 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -811,9 +811,8 @@ static int dccp_v4_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			   iph->saddr, dh->dccph_sport,
-			   iph->daddr, dh->dccph_dport, inet_iif(skb));
+	sk = __inet_lookup_skb(&dccp_hashinfo, skb,
+			       dh->dccph_sport, dh->dccph_dport);
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 5e1ee0d..caa7f34 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -805,10 +805,8 @@ static int dccp_v6_rcv(struct sk_buff *skb)
 
 	/* Step 2:
 	 *	Look up flow ID in table and get corresponding socket */
-	sk = __inet6_lookup(dev_net(skb->dst->dev), &dccp_hashinfo,
-			    &ipv6_hdr(skb)->saddr, dh->dccph_sport,
-			    &ipv6_hdr(skb)->daddr, ntohs(dh->dccph_dport),
-			    inet6_iif(skb));
+	sk = __inet6_lookup_skb(&dccp_hashinfo, skb,
+			        dh->dccph_sport, dh->dccph_dport);
 	/*
 	 * Step 2:
 	 *	If no socket ...
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1ac4d05..5215369 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1577,8 +1577,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags	 = iph->tos;
 	TCP_SKB_CB(skb)->sacked	 = 0;
 
-	sk = __inet_lookup(net, &tcp_hashinfo, iph->saddr,
-			th->source, iph->daddr, th->dest, inet_iif(skb));
+	sk = __inet_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
 	if (!sk)
 		goto no_tcp_socket;
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e85f377..37b189f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1681,11 +1681,7 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	TCP_SKB_CB(skb)->flags = ipv6_get_dsfield(ipv6_hdr(skb));
 	TCP_SKB_CB(skb)->sacked = 0;
 
-	sk = __inet6_lookup(net, &tcp_hashinfo,
-			&ipv6_hdr(skb)->saddr, th->source,
-			&ipv6_hdr(skb)->daddr, ntohs(th->dest),
-			inet6_iif(skb));
-
+	sk = __inet6_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
 	if (!sk)
 		goto no_tcp_socket;
 


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb)
  2008-10-03 13:47               ` Arnaldo Carvalho de Melo
  2008-10-07  7:36                 ` KOVACS Krisztian
@ 2008-10-07  7:42                 ` KOVACS Krisztian
  2008-10-07 12:34                   ` Arnaldo Carvalho de Melo
  2008-10-07  7:59                 ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
  2 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-07  7:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, David Miller; +Cc: kaber, netdev, netfilter-devel

Hi,

On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > Those functions don't have access to the skb so unless we change the
> > signature they won't be able to steal the reference.
> 
> Indeed, but we should try to have the main TCP code flow clean, ditto
> for
> DCCP, free of such details, so after this activitity settles down I'll
> submit something like the patch below.
> 
> If Dave agrees and you feel like merging it on your current patchset,
> feel free to do it.

And here are the skb lookup helpers for UDP.

-- 
KOVACS Krisztian


Add udplib_lookup_skb() helpers

To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 net/ipv4/udp.c |   14 ++++++++++++--
 net/ipv6/udp.c |   14 ++++++++++++--
 2 files changed, 24 insertions(+), 4 deletions(-)


diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 28c3c31..8369f4d 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -302,6 +302,17 @@ static struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 	return result;
 }
 
+static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
+						 __be16 sport, __be16 dport,
+						 struct hlist_head udptable[])
+{
+	const struct iphdr *iph = ip_hdr(skb);
+
+	return __udp4_lib_lookup(dev_net(skb->dst->dev), iph->saddr, sport,
+				 iph->daddr, dport, inet_iif(skb),
+				 udptable);
+}
+
 struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
 			     __be32 daddr, __be16 dport, int dif)
 {
@@ -1198,8 +1209,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 		return __udp4_lib_mcast_deliver(net, skb, uh,
 				saddr, daddr, udptable);
 
-	sk = __udp4_lib_lookup(net, saddr, uh->source, daddr,
-			uh->dest, inet_iif(skb), udptable);
+	sk = __udp4_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
 
 	if (sk != NULL) {
 		int ret = 0;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index a6aecf7..ce26c41 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -107,6 +107,17 @@ static struct sock *__udp6_lib_lookup(struct net *net,
 	return result;
 }
 
+static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
+					  __be16 sport, __be16 dport,
+					  struct hlist_head udptable[])
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+
+	return __udp6_lib_lookup(dev_net(skb->dst->dev), &iph->saddr, sport,
+				 &iph->daddr, dport, inet6_iif(skb),
+				 udptable);
+}
+
 /*
  * 	This should be easy, if there is something there we
  * 	return it, otherwise we block.
@@ -488,8 +499,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 	 * check socket cache ... must talk to Alan about his plans
 	 * for sock caches... i'll skip this for now.
 	 */
-	sk = __udp6_lib_lookup(net, saddr, uh->source,
-			       daddr, uh->dest, inet6_iif(skb), udptable);
+	sk = __udp6_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
 
 	if (sk == NULL) {
 		if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb)
  2008-10-03 13:47               ` Arnaldo Carvalho de Melo
  2008-10-07  7:36                 ` KOVACS Krisztian
  2008-10-07  7:42                 ` [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
@ 2008-10-07  7:59                 ` KOVACS Krisztian
  2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-07  7:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, David Miller; +Cc: kaber, netdev, netfilter-devel

Hi,

On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > Those functions don't have access to the skb so unless we change the
> > signature they won't be able to steal the reference.
> 
> Indeed, but we should try to have the main TCP code flow clean, ditto for
> DCCP, free of such details, so after this activitity settles down I'll
> submit something like the patch below.
> 
> If Dave agrees and you feel like merging it on your current patchset,
> feel free to do it.

And here's the modified use-cached-sk-if-present patch built on top of
the previous two x_lookup_skb() helper patches.

-- 
KOVACS Krisztian


Don't lookup the socket if there's a socket attached to the skb

Use the socket cached in the skb if it's present.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
---

 include/net/inet6_hashtables.h |   12 ++++++++----
 include/net/inet_hashtables.h  |    9 ++++++---
 include/net/sock.h             |   12 ++++++++++++
 net/ipv4/udp.c                 |    9 ++++++---
 net/ipv6/udp.c                 |    9 ++++++---
 5 files changed, 38 insertions(+), 13 deletions(-)


diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index 995efbb..f74665d 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -96,10 +96,14 @@ static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
 					      const __be16 sport,
 					      const __be16 dport)
 {
-	return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
-			      &ipv6_hdr(skb)->saddr, sport,
-			      &ipv6_hdr(skb)->daddr, ntohs(dport),
-			      inet6_iif(skb));
+	struct sock *sk;
+
+	if (unlikely(sk = skb_steal_sock(skb)))
+		return sk;
+	else return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
+				   &ipv6_hdr(skb)->saddr, sport,
+				   &ipv6_hdr(skb)->daddr, ntohs(dport),
+				   inet6_iif(skb));
 }
 
 extern struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 3522bbc..72b9ba5 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -378,11 +378,14 @@ static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
 					     const __be16 sport,
 					     const __be16 dport)
 {
+	struct sock *sk;
 	const struct iphdr *iph = ip_hdr(skb);
 
-	return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
-			     iph->saddr, sport,
-			     iph->daddr, dport, inet_iif(skb));
+	if (unlikely(sk = skb_steal_sock(skb)))
+		return sk;
+	else return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
+				  iph->saddr, sport,
+				  iph->daddr, dport, inet_iif(skb));
 }
 
 extern int __inet_hash_connect(struct inet_timewait_death_row *death_row,
diff --git a/include/net/sock.h b/include/net/sock.h
index 75a312d..18f9670 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1324,6 +1324,18 @@ static inline void sk_change_net(struct sock *sk, struct net *net)
 	sock_net_set(sk, hold_net(net));
 }
 
+static inline struct sock *skb_steal_sock(struct sk_buff *skb)
+{
+	if (unlikely(skb->sk)) {
+		struct sock *sk = skb->sk;
+
+		skb->destructor = NULL;
+		skb->sk = NULL;
+		return sk;
+	}
+	return NULL;
+}
+
 extern void sock_enable_timestamp(struct sock *sk);
 extern int sock_get_timestamp(struct sock *, struct timeval __user *);
 extern int sock_get_timestampns(struct sock *, struct timespec __user *);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8369f4d..219a4aa 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -306,11 +306,14 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 						 __be16 sport, __be16 dport,
 						 struct hlist_head udptable[])
 {
+	struct sock *sk;
 	const struct iphdr *iph = ip_hdr(skb);
 
-	return __udp4_lib_lookup(dev_net(skb->dst->dev), iph->saddr, sport,
-				 iph->daddr, dport, inet_iif(skb),
-				 udptable);
+	if (unlikely(sk = skb_steal_sock(skb)))
+		return sk;
+	else return __udp4_lib_lookup(dev_net(skb->dst->dev), iph->saddr, sport,
+				      iph->daddr, dport, inet_iif(skb),
+				      udptable);
 }
 
 struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index ce26c41..95a2b56 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -111,11 +111,14 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 					  __be16 sport, __be16 dport,
 					  struct hlist_head udptable[])
 {
+	struct sock *sk;
 	struct ipv6hdr *iph = ipv6_hdr(skb);
 
-	return __udp6_lib_lookup(dev_net(skb->dst->dev), &iph->saddr, sport,
-				 &iph->daddr, dport, inet6_iif(skb),
-				 udptable);
+	if (unlikely(sk = skb_steal_sock(skb)))
+		return sk;
+	else return __udp6_lib_lookup(dev_net(skb->dst->dev), &iph->saddr, sport,
+				      &iph->daddr, dport, inet6_iif(skb),
+				      udptable);
 }
 
 /*



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb)
  2008-10-07  7:42                 ` [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
@ 2008-10-07 12:34                   ` Arnaldo Carvalho de Melo
  2008-10-07 19:39                     ` [net-next PATCH] Add udplib_lookup_skb() helpers David Miller
  0 siblings, 1 reply; 64+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-10-07 12:34 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, kaber, netdev, netfilter-devel

Em Tue, Oct 07, 2008 at 09:42:12AM +0200, KOVACS Krisztian escreveu:
> Hi,
> 
> On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > > Those functions don't have access to the skb so unless we change the
> > > signature they won't be able to steal the reference.
> > 
> > Indeed, but we should try to have the main TCP code flow clean, ditto
> > for
> > DCCP, free of such details, so after this activitity settles down I'll
> > submit something like the patch below.
> > 
> > If Dave agrees and you feel like merging it on your current patchset,
> > feel free to do it.
> 
> And here are the skb lookup helpers for UDP.

Thanks!

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb)
  2008-10-07  7:59                 ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
@ 2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
  2008-10-07 19:41                     ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb David Miller
  0 siblings, 1 reply; 64+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-10-07 12:36 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, kaber, netdev, netfilter-devel

Em Tue, Oct 07, 2008 at 09:59:29AM +0200, KOVACS Krisztian escreveu:
> Hi,
> 
> On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > > Those functions don't have access to the skb so unless we change the
> > > signature they won't be able to steal the reference.
> > 
> > Indeed, but we should try to have the main TCP code flow clean, ditto for
> > DCCP, free of such details, so after this activitity settles down I'll
> > submit something like the patch below.
> > 
> > If Dave agrees and you feel like merging it on your current patchset,
> > feel free to do it.
> 
> And here's the modified use-cached-sk-if-present patch built on top of
> the previous two x_lookup_skb() helper patches.
> 
> -- 
> KOVACS Krisztian
> 
> 
> Don't lookup the socket if there's a socket attached to the skb
> 
> Use the socket cached in the skb if it's present.
> 
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
> ---
> 
>  include/net/inet6_hashtables.h |   12 ++++++++----
>  include/net/inet_hashtables.h  |    9 ++++++---
>  include/net/sock.h             |   12 ++++++++++++
>  net/ipv4/udp.c                 |    9 ++++++---
>  net/ipv6/udp.c                 |    9 ++++++---
>  5 files changed, 38 insertions(+), 13 deletions(-)
> 
> 
> diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
> index 995efbb..f74665d 100644
> --- a/include/net/inet6_hashtables.h
> +++ b/include/net/inet6_hashtables.h
> @@ -96,10 +96,14 @@ static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
>  					      const __be16 sport,
>  					      const __be16 dport)
>  {
> -	return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
> -			      &ipv6_hdr(skb)->saddr, sport,
> -			      &ipv6_hdr(skb)->daddr, ntohs(dport),
> -			      inet6_iif(skb));
> +	struct sock *sk;
> +
> +	if (unlikely(sk = skb_steal_sock(skb)))
> +		return sk;
> +	else return __inet6_lookup(dev_net(skb->dst->dev), hashinfo,
> +				   &ipv6_hdr(skb)->saddr, sport,
> +				   &ipv6_hdr(skb)->daddr, ntohs(dport),
> +				   inet6_iif(skb));
>  }
>  
>  extern struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
> diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
> index 3522bbc..72b9ba5 100644
> --- a/include/net/inet_hashtables.h
> +++ b/include/net/inet_hashtables.h
> @@ -378,11 +378,14 @@ static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
>  					     const __be16 sport,
>  					     const __be16 dport)
>  {
> +	struct sock *sk;
>  	const struct iphdr *iph = ip_hdr(skb);
>  
> -	return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
> -			     iph->saddr, sport,
> -			     iph->daddr, dport, inet_iif(skb));
> +	if (unlikely(sk = skb_steal_sock(skb)))
> +		return sk;
> +	else return __inet_lookup(dev_net(skb->dst->dev), hashinfo,
> +				  iph->saddr, sport,
> +				  iph->daddr, dport, inet_iif(skb));

return on a different line, please:

	else
		return

>  }
>  
>  extern int __inet_hash_connect(struct inet_timewait_death_row *death_row,
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 75a312d..18f9670 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1324,6 +1324,18 @@ static inline void sk_change_net(struct sock *sk, struct net *net)
>  	sock_net_set(sk, hold_net(net));
>  }
>  
> +static inline struct sock *skb_steal_sock(struct sk_buff *skb)
> +{
> +	if (unlikely(skb->sk)) {
> +		struct sock *sk = skb->sk;
> +
> +		skb->destructor = NULL;
> +		skb->sk = NULL;
> +		return sk;
> +	}
> +	return NULL;
> +}
> +
>  extern void sock_enable_timestamp(struct sock *sk);
>  extern int sock_get_timestamp(struct sock *, struct timeval __user *);
>  extern int sock_get_timestampns(struct sock *, struct timespec __user *);
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 8369f4d..219a4aa 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -306,11 +306,14 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
>  						 __be16 sport, __be16 dport,
>  						 struct hlist_head udptable[])
>  {
> +	struct sock *sk;
>  	const struct iphdr *iph = ip_hdr(skb);
>  
> -	return __udp4_lib_lookup(dev_net(skb->dst->dev), iph->saddr, sport,
> -				 iph->daddr, dport, inet_iif(skb),
> -				 udptable);
> +	if (unlikely(sk = skb_steal_sock(skb)))
> +		return sk;
> +	else return __udp4_lib_lookup(dev_net(skb->dst->dev), iph->saddr, sport,
> +				      iph->daddr, dport, inet_iif(skb),
> +				      udptable);

ditto

>  }
>  
>  struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index ce26c41..95a2b56 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -111,11 +111,14 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
>  					  __be16 sport, __be16 dport,
>  					  struct hlist_head udptable[])
>  {
> +	struct sock *sk;
>  	struct ipv6hdr *iph = ipv6_hdr(skb);
>  
> -	return __udp6_lib_lookup(dev_net(skb->dst->dev), &iph->saddr, sport,
> -				 &iph->daddr, dport, inet6_iif(skb),
> -				 udptable);
> +	if (unlikely(sk = skb_steal_sock(skb)))
> +		return sk;
> +	else return __udp6_lib_lookup(dev_net(skb->dst->dev), &iph->saddr, sport,
> +				      &iph->daddr, dport, inet6_iif(skb),
> +				      udptable);

ditto


After you fix this up, please add:

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-07  7:36                 ` KOVACS Krisztian
@ 2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
  2008-10-07 18:42                     ` David Miller
  0 siblings, 1 reply; 64+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-10-07 12:36 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: David Miller, kaber, netdev, netfilter-devel

Em Tue, Oct 07, 2008 at 09:36:24AM +0200, KOVACS Krisztian escreveu:
> Hi,
> 
> On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > [...] 
> > > > Why don't you add it to __inet6_lookup, __inet6_lookup and the udp_lib
> > > > lookup routines? And please rename it to skb_steal_sock, as it acts on a
> > > > skb, not on a sock.
> > > 
> > > Those functions don't have access to the skb so unless we change the
> > > signature they won't be able to steal the reference.
> > 
> > Indeed, but we should try to have the main TCP code flow clean, ditto for
> > DCCP, free of such details, so after this activitity settles down I'll
> > submit something like the patch below.
> > 
> > If Dave agrees and you feel like merging it on your current patchset,
> > feel free to do it.
> 
> Ok, I'll pick this up. Didn't compile because of missing includes in
> inet_hashtables.h but I've fixed it.
> 
> -- 
> KOVACS Krisztian 
> 
> 
> inet_hashtables: Add inet_lookup_skb helpers
> 
> To be able to use the cached socket reference in the skb during input
> processing we add a new set of lookup functions that receive the skb on
> their argument list.
> 
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>

Thanks a lot!

- Arnaldo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch] Update tproxy documentation
  2008-10-07  7:01     ` KOVACS Krisztian
@ 2008-10-07 13:25       ` Jan Engelhardt
  2008-10-07 19:50       ` [net-next PATCH 16/16] Add documentation David Miller
  1 sibling, 0 replies; 64+ messages in thread
From: Jan Engelhardt @ 2008-10-07 13:25 UTC (permalink / raw)
  To: kaber; +Cc: Netfilter Developer Mailing List, KOVACS Krisztian


On Tuesday 2008-10-07 03:01, KOVACS Krisztian wrote:
>
>I'm not opposed to the changes, though, so could you please send a patch
>on top of Dave's current net-next tree? Thanks.
>

This one goes on top of Patrick's net-next, because it does not exist in 
Dave's yet :)

commit b18b26b12062a7d1e866e0215e734561c7279259
Author: Jan Engelhardt <jengelh@medozas.de>
Date:   Tue Oct 7 09:20:39 2008 -0400

netfilter: update tproxy documentation

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 Documentation/networking/tproxy.txt |   24 ++++++++++++++----------
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
index 7b5996d..db7e808 100644
--- a/Documentation/networking/tproxy.txt
+++ b/Documentation/networking/tproxy.txt
@@ -27,15 +27,17 @@ modify your application to allow it to send datagrams _from_ non-local IP
 addresses. All you have to do is enable the (SOL_IP, IP_TRANSPARENT) socket
 option before calling bind:
 
-fd = socket(AF_INET, SOCK_STREAM, 0);
-/* - 8< -*/
-int value = 1;
-setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
-/* - 8< -*/
-name.sin_family = AF_INET;
-name.sin_port = htons(0xCAFE);
-name.sin_addr.s_addr = htonl(0xDEADBEEF);
-bind(fd, &name, sizeof(name));
+	struct sockaddr_in name;
+	int fd;
+	fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+	/* - 8< -*/
+	static const unsigned int value = 1;
+	setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
+	/* - 8< -*/
+	name.sin_family = AF_INET;
+	name.sin_port = htons(0xCAFE);
+	inet_pton(PF_INET, "192.0.2.37", &name.sin_addr);
+	bind(fd, (const void *)&name, sizeof(name));
 
 A trivial patch for netcat is available here:
 http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
@@ -50,7 +52,9 @@ limitations of that method. One of the major issues is that it actually
 modifies the packets to change the destination address -- which might not be
 acceptable in certain situations. (Think of proxying UDP for example: you won't
 be able to find out the original destination address. Even in case of TCP
-getting the original destination address is racy.)
+getting the original destination address is racy. Obtaining the address via
+getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, ...) also requires connection tracking,
+which may not be loaded or desired.)
 
 The 'TPROXY' target provides similar functionality without relying on NAT. Simply
 add rules like this to the iptables ruleset above:

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb
  2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
@ 2008-10-07 18:42                     ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-07 18:42 UTC (permalink / raw)
  To: acme; +Cc: hidden, kaber, netdev, netfilter-devel

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue, 7 Oct 2008 09:36:57 -0300

> Em Tue, Oct 07, 2008 at 09:36:24AM +0200, KOVACS Krisztian escreveu:
> > inet_hashtables: Add inet_lookup_skb helpers
> > 
> > To be able to use the cached socket reference in the skb during input
> > processing we add a new set of lookup functions that receive the skb on
> > their argument list.
> > 
> > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > 
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
> 
> Thanks a lot!

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH] Add udplib_lookup_skb() helpers
  2008-10-07 12:34                   ` Arnaldo Carvalho de Melo
@ 2008-10-07 19:39                     ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-07 19:39 UTC (permalink / raw)
  To: acme; +Cc: hidden, kaber, netdev, netfilter-devel

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue, 7 Oct 2008 09:34:54 -0300

> Em Tue, Oct 07, 2008 at 09:42:12AM +0200, KOVACS Krisztian escreveu:
> > Hi,
> > 
> > On Fri, 2008-10-03 at 10:47 -0300, Arnaldo Carvalho de Melo wrote:
> > > > Those functions don't have access to the skb so unless we change the
> > > > signature they won't be able to steal the reference.
> > > 
> > > Indeed, but we should try to have the main TCP code flow clean, ditto
> > > for
> > > DCCP, free of such details, so after this activitity settles down I'll
> > > submit something like the patch below.
> > > 
> > > If Dave agrees and you feel like merging it on your current patchset,
> > > feel free to do it.
> > 
> > And here are the skb lookup helpers for UDP.
> 
> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb
  2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
@ 2008-10-07 19:41                     ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-07 19:41 UTC (permalink / raw)
  To: acme; +Cc: hidden, kaber, netdev, netfilter-devel

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue, 7 Oct 2008 09:36:26 -0300

> Em Tue, Oct 07, 2008 at 09:59:29AM +0200, KOVACS Krisztian escreveu:
> > Don't lookup the socket if there's a socket attached to the skb
> > 
> > Use the socket cached in the skb if it's present.
> > 
> > Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
 ...
> After you fix this up, please add:
> 
> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

I made the requested fixups for him and applied the result
to net-next-2.6, thanks!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-07  7:01     ` KOVACS Krisztian
  2008-10-07 13:25       ` [patch] Update tproxy documentation Jan Engelhardt
@ 2008-10-07 19:50       ` David Miller
  2008-10-07 20:02         ` KOVACS Krisztian
  1 sibling, 1 reply; 64+ messages in thread
From: David Miller @ 2008-10-07 19:50 UTC (permalink / raw)
  To: hidden; +Cc: jengelh, kaber, netdev, netfilter-devel


Randy Dunlap asked for some corrections to this documentation patch,
and I also think that Patrick should take this one since it only
makes sense once the netfilter side of this patch set is present.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-07 19:50       ` [net-next PATCH 16/16] Add documentation David Miller
@ 2008-10-07 20:02         ` KOVACS Krisztian
  2008-10-07 20:47           ` Patrick McHardy
  0 siblings, 1 reply; 64+ messages in thread
From: KOVACS Krisztian @ 2008-10-07 20:02 UTC (permalink / raw)
  To: David Miller; +Cc: hidden, jengelh, kaber, netdev, netfilter-devel

Hi,

On k, okt 07, 2008 at 12:50:51 -0700, David Miller wrote:
> Randy Dunlap asked for some corrections to this documentation patch,
> and I also think that Patrick should take this one since it only
> makes sense once the netfilter side of this patch set is present.

Sure, thanks a lot. The corrections from Randy Dunlap are already in
Patrick's tree.

-- 
KOVACS Krisztian

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-07 20:02         ` KOVACS Krisztian
@ 2008-10-07 20:47           ` Patrick McHardy
  2008-10-07 20:53             ` David Miller
  0 siblings, 1 reply; 64+ messages in thread
From: Patrick McHardy @ 2008-10-07 20:47 UTC (permalink / raw)
  To: David Miller, hidden, jengelh, kaber, netdev, netfilter-devel

KOVACS Krisztian wrote:
> Hi,
>
> On k, okt 07, 2008 at 12:50:51 -0700, David Miller wrote:
>   
>> Randy Dunlap asked for some corrections to this documentation patch,
>> and I also think that Patrick should take this one since it only
>> makes sense once the netfilter side of this patch set is present.
>>     
>
> Sure, thanks a lot. The corrections from Randy Dunlap are already in
> Patrick's tree.al
>   

Just FYI: I hope to finally get all the netfilter patches out
by tommorrow.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-07 20:47           ` Patrick McHardy
@ 2008-10-07 20:53             ` David Miller
  0 siblings, 0 replies; 64+ messages in thread
From: David Miller @ 2008-10-07 20:53 UTC (permalink / raw)
  To: kaber; +Cc: hidden, jengelh, netdev, netfilter-devel

From: Patrick McHardy <kaber@trash.net>
Date: Tue, 07 Oct 2008 22:47:30 +0200

> KOVACS Krisztian wrote:
> > Hi,
> >
> > On k, okt 07, 2008 at 12:50:51 -0700, David Miller wrote:
> >   
> >> Randy Dunlap asked for some corrections to this documentation patch,
> >> and I also think that Patrick should take this one since it only
> >> makes sense once the netfilter side of this patch set is present.
> >>     
> >
> > Sure, thanks a lot. The corrections from Randy Dunlap are already in
> > Patrick's tree.al
> >   
> 
> Just FYI: I hope to finally get all the netfilter patches out
> by tommorrow.

I was just about to ping you about this, thanks :-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [net-next PATCH 16/16] Add documentation
  2008-10-03 14:01   ` [net-next " Jan Engelhardt
  2008-10-07  7:01     ` KOVACS Krisztian
@ 2008-10-08  0:32     ` Philip Craig
  1 sibling, 0 replies; 64+ messages in thread
From: Philip Craig @ 2008-10-08  0:32 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: KOVACS Krisztian, David Miller, Patrick McHardy, netdev,
	netfilter-devel

Jan Engelhardt wrote:
>> +2. Redirecting traffic
>> +======================
>> +
>> +Transparent proxying often involves "intercepting" traffic on a router. This is
>> +usually done with the iptables REDIRECT target, however, there are serious
>> +limitations of that method. One of the major issues is that it actually
>> +modifies the packets to change the destination address -- which might not be
>> +acceptable in certain situations. (Think of proxying UDP for example: you won't
>> +be able to find out the original destination address. Even in case of TCP
>> +getting the original destination address is racy.)
> 
> IIRC, you _can_ find out, though I agree it's rather a hack (with 
> tproxy, you can just use the address as received via recvmsg):
> 
> 	getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, &sockaddr, &sizeptr);

Yes, but the problem is that SO_ORIGINAL_DST is only implemented for TCP.
And I guess that the race for TCP is that the conntrack may not exist when you
call getsockopt() (not sure that is something you'll hit in practice though).


^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2008-10-08  0:32 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-01 14:24 [net-next PATCH 00/16] Transparent proxying patches, take six KOVACS Krisztian
2008-10-01 14:24 ` [net-next PATCH 07/16] Make Netfilter's ip_route_me_harder() non-local address compatible KOVACS Krisztian
2008-10-01 14:45   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 02/16] Implement IP_TRANSPARENT socket option KOVACS Krisztian
2008-10-01 14:30   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 15/16] iptables TPROXY target KOVACS Krisztian
2008-10-02  9:28   ` Patrick McHardy
2008-10-01 14:24 ` [net-next PATCH 08/16] Port redirection support for TCP KOVACS Krisztian
2008-10-01 14:47   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 09/16] Export UDP socket lookup function KOVACS Krisztian
2008-10-01 14:48   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 13/16] iptables tproxy core KOVACS Krisztian
2008-10-02  9:19   ` Patrick McHardy
2008-10-01 14:24 ` [net-next PATCH 05/16] Conditionally enable transparent flow flag when connecting KOVACS Krisztian
2008-10-01 14:36   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 16/16] Add documentation KOVACS Krisztian
2008-10-01 16:22   ` Randy Dunlap
2008-10-02  9:37     ` [RESEND net-next " KOVACS Krisztian
2008-10-02  9:38       ` Patrick McHardy
2008-10-03 14:01   ` [net-next " Jan Engelhardt
2008-10-07  7:01     ` KOVACS Krisztian
2008-10-07 13:25       ` [patch] Update tproxy documentation Jan Engelhardt
2008-10-07 19:50       ` [net-next PATCH 16/16] Add documentation David Miller
2008-10-07 20:02         ` KOVACS Krisztian
2008-10-07 20:47           ` Patrick McHardy
2008-10-07 20:53             ` David Miller
2008-10-08  0:32     ` Philip Craig
2008-10-01 14:24 ` [net-next PATCH 06/16] Handle TCP SYN+ACK/ACK/RST transparency KOVACS Krisztian
2008-10-01 14:42   ` David Miller
2008-10-01 14:46     ` KOVACS Krisztian
2008-10-01 14:24 ` [net-next PATCH 11/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
2008-10-01 14:24 ` [net-next PATCH 01/16] Loosen source address check on IPv4 output KOVACS Krisztian
2008-10-01 14:28   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 12/16] Split Netfilter IPv4 defragmentation into a separate module KOVACS Krisztian
2008-10-02  9:18   ` Patrick McHardy
2008-10-01 14:24 ` [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb KOVACS Krisztian
2008-10-01 14:50   ` David Miller
2008-10-01 15:38     ` KOVACS Krisztian
2008-10-01 15:51       ` David Miller
2008-10-02 15:43         ` KOVACS Krisztian
2008-10-02 17:09           ` Arnaldo Carvalho de Melo
2008-10-02 19:58             ` David Miller
2008-10-03  8:57             ` KOVACS Krisztian
2008-10-03 13:47               ` Arnaldo Carvalho de Melo
2008-10-07  7:36                 ` KOVACS Krisztian
2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
2008-10-07 18:42                     ` David Miller
2008-10-07  7:42                 ` [net-next PATCH] Add udplib_lookup_skb() helpers (was: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
2008-10-07 12:34                   ` Arnaldo Carvalho de Melo
2008-10-07 19:39                     ` [net-next PATCH] Add udplib_lookup_skb() helpers David Miller
2008-10-07  7:59                 ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb (was: Re: [net-next PATCH 10/16] Don't lookup the socket if there's a socket attached to the skb) KOVACS Krisztian
2008-10-07 12:36                   ` Arnaldo Carvalho de Melo
2008-10-07 19:41                     ` [net-next PATCH] Don't lookup the socket if there's a socket attached to the skb David Miller
2008-10-01 14:24 ` [net-next PATCH 14/16] iptables socket match KOVACS Krisztian
2008-10-02  9:26   ` Patrick McHardy
2008-10-02 10:26     ` KOVACS Krisztian
2008-10-02 10:35       ` Patrick McHardy
2008-10-03 14:04     ` Jan Engelhardt
2008-10-01 14:24 ` [net-next PATCH 04/16] Make inet_sock.h independent of route.h KOVACS Krisztian
2008-10-01 14:34   ` David Miller
2008-10-01 14:24 ` [net-next PATCH 03/16] Allow binding to non-local addresses if IP_TRANSPARENT is set KOVACS Krisztian
2008-10-01 14:31   ` David Miller
2008-10-02 13:20 ` [net-next PATCH 00/16] Transparent proxying patches, take six Amos Jeffries
2008-10-02 15:38   ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).