* [PATCH 0/3] tproxy fixes for current upstream code
@ 2010-10-20 11:21 KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple KOVACS Krisztian
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: KOVACS Krisztian @ 2010-10-20 11:21 UTC (permalink / raw)
To: netdev, netfilter-devel; +Cc: Patrick McHardy, David Miller
The following series fix a handful of issues which have been found in the
current upstream IPv4 tproxy code:
* an issue with how port redirection interacts with TCP TIME_WAIT sockets
* UDP socket lookup fixes so that now it prefers connected sockets, etc.
* fix for a bind hash issue which could trigger crashes when port redirection
was used.
---
Balazs Scheidler (2):
tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple
tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4()
KOVACS Krisztian (1):
tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()
include/net/inet_hashtables.h | 2 -
include/net/netfilter/nf_tproxy_core.h | 120 +++++++++++++++++++++++++++++++-
net/dccp/ipv4.c | 10 ++-
net/dccp/ipv6.c | 10 ++-
net/ipv4/inet_hashtables.c | 28 +++++++
net/ipv4/tcp_ipv4.c | 10 ++-
net/ipv6/tcp_ipv6.c | 12 ++-
net/netfilter/nf_tproxy_core.c | 35 ---------
net/netfilter/xt_TPROXY.c | 68 +++++++++++++++++-
net/netfilter/xt_socket.c | 2 -
10 files changed, 238 insertions(+), 59 deletions(-)
--
KOVACS Krisztian
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()
2010-10-20 11:21 [PATCH 0/3] tproxy fixes for current upstream code KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() KOVACS Krisztian
@ 2010-10-20 11:21 ` KOVACS Krisztian
2010-10-21 11:08 ` Patrick McHardy
2 siblings, 1 reply; 7+ messages in thread
From: KOVACS Krisztian @ 2010-10-20 11:21 UTC (permalink / raw)
To: netdev, netfilter-devel; +Cc: Patrick McHardy, David Miller
When __inet_inherit_port() is called on a tproxy connection the wrong locks are
held for the inet_bind_bucket it is added to. __inet_inherit_port() made an
implicit assumption that the listener's port number (and thus its bind bucket).
Unfortunately, if you're using the TPROXY target to redirect skbs to a
transparent proxy that assumption is not true anymore and things break.
This patch adds code to __inet_inherit_port() so that it can handle this case
by looking up or creating a new bind bucket for the child socket and updates
callers of __inet_inherit_port() to gracefully handle __inet_inherit_port()
failing.
Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>.
See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion.
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
include/net/inet_hashtables.h | 2 +-
net/dccp/ipv4.c | 10 +++++++---
net/dccp/ipv6.c | 10 +++++++---
net/ipv4/inet_hashtables.c | 28 ++++++++++++++++++++++++++--
net/ipv4/tcp_ipv4.c | 10 +++++++---
net/ipv6/tcp_ipv6.c | 12 ++++++++----
6 files changed, 56 insertions(+), 16 deletions(-)
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 74358d1..e9c2ed8 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -245,7 +245,7 @@ static inline int inet_sk_listen_hashfn(const struct sock *sk)
}
/* Caller must disable local BH processing. */
-extern void __inet_inherit_port(struct sock *sk, struct sock *child);
+extern int __inet_inherit_port(struct sock *sk, struct sock *child);
extern void inet_put_port(struct sock *sk);
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index d4a166f..3f69ea1 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -392,7 +392,7 @@ struct sock *dccp_v4_request_recv_sock(struct sock *sk, struct sk_buff *skb,
newsk = dccp_create_openreq_child(sk, req, skb);
if (newsk == NULL)
- goto exit;
+ goto exit_nonewsk;
sk_setup_caps(newsk, dst);
@@ -409,16 +409,20 @@ struct sock *dccp_v4_request_recv_sock(struct sock *sk, struct sk_buff *skb,
dccp_sync_mss(newsk, dst_mtu(dst));
+ if (__inet_inherit_port(sk, newsk) < 0) {
+ sock_put(newsk);
+ goto exit;
+ }
__inet_hash_nolisten(newsk, NULL);
- __inet_inherit_port(sk, newsk);
return newsk;
exit_overflow:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+exit_nonewsk:
+ dst_release(dst);
exit:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
- dst_release(dst);
return NULL;
}
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 6e3f325..dca711d 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -564,7 +564,7 @@ static struct sock *dccp_v6_request_recv_sock(struct sock *sk,
newsk = dccp_create_openreq_child(sk, req, skb);
if (newsk == NULL)
- goto out;
+ goto out_nonewsk;
/*
* No need to charge this sock to the relevant IPv6 refcnt debug socks
@@ -632,18 +632,22 @@ static struct sock *dccp_v6_request_recv_sock(struct sock *sk,
newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6;
newinet->inet_rcv_saddr = LOOPBACK4_IPV6;
+ if (__inet_inherit_port(sk, newsk) < 0) {
+ sock_put(newsk);
+ goto out;
+ }
__inet6_hash(newsk, NULL);
- __inet_inherit_port(sk, newsk);
return newsk;
out_overflow:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+out_nonewsk:
+ dst_release(dst);
out:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
if (opt != NULL && opt != np->opt)
sock_kfree_s(sk, opt, opt->tot_len);
- dst_release(dst);
return NULL;
}
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index fb7ad5a..1b344f3 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -101,19 +101,43 @@ void inet_put_port(struct sock *sk)
}
EXPORT_SYMBOL(inet_put_port);
-void __inet_inherit_port(struct sock *sk, struct sock *child)
+int __inet_inherit_port(struct sock *sk, struct sock *child)
{
struct inet_hashinfo *table = sk->sk_prot->h.hashinfo;
- const int bhash = inet_bhashfn(sock_net(sk), inet_sk(child)->inet_num,
+ unsigned short port = inet_sk(child)->inet_num;
+ const int bhash = inet_bhashfn(sock_net(sk), port,
table->bhash_size);
struct inet_bind_hashbucket *head = &table->bhash[bhash];
struct inet_bind_bucket *tb;
spin_lock(&head->lock);
tb = inet_csk(sk)->icsk_bind_hash;
+ if (tb->port != port) {
+ /* NOTE: using tproxy and redirecting skbs to a proxy
+ * on a different listener port breaks the assumption
+ * that the listener socket's icsk_bind_hash is the same
+ * as that of the child socket. We have to look up or
+ * create a new bind bucket for the child here. */
+ struct hlist_node *node;
+ inet_bind_bucket_for_each(tb, node, &head->chain) {
+ if (net_eq(ib_net(tb), sock_net(sk)) &&
+ tb->port == port)
+ break;
+ }
+ if (!node) {
+ tb = inet_bind_bucket_create(table->bind_bucket_cachep,
+ sock_net(sk), head, port);
+ if (!tb) {
+ spin_unlock(&head->lock);
+ return -ENOMEM;
+ }
+ }
+ }
sk_add_bind_node(child, &tb->owners);
inet_csk(child)->icsk_bind_hash = tb;
spin_unlock(&head->lock);
+
+ return 0;
}
EXPORT_SYMBOL_GPL(__inet_inherit_port);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a0232f3..8f8527d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1422,7 +1422,7 @@ struct sock *tcp_v4_syn_recv_sock(struct sock *sk, struct sk_buff *skb,
newsk = tcp_create_openreq_child(sk, req, skb);
if (!newsk)
- goto exit;
+ goto exit_nonewsk;
newsk->sk_gso_type = SKB_GSO_TCPV4;
sk_setup_caps(newsk, dst);
@@ -1469,16 +1469,20 @@ struct sock *tcp_v4_syn_recv_sock(struct sock *sk, struct sk_buff *skb,
}
#endif
+ if (__inet_inherit_port(sk, newsk) < 0) {
+ sock_put(newsk);
+ goto exit;
+ }
__inet_hash_nolisten(newsk, NULL);
- __inet_inherit_port(sk, newsk);
return newsk;
exit_overflow:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+exit_nonewsk:
+ dst_release(dst);
exit:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
- dst_release(dst);
return NULL;
}
EXPORT_SYMBOL(tcp_v4_syn_recv_sock);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 8d93f6d..7e41e2c 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1409,7 +1409,7 @@ static struct sock * tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb,
newsk = tcp_create_openreq_child(sk, req, skb);
if (newsk == NULL)
- goto out;
+ goto out_nonewsk;
/*
* No need to charge this sock to the relevant IPv6 refcnt debug socks
@@ -1497,18 +1497,22 @@ static struct sock * tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb,
}
#endif
+ if (__inet_inherit_port(sk, newsk) < 0) {
+ sock_put(newsk);
+ goto out;
+ }
__inet6_hash(newsk, NULL);
- __inet_inherit_port(sk, newsk);
return newsk;
out_overflow:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
-out:
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
+out_nonewsk:
if (opt && opt != np->opt)
sock_kfree_s(sk, opt, opt->tot_len);
dst_release(dst);
+out:
+ NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
return NULL;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4()
2010-10-20 11:21 [PATCH 0/3] tproxy fixes for current upstream code KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple KOVACS Krisztian
@ 2010-10-20 11:21 ` KOVACS Krisztian
2010-10-21 10:48 ` Patrick McHardy
2010-10-20 11:21 ` [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() KOVACS Krisztian
2 siblings, 1 reply; 7+ messages in thread
From: KOVACS Krisztian @ 2010-10-20 11:21 UTC (permalink / raw)
To: netdev, netfilter-devel; +Cc: Patrick McHardy, David Miller
From: Balazs Scheidler <bazsi@balabit.hu>
Also, inline this function as the lookup_type is always a literal
and inlining removes branches performed at runtime.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
include/net/netfilter/nf_tproxy_core.h | 116 +++++++++++++++++++++++++++++++-
net/netfilter/nf_tproxy_core.c | 48 -------------
2 files changed, 114 insertions(+), 50 deletions(-)
diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
index b3a8942..1027d7f 100644
--- a/include/net/netfilter/nf_tproxy_core.h
+++ b/include/net/netfilter/nf_tproxy_core.h
@@ -13,11 +13,123 @@
#define NFT_LOOKUP_ESTABLISHED 2
/* look up and get a reference to a matching socket */
-extern struct sock *
+
+
+/* This function is used by the 'TPROXY' target and the 'socket'
+ * match. The following lookups are supported:
+ *
+ * Explicit TProxy target rule
+ * ===========================
+ *
+ * This is used when the user wants to intercept a connection matching
+ * an explicit iptables rule. In this case the sockets are assumed
+ * matching in preference order:
+ *
+ * - match: if there's a fully established connection matching the
+ * _packet_ tuple, it is returned, assuming the redirection
+ * already took place and we process a packet belonging to an
+ * established connection
+ *
+ * - match: if there's a listening socket matching the redirection
+ * (e.g. on-port & on-ip of the connection), it is returned,
+ * regardless if it was bound to 0.0.0.0 or an explicit
+ * address. The reasoning is that if there's an explicit rule, it
+ * does not really matter if the listener is bound to an interface
+ * or to 0. The user already stated that he wants redirection
+ * (since he added the rule).
+ *
+ * "socket" match based redirection (no specific rule)
+ * ===================================================
+ *
+ * There are connections with dynamic endpoints (e.g. FTP data
+ * connection) that the user is unable to add explicit rules
+ * for. These are taken care of by a generic "socket" rule. It is
+ * assumed that the proxy application is trusted to open such
+ * connections without explicit iptables rule (except of course the
+ * generic 'socket' rule). In this case the following sockets are
+ * matched in preference order:
+ *
+ * - match: if there's a fully established connection matching the
+ * _packet_ tuple
+ *
+ * - match: if there's a non-zero bound listener (possibly with a
+ * non-local address) We don't accept zero-bound listeners, since
+ * then local services could intercept traffic going through the
+ * box.
+ *
+ * Please note that there's an overlap between what a TPROXY target
+ * and a socket match will match. Normally if you have both rules the
+ * "socket" match will be the first one, effectively all packets
+ * belonging to established connections going through that one.
+ */
+static inline struct sock *
nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
const __be32 saddr, const __be32 daddr,
const __be16 sport, const __be16 dport,
- const struct net_device *in, int lookup_type);
+ const struct net_device *in, int lookup_type)
+{
+ struct sock *sk;
+
+ /* look up socket */
+ switch (protocol) {
+ case IPPROTO_TCP:
+ switch (lookup_type) {
+ case NFT_LOOKUP_ANY:
+ sk = __inet_lookup(net, &tcp_hashinfo,
+ saddr, sport, daddr, dport,
+ in->ifindex);
+ break;
+ case NFT_LOOKUP_LISTENER:
+ sk = inet_lookup_listener(net, &tcp_hashinfo,
+ daddr, dport,
+ in->ifindex);
+
+ /* NOTE: we return listeners even if bound to
+ * 0.0.0.0, those are filtered out in
+ * xt_socket, since xt_TPROXY needs 0 bound
+ * listeners too */
+
+ break;
+ case NFT_LOOKUP_ESTABLISHED:
+ sk = inet_lookup_established(net, &tcp_hashinfo,
+ saddr, sport, daddr, dport,
+ in->ifindex);
+ break;
+ default:
+ WARN_ON(1);
+ sk = NULL;
+ break;
+ }
+ break;
+ case IPPROTO_UDP:
+ sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
+ in->ifindex);
+ if (sk && lookup_type != NFT_LOOKUP_ANY) {
+ int connected = (sk->sk_state == TCP_ESTABLISHED);
+ int wildcard = (inet_sk(sk)->inet_rcv_saddr == 0);
+
+ /* NOTE: we return listeners even if bound to
+ * 0.0.0.0, those are filtered out in
+ * xt_socket, since xt_TPROXY needs 0 bound
+ * listeners too */
+ if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
+ (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
+ sock_put(sk);
+ sk = NULL;
+ }
+ }
+ break;
+ default:
+ WARN_ON(1);
+ sk = NULL;
+ }
+
+ pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, lookup type: %d, sock %p\n",
+ protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), lookup_type, sk);
+
+ return sk;
+}
+
static inline void
nf_tproxy_put_sock(struct sock *sk)
diff --git a/net/netfilter/nf_tproxy_core.c b/net/netfilter/nf_tproxy_core.c
index 2ce945c..4d87bef 100644
--- a/net/netfilter/nf_tproxy_core.c
+++ b/net/netfilter/nf_tproxy_core.c
@@ -18,54 +18,6 @@
#include <net/udp.h>
#include <net/netfilter/nf_tproxy_core.h>
-struct sock *
-nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
- const __be32 saddr, const __be32 daddr,
- const __be16 sport, const __be16 dport,
- const struct net_device *in, int lookup_type)
-{
- struct sock *sk;
-
- /* look up socket */
- switch (protocol) {
- case IPPROTO_TCP:
- switch (lookup_type) {
- case NFT_LOOKUP_ANY:
- sk = __inet_lookup(net, &tcp_hashinfo,
- saddr, sport, daddr, dport,
- in->ifindex);
- break;
- case NFT_LOOKUP_LISTENER:
- sk = inet_lookup_listener(net, &tcp_hashinfo,
- daddr, dport,
- in->ifindex);
- break;
- case NFT_LOOKUP_ESTABLISHED:
- sk = inet_lookup_established(net, &tcp_hashinfo,
- saddr, sport, daddr, dport,
- in->ifindex);
- break;
- default:
- WARN_ON(1);
- sk = NULL;
- break;
- }
- break;
- case IPPROTO_UDP:
- sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
- in->ifindex);
- break;
- default:
- WARN_ON(1);
- sk = NULL;
- }
-
- pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, lookup type: %d, sock %p\n",
- protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), lookup_type, sk);
-
- return sk;
-}
-EXPORT_SYMBOL_GPL(nf_tproxy_get_sock_v4);
static void
nf_tproxy_destructor(struct sk_buff *skb)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple
2010-10-20 11:21 [PATCH 0/3] tproxy fixes for current upstream code KOVACS Krisztian
@ 2010-10-20 11:21 ` KOVACS Krisztian
2010-10-21 10:45 ` Patrick McHardy
2010-10-20 11:21 ` [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() KOVACS Krisztian
2 siblings, 1 reply; 7+ messages in thread
From: KOVACS Krisztian @ 2010-10-20 11:21 UTC (permalink / raw)
To: netdev, netfilter-devel; +Cc: Patrick McHardy, David Miller
From: Balazs Scheidler <bazsi@balabit.hu>
Without tproxy redirections an incoming SYN kicks out conflicting
TIME_WAIT sockets, in order to handle clients that reuse ports
within the TIME_WAIT period.
The same mechanism didn't work in case TProxy is involved in finding
the proper socket, as the time_wait processing code looked up the
listening socket assuming that the listener addr/port matches those
of the established connection.
This is not the case with TProxy as the listener addr/port is possibly
changed with the tproxy rule.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
include/net/netfilter/nf_tproxy_core.h | 6 ++-
net/netfilter/nf_tproxy_core.c | 29 ++++++++++----
net/netfilter/xt_TPROXY.c | 68 ++++++++++++++++++++++++++++++--
net/netfilter/xt_socket.c | 2 -
4 files changed, 90 insertions(+), 15 deletions(-)
diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
index 208b46f..b3a8942 100644
--- a/include/net/netfilter/nf_tproxy_core.h
+++ b/include/net/netfilter/nf_tproxy_core.h
@@ -8,12 +8,16 @@
#include <net/inet_sock.h>
#include <net/tcp.h>
+#define NFT_LOOKUP_ANY 0
+#define NFT_LOOKUP_LISTENER 1
+#define NFT_LOOKUP_ESTABLISHED 2
+
/* look up and get a reference to a matching socket */
extern struct sock *
nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
const __be32 saddr, const __be32 daddr,
const __be16 sport, const __be16 dport,
- const struct net_device *in, bool listening);
+ const struct net_device *in, int lookup_type);
static inline void
nf_tproxy_put_sock(struct sock *sk)
diff --git a/net/netfilter/nf_tproxy_core.c b/net/netfilter/nf_tproxy_core.c
index daab8c4..2ce945c 100644
--- a/net/netfilter/nf_tproxy_core.c
+++ b/net/netfilter/nf_tproxy_core.c
@@ -22,21 +22,34 @@ struct sock *
nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
const __be32 saddr, const __be32 daddr,
const __be16 sport, const __be16 dport,
- const struct net_device *in, bool listening_only)
+ const struct net_device *in, int lookup_type)
{
struct sock *sk;
/* look up socket */
switch (protocol) {
case IPPROTO_TCP:
- if (listening_only)
- sk = __inet_lookup_listener(net, &tcp_hashinfo,
- daddr, ntohs(dport),
- in->ifindex);
- else
+ switch (lookup_type) {
+ case NFT_LOOKUP_ANY:
sk = __inet_lookup(net, &tcp_hashinfo,
saddr, sport, daddr, dport,
in->ifindex);
+ break;
+ case NFT_LOOKUP_LISTENER:
+ sk = inet_lookup_listener(net, &tcp_hashinfo,
+ daddr, dport,
+ in->ifindex);
+ break;
+ case NFT_LOOKUP_ESTABLISHED:
+ sk = inet_lookup_established(net, &tcp_hashinfo,
+ saddr, sport, daddr, dport,
+ in->ifindex);
+ break;
+ default:
+ WARN_ON(1);
+ sk = NULL;
+ break;
+ }
break;
case IPPROTO_UDP:
sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
@@ -47,8 +60,8 @@ nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
sk = NULL;
}
- pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, listener only: %d, sock %p\n",
- protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), listening_only, sk);
+ pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, lookup type: %d, sock %p\n",
+ protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), lookup_type, sk);
return sk;
}
diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
index c61294d..67cbed8 100644
--- a/net/netfilter/xt_TPROXY.c
+++ b/net/netfilter/xt_TPROXY.c
@@ -24,6 +24,57 @@
#include <net/netfilter/ipv4/nf_defrag_ipv4.h>
#include <net/netfilter/nf_tproxy_core.h>
+/**
+ * tproxy_handle_time_wait() - handle TCP TIME_WAIT reopen redirections
+ * @skb: The skb being processed.
+ * @par: Iptables target parameters.
+ * @sk: The TIME_WAIT TCP socket found by the lookup.
+ *
+ * We have to handle SYN packets arriving to TIME_WAIT sockets
+ * differently: instead of reopening the connection we should rather
+ * redirect the new connection to the proxy if there's a listener
+ * socket present.
+ *
+ * tproxy_handle_time_wait() consumes the socket reference passed in.
+ *
+ * Returns the listener socket if there's one, the TIME_WAIT socket if
+ * no such listener is found, or NULL if the TCP header is incomplete.
+ */
+static struct sock *
+tproxy_handle_time_wait(struct sk_buff *skb, const struct xt_action_param *par, struct sock *sk)
+{
+ const struct iphdr *iph = ip_hdr(skb);
+ const struct xt_tproxy_target_info *tgi = par->targinfo;
+ struct tcphdr _hdr, *hp;
+
+ hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &_hdr);
+ if (hp == NULL) {
+ inet_twsk_put(inet_twsk(sk));
+ return NULL;
+ }
+
+ if (hp->syn && !hp->rst && !hp->ack && !hp->fin) {
+ /* SYN to a TIME_WAIT socket, we'd rather redirect it
+ * to a listener socket if there's one */
+ struct sock *sk2;
+
+ sk2 = nf_tproxy_get_sock_v4(dev_net(skb->dev), iph->protocol,
+ iph->saddr, tgi->laddr ? tgi->laddr : iph->daddr,
+ hp->source, tgi->lport ? tgi->lport : hp->dest,
+ par->in, NFT_LOOKUP_LISTENER);
+ if (sk2) {
+ /* yeah, there's one, let's kill the TIME_WAIT
+ * socket and redirect to the listener
+ */
+ inet_twsk_deschedule(inet_twsk(sk), &tcp_death_row);
+ inet_twsk_put(inet_twsk(sk));
+ sk = sk2;
+ }
+ }
+
+ return sk;
+}
+
static unsigned int
tproxy_tg(struct sk_buff *skb, const struct xt_action_param *par)
{
@@ -37,11 +88,18 @@ tproxy_tg(struct sk_buff *skb, const struct xt_action_param *par)
return NF_DROP;
sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), iph->protocol,
- iph->saddr,
- tgi->laddr ? tgi->laddr : iph->daddr,
- hp->source,
- tgi->lport ? tgi->lport : hp->dest,
- par->in, true);
+ iph->saddr, iph->daddr,
+ hp->source, hp->dest,
+ par->in, NFT_LOOKUP_ESTABLISHED);
+
+ /* UDP has no TCP_TIME_WAIT state, so we never enter here */
+ if (sk && sk->sk_state == TCP_TIME_WAIT)
+ sk = tproxy_handle_time_wait(skb, par, sk);
+ else if (!sk)
+ sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), iph->protocol,
+ iph->saddr, tgi->laddr ? tgi->laddr : iph->daddr,
+ hp->source, tgi->lport ? tgi->lport : hp->dest,
+ par->in, NFT_LOOKUP_LISTENER);
/* NOTE: assign_sock consumes our sk reference */
if (sk && nf_tproxy_assign_sock(skb, sk)) {
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
index 1ca8990..266faa0 100644
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -142,7 +142,7 @@ socket_match(const struct sk_buff *skb, struct xt_action_param *par,
#endif
sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), protocol,
- saddr, daddr, sport, dport, par->in, false);
+ saddr, daddr, sport, dport, par->in, NFT_LOOKUP_ANY);
if (sk != NULL) {
bool wildcard;
bool transparent = true;
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple
2010-10-20 11:21 ` [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple KOVACS Krisztian
@ 2010-10-21 10:45 ` Patrick McHardy
0 siblings, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2010-10-21 10:45 UTC (permalink / raw)
To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, David Miller
Am 20.10.2010 13:21, schrieb KOVACS Krisztian:
> Without tproxy redirections an incoming SYN kicks out conflicting
> TIME_WAIT sockets, in order to handle clients that reuse ports
> within the TIME_WAIT period.
>
> The same mechanism didn't work in case TProxy is involved in finding
> the proper socket, as the time_wait processing code looked up the
> listening socket assuming that the listener addr/port matches those
> of the established connection.
>
> This is not the case with TProxy as the listener addr/port is possibly
> changed with the tproxy rule.
Applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4()
2010-10-20 11:21 ` [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() KOVACS Krisztian
@ 2010-10-21 10:48 ` Patrick McHardy
0 siblings, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2010-10-21 10:48 UTC (permalink / raw)
To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, David Miller
Am 20.10.2010 13:21, schrieb KOVACS Krisztian:
> Also, inline this function as the lookup_type is always a literal
> and inlining removes branches performed at runtime.
Applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()
2010-10-20 11:21 ` [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() KOVACS Krisztian
@ 2010-10-21 11:08 ` Patrick McHardy
0 siblings, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2010-10-21 11:08 UTC (permalink / raw)
To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, David Miller
Am 20.10.2010 13:21, schrieb KOVACS Krisztian:
> When __inet_inherit_port() is called on a tproxy connection the wrong locks are
> held for the inet_bind_bucket it is added to. __inet_inherit_port() made an
> implicit assumption that the listener's port number (and thus its bind bucket).
> Unfortunately, if you're using the TPROXY target to redirect skbs to a
> transparent proxy that assumption is not true anymore and things break.
>
> This patch adds code to __inet_inherit_port() so that it can handle this case
> by looking up or creating a new bind bucket for the child socket and updates
> callers of __inet_inherit_port() to gracefully handle __inet_inherit_port()
> failing.
>
> Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>.
> See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion.
Applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-10-21 11:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-20 11:21 [PATCH 0/3] tproxy fixes for current upstream code KOVACS Krisztian
2010-10-20 11:21 ` [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple KOVACS Krisztian
2010-10-21 10:45 ` Patrick McHardy
2010-10-20 11:21 ` [PATCH 2/3] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() KOVACS Krisztian
2010-10-21 10:48 ` Patrick McHardy
2010-10-20 11:21 ` [PATCH 3/3] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() KOVACS Krisztian
2010-10-21 11:08 ` Patrick McHardy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).