netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: kaber@trash.net
To: davem@davemloft.net
Cc: netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 08/72] ipvs: make rerouting optional with snat_reroute
Date: Thu, 21 Oct 2010 17:18:55 +0200	[thread overview]
Message-ID: <1287674399-31455-9-git-send-email-kaber@trash.net> (raw)
In-Reply-To: <1287674399-31455-1-git-send-email-kaber@trash.net>

From: Julian Anastasov <ja@ssi.bg>

	Add new sysctl flag "snat_reroute". Recent kernels use
ip_route_me_harder() to route LVS-NAT responses properly by
VIP when there are multiple paths to client. But setups
that do not have alternative default routes can skip this
routing lookup by using snat_reroute=0.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/ip_vs.h             |    1 +
 net/netfilter/ipvs/ip_vs_core.c |   37 +++++++++++++++++++++++++++++--------
 net/netfilter/ipvs/ip_vs_ctl.c  |    8 ++++++++
 3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index e8ec523..3915a4f 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -801,6 +801,7 @@ extern int sysctl_ip_vs_expire_quiescent_template;
 extern int sysctl_ip_vs_sync_threshold[2];
 extern int sysctl_ip_vs_nat_icmp_send;
 extern int sysctl_ip_vs_conntrack;
+extern int sysctl_ip_vs_snat_reroute;
 extern struct ip_vs_stats ip_vs_stats;
 extern const struct ctl_path net_vs_ctl_path[];
 
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 7fbc80d..06c388b 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -929,20 +929,31 @@ handle_response(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
 		ip_send_check(ip_hdr(skb));
 	}
 
+	/*
+	 * nf_iterate does not expect change in the skb->dst->dev.
+	 * It looks like it is not fatal to enable this code for hooks
+	 * where our handlers are at the end of the chain list and
+	 * when all next handlers use skb->dst->dev and not outdev.
+	 * It will definitely route properly the inout NAT traffic
+	 * when multiple paths are used.
+	 */
+
 	/* For policy routing, packets originating from this
 	 * machine itself may be routed differently to packets
 	 * passing through.  We want this packet to be routed as
 	 * if it came from this machine itself.  So re-compute
 	 * the routing information.
 	 */
+	if (sysctl_ip_vs_snat_reroute) {
 #ifdef CONFIG_IP_VS_IPV6
-	if (af == AF_INET6) {
-		if (ip6_route_me_harder(skb) != 0)
-			goto drop;
-	} else
+		if (af == AF_INET6) {
+			if (ip6_route_me_harder(skb) != 0)
+				goto drop;
+		} else
 #endif
-		if (ip_route_me_harder(skb, RTN_LOCAL) != 0)
-			goto drop;
+			if (ip_route_me_harder(skb, RTN_LOCAL) != 0)
+				goto drop;
+	}
 
 	IP_VS_DBG_PKT(10, pp, skb, 0, "After SNAT");
 
@@ -991,8 +1002,13 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
 		if (unlikely(iph.protocol == IPPROTO_ICMPV6)) {
 			int related, verdict = ip_vs_out_icmp_v6(skb, &related);
 
-			if (related)
+			if (related) {
+				if (sysctl_ip_vs_snat_reroute &&
+					NF_ACCEPT == verdict &&
+					ip6_route_me_harder(skb))
+					verdict = NF_DROP;
 				return verdict;
+			}
 			ip_vs_fill_iphdr(af, skb_network_header(skb), &iph);
 		}
 	} else
@@ -1000,8 +1016,13 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
 		if (unlikely(iph.protocol == IPPROTO_ICMP)) {
 			int related, verdict = ip_vs_out_icmp(skb, &related);
 
-			if (related)
+			if (related) {
+				if (sysctl_ip_vs_snat_reroute &&
+					NF_ACCEPT == verdict &&
+					ip_route_me_harder(skb, RTN_LOCAL))
+					verdict = NF_DROP;
 				return verdict;
+			}
 			ip_vs_fill_iphdr(af, skb_network_header(skb), &iph);
 		}
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index d2d842f..e637cd0 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -91,6 +91,7 @@ int sysctl_ip_vs_nat_icmp_send = 0;
 #ifdef CONFIG_IP_VS_NFCT
 int sysctl_ip_vs_conntrack;
 #endif
+int sysctl_ip_vs_snat_reroute = 1;
 
 
 #ifdef CONFIG_IP_VS_DEBUG
@@ -1599,6 +1600,13 @@ static struct ctl_table vs_vars[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_do_defense_mode,
 	},
+	{
+		.procname	= "snat_reroute",
+		.data		= &sysctl_ip_vs_snat_reroute,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
 #if 0
 	{
 		.procname	= "timeout_established",
-- 
1.7.1


  parent reply	other threads:[~2010-10-21 15:18 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21 15:18 [PATCH 00/72] netfilter: netfilter update for 2.6.37 kaber
2010-10-21 15:18 ` [PATCH 01/72] netfilter: nf_nat: add nf_nat_csum() kaber
2010-10-21 15:18 ` [PATCH 02/72] netfilter: use NFPROTO_IPV4 instead of AF_INET kaber
2010-10-21 15:18 ` [PATCH 03/72] netfilter: nf_nat_core: don't check if the tuple is used if there is no other choice kaber
2010-10-21 15:18 ` [PATCH 04/72] netfilter: nf_nat: no IP_NAT_RANGE_MAP_IPS flags when alloc_null_binding() kaber
2010-10-21 15:18 ` [PATCH 05/72] netfilter: nf_conntrack: fix the hash random initializing race kaber
2010-10-21 15:18 ` [PATCH 06/72] ipvs: extend connection flags to 32 bits kaber
2010-10-21 15:18 ` [PATCH 07/72] ipvs: netfilter connection tracking changes kaber
2010-10-21 15:18 ` kaber [this message]
2010-10-21 15:18 ` [PATCH 09/72] netfilter: save the hash of the tuple in the original direction for latter use kaber
2010-10-21 15:18 ` [PATCH 10/72] ipvs: changes related to service usecnt kaber
2010-10-21 15:18 ` [PATCH 11/72] netfilter: nf_nat: better error handling of nf_ct_expect_related() in helpers kaber
2010-10-21 15:18 ` [PATCH 12/72] netfilter: ctnetlink: missing validation of CTA_EXPECT_ZONE attribute kaber
2010-10-21 15:19 ` [PATCH 13/72] netfilter: ctnetlink: allow to specify the expectation flags kaber
2010-10-21 15:19 ` [PATCH 14/72] netfilter: ctnetlink: add support for user-space expectation helpers kaber
2010-10-21 15:19 ` [PATCH 15/72] netfilter: nf_conntrack_sip: Allow ct_sip_get_header() to be called with a null ct argument kaber
2010-10-21 15:19 ` [PATCH 16/72] netfilter: nf_conntrack_sip: Add callid parser kaber
2010-10-21 15:19 ` [PATCH 17/72] IPVS: compact ip_vs_sched_persist() kaber
2010-10-21 15:19 ` [PATCH 18/72] IPVS: Add struct ip_vs_conn_param kaber
2010-10-21 15:19 ` [PATCH 19/72] IPVS: Allow null argument to ip_vs_scheduler_put() kaber
2010-10-21 15:19 ` [PATCH 20/72] IPVS: ip_vs_{un,}bind_scheduler NULL arguments kaber
2010-10-21 15:19 ` [PATCH 21/72] IPVS: Add struct ip_vs_pe kaber
2010-10-21 15:19 ` [PATCH 22/72] IPVS: Add persistence engine data to /proc/net/ip_vs_conn kaber
2010-10-21 15:19 ` [PATCH 23/72] IPVS: management of persistence engine modules kaber
2010-10-21 15:19 ` [PATCH 24/72] IPVS: Allow configuration of persistence engines kaber
2010-10-21 15:19 ` [PATCH 25/72] IPVS: Fallback if persistence engine fails kaber
2010-10-21 15:19 ` [PATCH 26/72] IPVS: sip persistence engine kaber
2010-10-21 15:19 ` [PATCH 27/72] netfilter: nf_nat: make find/put static kaber
2010-10-21 15:19 ` [PATCH 28/72] netfilter: ipt_LOG: add bufferisation to call printk() once kaber
2010-10-21 15:19 ` [PATCH 29/72] netfilter: remove duplicated include kaber
2010-10-21 15:19 ` [PATCH 30/72] netfilter: unregister nf hooks, matches and targets in the reverse order kaber
2010-10-21 15:19 ` [PATCH 31/72] netfilter: add missing xt_log.h file kaber
2010-10-21 15:19 ` [PATCH 32/72] netfilter: xtables: resolve indirect macros 1/3 kaber
2010-10-21 15:19 ` [PATCH 33/72] netfilter: xtables: resolve indirect macros 2/3 kaber
2010-10-21 15:19 ` [PATCH 34/72] netfilter: xtables: resolve indirect macros 3/3 kaber
2010-10-21 15:19 ` [PATCH 35/72] netfilter: xtables: unify {ip,ip6,arp}t_error_target kaber
2010-10-21 15:19 ` [PATCH 36/72] netfilter: xtables: remove unused defines kaber
2010-10-21 15:19 ` [PATCH 37/72] IPVS: ip_vs_dbg_callid() is only needed for debugging kaber
2010-10-21 15:19 ` [PATCH 38/72] netfilter: fix kconfig unmet dependency warning kaber
2010-10-21 15:19 ` [PATCH 39/72] netfilter: install missing ebtables headers for userspace kaber
2010-10-21 15:19 ` [PATCH 40/72] netfilter: ctnetlink: add expectation deletion events kaber
2010-10-21 15:19 ` [PATCH 41/72] ipvs: IPv6 tunnel mode kaber
2010-10-21 15:19 ` [PATCH 42/72] Fixed race condition at ip_vs.ko module init kaber
2010-10-21 15:19 ` [PATCH 43/72] ipvs: fix CHECKSUM_PARTIAL for TCP, UDP kaber
2010-10-21 15:19 ` [PATCH 44/72] ipvs: optimize checksums for apps kaber
2010-10-21 15:19 ` [PATCH 45/72] ipvs: switch to notrack mode kaber
2010-10-21 15:19 ` [PATCH 46/72] ipvs: do not schedule conns from real servers kaber
2010-10-21 15:19 ` [PATCH 47/72] ipvs: stop ICMP from FORWARD to local kaber
2010-10-21 15:19 ` [PATCH 48/72] ipvs: fix CHECKSUM_PARTIAL for TUN method kaber
2010-10-21 15:19 ` [PATCH 49/72] ipvs: create ip_vs_defrag_user kaber
2010-10-21 15:19 ` [PATCH 50/72] ipvs: move ip_route_me_harder for ICMP kaber
2010-10-21 15:19 ` [PATCH 51/72] ipvs: changes for local real server kaber
2010-10-21 15:19 ` [PATCH 52/72] ipvs: changes for local client kaber
2010-10-21 15:19 ` [PATCH 53/72] ipvs: inherit forwarding method in backup kaber
2010-10-21 15:19 ` [PATCH 54/72] ipvs: provide address family for debugging kaber
2010-10-21 15:19 ` [PATCH 55/72] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple kaber
2010-10-21 15:19 ` [PATCH 56/72] tproxy: add lookup type checks for UDP in nf_tproxy_get_sock_v4() kaber
2010-10-21 15:19 ` [PATCH 57/72] tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() kaber
2010-10-21 15:19 ` [PATCH 58/72] nf_nat: restrict ICMP translation for embedded header kaber
2010-10-21 15:19 ` [PATCH 59/72] tproxy: split off ipv6 defragmentation to a separate module kaber
2010-10-21 15:19 ` [PATCH 60/72] tproxy: added const specifiers to udp lookup functions kaber
2010-10-21 15:19 ` [PATCH 61/72] tproxy: added udp6_lib_lookup function kaber
2010-10-21 15:19 ` [PATCH 62/72] tproxy: added tproxy sockopt interface in the IPV6 layer kaber
2010-10-21 15:19 ` [PATCH 63/72] tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled kaber
2010-10-21 21:07   ` YOSHIFUJI Hideaki
2010-10-21 15:19 ` [PATCH 64/72] tproxy: added IPv6 socket lookup function to nf_tproxy_core kaber
2010-10-21 15:19 ` [PATCH 65/72] tproxy: added IPv6 support to the TPROXY target kaber
2010-10-21 15:19 ` [PATCH 66/72] tproxy: added IPv6 support to the socket match kaber
2010-10-21 15:19 ` [PATCH 67/72] tproxy: use the interface primary IP address as a default value for --on-ip kaber
2010-10-21 15:19 ` [PATCH 68/72] netfilter: ebtables: remove unused definitions kaber
2010-10-21 15:19 ` [PATCH 69/72] netfilter: xtables: add a missing pair of parentheses kaber
2010-10-21 15:19 ` [PATCH 70/72] netfilter: ebtables: replace EBT_ENTRY_ITERATE macro kaber
2010-10-21 15:19 ` [PATCH 71/72] netfilter: ebtables: replace EBT_MATCH_ITERATE macro kaber
2010-10-21 15:19 ` [PATCH 72/72] netfilter: ebtables: replace EBT_WATCHER_ITERATE macro kaber
2010-10-21 15:40 ` [PATCH 00/72] netfilter: netfilter update for 2.6.37 David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287674399-31455-9-git-send-email-kaber@trash.net \
    --to=kaber@trash.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).