From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org
Subject: [PATCH 04/18] ipvs: drop first packet to redirect conntrack
Date: Tue, 15 Mar 2016 02:27:48 +0100 [thread overview]
Message-ID: <1458005282-24665-5-git-send-email-pablo@netfilter.org> (raw)
In-Reply-To: <1458005282-24665-1-git-send-email-pablo@netfilter.org>
From: Julian Anastasov <ja@ssi.bg>
Jiri Bohac is reporting for a problem where the attempt
to reschedule existing connection to another real server
needs proper redirect for the conntrack used by the IPVS
connection. For example, when IPVS connection is created
to NAT-ed real server we alter the reply direction of
conntrack. If we later decide to select different real
server we can not alter again the conntrack. And if we
expire the old connection, the new connection is left
without conntrack.
So, the only way to redirect both the IPVS connection and
the Netfilter's conntrack is to drop the SYN packet that
hits existing connection, to wait for the next jiffie
to expire the old connection and its conntrack and to rely
on client's retransmission to create new connection as
usually.
Jiri Bohac provided a fix that drops all SYNs on rescheduling,
I extended his patch to do such drops only for connections
that use conntrack. Here is the original report from Jiri Bohac:
Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server
is dead"), new connections to dead servers are redistributed
immediately to new servers. The old connection is expired using
ip_vs_conn_expire_now() which sets the connection timer to expire
immediately.
However, before the timer callback, ip_vs_conn_expire(), is run
to clean the connection's conntrack entry, the new redistributed
connection may already be established and its conntrack removed
instead.
Fix this by dropping the first packet of the new connection
instead, like we do when the destination server is not available.
The timer will have deleted the old conntrack entry long before
the first packet of the new connection is retransmitted.
Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead")
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 17 +++++++++++++++++
net/netfilter/ipvs/ip_vs_core.c | 37 ++++++++++++++++++++++++++++---------
2 files changed, 45 insertions(+), 9 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 0816c87..a6cc576 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1588,6 +1588,23 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
}
#endif /* CONFIG_IP_VS_NFCT */
+/* Really using conntrack? */
+static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp,
+ struct sk_buff *skb)
+{
+#ifdef CONFIG_IP_VS_NFCT
+ enum ip_conntrack_info ctinfo;
+ struct nf_conn *ct;
+
+ if (!(cp->flags & IP_VS_CONN_F_NFCT))
+ return false;
+ ct = nf_ct_get(skb, &ctinfo);
+ if (ct && !nf_ct_is_untracked(ct))
+ return true;
+#endif
+ return false;
+}
+
static inline int
ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
{
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index f57b4dc..4da5600 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1757,15 +1757,34 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, struct sk_buff *skb, int
cp = pp->conn_in_get(ipvs, af, skb, &iph);
conn_reuse_mode = sysctl_conn_reuse_mode(ipvs);
- if (conn_reuse_mode && !iph.fragoffs &&
- is_new_conn(skb, &iph) && cp &&
- ((unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
- unlikely(!atomic_read(&cp->dest->weight))) ||
- unlikely(is_new_conn_expected(cp, conn_reuse_mode)))) {
- if (!atomic_read(&cp->n_control))
- ip_vs_conn_expire_now(cp);
- __ip_vs_conn_put(cp);
- cp = NULL;
+ if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, &iph) && cp) {
+ bool uses_ct = false, resched = false;
+
+ if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
+ unlikely(!atomic_read(&cp->dest->weight))) {
+ resched = true;
+ uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+ } else if (is_new_conn_expected(cp, conn_reuse_mode)) {
+ uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+ if (!atomic_read(&cp->n_control)) {
+ resched = true;
+ } else {
+ /* Do not reschedule controlling connection
+ * that uses conntrack while it is still
+ * referenced by controlled connection(s).
+ */
+ resched = !uses_ct;
+ }
+ }
+
+ if (resched) {
+ if (!atomic_read(&cp->n_control))
+ ip_vs_conn_expire_now(cp);
+ __ip_vs_conn_put(cp);
+ if (uses_ct)
+ return NF_DROP;
+ cp = NULL;
+ }
}
if (unlikely(!cp)) {
--
2.1.4
next prev parent reply other threads:[~2016-03-15 1:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-15 1:27 [PATCH 00/18] Netfilter/IPVS/OVS updates for net-next Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 01/18] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 02/18] netfilter: nfnetlink_acct: validate NFACCT_FILTER parameters Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 03/18] ipvs: handle ip_vs_fill_iph_skb_off failure Pablo Neira Ayuso
2016-03-15 1:27 ` Pablo Neira Ayuso [this message]
2016-03-15 1:27 ` [PATCH 05/18] ipvs: allow rescheduling after RST Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 06/18] ipvs: correct initial offset of Call-ID header search in SIP persistence engine Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 07/18] netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute length Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 08/18] netfilter: nft_compat: check match/targetinfo attr size Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 09/18] netfilter: x_tables: check for size overflow Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 10/18] netfilter: Remove IP_CT_NEW_REPLY definition Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 11/18] netfilter: Allow calling into nat helper without skb_dst Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 12/18] openvswitch: Add commentary to conntrack.c Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 13/18] openvswitch: Update the CT state key only after nf_conntrack_in() Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 14/18] openvswitch: Find existing conntrack entry after upcall Pablo Neira Ayuso
2016-03-15 1:27 ` [PATCH 15/18] openvswitch: Handle NF_REPEAT in conntrack action Pablo Neira Ayuso
2016-03-15 1:28 ` [PATCH 16/18] openvswitch: Delay conntrack helper call for new connections Pablo Neira Ayuso
2016-03-15 1:28 ` [PATCH 17/18] openvswitch: Interface with NAT Pablo Neira Ayuso
2016-03-15 1:28 ` [PATCH 18/18] netfilter: nf_conntrack: consolidate lock/unlock into unlock_wait Pablo Neira Ayuso
2016-03-15 2:33 ` [PATCH 00/18] Netfilter/IPVS/OVS updates for net-next David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1458005282-24665-5-git-send-email-pablo@netfilter.org \
--to=pablo@netfilter.org \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).