* [GIT PULL nf 0/3] IPVS Fixes for v4.5
@ 2016-02-18 0:40 Simon Horman
2016-02-18 0:40 ` [PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack Simon Horman
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Simon Horman @ 2016-02-18 0:40 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Simon Horman
Hi Pablo,
please consider these IPVS fixes for v4.5.
* Arnd Bergman has corrected an error whereby the SIP persistence engine
may incorrectly access protocol fields
* Julian Anastasov has corrected a problem reported by Jiri Bohac with the
connection rescheduling mechanism added in 3.10 when new SYNs in
connection to dead real server can be redirected to another real server.
The following changes since commit 5cc6ce9ff27565949a1001a2889a8dd9fd09e772:
netfilter: nft_counter: fix erroneous return values (2016-02-08 13:05:02 +0100)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git tags/ipvs-fixes-for-v4.5
for you to fetch changes up to 5acaf89f88b97849d550d6fbb10362e3d84b5082:
netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure (2016-02-18 09:31:48 +0900)
----------------------------------------------------------------
Arnd Bergmann (1):
netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure
Julian Anastasov (2):
netfilter: ipvs: drop first packet to redirect conntrack
netfilter: ipvs: allow rescheduling after RST
include/net/ip_vs.h | 17 +++++++++++++++++
net/netfilter/ipvs/ip_vs_core.c | 38 +++++++++++++++++++++++++++++---------
net/netfilter/ipvs/ip_vs_pe_sip.c | 4 ++--
3 files changed, 48 insertions(+), 11 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack
2016-02-18 0:40 [GIT PULL nf 0/3] IPVS Fixes for v4.5 Simon Horman
@ 2016-02-18 0:40 ` Simon Horman
2016-02-18 0:41 ` [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST Simon Horman
2016-02-18 0:41 ` [PATCH nf 3/3] netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure Simon Horman
2 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2016-02-18 0:40 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Jiri Bohac, Simon Horman
From: Julian Anastasov <ja@ssi.bg>
Jiri Bohac is reporting for a problem where the attempt
to reschedule existing connection to another real server
needs proper redirect for the conntrack used by the IPVS
connection. For example, when IPVS connection is created
to NAT-ed real server we alter the reply direction of
conntrack. If we later decide to select different real
server we can not alter again the conntrack. And if we
expire the old connection, the new connection is left
without conntrack.
So, the only way to redirect both the IPVS connection and
the Netfilter's conntrack is to drop the SYN packet that
hits existing connection, to wait for the next jiffie
to expire the old connection and its conntrack and to rely
on client's retransmission to create new connection as
usually.
Jiri Bohac provided a fix that drops all SYNs on rescheduling,
I extended his patch to do such drops only for connections
that use conntrack. Here is the original report from Jiri Bohac:
Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server
is dead"), new connections to dead servers are redistributed
immediately to new servers. The old connection is expired using
ip_vs_conn_expire_now() which sets the connection timer to expire
immediately.
However, before the timer callback, ip_vs_conn_expire(), is run
to clean the connection's conntrack entry, the new redistributed
connection may already be established and its conntrack removed
instead.
Fix this by dropping the first packet of the new connection
instead, like we do when the destination server is not available.
The timer will have deleted the old conntrack entry long before
the first packet of the new connection is retransmitted.
Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead")
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 17 +++++++++++++++++
net/netfilter/ipvs/ip_vs_core.c | 37 ++++++++++++++++++++++++++++---------
2 files changed, 45 insertions(+), 9 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 0816c872b689..a6cc576fd467 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1588,6 +1588,23 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
}
#endif /* CONFIG_IP_VS_NFCT */
+/* Really using conntrack? */
+static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp,
+ struct sk_buff *skb)
+{
+#ifdef CONFIG_IP_VS_NFCT
+ enum ip_conntrack_info ctinfo;
+ struct nf_conn *ct;
+
+ if (!(cp->flags & IP_VS_CONN_F_NFCT))
+ return false;
+ ct = nf_ct_get(skb, &ctinfo);
+ if (ct && !nf_ct_is_untracked(ct))
+ return true;
+#endif
+ return false;
+}
+
static inline int
ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
{
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index f57b4dcdb233..4da560005b0e 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1757,15 +1757,34 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, struct sk_buff *skb, int
cp = pp->conn_in_get(ipvs, af, skb, &iph);
conn_reuse_mode = sysctl_conn_reuse_mode(ipvs);
- if (conn_reuse_mode && !iph.fragoffs &&
- is_new_conn(skb, &iph) && cp &&
- ((unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
- unlikely(!atomic_read(&cp->dest->weight))) ||
- unlikely(is_new_conn_expected(cp, conn_reuse_mode)))) {
- if (!atomic_read(&cp->n_control))
- ip_vs_conn_expire_now(cp);
- __ip_vs_conn_put(cp);
- cp = NULL;
+ if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, &iph) && cp) {
+ bool uses_ct = false, resched = false;
+
+ if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
+ unlikely(!atomic_read(&cp->dest->weight))) {
+ resched = true;
+ uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+ } else if (is_new_conn_expected(cp, conn_reuse_mode)) {
+ uses_ct = ip_vs_conn_uses_conntrack(cp, skb);
+ if (!atomic_read(&cp->n_control)) {
+ resched = true;
+ } else {
+ /* Do not reschedule controlling connection
+ * that uses conntrack while it is still
+ * referenced by controlled connection(s).
+ */
+ resched = !uses_ct;
+ }
+ }
+
+ if (resched) {
+ if (!atomic_read(&cp->n_control))
+ ip_vs_conn_expire_now(cp);
+ __ip_vs_conn_put(cp);
+ if (uses_ct)
+ return NF_DROP;
+ cp = NULL;
+ }
}
if (unlikely(!cp)) {
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST
2016-02-18 0:40 [GIT PULL nf 0/3] IPVS Fixes for v4.5 Simon Horman
2016-02-18 0:40 ` [PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack Simon Horman
@ 2016-02-18 0:41 ` Simon Horman
2016-02-18 13:17 ` Sergei Shtylyov
2016-02-18 0:41 ` [PATCH nf 3/3] netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure Simon Horman
2 siblings, 1 reply; 6+ messages in thread
From: Simon Horman @ 2016-02-18 0:41 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Simon Horman
From: Julian Anastasov <ja@ssi.bg>
"RFC 5961, 4.2. Mitigation" describes a mechanism to request
client to confirm with RST the restart of TCP connection
before resending its SYN. As result, IPVS can see SYNs for
existing connection in CLOSE state. Add check to allow
rescheduling in this state.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 4da560005b0e..0c1d3fef9a7c 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1089,6 +1089,7 @@ static inline bool is_new_conn_expected(const struct ip_vs_conn *cp,
switch (cp->protocol) {
case IPPROTO_TCP:
return (cp->state == IP_VS_TCP_S_TIME_WAIT) ||
+ cp->state == IP_VS_TCP_S_CLOSE ||
((conn_reuse_mode & 2) &&
(cp->state == IP_VS_TCP_S_FIN_WAIT) &&
(cp->flags & IP_VS_CONN_F_NOOUTPUT));
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH nf 3/3] netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure
2016-02-18 0:40 [GIT PULL nf 0/3] IPVS Fixes for v4.5 Simon Horman
2016-02-18 0:40 ` [PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack Simon Horman
2016-02-18 0:41 ` [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST Simon Horman
@ 2016-02-18 0:41 ` Simon Horman
2 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2016-02-18 0:41 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Arnd Bergmann, Simon Horman
From: Arnd Bergmann <arnd@arndb.de>
ip_vs_fill_iph_skb_off() may not find an IP header, and gcc has
determined that ip_vs_sip_fill_param() then incorrectly accesses
the protocol fields:
net/netfilter/ipvs/ip_vs_pe_sip.c: In function 'ip_vs_sip_fill_param':
net/netfilter/ipvs/ip_vs_pe_sip.c:76:5: error: 'iph.protocol' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (iph.protocol != IPPROTO_UDP)
^
net/netfilter/ipvs/ip_vs_pe_sip.c:81:10: error: 'iph.len' may be used uninitialized in this function [-Werror=maybe-uninitialized]
dataoff = iph.len + sizeof(struct udphdr);
^
This adds a check for the ip_vs_fill_iph_skb_off() return code
before looking at the ip header data returned from it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: b0e010c527de ("ipvs: replace ip_vs_fill_ip4hdr with ip_vs_fill_iph_skb_off")
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_pe_sip.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_pe_sip.c b/net/netfilter/ipvs/ip_vs_pe_sip.c
index 1b8d594e493a..c4e9ca016a88 100644
--- a/net/netfilter/ipvs/ip_vs_pe_sip.c
+++ b/net/netfilter/ipvs/ip_vs_pe_sip.c
@@ -70,10 +70,10 @@ ip_vs_sip_fill_param(struct ip_vs_conn_param *p, struct sk_buff *skb)
const char *dptr;
int retc;
- ip_vs_fill_iph_skb(p->af, skb, false, &iph);
+ retc = ip_vs_fill_iph_skb(p->af, skb, false, &iph);
/* Only useful with UDP */
- if (iph.protocol != IPPROTO_UDP)
+ if (!retc || iph.protocol != IPPROTO_UDP)
return -EINVAL;
/* todo: IPv6 fragments:
* I think this only should be done for the first fragment. /HS
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST
2016-02-18 0:41 ` [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST Simon Horman
@ 2016-02-18 13:17 ` Sergei Shtylyov
2016-03-05 13:01 ` Julian Anastasov
0 siblings, 1 reply; 6+ messages in thread
From: Sergei Shtylyov @ 2016-02-18 13:17 UTC (permalink / raw)
To: Simon Horman, Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov
Hello.
On 2/18/2016 3:41 AM, Simon Horman wrote:
> From: Julian Anastasov <ja@ssi.bg>
>
> "RFC 5961, 4.2. Mitigation" describes a mechanism to request
> client to confirm with RST the restart of TCP connection
> before resending its SYN. As result, IPVS can see SYNs for
> existing connection in CLOSE state. Add check to allow
> rescheduling in this state.
>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> Signed-off-by: Simon Horman <horms@verge.net.au>
> ---
> net/netfilter/ipvs/ip_vs_core.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
> index 4da560005b0e..0c1d3fef9a7c 100644
> --- a/net/netfilter/ipvs/ip_vs_core.c
> +++ b/net/netfilter/ipvs/ip_vs_core.c
> @@ -1089,6 +1089,7 @@ static inline bool is_new_conn_expected(const struct ip_vs_conn *cp,
> switch (cp->protocol) {
> case IPPROTO_TCP:
> return (cp->state == IP_VS_TCP_S_TIME_WAIT) ||
> + cp->state == IP_VS_TCP_S_CLOSE ||
I would have been consistent and enclosed this expression into parens as
well. BTW, the indentation is not correct anyway.
> ((conn_reuse_mode & 2) &&
> (cp->state == IP_VS_TCP_S_FIN_WAIT) &&
> (cp->flags & IP_VS_CONN_F_NOOUTPUT));
MBR, Sergei
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST
2016-02-18 13:17 ` Sergei Shtylyov
@ 2016-03-05 13:01 ` Julian Anastasov
0 siblings, 0 replies; 6+ messages in thread
From: Julian Anastasov @ 2016-03-05 13:01 UTC (permalink / raw)
To: Sergei Shtylyov
Cc: Simon Horman, Pablo Neira Ayuso, lvs-devel, netdev,
netfilter-devel, Wensong Zhang
Hello,
On Thu, 18 Feb 2016, Sergei Shtylyov wrote:
> On 2/18/2016 3:41 AM, Simon Horman wrote:
>
> > From: Julian Anastasov <ja@ssi.bg>
> >
> > "RFC 5961, 4.2. Mitigation" describes a mechanism to request
> > client to confirm with RST the restart of TCP connection
> > before resending its SYN. As result, IPVS can see SYNs for
> > existing connection in CLOSE state. Add check to allow
> > rescheduling in this state.
> >
> > Signed-off-by: Julian Anastasov <ja@ssi.bg>
> > Signed-off-by: Simon Horman <horms@verge.net.au>
> > ---
> > net/netfilter/ipvs/ip_vs_core.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/net/netfilter/ipvs/ip_vs_core.c
> > b/net/netfilter/ipvs/ip_vs_core.c
> > index 4da560005b0e..0c1d3fef9a7c 100644
> > --- a/net/netfilter/ipvs/ip_vs_core.c
> > +++ b/net/netfilter/ipvs/ip_vs_core.c
> > @@ -1089,6 +1089,7 @@ static inline bool is_new_conn_expected(const struct
> > ip_vs_conn *cp,
> > switch (cp->protocol) {
> > case IPPROTO_TCP:
> > return (cp->state == IP_VS_TCP_S_TIME_WAIT) ||
> > + cp->state == IP_VS_TCP_S_CLOSE ||
>
> I would have been consistent and enclosed this expression into parens as
> well. BTW, the indentation is not correct anyway.
Thanks! I'll send new version...
>
> > ((conn_reuse_mode & 2) &&
> > (cp->state == IP_VS_TCP_S_FIN_WAIT) &&
> > (cp->flags & IP_VS_CONN_F_NOOUTPUT));
>
> MBR, Sergei
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-03-05 13:01 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-18 0:40 [GIT PULL nf 0/3] IPVS Fixes for v4.5 Simon Horman
2016-02-18 0:40 ` [PATCH nf 1/3] netfilter: ipvs: drop first packet to redirect conntrack Simon Horman
2016-02-18 0:41 ` [PATCH nf 2/3] netfilter: ipvs: allow rescheduling after RST Simon Horman
2016-02-18 13:17 ` Sergei Shtylyov
2016-03-05 13:01 ` Julian Anastasov
2016-02-18 0:41 ` [PATCH nf 3/3] netfilter: ipvs: handle ip_vs_fill_iph_skb_off failure Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).