netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org
Subject: [PATCH 18/20] net: ipvs: sctp: do not recalc sctp csum when ports didn't change
Date: Mon,  4 Nov 2013 22:50:40 +0100	[thread overview]
Message-ID: <1383601842-4570-19-git-send-email-pablo@netfilter.org> (raw)
In-Reply-To: <1383601842-4570-1-git-send-email-pablo@netfilter.org>

From: Daniel Borkmann <dborkman@redhat.com>

Unlike UDP or TCP, we do not take the pseudo-header into
account in SCTP checksums. So in case port mapping is the
very same, we do not need to recalculate the whole SCTP
checksum in software, which is very expensive.

Also, similarly as in TCP, take into account when a private
helper mangled the packet. In that case, we also need to
recalculate the checksum even if ports might be same.

Thanks for feedback regarding skb->ip_summed checks from
Julian Anastasov; here's a discussion on these checks for
snat and dnat:

* For snat_handler(), we can see CHECKSUM_PARTIAL from
  virtual devices, and from LOCAL_OUT, otherwise it
  should be CHECKSUM_UNNECESSARY. In general, in snat it
  is more complex. skb contains the original route and
  ip_vs_route_me_harder() can change the route after
  snat_handler. So, for locally generated replies from
  local server we can not preserve the CHECKSUM_PARTIAL
  mode. It is an chicken or egg dilemma: snat_handler
  needs the device after rerouting (to check for
  NETIF_F_SCTP_CSUM), while ip_route_me_harder() wants
  the snat_handler() to put the new saddr for proper
  rerouting.

* For dnat_handler(), we should not see CHECKSUM_COMPLETE
  for SCTP, in fact the small set of drivers that support
  SCTP offloading return CHECKSUM_UNNECESSARY on correctly
  received SCTP csum. We can see CHECKSUM_PARTIAL from
  local stack or received from virtual drivers. The idea is
  that SCTP decides to avoid csum calculation if hardware
  supports offloading. IPVS can change the device after
  rerouting to real server but we can preserve the
  CHECKSUM_PARTIAL mode if the new device supports
  offloading too. This works because skb dst is changed
  before dnat_handler and we see the new device. So, checks
  in the 'if' part will decide whether it is ok to keep
  CHECKSUM_PARTIAL for the output. If the packet was with
  CHECKSUM_NONE, hence we deal with unknown checksum. As we
  recalculate the sum for IP header in all cases, it should
  be safe to use CHECKSUM_UNNECESSARY. We can forward wrong
  checksum in this case (without cp->app). In case of
  CHECKSUM_UNNECESSARY, the csum was valid on receive.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_proto_sctp.c |   39 ++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 9ca7aa0..2f7ea75 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -81,6 +81,7 @@ sctp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
 {
 	sctp_sctphdr_t *sctph;
 	unsigned int sctphoff = iph->len;
+	bool payload_csum = false;
 
 #ifdef CONFIG_IP_VS_IPV6
 	if (cp->af == AF_INET6 && iph->fragoffs)
@@ -92,19 +93,31 @@ sctp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
 		return 0;
 
 	if (unlikely(cp->app != NULL)) {
+		int ret;
+
 		/* Some checks before mangling */
 		if (pp->csum_check && !pp->csum_check(cp->af, skb, pp))
 			return 0;
 
 		/* Call application helper if needed */
-		if (!ip_vs_app_pkt_out(cp, skb))
+		ret = ip_vs_app_pkt_out(cp, skb);
+		if (ret == 0)
 			return 0;
+		/* ret=2: csum update is needed after payload mangling */
+		if (ret == 2)
+			payload_csum = true;
 	}
 
 	sctph = (void *) skb_network_header(skb) + sctphoff;
-	sctph->source = cp->vport;
 
-	sctp_nat_csum(skb, sctph, sctphoff);
+	/* Only update csum if we really have to */
+	if (sctph->source != cp->vport || payload_csum ||
+	    skb->ip_summed == CHECKSUM_PARTIAL) {
+		sctph->source = cp->vport;
+		sctp_nat_csum(skb, sctph, sctphoff);
+	} else {
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+	}
 
 	return 1;
 }
@@ -115,6 +128,7 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
 {
 	sctp_sctphdr_t *sctph;
 	unsigned int sctphoff = iph->len;
+	bool payload_csum = false;
 
 #ifdef CONFIG_IP_VS_IPV6
 	if (cp->af == AF_INET6 && iph->fragoffs)
@@ -126,19 +140,32 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
 		return 0;
 
 	if (unlikely(cp->app != NULL)) {
+		int ret;
+
 		/* Some checks before mangling */
 		if (pp->csum_check && !pp->csum_check(cp->af, skb, pp))
 			return 0;
 
 		/* Call application helper if needed */
-		if (!ip_vs_app_pkt_in(cp, skb))
+		ret = ip_vs_app_pkt_in(cp, skb);
+		if (ret == 0)
 			return 0;
+		/* ret=2: csum update is needed after payload mangling */
+		if (ret == 2)
+			payload_csum = true;
 	}
 
 	sctph = (void *) skb_network_header(skb) + sctphoff;
-	sctph->dest = cp->dport;
 
-	sctp_nat_csum(skb, sctph, sctphoff);
+	/* Only update csum if we really have to */
+	if (sctph->dest != cp->dport || payload_csum ||
+	    (skb->ip_summed == CHECKSUM_PARTIAL &&
+	     !(skb_dst(skb)->dev->features & NETIF_F_SCTP_CSUM))) {
+		sctph->dest = cp->dport;
+		sctp_nat_csum(skb, sctph, sctphoff);
+	} else if (skb->ip_summed != CHECKSUM_PARTIAL) {
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+	}
 
 	return 1;
 }
-- 
1.7.10.4

  parent reply	other threads:[~2013-11-04 21:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 02/20] ipvs: avoid rcu_barrier during netns cleanup Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 03/20] ipvs: improved SH fallback strategy Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 04/20] netfilter: xt_socket: use sock_gen_put() Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 05/20] netfilter: ipt_CLUSTERIP: make proc directory per net namespace Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 06/20] netfilter: ipt_CLUSTERIP: make clusterip_list " Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 07/20] netfilter: ipt_CLUSTERIP: make clusterip_lock " Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 08/20] netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 09/20] netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 10/20] netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 11/20] netfilter: ipset: Use netlink callback dump args only Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 12/20] netfilter: ipset: The unnamed union initialization may lead to compilation error Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 13/20] netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 14/20] netfilter:ipset: Fix memory allocation for bitmap:port Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 15/20] netfilter: ipset: remove duplicate define Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 16/20] bridge: netfilter: orphan skb before invoking ip netfilter hooks Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 17/20] net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule Pablo Neira Ayuso
2013-11-04 21:50 ` Pablo Neira Ayuso [this message]
2013-11-04 21:50 ` [PATCH 19/20] netfilter: introduce nf_conn_acct structure Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 20/20] netfilter: ctnetlink: account both directions in one step Pablo Neira Ayuso
2013-11-05  0:47 ` [PATCH 00/20] Netfilter/IPVS updates for net-next David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1383601842-4570-19-git-send-email-pablo@netfilter.org \
    --to=pablo@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).