netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fan Du <fan.du@windriver.com>
To: Neil Horman <nhorman@tuxdriver.com>, <vyasevich@gmail.com>,
	<steffen.klassert@secunet.com>, <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>
Subject: [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
Date: Fri, 11 Oct 2013 15:08:06 +0800	[thread overview]
Message-ID: <5257A3D6.3010002@windriver.com> (raw)
In-Reply-To: <20131010131116.GA16147@hmsreliant.think-freely.org>



On 2013年10月10日 21:11, Neil Horman wrote:
> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
> Shouldn't this be fixed in the xfrm code then?  E.g. check the device features
> for SCTP checksum offloading and and skip the checksum during xfrm output if its
> available?
>
> Or am I missing something?
> Neil
>
>


 From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@windriver.com>
Date: Fri, 11 Oct 2013 14:31:57 +0800
Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
  CHECKSUM_PARTIAL set
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

IPsec is not enabled in this scenario:

SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of doing
SCTP checksum(crc32-c) scoping the whole SCTP packet range. However when
such kind of skb is delivered through IPv4/v6 output handler, IPv4/v6 stack
interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute 16bits
checksum value by summing everything up, the whole SCTP packet in software
manner! After this skb reach NIC, after hardware doing its SCTP checking
business, a crc32-c value will overwrite the value IPv4/v6 stack computed
before.

This patch solves this by introducing skb_is_sctpv4/6 to optimize such case.

Signed-off-by: Fan Du <fan.du@windriver.com>
---
v2:
    Rework this problem by introducing skb_is_scktv4/6

---
  include/linux/ip.h     |    5 +++++
  include/linux/ipv6.h   |    6 ++++++
  include/linux/skbuff.h |    1 -
  net/core/skbuff.c      |    1 +
  net/ipv4/ip_output.c   |    4 +++-
  net/ipv6/ip6_output.c  |    1 +
  net/xfrm/xfrm_output.c |   20 +++++++++++++++-----
  7 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/include/linux/ip.h b/include/linux/ip.h
index 492bc65..f556292 100644
--- a/include/linux/ip.h
+++ b/include/linux/ip.h
@@ -19,6 +19,7 @@

  #include <linux/skbuff.h>
  #include <uapi/linux/ip.h>
+#include <uapi/linux/in.h>

  static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
  {
@@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct sk_buff *skb)
  {
  	return (struct iphdr *)skb_transport_header(skb);
  }
+static inline int skb_is_sctpv4(const struct sk_buff *skb)
+{
+	return ip_hdr(skb)->protocol == IPPROTO_SCTP;
+}
  #endif	/* _LINUX_IP_H */
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 28ea384..6e17c04 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -2,6 +2,7 @@
  #define _IPV6_H

  #include <uapi/linux/ipv6.h>
+#include <uapi/linux/in.h>

  #define ipv6_optlen(p)  (((p)->hdrlen+1) << 3)
  /*
@@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const struct sock *sk)
  	  ((__sk)->sk_bound_dev_if == (__dif)))				&& \
  	 net_eq(sock_net(__sk), (__net)))

+static inline int skb_is_sctpv6(const struct sk_buff *skb)
+{
+	return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
+}
+
  #endif /* _IPV6_H */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..b36d0cc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2393,7 +2393,6 @@ extern void	       skb_split(struct sk_buff *skb,
  extern int	       skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
  				 int shiftlen);
  extern void	       skb_scrub_packet(struct sk_buff *skb, bool xnet);
-
  extern struct sk_buff *skb_segment(struct sk_buff *skb,
  				   netdev_features_t features);

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d81cff1..54d6172 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
  	nf_reset_trace(skb);
  }
  EXPORT_SYMBOL_GPL(skb_scrub_packet);
+
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a04d872..8676b2c 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -587,7 +587,9 @@ slow_path_clean:

  slow_path:
  	/* for offloaded checksums cleanup checksum before fragmentation */
-	if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
+	if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
+	    !skb_is_sctpv4(skb) &&
+	    skb_checksum_help(skb))
  		goto fail;
  	iph = ip_hdr(skb);

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3a692d5..9b27d95 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -671,6 +671,7 @@ slow_path_clean:

  slow_path:
  	if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
+	    !skb_is_sctpv6(skb) &&
  	    skb_checksum_help(skb))
  		goto fail;

diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 3bb2cdc..ddef94a 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
  	return 0;
  }

+static int skb_is_sctp(const struct sk_buff *skb)
+{
+	if (skb->protocol == __constant_htons(ETH_P_IP))
+		return skb_is_sctpv4(skb);
+	else
+		return skb_is_sctpv6(skb);
+}
+
  int xfrm_output(struct sk_buff *skb)
  {
  	struct net *net = dev_net(skb_dst(skb)->dev);
@@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
  		return xfrm_output_gso(skb);

  	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		err = skb_checksum_help(skb);
-		if (err) {
-			XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
-			kfree_skb(skb);
-			return err;
+		if (!skb_is_sctp(skb)) {
+			err = skb_checksum_help(skb);
+			if (err) {
+				XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
+				kfree_skb(skb);
+				return err;
+			}
  		}
  	}

-- 
1.7.9.5






-- 
浮沉随浪只记今朝笑

--fan

  parent reply	other threads:[~2013-10-11  7:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-10  5:51 [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Fan Du
2013-10-10 13:11 ` Neil Horman
2013-10-11  7:02   ` Fan Du
2013-10-11  7:05   ` [PATCHv2 1/2 ] " Fan Du
2013-10-11 14:04     ` Vlad Yasevich
2013-10-11 17:12       ` Vlad Yasevich
2013-10-11  7:08   ` Fan Du [this message]
2013-10-11 14:25     ` [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set Vlad Yasevich
2013-10-12  9:45       ` Fan Du
2013-10-12 13:06         ` Vlad Yasevich
2013-10-14  7:16           ` Fan Du
2013-10-10 14:11 ` [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Vlad Yasevich
2013-10-11  7:02   ` Fan Du

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5257A3D6.3010002@windriver.com \
    --to=fan.du@windriver.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=steffen.klassert@secunet.com \
    --cc=vyasevich@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).