* [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
@ 2013-10-10 5:51 Fan Du
2013-10-10 13:11 ` Neil Horman
2013-10-10 14:11 ` [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Vlad Yasevich
0 siblings, 2 replies; 13+ messages in thread
From: Fan Du @ 2013-10-10 5:51 UTC (permalink / raw)
To: vyasevich, nhorman, steffen.klassert; +Cc: davem, netdev
igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.
And I saw another point in this part of code, when IPsec is not armed,
sctp communication is good, however setting setting CHECKSUM_PARTIAL will
make xfrm_output compute dummy checksum values which will be overwritten by
hardware lately.
So this patch try to solve above two issues together.
Signed-off-by: Fan Du <fan.du@windriver.com>
---
note:
igb/ixgbe hardware is not handy on my side, so just build test only.
---
net/sctp/output.c | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 0ac3a65..f0b9cc5 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
atomic_inc(&sk->sk_wmem_alloc);
}
+static int is_xfrm_armed(struct dst_entry *dst)
+{
+#ifdef CONFIG_XFRM
+ /* If dst->xfrm is valid, this skb needs to be transformed */
+ return dst->xfrm != NULL;
+#else
+ return 0;
+#endif
+}
+
/* All packets are sent to the network through this function from
* sctp_outq_tail().
*
@@ -536,20 +546,21 @@ int sctp_packet_transmit(struct sctp_packet *packet)
* by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
*/
if (!sctp_checksum_disable) {
- if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
+ if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
+ is_xfrm_armed(dst)) {
+
__u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
/* 3) Put the resultant value into the checksum field in the
* common header, and leave the rest of the bits unchanged.
*/
sh->checksum = sctp_end_cksum(crc32);
- } else {
- /* no need to seed pseudo checksum for SCTP */
- nskb->ip_summed = CHECKSUM_PARTIAL;
- nskb->csum_start = (skb_transport_header(nskb) -
- nskb->head);
- nskb->csum_offset = offsetof(struct sctphdr, checksum);
- }
+ } else
+ /* Mark skb as CHECKSUM_UNNECESSARY to let hardware compute
+ * the checksum, and also avoid xfrm_output to do unceccessary
+ * checksum.
+ */
+ nskb->ip_summed = CHECKSUM_UNNECESSARY;
}
/* IP layer ECN support
--
1.7.9.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-10 5:51 [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Fan Du
@ 2013-10-10 13:11 ` Neil Horman
2013-10-11 7:02 ` Fan Du
` (2 more replies)
2013-10-10 14:11 ` [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Vlad Yasevich
1 sibling, 3 replies; 13+ messages in thread
From: Neil Horman @ 2013-10-10 13:11 UTC (permalink / raw)
To: Fan Du; +Cc: vyasevich, steffen.klassert, davem, netdev
On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
> and also IPsec is armed to protect sctp traffic, ugly things happened as
> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
> up and pack the 16bits result in the checksum field). The result is fail
> establishment of sctp communication.
>
Shouldn't this be fixed in the xfrm code then? E.g. check the device features
for SCTP checksum offloading and and skip the checksum during xfrm output if its
available?
Or am I missing something?
Neil
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-10 5:51 [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Fan Du
2013-10-10 13:11 ` Neil Horman
@ 2013-10-10 14:11 ` Vlad Yasevich
2013-10-11 7:02 ` Fan Du
1 sibling, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2013-10-10 14:11 UTC (permalink / raw)
To: Fan Du, nhorman, steffen.klassert; +Cc: davem, netdev
On 10/10/2013 01:51 AM, Fan Du wrote:
> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
> and also IPsec is armed to protect sctp traffic, ugly things happened as
> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
> up and pack the 16bits result in the checksum field). The result is fail
> establishment of sctp communication.
>
> And I saw another point in this part of code, when IPsec is not armed,
> sctp communication is good, however setting setting CHECKSUM_PARTIAL will
> make xfrm_output compute dummy checksum values which will be overwritten by
> hardware lately.
>
> So this patch try to solve above two issues together.
>
> Signed-off-by: Fan Du <fan.du@windriver.com>
> ---
> note:
> igb/ixgbe hardware is not handy on my side, so just build test only.
>
> ---
> net/sctp/output.c | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 0ac3a65..f0b9cc5 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
> atomic_inc(&sk->sk_wmem_alloc);
> }
>
> +static int is_xfrm_armed(struct dst_entry *dst)
> +{
> +#ifdef CONFIG_XFRM
> + /* If dst->xfrm is valid, this skb needs to be transformed */
> + return dst->xfrm != NULL;
> +#else
> + return 0;
> +#endif
> +}
> +
> /* All packets are sent to the network through this function from
> * sctp_outq_tail().
> *
> @@ -536,20 +546,21 @@ int sctp_packet_transmit(struct sctp_packet *packet)
> * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
> */
> if (!sctp_checksum_disable) {
> - if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
> + if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
> + is_xfrm_armed(dst)) {
> +
> __u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
>
> /* 3) Put the resultant value into the checksum field in the
> * common header, and leave the rest of the bits unchanged.
> */
> sh->checksum = sctp_end_cksum(crc32);
> - } else {
> - /* no need to seed pseudo checksum for SCTP */
> - nskb->ip_summed = CHECKSUM_PARTIAL;
> - nskb->csum_start = (skb_transport_header(nskb) -
> - nskb->head);
> - nskb->csum_offset = offsetof(struct sctphdr, checksum);
> - }
> + } else
> + /* Mark skb as CHECKSUM_UNNECESSARY to let hardware compute
> + * the checksum, and also avoid xfrm_output to do unceccessary
> + * checksum.
> + */
> + nskb->ip_summed = CHECKSUM_UNNECESSARY;
> }
In addition to what Niel said, the use of CHECKSUM_UNNECESSARY is
incorrect as it will cause the nic to not compute the checksum.
The checksum offload depends on the use of CHECKSUM_PARTIAL.
-vlad
>
> /* IP layer ECN support
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-10 13:11 ` Neil Horman
@ 2013-10-11 7:02 ` Fan Du
2013-10-11 7:05 ` [PATCHv2 1/2 ] " Fan Du
2013-10-11 7:08 ` [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set Fan Du
2 siblings, 0 replies; 13+ messages in thread
From: Fan Du @ 2013-10-11 7:02 UTC (permalink / raw)
To: Neil Horman, steffen.klassert; +Cc: vyasevich, davem, netdev
On 2013年10月10日 21:11, Neil Horman wrote:
> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
> Shouldn't this be fixed in the xfrm code then? E.g. check the device features
> for SCTP checksum offloading and and skip the checksum during xfrm output if its
> available?
sigh... same as my first thought.
However why should xfrm_output check whether the skb is a SCTP one as well as whether
the associated dev is able to do hw SCTP crc32-c checksum in the first place, and then
call SCTP crc checksum value API to write the correct checksum *after* this skb has
leaven SCTP layer???
The checksum operation in xfrm_ouput fits TCP/UDP/IP layer checksum semantics, e.g.
calculate 16bits value by sum almost everything up. Make a SCTP specific exception
in this common path sound wired, as the fix can be done in SCTP codes easily with
minimum changes, so not any bit of consequence for non-SCTP traffic.
And If you/Vlad insist to do so as well as no objection from Steffen/David, I'm
happy to send this revised version as your suggested.
Anyway, I should have separated this patch for two which makes the issues more clear
for you and Vlad to review.
> Or am I missing something?
> Neil
>
>
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-10 14:11 ` [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Vlad Yasevich
@ 2013-10-11 7:02 ` Fan Du
0 siblings, 0 replies; 13+ messages in thread
From: Fan Du @ 2013-10-11 7:02 UTC (permalink / raw)
To: Vlad Yasevich; +Cc: nhorman, steffen.klassert, davem, netdev
On 2013年10月10日 22:11, Vlad Yasevich wrote:
> On 10/10/2013 01:51 AM, Fan Du wrote:
>> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
>> And I saw another point in this part of code, when IPsec is not armed,
>> sctp communication is good, however setting setting CHECKSUM_PARTIAL will
>> make xfrm_output compute dummy checksum values which will be overwritten by
>> hardware lately.
>>
>> So this patch try to solve above two issues together.
>>
>> Signed-off-by: Fan Du <fan.du@windriver.com>
>> ---
>> note:
>> igb/ixgbe hardware is not handy on my side, so just build test only.
>>
>> ---
>> net/sctp/output.c | 27 +++++++++++++++++++--------
>> 1 file changed, 19 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/sctp/output.c b/net/sctp/output.c
>> index 0ac3a65..f0b9cc5 100644
>> --- a/net/sctp/output.c
>> +++ b/net/sctp/output.c
>> @@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
>> atomic_inc(&sk->sk_wmem_alloc);
>> }
>>
>> +static int is_xfrm_armed(struct dst_entry *dst)
>> +{
>> +#ifdef CONFIG_XFRM
>> + /* If dst->xfrm is valid, this skb needs to be transformed */
>> + return dst->xfrm != NULL;
>> +#else
>> + return 0;
>> +#endif
>> +}
>> +
>> /* All packets are sent to the network through this function from
>> * sctp_outq_tail().
>> *
>> @@ -536,20 +546,21 @@ int sctp_packet_transmit(struct sctp_packet *packet)
>> * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
>> */
>> if (!sctp_checksum_disable) {
>> - if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
>> + if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
>> + is_xfrm_armed(dst)) {
>> +
>> __u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
>>
>> /* 3) Put the resultant value into the checksum field in the
>> * common header, and leave the rest of the bits unchanged.
>> */
>> sh->checksum = sctp_end_cksum(crc32);
>> - } else {
>> - /* no need to seed pseudo checksum for SCTP */
>> - nskb->ip_summed = CHECKSUM_PARTIAL;
>> - nskb->csum_start = (skb_transport_header(nskb) -
>> - nskb->head);
>> - nskb->csum_offset = offsetof(struct sctphdr, checksum);
>> - }
>> + } else
>> + /* Mark skb as CHECKSUM_UNNECESSARY to let hardware compute
>> + * the checksum, and also avoid xfrm_output to do unceccessary
>> + * checksum.
>> + */
>> + nskb->ip_summed = CHECKSUM_UNNECESSARY;
>> }
>
> In addition to what Niel said, the use of CHECKSUM_UNNECESSARY is
> incorrect as it will cause the nic to not compute the checksum.
> The checksum offload depends on the use of CHECKSUM_PARTIAL.
My bad, my head is cramed with IPsec recently :( We should fix this in ipv4/v6 output path.
> -vlad
>
>
>>
>> /* IP layer ECN support
>>
>
>
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCHv2 1/2 ] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-10 13:11 ` Neil Horman
2013-10-11 7:02 ` Fan Du
@ 2013-10-11 7:05 ` Fan Du
2013-10-11 14:04 ` Vlad Yasevich
2013-10-11 7:08 ` [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set Fan Du
2 siblings, 1 reply; 13+ messages in thread
From: Fan Du @ 2013-10-11 7:05 UTC (permalink / raw)
To: Neil Horman, vyasevich, steffen.klassert, davem; +Cc: netdev
On 2013年10月10日 21:11, Neil Horman wrote:
> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
> Shouldn't this be fixed in the xfrm code then? E.g. check the device features
> for SCTP checksum offloading and and skip the checksum during xfrm output if its
> available?
>
> Or am I missing something?
> Neil
>
>
From 014276de0877f11d46e1704114a7d91f19221a63 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@windriver.com>
Date: Fri, 11 Oct 2013 14:24:33 +0800
Subject: [PATCH 1/2] {xfrm, sctp} Stick to software crc32 even if hardware is
capable of that
igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.
Signed-off-by: Fan Du <fan.du@windriver.com>
---
v2:
Leave ip_summed as CHECKSUM_PARTIAL as before, the second patch will fix this.
---
net/sctp/output.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 0ac3a65..6de6402 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
atomic_inc(&sk->sk_wmem_alloc);
}
+static int is_xfrm_armed(struct dst_entry *dst)
+{
+#ifdef CONFIG_XFRM
+ /* If dst->xfrm is valid, this skb needs to be transformed */
+ return dst->xfrm != NULL;
+#else
+ return 0;
+#endif
+}
+
/* All packets are sent to the network through this function from
* sctp_outq_tail().
*
@@ -536,7 +546,9 @@ int sctp_packet_transmit(struct sctp_packet *packet)
* by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
*/
if (!sctp_checksum_disable) {
- if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
+ if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
+ is_xfrm_armed(dst)) {
+
__u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
/* 3) Put the resultant value into the checksum field in the
--
1.7.9.5
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
2013-10-10 13:11 ` Neil Horman
2013-10-11 7:02 ` Fan Du
2013-10-11 7:05 ` [PATCHv2 1/2 ] " Fan Du
@ 2013-10-11 7:08 ` Fan Du
2013-10-11 14:25 ` Vlad Yasevich
2 siblings, 1 reply; 13+ messages in thread
From: Fan Du @ 2013-10-11 7:08 UTC (permalink / raw)
To: Neil Horman, vyasevich, steffen.klassert, davem; +Cc: netdev
On 2013年10月10日 21:11, Neil Horman wrote:
> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
> Shouldn't this be fixed in the xfrm code then? E.g. check the device features
> for SCTP checksum offloading and and skip the checksum during xfrm output if its
> available?
>
> Or am I missing something?
> Neil
>
>
From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@windriver.com>
Date: Fri, 11 Oct 2013 14:31:57 +0800
Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
CHECKSUM_PARTIAL set
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
IPsec is not enabled in this scenario:
SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of doing
SCTP checksum(crc32-c) scoping the whole SCTP packet range. However when
such kind of skb is delivered through IPv4/v6 output handler, IPv4/v6 stack
interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute 16bits
checksum value by summing everything up, the whole SCTP packet in software
manner! After this skb reach NIC, after hardware doing its SCTP checking
business, a crc32-c value will overwrite the value IPv4/v6 stack computed
before.
This patch solves this by introducing skb_is_sctpv4/6 to optimize such case.
Signed-off-by: Fan Du <fan.du@windriver.com>
---
v2:
Rework this problem by introducing skb_is_scktv4/6
---
include/linux/ip.h | 5 +++++
include/linux/ipv6.h | 6 ++++++
include/linux/skbuff.h | 1 -
net/core/skbuff.c | 1 +
net/ipv4/ip_output.c | 4 +++-
net/ipv6/ip6_output.c | 1 +
net/xfrm/xfrm_output.c | 20 +++++++++++++++-----
7 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/include/linux/ip.h b/include/linux/ip.h
index 492bc65..f556292 100644
--- a/include/linux/ip.h
+++ b/include/linux/ip.h
@@ -19,6 +19,7 @@
#include <linux/skbuff.h>
#include <uapi/linux/ip.h>
+#include <uapi/linux/in.h>
static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
{
@@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct sk_buff *skb)
{
return (struct iphdr *)skb_transport_header(skb);
}
+static inline int skb_is_sctpv4(const struct sk_buff *skb)
+{
+ return ip_hdr(skb)->protocol == IPPROTO_SCTP;
+}
#endif /* _LINUX_IP_H */
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 28ea384..6e17c04 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -2,6 +2,7 @@
#define _IPV6_H
#include <uapi/linux/ipv6.h>
+#include <uapi/linux/in.h>
#define ipv6_optlen(p) (((p)->hdrlen+1) << 3)
/*
@@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const struct sock *sk)
((__sk)->sk_bound_dev_if == (__dif))) && \
net_eq(sock_net(__sk), (__net)))
+static inline int skb_is_sctpv6(const struct sk_buff *skb)
+{
+ return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
+}
+
#endif /* _IPV6_H */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..b36d0cc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2393,7 +2393,6 @@ extern void skb_split(struct sk_buff *skb,
extern int skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
int shiftlen);
extern void skb_scrub_packet(struct sk_buff *skb, bool xnet);
-
extern struct sk_buff *skb_segment(struct sk_buff *skb,
netdev_features_t features);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d81cff1..54d6172 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
nf_reset_trace(skb);
}
EXPORT_SYMBOL_GPL(skb_scrub_packet);
+
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a04d872..8676b2c 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -587,7 +587,9 @@ slow_path_clean:
slow_path:
/* for offloaded checksums cleanup checksum before fragmentation */
- if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
+ if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
+ !skb_is_sctpv4(skb) &&
+ skb_checksum_help(skb))
goto fail;
iph = ip_hdr(skb);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3a692d5..9b27d95 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -671,6 +671,7 @@ slow_path_clean:
slow_path:
if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
+ !skb_is_sctpv6(skb) &&
skb_checksum_help(skb))
goto fail;
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 3bb2cdc..ddef94a 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
return 0;
}
+static int skb_is_sctp(const struct sk_buff *skb)
+{
+ if (skb->protocol == __constant_htons(ETH_P_IP))
+ return skb_is_sctpv4(skb);
+ else
+ return skb_is_sctpv6(skb);
+}
+
int xfrm_output(struct sk_buff *skb)
{
struct net *net = dev_net(skb_dst(skb)->dev);
@@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
return xfrm_output_gso(skb);
if (skb->ip_summed == CHECKSUM_PARTIAL) {
- err = skb_checksum_help(skb);
- if (err) {
- XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
- kfree_skb(skb);
- return err;
+ if (!skb_is_sctp(skb)) {
+ err = skb_checksum_help(skb);
+ if (err) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
+ kfree_skb(skb);
+ return err;
+ }
}
}
--
1.7.9.5
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCHv2 1/2 ] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-11 7:05 ` [PATCHv2 1/2 ] " Fan Du
@ 2013-10-11 14:04 ` Vlad Yasevich
2013-10-11 17:12 ` Vlad Yasevich
0 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2013-10-11 14:04 UTC (permalink / raw)
To: Fan Du, Neil Horman, steffen.klassert, davem; +Cc: netdev
On 10/11/2013 03:05 AM, Fan Du wrote:
>
>
> On 2013年10月10日 21:11, Neil Horman wrote:
>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>> igb/ixgbe have hardware sctp checksum support, when this feature is
>>> enabled
>>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>> every thing
>>> up and pack the 16bits result in the checksum field). The result is fail
>>> establishment of sctp communication.
>>>
>> Shouldn't this be fixed in the xfrm code then? E.g. check the device
>> features
>> for SCTP checksum offloading and and skip the checksum during xfrm
>> output if its
>> available?
>>
>> Or am I missing something?
>> Neil
>>
>>
>
>
> From 014276de0877f11d46e1704114a7d91f19221a63 Mon Sep 17 00:00:00 2001
> From: Fan Du <fan.du@windriver.com>
> Date: Fri, 11 Oct 2013 14:24:33 +0800
> Subject: [PATCH 1/2] {xfrm, sctp} Stick to software crc32 even if
> hardware is
> capable of that
>
> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
> and also IPsec is armed to protect sctp traffic, ugly things happened as
> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every
> thing
> up and pack the 16bits result in the checksum field). The result is fail
> establishment of sctp communication.
>
> Signed-off-by: Fan Du <fan.du@windriver.com>
> ---
> v2:
> Leave ip_summed as CHECKSUM_PARTIAL as before, the second patch will
> fix this.
>
> ---
> net/sctp/output.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 0ac3a65..6de6402 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff
> *skb, struct sock *sk)
> atomic_inc(&sk->sk_wmem_alloc);
> }
>
> +static int is_xfrm_armed(struct dst_entry *dst)
> +{
> +#ifdef CONFIG_XFRM
> + /* If dst->xfrm is valid, this skb needs to be transformed */
> + return dst->xfrm != NULL;
> +#else
> + return 0;
> +#endif
> +}
> +
I would really prefer to have an accessor function to dst->xfrm, but
I see that everyone codes it inside the #ifdef. Gack.
> /* All packets are sent to the network through this function from
> * sctp_outq_tail().
> *
> @@ -536,7 +546,9 @@ int sctp_packet_transmit(struct sctp_packet *packet)
> * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
> */
> if (!sctp_checksum_disable) {
> - if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
> + if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
> + is_xfrm_armed(dst)) {
> +
> __u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
>
> /* 3) Put the resultant value into the checksum field in the
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
2013-10-11 7:08 ` [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set Fan Du
@ 2013-10-11 14:25 ` Vlad Yasevich
2013-10-12 9:45 ` Fan Du
0 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2013-10-11 14:25 UTC (permalink / raw)
To: Fan Du, Neil Horman, steffen.klassert, davem; +Cc: netdev
On 10/11/2013 03:08 AM, Fan Du wrote:
>
>
> On 2013年10月10日 21:11, Neil Horman wrote:
>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>> igb/ixgbe have hardware sctp checksum support, when this feature is
>>> enabled
>>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>> every thing
>>> up and pack the 16bits result in the checksum field). The result is fail
>>> establishment of sctp communication.
>>>
>> Shouldn't this be fixed in the xfrm code then? E.g. check the device
>> features
>> for SCTP checksum offloading and and skip the checksum during xfrm
>> output if its
>> available?
>>
>> Or am I missing something?
>> Neil
>>
>>
>
>
> From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00 2001
> From: Fan Du <fan.du@windriver.com>
> Date: Fri, 11 Oct 2013 14:31:57 +0800
> Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
> CHECKSUM_PARTIAL set
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> IPsec is not enabled in this scenario:
>
> SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of doing
> SCTP checksum(crc32-c) scoping the whole SCTP packet range. However when
> such kind of skb is delivered through IPv4/v6 output handler, IPv4/v6 stack
> interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute 16bits
> checksum value by summing everything up, the whole SCTP packet in software
> manner! After this skb reach NIC, after hardware doing its SCTP checking
> business, a crc32-c value will overwrite the value IPv4/v6 stack computed
> before.
>
> This patch solves this by introducing skb_is_sctpv4/6 to optimize such
> case.
>
> Signed-off-by: Fan Du <fan.du@windriver.com>
> ---
> v2:
> Rework this problem by introducing skb_is_scktv4/6
>
> ---
> include/linux/ip.h | 5 +++++
> include/linux/ipv6.h | 6 ++++++
> include/linux/skbuff.h | 1 -
> net/core/skbuff.c | 1 +
> net/ipv4/ip_output.c | 4 +++-
> net/ipv6/ip6_output.c | 1 +
> net/xfrm/xfrm_output.c | 20 +++++++++++++++-----
> 7 files changed, 31 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/ip.h b/include/linux/ip.h
> index 492bc65..f556292 100644
> --- a/include/linux/ip.h
> +++ b/include/linux/ip.h
> @@ -19,6 +19,7 @@
>
> #include <linux/skbuff.h>
> #include <uapi/linux/ip.h>
> +#include <uapi/linux/in.h>
>
> static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
> {
> @@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct
> sk_buff *skb)
> {
> return (struct iphdr *)skb_transport_header(skb);
> }
> +static inline int skb_is_sctpv4(const struct sk_buff *skb)
> +{
> + return ip_hdr(skb)->protocol == IPPROTO_SCTP;
> +}
> #endif /* _LINUX_IP_H */
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 28ea384..6e17c04 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -2,6 +2,7 @@
> #define _IPV6_H
>
> #include <uapi/linux/ipv6.h>
> +#include <uapi/linux/in.h>
>
> #define ipv6_optlen(p) (((p)->hdrlen+1) << 3)
> /*
> @@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const struct
> sock *sk)
> ((__sk)->sk_bound_dev_if == (__dif))) && \
> net_eq(sock_net(__sk), (__net)))
>
> +static inline int skb_is_sctpv6(const struct sk_buff *skb)
> +{
> + return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
> +}
> +
> #endif /* _IPV6_H */
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 2ddb48d..b36d0cc 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2393,7 +2393,6 @@ extern void skb_split(struct sk_buff *skb,
> extern int skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
> int shiftlen);
> extern void skb_scrub_packet(struct sk_buff *skb, bool xnet);
> -
> extern struct sk_buff *skb_segment(struct sk_buff *skb,
> netdev_features_t features);
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index d81cff1..54d6172 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
> nf_reset_trace(skb);
> }
> EXPORT_SYMBOL_GPL(skb_scrub_packet);
> +
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index a04d872..8676b2c 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -587,7 +587,9 @@ slow_path_clean:
>
> slow_path:
> /* for offloaded checksums cleanup checksum before fragmentation */
> - if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
> + if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
> + !skb_is_sctpv4(skb) &&
> + skb_checksum_help(skb))
> goto fail;
> iph = ip_hdr(skb);
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 3a692d5..9b27d95 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -671,6 +671,7 @@ slow_path_clean:
>
> slow_path:
> if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
> + !skb_is_sctpv6(skb) &&
> skb_checksum_help(skb))
> goto fail;
>
No, this isn't right. This is a case where IP fragmentation is
required, and the above code will cause SCTP checksum to not be
computed.
Looks like SCTP needs to compute the checksum in the case where
skb will be fragmented.
An alternative, that will also allow us to get rid of patch 1
in the serices is to have a checksum handler offload function
that can be used to compute checksum in this case.
-vlad
> diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
> index 3bb2cdc..ddef94a 100644
> --- a/net/xfrm/xfrm_output.c
> +++ b/net/xfrm/xfrm_output.c
> @@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
> return 0;
> }
>
> +static int skb_is_sctp(const struct sk_buff *skb)
> +{
> + if (skb->protocol == __constant_htons(ETH_P_IP))
> + return skb_is_sctpv4(skb);
> + else
> + return skb_is_sctpv6(skb);
> +}
> +
> int xfrm_output(struct sk_buff *skb)
> {
> struct net *net = dev_net(skb_dst(skb)->dev);
> @@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
> return xfrm_output_gso(skb);
>
> if (skb->ip_summed == CHECKSUM_PARTIAL) {
> - err = skb_checksum_help(skb);
> - if (err) {
> - XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> - kfree_skb(skb);
> - return err;
> + if (!skb_is_sctp(skb)) {
> + err = skb_checksum_help(skb);
> + if (err) {
> + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> + kfree_skb(skb);
> + return err;
> + }
> }
> }
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2 1/2 ] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
2013-10-11 14:04 ` Vlad Yasevich
@ 2013-10-11 17:12 ` Vlad Yasevich
0 siblings, 0 replies; 13+ messages in thread
From: Vlad Yasevich @ 2013-10-11 17:12 UTC (permalink / raw)
To: Fan Du, Neil Horman, steffen.klassert, davem; +Cc: netdev
On 10/11/2013 10:04 AM, Vlad Yasevich wrote:
> On 10/11/2013 03:05 AM, Fan Du wrote:
>>
>>
>> On 2013年10月10日 21:11, Neil Horman wrote:
>>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>>> igb/ixgbe have hardware sctp checksum support, when this feature is
>>>> enabled
>>>> and also IPsec is armed to protect sctp traffic, ugly things
>>>> happened as
>>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>>> every thing
>>>> up and pack the 16bits result in the checksum field). The result is
>>>> fail
>>>> establishment of sctp communication.
>>>>
>>> Shouldn't this be fixed in the xfrm code then? E.g. check the device
>>> features
>>> for SCTP checksum offloading and and skip the checksum during xfrm
>>> output if its
>>> available?
>>>
>>> Or am I missing something?
>>> Neil
>>>
>>>
>>
>>
>> From 014276de0877f11d46e1704114a7d91f19221a63 Mon Sep 17 00:00:00 2001
>> From: Fan Du <fan.du@windriver.com>
>> Date: Fri, 11 Oct 2013 14:24:33 +0800
>> Subject: [PATCH 1/2] {xfrm, sctp} Stick to software crc32 even if
>> hardware is
>> capable of that
>>
>> igb/ixgbe have hardware sctp checksum support, when this feature is
>> enabled
>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every
>> thing
>> up and pack the 16bits result in the checksum field). The result is fail
>> establishment of sctp communication.
>>
>> Signed-off-by: Fan Du <fan.du@windriver.com>
>> ---
>> v2:
>> Leave ip_summed as CHECKSUM_PARTIAL as before, the second patch will
>> fix this.
>>
>> ---
>> net/sctp/output.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sctp/output.c b/net/sctp/output.c
>> index 0ac3a65..6de6402 100644
>> --- a/net/sctp/output.c
>> +++ b/net/sctp/output.c
>> @@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff
>> *skb, struct sock *sk)
>> atomic_inc(&sk->sk_wmem_alloc);
>> }
>>
>> +static int is_xfrm_armed(struct dst_entry *dst)
>> +{
>> +#ifdef CONFIG_XFRM
>> + /* If dst->xfrm is valid, this skb needs to be transformed */
>> + return dst->xfrm != NULL;
>> +#else
>> + return 0;
>> +#endif
>> +}
>> +
>
> I would really prefer to have an accessor function to dst->xfrm, but
> I see that everyone codes it inside the #ifdef. Gack.
>
>> /* All packets are sent to the network through this function from
>> * sctp_outq_tail().
>> *
>> @@ -536,7 +546,9 @@ int sctp_packet_transmit(struct sctp_packet *packet)
>> * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
>> */
>> if (!sctp_checksum_disable) {
>> - if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
>> + if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
>> + is_xfrm_armed(dst)) {
>> +
>> __u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
>>
>> /* 3) Put the resultant value into the checksum field in
>> the
>
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>
This patch doesn't seem to apply to net.git.
-vlad
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
2013-10-11 14:25 ` Vlad Yasevich
@ 2013-10-12 9:45 ` Fan Du
2013-10-12 13:06 ` Vlad Yasevich
0 siblings, 1 reply; 13+ messages in thread
From: Fan Du @ 2013-10-12 9:45 UTC (permalink / raw)
To: Vlad Yasevich; +Cc: Neil Horman, steffen.klassert, davem, netdev
On 2013年10月11日 22:25, Vlad Yasevich wrote:
> On 10/11/2013 03:08 AM, Fan Du wrote:
>>
>>
>> On 2013年10月10日 21:11, Neil Horman wrote:
>>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>>> igb/ixgbe have hardware sctp checksum support, when this feature is
>>>> enabled
>>>> and also IPsec is armed to protect sctp traffic, ugly things happened as
>>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>>> every thing
>>>> up and pack the 16bits result in the checksum field). The result is fail
>>>> establishment of sctp communication.
>>>>
>>> Shouldn't this be fixed in the xfrm code then? E.g. check the device
>>> features
>>> for SCTP checksum offloading and and skip the checksum during xfrm
>>> output if its
>>> available?
>>>
>>> Or am I missing something?
>>> Neil
>>>
>>>
>>
>>
>> From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00 2001
>> From: Fan Du <fan.du@windriver.com>
>> Date: Fri, 11 Oct 2013 14:31:57 +0800
>> Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
>> CHECKSUM_PARTIAL set
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> IPsec is not enabled in this scenario:
>>
>> SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of doing
>> SCTP checksum(crc32-c) scoping the whole SCTP packet range. However when
>> such kind of skb is delivered through IPv4/v6 output handler, IPv4/v6 stack
>> interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute 16bits
>> checksum value by summing everything up, the whole SCTP packet in software
>> manner! After this skb reach NIC, after hardware doing its SCTP checking
>> business, a crc32-c value will overwrite the value IPv4/v6 stack computed
>> before.
>>
>> This patch solves this by introducing skb_is_sctpv4/6 to optimize such
>> case.
>>
>> Signed-off-by: Fan Du <fan.du@windriver.com>
>> ---
>> v2:
>> Rework this problem by introducing skb_is_scktv4/6
>>
>> ---
>> include/linux/ip.h | 5 +++++
>> include/linux/ipv6.h | 6 ++++++
>> include/linux/skbuff.h | 1 -
>> net/core/skbuff.c | 1 +
>> net/ipv4/ip_output.c | 4 +++-
>> net/ipv6/ip6_output.c | 1 +
>> net/xfrm/xfrm_output.c | 20 +++++++++++++++-----
>> 7 files changed, 31 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/ip.h b/include/linux/ip.h
>> index 492bc65..f556292 100644
>> --- a/include/linux/ip.h
>> +++ b/include/linux/ip.h
>> @@ -19,6 +19,7 @@
>>
>> #include <linux/skbuff.h>
>> #include <uapi/linux/ip.h>
>> +#include <uapi/linux/in.h>
>>
>> static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
>> {
>> @@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct
>> sk_buff *skb)
>> {
>> return (struct iphdr *)skb_transport_header(skb);
>> }
>> +static inline int skb_is_sctpv4(const struct sk_buff *skb)
>> +{
>> + return ip_hdr(skb)->protocol == IPPROTO_SCTP;
>> +}
>> #endif /* _LINUX_IP_H */
>> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
>> index 28ea384..6e17c04 100644
>> --- a/include/linux/ipv6.h
>> +++ b/include/linux/ipv6.h
>> @@ -2,6 +2,7 @@
>> #define _IPV6_H
>>
>> #include <uapi/linux/ipv6.h>
>> +#include <uapi/linux/in.h>
>>
>> #define ipv6_optlen(p) (((p)->hdrlen+1) << 3)
>> /*
>> @@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const struct
>> sock *sk)
>> ((__sk)->sk_bound_dev_if == (__dif))) && \
>> net_eq(sock_net(__sk), (__net)))
>>
>> +static inline int skb_is_sctpv6(const struct sk_buff *skb)
>> +{
>> + return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
>> +}
>> +
>> #endif /* _IPV6_H */
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 2ddb48d..b36d0cc 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -2393,7 +2393,6 @@ extern void skb_split(struct sk_buff *skb,
>> extern int skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
>> int shiftlen);
>> extern void skb_scrub_packet(struct sk_buff *skb, bool xnet);
>> -
>> extern struct sk_buff *skb_segment(struct sk_buff *skb,
>> netdev_features_t features);
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index d81cff1..54d6172 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
>> nf_reset_trace(skb);
>> }
>> EXPORT_SYMBOL_GPL(skb_scrub_packet);
>> +
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index a04d872..8676b2c 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -587,7 +587,9 @@ slow_path_clean:
>>
>> slow_path:
>> /* for offloaded checksums cleanup checksum before fragmentation */
>> - if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
>> + if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
>> + !skb_is_sctpv4(skb) &&
>> + skb_checksum_help(skb))
>> goto fail;
>> iph = ip_hdr(skb);
>>
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index 3a692d5..9b27d95 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -671,6 +671,7 @@ slow_path_clean:
>>
>> slow_path:
>> if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
>> + !skb_is_sctpv6(skb) &&
>> skb_checksum_help(skb))
>> goto fail;
>>
>
> No, this isn't right. This is a case where IP fragmentation is
> required, and the above code will cause SCTP checksum to not be
> computed.
Ok, I got my ball, ten bucks bet this correct ;)
IPv4:
after skb reach fragmentation part, CHECKSUM_PARTIAL denotes
L4 layer protocol need to be checksummed, then IPv4 checksum is
recomputed for each fragmented IPv4 packet.
IPv6:
Here IPv6 doesn't need checksum for its header, again
CHECKSUM_PARTIAL denotes L4 layer protocol need to be checksummed.
So all in all, this is the right place to distinguish SCTP skb out,
and skip checksum operation as hw does it thereafter.
Q.E.D.
> Looks like SCTP needs to compute the checksum in the case where
> skb will be fragmented.
>
> An alternative, that will also allow us to get rid of patch 1
> in the serices is to have a checksum handler offload function
> that can be used to compute checksum in this case.
>
> -vlad
>
>> diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
>> index 3bb2cdc..ddef94a 100644
>> --- a/net/xfrm/xfrm_output.c
>> +++ b/net/xfrm/xfrm_output.c
>> @@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
>> return 0;
>> }
>>
>> +static int skb_is_sctp(const struct sk_buff *skb)
>> +{
>> + if (skb->protocol == __constant_htons(ETH_P_IP))
>> + return skb_is_sctpv4(skb);
>> + else
>> + return skb_is_sctpv6(skb);
>> +}
>> +
>> int xfrm_output(struct sk_buff *skb)
>> {
>> struct net *net = dev_net(skb_dst(skb)->dev);
>> @@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
>> return xfrm_output_gso(skb);
>>
>> if (skb->ip_summed == CHECKSUM_PARTIAL) {
>> - err = skb_checksum_help(skb);
>> - if (err) {
>> - XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>> - kfree_skb(skb);
>> - return err;
>> + if (!skb_is_sctp(skb)) {
>> + err = skb_checksum_help(skb);
>> + if (err) {
>> + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>> + kfree_skb(skb);
>> + return err;
>> + }
>> }
>> }
>>
>
>
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
2013-10-12 9:45 ` Fan Du
@ 2013-10-12 13:06 ` Vlad Yasevich
2013-10-14 7:16 ` Fan Du
0 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2013-10-12 13:06 UTC (permalink / raw)
To: Fan Du; +Cc: Neil Horman, steffen.klassert, davem, netdev
Fan Du <fan.du@windriver.com> wrote:
>
>
>On 2013年10月11日 22:25, Vlad Yasevich wrote:
>> On 10/11/2013 03:08 AM, Fan Du wrote:
>>>
>>>
>>> On 2013年10月10日 21:11, Neil Horman wrote:
>>>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>>>> igb/ixgbe have hardware sctp checksum support, when this feature
>is
>>>>> enabled
>>>>> and also IPsec is armed to protect sctp traffic, ugly things
>happened as
>>>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>>>> every thing
>>>>> up and pack the 16bits result in the checksum field). The result
>is fail
>>>>> establishment of sctp communication.
>>>>>
>>>> Shouldn't this be fixed in the xfrm code then? E.g. check the
>device
>>>> features
>>>> for SCTP checksum offloading and and skip the checksum during xfrm
>>>> output if its
>>>> available?
>>>>
>>>> Or am I missing something?
>>>> Neil
>>>>
>>>>
>>>
>>>
>>> From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00
>2001
>>> From: Fan Du <fan.du@windriver.com>
>>> Date: Fri, 11 Oct 2013 14:31:57 +0800
>>> Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
>>> CHECKSUM_PARTIAL set
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=UTF-8
>>> Content-Transfer-Encoding: 8bit
>>>
>>> IPsec is not enabled in this scenario:
>>>
>>> SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of
>doing
>>> SCTP checksum(crc32-c) scoping the whole SCTP packet range. However
>when
>>> such kind of skb is delivered through IPv4/v6 output handler,
>IPv4/v6 stack
>>> interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute
>16bits
>>> checksum value by summing everything up, the whole SCTP packet in
>software
>>> manner! After this skb reach NIC, after hardware doing its SCTP
>checking
>>> business, a crc32-c value will overwrite the value IPv4/v6 stack
>computed
>>> before.
>>>
>>> This patch solves this by introducing skb_is_sctpv4/6 to optimize
>such
>>> case.
>>>
>>> Signed-off-by: Fan Du <fan.du@windriver.com>
>>> ---
>>> v2:
>>> Rework this problem by introducing skb_is_scktv4/6
>>>
>>> ---
>>> include/linux/ip.h | 5 +++++
>>> include/linux/ipv6.h | 6 ++++++
>>> include/linux/skbuff.h | 1 -
>>> net/core/skbuff.c | 1 +
>>> net/ipv4/ip_output.c | 4 +++-
>>> net/ipv6/ip6_output.c | 1 +
>>> net/xfrm/xfrm_output.c | 20 +++++++++++++++-----
>>> 7 files changed, 31 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/linux/ip.h b/include/linux/ip.h
>>> index 492bc65..f556292 100644
>>> --- a/include/linux/ip.h
>>> +++ b/include/linux/ip.h
>>> @@ -19,6 +19,7 @@
>>>
>>> #include <linux/skbuff.h>
>>> #include <uapi/linux/ip.h>
>>> +#include <uapi/linux/in.h>
>>>
>>> static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
>>> {
>>> @@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct
>>> sk_buff *skb)
>>> {
>>> return (struct iphdr *)skb_transport_header(skb);
>>> }
>>> +static inline int skb_is_sctpv4(const struct sk_buff *skb)
>>> +{
>>> + return ip_hdr(skb)->protocol == IPPROTO_SCTP;
>>> +}
>>> #endif /* _LINUX_IP_H */
>>> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
>>> index 28ea384..6e17c04 100644
>>> --- a/include/linux/ipv6.h
>>> +++ b/include/linux/ipv6.h
>>> @@ -2,6 +2,7 @@
>>> #define _IPV6_H
>>>
>>> #include <uapi/linux/ipv6.h>
>>> +#include <uapi/linux/in.h>
>>>
>>> #define ipv6_optlen(p) (((p)->hdrlen+1) << 3)
>>> /*
>>> @@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const
>struct
>>> sock *sk)
>>> ((__sk)->sk_bound_dev_if == (__dif))) && \
>>> net_eq(sock_net(__sk), (__net)))
>>>
>>> +static inline int skb_is_sctpv6(const struct sk_buff *skb)
>>> +{
>>> + return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
>>> +}
>>> +
>>> #endif /* _IPV6_H */
>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>> index 2ddb48d..b36d0cc 100644
>>> --- a/include/linux/skbuff.h
>>> +++ b/include/linux/skbuff.h
>>> @@ -2393,7 +2393,6 @@ extern void skb_split(struct sk_buff *skb,
>>> extern int skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
>>> int shiftlen);
>>> extern void skb_scrub_packet(struct sk_buff *skb, bool xnet);
>>> -
>>> extern struct sk_buff *skb_segment(struct sk_buff *skb,
>>> netdev_features_t features);
>>>
>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>> index d81cff1..54d6172 100644
>>> --- a/net/core/skbuff.c
>>> +++ b/net/core/skbuff.c
>>> @@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb,
>bool xnet)
>>> nf_reset_trace(skb);
>>> }
>>> EXPORT_SYMBOL_GPL(skb_scrub_packet);
>>> +
>>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>>> index a04d872..8676b2c 100644
>>> --- a/net/ipv4/ip_output.c
>>> +++ b/net/ipv4/ip_output.c
>>> @@ -587,7 +587,9 @@ slow_path_clean:
>>>
>>> slow_path:
>>> /* for offloaded checksums cleanup checksum before fragmentation */
>>> - if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
>skb_checksum_help(skb))
>>> + if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
>>> + !skb_is_sctpv4(skb) &&
>>> + skb_checksum_help(skb))
>>> goto fail;
>>> iph = ip_hdr(skb);
>>>
>>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>>> index 3a692d5..9b27d95 100644
>>> --- a/net/ipv6/ip6_output.c
>>> +++ b/net/ipv6/ip6_output.c
>>> @@ -671,6 +671,7 @@ slow_path_clean:
>>>
>>> slow_path:
>>> if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
>>> + !skb_is_sctpv6(skb) &&
>>> skb_checksum_help(skb))
>>> goto fail;
>>>
>>
>> No, this isn't right. This is a case where IP fragmentation is
>> required, and the above code will cause SCTP checksum to not be
>> computed.
>
>Ok, I got my ball, ten bucks bet this correct ;)
>
>IPv4:
>after skb reach fragmentation part, CHECKSUM_PARTIAL denotes
>L4 layer protocol need to be checksummed, then IPv4 checksum is
>recomputed for each fragmented IPv4 packet.
>
>IPv6:
>Here IPv6 doesn't need checksum for its header, again
>CHECKSUM_PARTIAL denotes L4 layer protocol need to be checksummed.
>
>So all in all, this is the right place to distinguish SCTP skb out,
>and skip checksum operation as hw does it thereafter.
>
How does HW compute SCTP checksum when the data is split between skb?
Each skb will be submitted separately to the HW.
I think we need to fall back to SW checksum when packer will be fragmented.
-vlad
>Q.E.D.
>
>
>> Looks like SCTP needs to compute the checksum in the case where
>> skb will be fragmented.
>>
>> An alternative, that will also allow us to get rid of patch 1
>> in the serices is to have a checksum handler offload function
>> that can be used to compute checksum in this case.
>>
>> -vlad
>>
>>> diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
>>> index 3bb2cdc..ddef94a 100644
>>> --- a/net/xfrm/xfrm_output.c
>>> +++ b/net/xfrm/xfrm_output.c
>>> @@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
>>> return 0;
>>> }
>>>
>>> +static int skb_is_sctp(const struct sk_buff *skb)
>>> +{
>>> + if (skb->protocol == __constant_htons(ETH_P_IP))
>>> + return skb_is_sctpv4(skb);
>>> + else
>>> + return skb_is_sctpv6(skb);
>>> +}
>>> +
>>> int xfrm_output(struct sk_buff *skb)
>>> {
>>> struct net *net = dev_net(skb_dst(skb)->dev);
>>> @@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
>>> return xfrm_output_gso(skb);
>>>
>>> if (skb->ip_summed == CHECKSUM_PARTIAL) {
>>> - err = skb_checksum_help(skb);
>>> - if (err) {
>>> - XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>>> - kfree_skb(skb);
>>> - return err;
>>> + if (!skb_is_sctp(skb)) {
>>> + err = skb_checksum_help(skb);
>>> + if (err) {
>>> + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>>> + kfree_skb(skb);
>>> + return err;
>>> + }
>>> }
>>> }
>>>
>>
>>
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set
2013-10-12 13:06 ` Vlad Yasevich
@ 2013-10-14 7:16 ` Fan Du
0 siblings, 0 replies; 13+ messages in thread
From: Fan Du @ 2013-10-14 7:16 UTC (permalink / raw)
To: Vlad Yasevich; +Cc: Neil Horman, steffen.klassert, davem, netdev
On 2013年10月12日 21:06, Vlad Yasevich wrote:
>
>
> Fan Du<fan.du@windriver.com> wrote:
>
>>
>>
>> On 2013年10月11日 22:25, Vlad Yasevich wrote:
>>> On 10/11/2013 03:08 AM, Fan Du wrote:
>>>>
>>>>
>>>> On 2013年10月10日 21:11, Neil Horman wrote:
>>>>> On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
>>>>>> igb/ixgbe have hardware sctp checksum support, when this feature
>> is
>>>>>> enabled
>>>>>> and also IPsec is armed to protect sctp traffic, ugly things
>> happened as
>>>>>> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum
>>>>>> every thing
>>>>>> up and pack the 16bits result in the checksum field). The result
>> is fail
>>>>>> establishment of sctp communication.
>>>>>>
>>>>> Shouldn't this be fixed in the xfrm code then? E.g. check the
>> device
>>>>> features
>>>>> for SCTP checksum offloading and and skip the checksum during xfrm
>>>>> output if its
>>>>> available?
>>>>>
>>>>> Or am I missing something?
>>>>> Neil
>>>>>
>>>>>
>>>>
>>>>
>>>> From cff25810910603ff991f0c56441d6de8dc33a822 Mon Sep 17 00:00:00
>> 2001
>>>> From: Fan Du<fan.du@windriver.com>
>>>> Date: Fri, 11 Oct 2013 14:31:57 +0800
>>>> Subject: [PATCH 2/2] Don't compute checksum value for SCTP skb with
>>>> CHECKSUM_PARTIAL set
>>>> MIME-Version: 1.0
>>>> Content-Type: text/plain; charset=UTF-8
>>>> Content-Transfer-Encoding: 8bit
>>>>
>>>> IPsec is not enabled in this scenario:
>>>>
>>>> SCTP skb set CHECKSUM_PARTIAL to indicate hardware is capable of
>> doing
>>>> SCTP checksum(crc32-c) scoping the whole SCTP packet range. However
>> when
>>>> such kind of skb is delivered through IPv4/v6 output handler,
>> IPv4/v6 stack
>>>> interpret CHECKSUM_PARTIAL by calling skb_checksum_help to compute
>> 16bits
>>>> checksum value by summing everything up, the whole SCTP packet in
>> software
>>>> manner! After this skb reach NIC, after hardware doing its SCTP
>> checking
>>>> business, a crc32-c value will overwrite the value IPv4/v6 stack
>> computed
>>>> before.
>>>>
>>>> This patch solves this by introducing skb_is_sctpv4/6 to optimize
>> such
>>>> case.
>>>>
>>>> Signed-off-by: Fan Du<fan.du@windriver.com>
>>>> ---
>>>> v2:
>>>> Rework this problem by introducing skb_is_scktv4/6
>>>>
>>>> ---
>>>> include/linux/ip.h | 5 +++++
>>>> include/linux/ipv6.h | 6 ++++++
>>>> include/linux/skbuff.h | 1 -
>>>> net/core/skbuff.c | 1 +
>>>> net/ipv4/ip_output.c | 4 +++-
>>>> net/ipv6/ip6_output.c | 1 +
>>>> net/xfrm/xfrm_output.c | 20 +++++++++++++++-----
>>>> 7 files changed, 31 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/include/linux/ip.h b/include/linux/ip.h
>>>> index 492bc65..f556292 100644
>>>> --- a/include/linux/ip.h
>>>> +++ b/include/linux/ip.h
>>>> @@ -19,6 +19,7 @@
>>>>
>>>> #include<linux/skbuff.h>
>>>> #include<uapi/linux/ip.h>
>>>> +#include<uapi/linux/in.h>
>>>>
>>>> static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
>>>> {
>>>> @@ -34,4 +35,8 @@ static inline struct iphdr *ipip_hdr(const struct
>>>> sk_buff *skb)
>>>> {
>>>> return (struct iphdr *)skb_transport_header(skb);
>>>> }
>>>> +static inline int skb_is_sctpv4(const struct sk_buff *skb)
>>>> +{
>>>> + return ip_hdr(skb)->protocol == IPPROTO_SCTP;
>>>> +}
>>>> #endif /* _LINUX_IP_H */
>>>> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
>>>> index 28ea384..6e17c04 100644
>>>> --- a/include/linux/ipv6.h
>>>> +++ b/include/linux/ipv6.h
>>>> @@ -2,6 +2,7 @@
>>>> #define _IPV6_H
>>>>
>>>> #include<uapi/linux/ipv6.h>
>>>> +#include<uapi/linux/in.h>
>>>>
>>>> #define ipv6_optlen(p) (((p)->hdrlen+1)<< 3)
>>>> /*
>>>> @@ -387,4 +388,9 @@ static inline struct raw6_sock *raw6_sk(const
>> struct
>>>> sock *sk)
>>>> ((__sk)->sk_bound_dev_if == (__dif)))&& \
>>>> net_eq(sock_net(__sk), (__net)))
>>>>
>>>> +static inline int skb_is_sctpv6(const struct sk_buff *skb)
>>>> +{
>>>> + return ipv6_hdr(skb)->nexthdr == IPPROTO_SCTP;
>>>> +}
>>>> +
>>>> #endif /* _IPV6_H */
>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>>> index 2ddb48d..b36d0cc 100644
>>>> --- a/include/linux/skbuff.h
>>>> +++ b/include/linux/skbuff.h
>>>> @@ -2393,7 +2393,6 @@ extern void skb_split(struct sk_buff *skb,
>>>> extern int skb_shift(struct sk_buff *tgt, struct sk_buff *skb,
>>>> int shiftlen);
>>>> extern void skb_scrub_packet(struct sk_buff *skb, bool xnet);
>>>> -
>>>> extern struct sk_buff *skb_segment(struct sk_buff *skb,
>>>> netdev_features_t features);
>>>>
>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>> index d81cff1..54d6172 100644
>>>> --- a/net/core/skbuff.c
>>>> +++ b/net/core/skbuff.c
>>>> @@ -3526,3 +3526,4 @@ void skb_scrub_packet(struct sk_buff *skb,
>> bool xnet)
>>>> nf_reset_trace(skb);
>>>> }
>>>> EXPORT_SYMBOL_GPL(skb_scrub_packet);
>>>> +
>>>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>>>> index a04d872..8676b2c 100644
>>>> --- a/net/ipv4/ip_output.c
>>>> +++ b/net/ipv4/ip_output.c
>>>> @@ -587,7 +587,9 @@ slow_path_clean:
>>>>
>>>> slow_path:
>>>> /* for offloaded checksums cleanup checksum before fragmentation */
>>>> - if ((skb->ip_summed == CHECKSUM_PARTIAL)&&
>> skb_checksum_help(skb))
>>>> + if ((skb->ip_summed == CHECKSUM_PARTIAL)&&
>>>> + !skb_is_sctpv4(skb)&&
>>>> + skb_checksum_help(skb))
>>>> goto fail;
>>>> iph = ip_hdr(skb);
>>>>
>>>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>>>> index 3a692d5..9b27d95 100644
>>>> --- a/net/ipv6/ip6_output.c
>>>> +++ b/net/ipv6/ip6_output.c
>>>> @@ -671,6 +671,7 @@ slow_path_clean:
>>>>
>>>> slow_path:
>>>> if ((skb->ip_summed == CHECKSUM_PARTIAL)&&
>>>> + !skb_is_sctpv6(skb)&&
>>>> skb_checksum_help(skb))
>>>> goto fail;
>>>>
>>>
>>> No, this isn't right. This is a case where IP fragmentation is
>>> required, and the above code will cause SCTP checksum to not be
>>> computed.
>>
>> Ok, I got my ball, ten bucks bet this correct ;)
>>
>> IPv4:
>> after skb reach fragmentation part, CHECKSUM_PARTIAL denotes
>> L4 layer protocol need to be checksummed, then IPv4 checksum is
>> recomputed for each fragmented IPv4 packet.
>>
>> IPv6:
>> Here IPv6 doesn't need checksum for its header, again
>> CHECKSUM_PARTIAL denotes L4 layer protocol need to be checksummed.
>>
>> So all in all, this is the right place to distinguish SCTP skb out,
>> and skip checksum operation as hw does it thereafter.
>>
>
> How does HW compute SCTP checksum when the data is split between skb?
> Each skb will be submitted separately to the HW.
> I think we need to fall back to SW checksum when packer will be fragmented.
Hi, Vlad
I understand your argument now, I finally applied a 82576 NIC with SCTP CHECKSUM
supported to test this. It seems when sending this super-sized packet,
sctp_datamsg_from_user fragments each packet to 1480 size already using pathmtu,
this pathmtu is equal or less than the interface device mtu, which means
ip_fragment/ip6_fragment didn't fragments any SCTP skb.
Host A(with 82576):
sctp_test -h 128.224.162.161 -p 5001 -H 128.224.162.220 -P 500 -x 1 -c 5 -s -T
^^^
set each packet to 32768 bytes
I cannot picture any scenario where a SCTP skb with CHECK_PARTIAL set to reach
ip_fragment/ip6_fragment. FWIW, the fix for xfrm_output part is definitely a
valid one.
> -vlad
>
>> Q.E.D.
>>
>>
>>> Looks like SCTP needs to compute the checksum in the case where
>>> skb will be fragmented.
>>>
>>> An alternative, that will also allow us to get rid of patch 1
>>> in the serices is to have a checksum handler offload function
>>> that can be used to compute checksum in this case.
>>>
>>> -vlad
>>>
>>>> diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
>>>> index 3bb2cdc..ddef94a 100644
>>>> --- a/net/xfrm/xfrm_output.c
>>>> +++ b/net/xfrm/xfrm_output.c
>>>> @@ -180,6 +180,14 @@ static int xfrm_output_gso(struct sk_buff *skb)
>>>> return 0;
>>>> }
>>>>
>>>> +static int skb_is_sctp(const struct sk_buff *skb)
>>>> +{
>>>> + if (skb->protocol == __constant_htons(ETH_P_IP))
>>>> + return skb_is_sctpv4(skb);
>>>> + else
>>>> + return skb_is_sctpv6(skb);
>>>> +}
>>>> +
>>>> int xfrm_output(struct sk_buff *skb)
>>>> {
>>>> struct net *net = dev_net(skb_dst(skb)->dev);
>>>> @@ -189,11 +197,13 @@ int xfrm_output(struct sk_buff *skb)
>>>> return xfrm_output_gso(skb);
>>>>
>>>> if (skb->ip_summed == CHECKSUM_PARTIAL) {
>>>> - err = skb_checksum_help(skb);
>>>> - if (err) {
>>>> - XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>>>> - kfree_skb(skb);
>>>> - return err;
>>>> + if (!skb_is_sctp(skb)) {
>>>> + err = skb_checksum_help(skb);
>>>> + if (err) {
>>>> + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
>>>> + kfree_skb(skb);
>>>> + return err;
>>>> + }
>>>> }
>>>> }
>>>>
>>>
>>>
>
--
浮沉随浪只记今朝笑
--fan
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-10-14 7:18 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-10 5:51 [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Fan Du
2013-10-10 13:11 ` Neil Horman
2013-10-11 7:02 ` Fan Du
2013-10-11 7:05 ` [PATCHv2 1/2 ] " Fan Du
2013-10-11 14:04 ` Vlad Yasevich
2013-10-11 17:12 ` Vlad Yasevich
2013-10-11 7:08 ` [PATCHv2 2/2] Don't compute checksum value for SCTP skb with, CHECKSUM_PARTIAL set Fan Du
2013-10-11 14:25 ` Vlad Yasevich
2013-10-12 9:45 ` Fan Du
2013-10-12 13:06 ` Vlad Yasevich
2013-10-14 7:16 ` Fan Du
2013-10-10 14:11 ` [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that Vlad Yasevich
2013-10-11 7:02 ` Fan Du
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).