linux-sctp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 0/5] net: improve support for SCTP checksums
@ 2017-01-23 16:52 Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
                   ` (4 more replies)
  0 siblings, 5 replies; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

When NETIF_F_CSUM_MASK bits are all 0 on netdev features, validate_xmit_skb
uses skb_checksum_help to compute 16-bit, 2-complement checksum on non-GSO
skbs having ip_summed equal to CHECKSUM_PARTIAL.
This results in a systematic corruption of SCTP packets, since they need
to be checksummed with crc32c. Moreover, this is done regardless the value
of NETIF_F_SCTP_CRC, so any chance to offload crc32c computation on the 
NIC is lost. Finally, even when at least one bit in NETIF_F_CSUM_MASK is
set on netdev features, validate_xmit_skb skips checksum computation - but
then most NIC drivers can only call skb_checksum_help if their HW can't
offload the checksum computation. Depending on the driver code, this
results in wrong handling of SCTP, leading to:

- packet being dropped
- packet being transmitted with identically-zero checksum
- packet being transmitted with 2-complement checksum instead of crc32c

This series tries to address the above issue, by providing:
- the possibility to compute crc32c on skbs in Linux net core [patch 1]
- skb_sctp_csum_help, a function sharing common code with the original
  skb_checksum_help, that performs SW checksumming for skbs using crc32c
  [patch 2 and patch 3]
- skb_csum_hwoffload_help, called by validate xmit skb to perform SW
  checksumming using the correct algorithm based on the value of IP
  protocol number and netdev features bitmask [patch 4]
- an update to Linux documentation [patch 5]

Davide Caratti (5):
  skbuff: add stub to help computing crc32c on SCTP packets
  net: split skb_checksum_help
  net: introduce skb_sctp_csum_help
  net: more accurate checksumming in validate_xmit_skb
  Documentation: add description of skb_sctp_csum_help

 Documentation/networking/checksum-offloads.txt |   9 +-
 include/linux/netdevice.h                      |   1 +
 include/linux/skbuff.h                         |   5 +-
 net/core/dev.c                                 | 132 +++++++++++++++++++++----
 net/core/skbuff.c                              |  20 ++++
 net/sctp/offload.c                             |   7 ++
 6 files changed, 151 insertions(+), 23 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets
  2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
@ 2017-01-23 16:52 ` Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 2/5] net: split skb_checksum_help Davide Caratti
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
it can't be used in net core. Like it has been done previously with other
symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
to allow computation of SCTP checksum in net core after sctp.ko (and thus
libcrc32c) has been loaded.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 20 ++++++++++++++++++++
 net/sctp/offload.c     |  7 +++++++
 3 files changed, 29 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6f63b7e..44fc804 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3119,6 +3119,8 @@ struct skb_checksum_ops {
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
 };
 
+extern const struct skb_checksum_ops *sctp_csum_stub __read_mostly;
+
 __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
 		      __wsum csum, const struct skb_checksum_ops *ops);
 __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f8dbe4a..60e9963 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2235,6 +2235,26 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 }
 EXPORT_SYMBOL(skb_copy_and_csum_bits);
 
+static __wsum warn_sctp_csum_update(const void *buff, int len, __wsum sum)
+{
+	net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
+	return 0;
+}
+
+static __wsum warn_sctp_csum_combine(__wsum csum, __wsum csum2,
+					 int offset, int len)
+{
+	net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
+	return 0;
+}
+
+const struct skb_checksum_ops *sctp_csum_stub __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = warn_sctp_csum_update,
+	.combine = warn_sctp_csum_combine,
+};
+EXPORT_SYMBOL(sctp_csum_stub);
+
  /**
  *	skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
  *	@from: source buffer
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 7e869d0..1c9c548 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
 	},
 };
 
+static const struct skb_checksum_ops *sctp_csum_ops __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = sctp_csum_update,
+	.combine = sctp_csum_combine,
+};
+
 int __init sctp_offload_init(void)
 {
 	int ret;
@@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
 	if (ret)
 		goto ipv4;
 
+	sctp_csum_stub = sctp_csum_ops;
 	return ret;
 
 ipv4:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
@ 2017-01-23 16:52 ` Davide Caratti
  2017-01-23 20:59   ` Tom Herbert
  2017-01-23 16:52 ` [RFC PATCH net-next 3/5] net: introduce skb_sctp_csum_help Davide Caratti
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

skb_checksum_help is designed to compute the Internet Checksum only. To
avoid duplicating code when other checksumming algorithms (e.g. crc32c)
are used, separate common part from RFC1624-specific part.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/core/dev.c | 51 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index ad5959e..6742160 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2532,13 +2532,36 @@ static void skb_warn_bad_offload(const struct sk_buff *skb)
 	     skb_shinfo(skb)->gso_type, skb->ip_summed);
 }
 
-/*
- * Invalidate hardware checksum when packet is to be mangled, and
+/* compute 16-bit RFC1624 checksum and store it at skb->data + offset */
+static int skb_rfc1624_csum(struct sk_buff *skb, int offset)
+{
+	__wsum csum;
+	int ret = 0;
+
+	csum = skb_checksum(skb, offset, skb->len - offset, 0);
+
+	offset += skb->csum_offset;
+	BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__sum16))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
+out:
+	return ret;
+}
+
+/* Invalidate hardware checksum when packet is to be mangled, and
  * complete checksum manually on outgoing path.
+ *    @skb - buffer that needs checksum
+ *    @csum_algo(skb, offset) - function used to compute the checksum
  */
-int skb_checksum_help(struct sk_buff *skb)
+static int __skb_checksum_help(struct sk_buff *skb,
+			       int (*csum_algo)(struct sk_buff *, int))
 {
-	__wsum csum;
 	int ret = 0, offset;
 
 	if (skb->ip_summed = CHECKSUM_COMPLETE)
@@ -2560,24 +2583,20 @@ int skb_checksum_help(struct sk_buff *skb)
 
 	offset = skb_checksum_start_offset(skb);
 	BUG_ON(offset >= skb_headlen(skb));
-	csum = skb_checksum(skb, offset, skb->len - offset, 0);
-
-	offset += skb->csum_offset;
-	BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
-
-	if (skb_cloned(skb) &&
-	    !skb_clone_writable(skb, offset + sizeof(__sum16))) {
-		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
-		if (ret)
-			goto out;
-	}
 
-	*(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
+	ret = csum_algo(skb, offset);
+	if (ret)
+		goto out;
 out_set_summed:
 	skb->ip_summed = CHECKSUM_NONE;
 out:
 	return ret;
 }
+
+int skb_checksum_help(struct sk_buff *skb)
+{
+	return __skb_checksum_help(skb, skb_rfc1624_csum);
+}
 EXPORT_SYMBOL(skb_checksum_help);
 
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH net-next 3/5] net: introduce skb_sctp_csum_help
  2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 2/5] net: split skb_checksum_help Davide Caratti
@ 2017-01-23 16:52 ` Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 4/5] net: more accurate checksumming in validate_xmit_skb Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 5/5] Documentation: add description of skb_sctp_csum_help Davide Caratti
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

skb_sctp_csum_help is like skb_checksum_help, but it is designed for
checksumming SCTP packets using crc32c (see RFC3309), provided that
sctp.ko has been loaded before. In case sctp.ko is not loaded, invoking
skb_sctp_csum_help() on a skb results in the following printout:

sk_buff: attempt to compute crc32c without sctp.ko

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/netdevice.h |  1 +
 include/linux/skbuff.h    |  3 ++-
 net/core/dev.c            | 29 +++++++++++++++++++++++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3868c32..9d72824 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3902,6 +3902,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
+int skb_sctp_csum_help(struct sk_buff *skb);
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 44fc804..91b4e22 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -192,7 +192,8 @@
  *     accordingly. Note the there is no indication in the skbuff that the
  *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
  *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers.
+ *     is configured for a packet presumably by inspecting packet headers; in
+ *     case, skb_sctp_csum_help is provided to compute CRC on SCTP packets.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
diff --git a/net/core/dev.c b/net/core/dev.c
index 6742160..45cee84 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2554,6 +2554,29 @@ static int skb_rfc1624_csum(struct sk_buff *skb, int offset)
 	return ret;
 }
 
+/* compute 32-bit RFC3309 checksum and store it at skb->data + offset */
+static int skb_rfc3309_csum(struct sk_buff *skb, int offset)
+{
+	__le32 crc32c_csum;
+	int ret = 0;
+
+	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
+						  skb->len - offset, ~(__u32)0,
+						  sctp_csum_stub));
+	offset += skb->csum_offset;
+	BUG_ON((offset + sizeof(__le32)) > skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__le32 *)(skb->data + offset) = crc32c_csum;
+out:
+	return ret;
+}
+
 /* Invalidate hardware checksum when packet is to be mangled, and
  * complete checksum manually on outgoing path.
  *    @skb - buffer that needs checksum
@@ -2599,6 +2622,12 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+int skb_sctp_csum_help(struct sk_buff *skb)
+{
+	return __skb_checksum_help(skb, skb_rfc3309_csum);
+}
+EXPORT_SYMBOL(skb_sctp_csum_help);
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH net-next 4/5] net: more accurate checksumming in validate_xmit_skb
  2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
                   ` (2 preceding siblings ...)
  2017-01-23 16:52 ` [RFC PATCH net-next 3/5] net: introduce skb_sctp_csum_help Davide Caratti
@ 2017-01-23 16:52 ` Davide Caratti
  2017-01-23 16:52 ` [RFC PATCH net-next 5/5] Documentation: add description of skb_sctp_csum_help Davide Caratti
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

introduce skb_csum_hwoffload_help and use it as a replacement for
skb_checksum_help in validate_xmit_skb, to compute checksum using crc32c or
2-complement Internet Checksum (or leave the packet unchanged and let the
NIC do the checksum), depending on netdev checksum offloading capabilities
and on presence of IPPROTO_SCTP as protocol number in IPv4/IPv6 header.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/core/dev.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 45cee84..f8cb3ba 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -140,6 +140,7 @@
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
 #include <linux/crash_dump.h>
+#include <linux/sctp.h>
 
 #include "net-sysfs.h"
 
@@ -2960,6 +2961,54 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
+static int skb_csum_hwoffload_help(struct sk_buff *skb,
+				   netdev_features_t features)
+{
+	bool encap = skb->encapsulation;
+	unsigned int offset = 0;
+	__be16 protocol;
+
+	if (likely((features & (NETIF_F_SCTP_CRC | NETIF_F_CSUM_MASK)) =
+	    (NETIF_F_SCTP_CRC | NETIF_F_CSUM_MASK)))
+		return 0;
+
+	if (skb->csum_offset != offsetof(struct sctphdr, checksum))
+		goto inet_csum;
+
+	if (encap) {
+		protocol = skb->inner_protocol;
+		if (skb->inner_protocol_type = ENCAP_TYPE_IPPROTO)
+			switch (protocol) {
+			case IPPROTO_IPV6:
+				protocol = ntohs(ETH_P_IPV6);
+				break;
+			case IPPROTO_IP:
+				protocol = ntohs(ETH_P_IP);
+				break;
+			default:
+				goto inet_csum;
+			}
+	} else {
+		protocol = vlan_get_protocol(skb);
+	}
+	switch (protocol) {
+	case ntohs(ETH_P_IP):
+		if ((encap ? inner_ip_hdr(skb) : ip_hdr(skb))->protocol =
+		    IPPROTO_SCTP)
+			goto sctp_csum;
+		break;
+	case ntohs(ETH_P_IPV6):
+		if (ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL) =
+		    IPPROTO_SCTP)
+			goto sctp_csum;
+		break;
+	}
+inet_csum:
+	return !(features & NETIF_F_CSUM_MASK) ? skb_checksum_help(skb) : 0;
+sctp_csum:
+	return !(features & NETIF_F_SCTP_CRC) ? skb_sctp_csum_help(skb) : 0;
+}
+
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 {
 	netdev_features_t features;
@@ -2995,8 +3044,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 			else
 				skb_set_transport_header(skb,
 							 skb_checksum_start_offset(skb));
-			if (!(features & NETIF_F_CSUM_MASK) &&
-			    skb_checksum_help(skb))
+			if (skb_csum_hwoffload_help(skb, features))
 				goto out_kfree_skb;
 		}
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH net-next 5/5] Documentation: add description of skb_sctp_csum_help
  2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
                   ` (3 preceding siblings ...)
  2017-01-23 16:52 ` [RFC PATCH net-next 4/5] net: more accurate checksumming in validate_xmit_skb Davide Caratti
@ 2017-01-23 16:52 ` Davide Caratti
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-01-23 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-sctp, Marcelo Ricardo Leitner

Add description of skb_sctp_csum_help in networking/checksum-offload.txt;
while at it, remove reference to skb_csum_off_chk* functions, since they
are not present anymore in Linux since commit cf53b1da73bd ('Revert "net:
Add driver helper functions to determine checksum"').

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 Documentation/networking/checksum-offloads.txt | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e3686..cb7a7e5 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -49,9 +49,9 @@ A driver declares its offload capabilities in netdev->hw_features; see
  and csum_offset given in the SKB; if it tries to deduce these itself in
  hardware (as some NICs do) the driver should check that the values in the
  SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h).  This
- is a pain, but that's what you get when hardware tries to be clever.
+ checksumming in software instead (with skb_checksum_help or
+ skb_sctp_csum_help functions as mentioned in include/linux/skbuff.h).
+ This is a pain, but that's what you get when hardware tries to be clever.
 
 The stack should, for the most part, assume that checksum offload is
  supported by the underlying device.  The only place that should check is
@@ -60,7 +60,8 @@ The stack should, for the most part, assume that checksum offload is
  may include other offloads besides TX Checksum Offload) and, if they are
  not supported or enabled on the device (determined by netdev->features),
  performs the corresponding offload in software.  In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_sctp_csum_help(skb) for SCTP
+ packets, and skb_checksum_help(skb) for other packets.
 
 
 LCO: Local Checksum Offload
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-01-23 16:52 ` [RFC PATCH net-next 2/5] net: split skb_checksum_help Davide Caratti
@ 2017-01-23 20:59   ` Tom Herbert
  2017-01-24 16:35     ` David Laight
  0 siblings, 1 reply; 51+ messages in thread
From: Tom Herbert @ 2017-01-23 20:59 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David S. Miller, Linux Kernel Network Developers, linux-sctp,
	Marcelo Ricardo Leitner

On Mon, Jan 23, 2017 at 8:52 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> skb_checksum_help is designed to compute the Internet Checksum only. To
> avoid duplicating code when other checksumming algorithms (e.g. crc32c)
> are used, separate common part from RFC1624-specific part.
>
> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  net/core/dev.c | 51 +++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 35 insertions(+), 16 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ad5959e..6742160 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2532,13 +2532,36 @@ static void skb_warn_bad_offload(const struct sk_buff *skb)
>              skb_shinfo(skb)->gso_type, skb->ip_summed);
>  }
>
> -/*
> - * Invalidate hardware checksum when packet is to be mangled, and
> +/* compute 16-bit RFC1624 checksum and store it at skb->data + offset */
> +static int skb_rfc1624_csum(struct sk_buff *skb, int offset)
> +{
> +       __wsum csum;
> +       int ret = 0;
> +
> +       csum = skb_checksum(skb, offset, skb->len - offset, 0);
> +
> +       offset += skb->csum_offset;
> +       BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
> +
> +       if (skb_cloned(skb) &&
> +           !skb_clone_writable(skb, offset + sizeof(__sum16))) {
> +               ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
> +               if (ret)
> +                       goto out;
> +       }
> +       *(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
> +out:
> +       return ret;
> +}
> +
> +/* Invalidate hardware checksum when packet is to be mangled, and
>   * complete checksum manually on outgoing path.
> + *    @skb - buffer that needs checksum
> + *    @csum_algo(skb, offset) - function used to compute the checksum
>   */
> -int skb_checksum_help(struct sk_buff *skb)
> +static int __skb_checksum_help(struct sk_buff *skb,
> +                              int (*csum_algo)(struct sk_buff *, int))
>  {
> -       __wsum csum;
>         int ret = 0, offset;
>
>         if (skb->ip_summed = CHECKSUM_COMPLETE)

skb_checksum_help is specific to the Internet checksum. For instance,
CHECKSUM_COMPLETE can _only_ refer to Internet checksum calculation
nothing else will work. Checksums and CRCs are very different things
with very different processing. They are not interchangeable, have
very different properties, and hence it is a mistake to try to shoe
horn things so that they use a common infrastructure.

It might make sense to create some CRC helper functions, but last time
I checked there are so few users of CRC in skbufs I'm not even sure
that would make sense.

Tom

> @@ -2560,24 +2583,20 @@ int skb_checksum_help(struct sk_buff *skb)
>
>         offset = skb_checksum_start_offset(skb);
>         BUG_ON(offset >= skb_headlen(skb));
> -       csum = skb_checksum(skb, offset, skb->len - offset, 0);
> -
> -       offset += skb->csum_offset;
> -       BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
> -
> -       if (skb_cloned(skb) &&
> -           !skb_clone_writable(skb, offset + sizeof(__sum16))) {
> -               ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
> -               if (ret)
> -                       goto out;
> -       }
>
> -       *(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
> +       ret = csum_algo(skb, offset);
> +       if (ret)
> +               goto out;
>  out_set_summed:
>         skb->ip_summed = CHECKSUM_NONE;
>  out:
>         return ret;
>  }
> +
> +int skb_checksum_help(struct sk_buff *skb)
> +{
> +       return __skb_checksum_help(skb, skb_rfc1624_csum);
> +}
>  EXPORT_SYMBOL(skb_checksum_help);
>
>  __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-01-23 20:59   ` Tom Herbert
@ 2017-01-24 16:35     ` David Laight
  2017-02-02 15:07       ` Davide Caratti
  0 siblings, 1 reply; 51+ messages in thread
From: David Laight @ 2017-01-24 16:35 UTC (permalink / raw)
  To: 'Tom Herbert', Davide Caratti
  Cc: David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

RnJvbTogVG9tIEhlcmJlcnQNCj4gU2VudDogMjMgSmFudWFyeSAyMDE3IDIxOjAwDQouLg0KPiBz
a2JfY2hlY2tzdW1faGVscCBpcyBzcGVjaWZpYyB0byB0aGUgSW50ZXJuZXQgY2hlY2tzdW0uIEZv
ciBpbnN0YW5jZSwNCj4gQ0hFQ0tTVU1fQ09NUExFVEUgY2FuIF9vbmx5XyByZWZlciB0byBJbnRl
cm5ldCBjaGVja3N1bSBjYWxjdWxhdGlvbg0KPiBub3RoaW5nIGVsc2Ugd2lsbCB3b3JrLiBDaGVj
a3N1bXMgYW5kIENSQ3MgYXJlIHZlcnkgZGlmZmVyZW50IHRoaW5ncw0KPiB3aXRoIHZlcnkgZGlm
ZmVyZW50IHByb2Nlc3NpbmcuIFRoZXkgYXJlIG5vdCBpbnRlcmNoYW5nZWFibGUsIGhhdmUNCj4g
dmVyeSBkaWZmZXJlbnQgcHJvcGVydGllcywgYW5kIGhlbmNlIGl0IGlzIGEgbWlzdGFrZSB0byB0
cnkgdG8gc2hvZQ0KPiBob3JuIHRoaW5ncyBzbyB0aGF0IHRoZXkgdXNlIGEgY29tbW9uIGluZnJh
c3RydWN0dXJlLg0KPiANCj4gSXQgbWlnaHQgbWFrZSBzZW5zZSB0byBjcmVhdGUgc29tZSBDUkMg
aGVscGVyIGZ1bmN0aW9ucywgYnV0IGxhc3QgdGltZQ0KPiBJIGNoZWNrZWQgdGhlcmUgYXJlIHNv
IGZldyB1c2VycyBvZiBDUkMgaW4gc2tidWZzIEknbSBub3QgZXZlbiBzdXJlDQo+IHRoYXQgd291
bGQgbWFrZSBzZW5zZS4NCg0KSSBjYW4gaW1hZ2luZSBob3JyaWQgdGhpbmdzIGhhcHBlbmluZyBp
ZiBzb21lb25lIHRyaWVzIHRvIGVuY2Fwc3VsYXRlDQpTQ1RQL0lQIGluIFVEUCAob3Igd29yc2Ug
VURQL0lQIGluIFNDVFApLg0KDQpGb3IgVURQIGluIFVEUCBJIHN1c3BlY3QgdGhhdCBDSEVDS1NV
TV9DT01QTEVURSBvbiBhbiBpbm5lciBVRFAgcGFja2V0DQphbGxvd3MgdGhlIG91dGVyIGNoZWNr
c3VtIGJlIGNhbGN1bGF0ZWQgYnkgaWdub3JpbmcgdGhlIGlubmVyIHBhY2tldA0KKHNpbmNlIGl0
IHN1bXMgdG8gemVybykuDQpUaGlzIGp1c3QgaXNuJ3QgdHJ1ZSBpZiBTQ1RQIGlzIGludm9sdmVk
Lg0KVGhlcmUgYXJlIHRyaWNrcyB0byBnZW5lcmF0ZSBhIGNyYyBvZiBhIGxvbmdlciBwYWNrZXQs
IGJ1dCB0aGV5J2Qgb25seQ0Kd29yayBmb3IgU0NUUCBpbiBTQ1RQLg0KDQpGb3Igbm9uLWVuY2Fw
c3VsYXRlZCBwYWNrZXRzIGl0IGlzIGEgZGlmZmVyZW50IG1hdHRlci4NCg0KCURhdmlkDQoNCg=

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-01-24 16:35     ` David Laight
@ 2017-02-02 15:07       ` Davide Caratti
  2017-02-02 16:55         ` David Laight
  2017-02-02 18:08         ` Tom Herbert
  0 siblings, 2 replies; 51+ messages in thread
From: Davide Caratti @ 2017-02-02 15:07 UTC (permalink / raw)
  To: David Laight, 'Tom Herbert'
  Cc: David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

hello Tom and David,

thank you for the attention.

> From: Tom Herbert
> > 
> > Sent: 23 January 2017 21:00
> ..
> > 
> > skb_checksum_help is specific to the Internet checksum. For instance,
> > CHECKSUM_COMPLETE can _only_ refer to Internet checksum calculation
> > nothing else will work. Checksums and CRCs are very different things
> > with very different processing. They are not interchangeable, have
> > very different properties, and hence it is a mistake to try to shoe
> > horn things so that they use a common infrastructure.
> > 

true, we don't need to test CHECKSUM_COMPLETE on skbs carrying SCTP.
So maybe we can simply replace patches 2/5 and 3/5 with the smaller one at
the bottom of this message.

> > It might make sense to create some CRC helper functions, but last time
> > I checked there are so few users of CRC in skbufs I'm not even sure
> > that would make sense.

This is exactly the cause of issues I see with SCTP. These packets can be
wrongly checksummed using skb_checksum_help, or simply not checksummed at
all; and in both cases, the packet goes out from the NIC with wrong L4
checksum.

For example: there are scenarios, even the trivial one below, where skb
carrying SCTP packets are wrongly checksummed, because the originating
socket read NETIF_F_SCTP_CRC bit in the underlying device features.
Then, after the kernel forwards the skb, the final transmission
happens on another device where CRC offload is not available: this
typically leads to bad checksums on transmitted SCTP packets.



namespace 1 |                   namespace 2
            |
            |                      br0
            |         +------- Linux bridge -------+
            |         |                            |
            |         V                            V
vethA <-----------> vethB                        eth0
            |
            |

when a socket bound to vethA in namespace 1 generates an INIT packet,
it's not checksummed since veth devices have NETIF_F_SCTP_CRC set [1].
Then, after vethB receives the packet in namespace 2, linux bridge
forwards it to eth0, and (depending on eth0 driver code), it will be
transmitted with wrong CRC32c or simply dropped.

On Tue, 2017-01-24 at 16:35 +0000, David Laight wrote:
> 
> I can imagine horrid things happening if someone tries to encapsulate
> SCTP/IP in UDP (or worse UDP/IP in SCTP).
> 
> For UDP in UDP I suspect that CHECKSUM_COMPLETE on an inner UDP packet
> allows the outer checksum be calculated by ignoring the inner packet
> (since it sums to zero).
> This just isn't true if SCTP is involved.
> There are tricks to generate a crc of a longer packet, but they'd only
> work for SCTP in SCTP.
> 
> For non-encapsulated packets it is a different matter.

If we limit the scope to skbs having ip_summed equal to CHECKSUM_PARTIAL,
like it's done in patch 4, we only need checksumming the packet starting
from csum_start to its end, and copy the computed value to csum_offset.
The difficult thing is discriminating skbs that need CRC32c, namely SCTP,
from the rest of the traffic (that will likely be checksummed by
skb_checksum_help).

Currently, the only way to fix wrong CRCs in the scenario above is to
configure tc filter with "csum" action on eth0 egress, to compensate the
missing capability of eth0 driver to deal with SCTP packets having
ip_summed equal to CHECKSUM_PARTIAL [2].

Patch 4 in the series is an attempt to solve the issue, both for
encapsulated and non-encapsulated skbs, calling skb_csum_hwoffload_help()
inside validate_xmit_skb. In order to look for unchecksummed SCTP packets,
I took inspiration from a Linux-4.4 commit (6ae23ad36253 "net: Add driver
helper functions ...) to implement skb_csum_hwoffload_help, then I called
it in validate_xmit_skb() to fix situations that can't be recovered by the
NIC driver (it's the case where NETIF_F_CSUM_MASK bits are all zero).

Today most NICs can provide at least HW offload for Internet Checksum:
that's why I'm a bit doubtful if it's ok to spend extra CPU cycles in
validate_xmit_skb() to ensure correct CRC in some scenarios. 
Since this issue affects some (not all) NICs, maybe it's better to drop
patch 4, or part of it, and provide a fix for individual drivers that
don't currently handle non-checksummed SCTP packets. But to do that, we
need at least patch 1 and the small code below.

------------------- 8< --------------------------
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -200,7 +200,8 @@
  *     accordingly. Note the there is no indication in the skbuff that the
  *     CHECKSUM_PARTIAL refers to an FCOE checksum, a driver that supports
  *     both IP checksum offload and FCOE CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers.
+ *     is configured for a packet presumably by inspecting packet headers; in
+ *     case, skb_sctp_csum_help is provided to compute CRC on SCTP packets.
  *
  * E. Checksumming on output with GSO.
  *
diff --git a/net/core/dev.c b/net/core/dev.c
index ad5959e..fa9be6d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2580,6 +2580,42 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+int skb_sctp_csum_help(struct sk_buff *skb)
+{
+	__le32 crc32c_csum;
+	int ret = 0, offset;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		goto out;
+	if (unlikely(skb_is_gso(skb)))
+		goto out;
+	if (skb_has_shared_frag(skb)) {
+		ret = __skb_linearize(skb);
+		if (ret)
+			goto out;
+	}
+
+	offset = skb_checksum_start_offset(skb);
+	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
+				  skb->len - offset, ~(__u32)0,
+				  sctp_csum_stub));
+
+	offset += offsetof(struct sctphdr, checksum);
+	BUG_ON(offset >= skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__le32 *)(skb->data + offset) = crc32c_csum;
+	skb->ip_summed = CHECKSUM_NONE;
+out:
+	return ret;
+}
+EXPORT_SYMBOL(skb_sctp_csum_help);
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.7.4
------------------- >8 --------------------------

Thank you again for paying attention to this, and I would appreciate if
you share your opinion.

Notes:

[1] see commit c80fafbbb59e ("veth: sctp: add NETIF_F_SCTP_CRC to device
features")
[2] see commit c008b33f3ef ("net/sched: act_csum: compute crc32c on SCTP
packets").  We could also turn off NETIF_F_SCTP_CRC bit from vethA, but
this would generate useless crc32c calculations if the SCTP server is not
outside the physical node (e.g. it is bound to br0), leading to a
throughput degradation.


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* RE: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-02-02 15:07       ` Davide Caratti
@ 2017-02-02 16:55         ` David Laight
  2017-02-02 18:08         ` Tom Herbert
  1 sibling, 0 replies; 51+ messages in thread
From: David Laight @ 2017-02-02 16:55 UTC (permalink / raw)
  To: 'Davide Caratti', 'Tom Herbert'
  Cc: David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

RnJvbTogRGF2aWRlIENhcmF0dGkNCj4gU2VudDogMDIgRmVicnVhcnkgMjAxNyAxNTowNw0KPiA+
IEZyb206IFRvbSBIZXJiZXJ0DQo+ID4gPg0KPiA+ID4gU2VudDogMjMgSmFudWFyeSAyMDE3IDIx
OjAwDQo+ID4gLi4NCj4gPiA+DQo+ID4gPiBza2JfY2hlY2tzdW1faGVscCBpcyBzcGVjaWZpYyB0
byB0aGUgSW50ZXJuZXQgY2hlY2tzdW0uIEZvciBpbnN0YW5jZSwNCj4gPiA+IENIRUNLU1VNX0NP
TVBMRVRFIGNhbiBfb25seV8gcmVmZXIgdG8gSW50ZXJuZXQgY2hlY2tzdW0gY2FsY3VsYXRpb24N
Cj4gPiA+IG5vdGhpbmcgZWxzZSB3aWxsIHdvcmsuIENoZWNrc3VtcyBhbmQgQ1JDcyBhcmUgdmVy
eSBkaWZmZXJlbnQgdGhpbmdzDQo+ID4gPiB3aXRoIHZlcnkgZGlmZmVyZW50IHByb2Nlc3Npbmcu
IFRoZXkgYXJlIG5vdCBpbnRlcmNoYW5nZWFibGUsIGhhdmUNCj4gPiA+IHZlcnkgZGlmZmVyZW50
IHByb3BlcnRpZXMsIGFuZCBoZW5jZSBpdCBpcyBhIG1pc3Rha2UgdG8gdHJ5IHRvIHNob2UNCj4g
PiA+IGhvcm4gdGhpbmdzIHNvIHRoYXQgdGhleSB1c2UgYSBjb21tb24gaW5mcmFzdHJ1Y3R1cmUu
DQo+ID4gPg0KPiANCj4gdHJ1ZSwgd2UgZG9uJ3QgbmVlZCB0byB0ZXN0IENIRUNLU1VNX0NPTVBM
RVRFIG9uIHNrYnMgY2FycnlpbmcgU0NUUC4NCj4gU28gbWF5YmUgd2UgY2FuIHNpbXBseSByZXBs
YWNlIHBhdGNoZXMgMi81IGFuZCAzLzUgd2l0aCB0aGUgc21hbGxlciBvbmUgYXQNCj4gdGhlIGJv
dHRvbSBvZiB0aGlzIG1lc3NhZ2UuDQoNCkkgaGF2ZSB0byBhZG1pdCB0byBub3Qga25vd2luZyBl
eGFjdGx5IHdoYXQgdGhlIENIRUNLU1VNX3h4eCBmbGFncyBhY3R1YWxseSBtZWFuLg0KSSBoYXZl
IGEgZ29vZCBpZGVhIGFib3V0IHdoYXQgdGhlIGludGVudGlvbiBpcyB0aG91Z2guDQoNCi4uLg0K
PiBPbiBUdWUsIDIwMTctMDEtMjQgYXQgMTY6MzUgKzAwMDAsIERhdmlkIExhaWdodCB3cm90ZToN
Cj4gPg0KPiA+IEkgY2FuIGltYWdpbmUgaG9ycmlkIHRoaW5ncyBoYXBwZW5pbmcgaWYgc29tZW9u
ZSB0cmllcyB0byBlbmNhcHN1bGF0ZQ0KPiA+IFNDVFAvSVAgaW4gVURQIChvciB3b3JzZSBVRFAv
SVAgaW4gU0NUUCkuDQo+ID4NCj4gPiBGb3IgVURQIGluIFVEUCBJIHN1c3BlY3QgdGhhdCBDSEVD
S1NVTV9DT01QTEVURSBvbiBhbiBpbm5lciBVRFAgcGFja2V0DQo+ID4gYWxsb3dzIHRoZSBvdXRl
ciBjaGVja3N1bSBiZSBjYWxjdWxhdGVkIGJ5IGlnbm9yaW5nIHRoZSBpbm5lciBwYWNrZXQNCj4g
PiAoc2luY2UgaXQgc3VtcyB0byB6ZXJvKS4NCj4gPiBUaGlzIGp1c3QgaXNuJ3QgdHJ1ZSBpZiBT
Q1RQIGlzIGludm9sdmVkLg0KPiA+IFRoZXJlIGFyZSB0cmlja3MgdG8gZ2VuZXJhdGUgYSBjcmMg
b2YgYSBsb25nZXIgcGFja2V0LCBidXQgdGhleSdkIG9ubHkNCj4gPiB3b3JrIGZvciBTQ1RQIGlu
IFNDVFAuDQo+ID4NCj4gPiBGb3Igbm9uLWVuY2Fwc3VsYXRlZCBwYWNrZXRzIGl0IGlzIGEgZGlm
ZmVyZW50IG1hdHRlci4NCj4gDQo+IElmIHdlIGxpbWl0IHRoZSBzY29wZSB0byBza2JzIGhhdmlu
ZyBpcF9zdW1tZWQgZXF1YWwgdG8gQ0hFQ0tTVU1fUEFSVElBTCwNCj4gbGlrZSBpdCdzIGRvbmUg
aW4gcGF0Y2ggNCwgd2Ugb25seSBuZWVkIGNoZWNrc3VtbWluZyB0aGUgcGFja2V0IHN0YXJ0aW5n
DQo+IGZyb20gY3N1bV9zdGFydCB0byBpdHMgZW5kLCBhbmQgY29weSB0aGUgY29tcHV0ZWQgdmFs
dWUgdG8gY3N1bV9vZmZzZXQuDQo+IFRoZSBkaWZmaWN1bHQgdGhpbmcgaXMgZGlzY3JpbWluYXRp
bmcgc2ticyB0aGF0IG5lZWQgQ1JDMzJjLCBuYW1lbHkgU0NUUCwNCj4gZnJvbSB0aGUgcmVzdCBv
ZiB0aGUgdHJhZmZpYyAodGhhdCB3aWxsIGxpa2VseSBiZSBjaGVja3N1bW1lZCBieQ0KPiBza2Jf
Y2hlY2tzdW1faGVscCkuDQouLi4NCg0KSSdtIGd1ZXNzaW5nIHRoYXQgdGhlIFNDVFAgY29kZSBv
bmx5IHNldHMgQ0hFQ0tTVU1fUEFSVElBTCAoYW5kIGRvZXNuJ3QNCnBlcmZvcm0gdGhlIGNoZWNr
c3VtKSBpZiBpdCBzb21laG93IGtub3dzIHRoYXQgdGhlIHRhcmdldCBpbnRlcmZhY2UNCnN1cHBv
cnRzIENSQzMyYyBjaGVja3N1bXMuDQoNCkknZCBwdXQgdGhlIG9udXMgb24gYW55IHN1Y2ggaW50
ZXJmYWNlIHRvIHBlcmZvcm0gdGhlIGNoZWNrc3VtIChhbmQNCnNldCBDSEVDS1NVTV9DT01QTEVU
RSAob3IgaXMgaXQgVU5ORUNFU1NBUlk/KSBiZWZvcmUgcGFzc2luZyB0aGUgDQptZXNzYWdlIG9u
dG8gYW4gaW50ZXJmYWNlIHRoYXQgZG9lc24ndCBhZHZlcnRpc2UgQ1JDMzIgc3VwcG9ydC4NCg0K
WW91IGNlcnRhaW5seSBkb24ndCB3YW50IHRvIGhhdmUgdG8gZ28gdGhyb3VnaCBhbGwgdGhlIGV0
aGVybmV0IGRyaXZlcnMhDQoNCj4gDQo+IC0tLS0tLS0tLS0tLS0tLS0tLS0gODwgLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0NCj4gLS0tIGEvaW5jbHVkZS9saW51eC9za2J1ZmYuaA0KPiArKysg
Yi9pbmNsdWRlL2xpbnV4L3NrYnVmZi5oDQo+IEBAIC0yMDAsNyArMjAwLDggQEANCj4gICphY2Nv
cmRpbmdseS4gTm90ZSB0aGUgdGhlcmUgaXMgbm8gaW5kaWNhdGlvbiBpbiB0aGUgc2tidWZmIHRo
YXQgdGhlDQo+ICAqQ0hFQ0tTVU1fUEFSVElBTCByZWZlcnMgdG8gYW4gRkNPRSBjaGVja3N1bSwg
YSBkcml2ZXIgdGhhdCBzdXBwb3J0cw0KPiAgKmJvdGggSVAgY2hlY2tzdW0gb2ZmbG9hZCBhbmQg
RkNPRSBDUkMgb2ZmbG9hZCBtdXN0IHZlcmlmeSB3aGljaCBvZmZsb2FkDQo+IC0gKmlzIGNvbmZp
Z3VyZWQgZm9yIGEgcGFja2V0IHByZXN1bWFibHkgYnkgaW5zcGVjdGluZyBwYWNrZXQgaGVhZGVy
cy4NCj4gKyAqaXMgY29uZmlndXJlZCBmb3IgYSBwYWNrZXQgcHJlc3VtYWJseSBieSBpbnNwZWN0
aW5nIHBhY2tldCBoZWFkZXJzOyBpbg0KPiArICpjYXNlLCBza2Jfc2N0cF9jc3VtX2hlbHAgaXMg
cHJvdmlkZWQgdG8gY29tcHV0ZSBDUkMgb24gU0NUUCBwYWNrZXRzLg0KPiAgKg0KPiAgKiBFLiBD
aGVja3N1bW1pbmcgb24gb3V0cHV0IHdpdGggR1NPLg0KPiAgKg0KPiBkaWZmIC0tZ2l0IGEvbmV0
L2NvcmUvZGV2LmMgYi9uZXQvY29yZS9kZXYuYw0KPiBpbmRleCBhZDU5NTllLi5mYTliZTZkIDEw
MDY0NA0KPiAtLS0gYS9uZXQvY29yZS9kZXYuYw0KPiArKysgYi9uZXQvY29yZS9kZXYuYw0KPiBA
QCAtMjU4MCw2ICsyNTgwLDQyIEBAIGludCBza2JfY2hlY2tzdW1faGVscChzdHJ1Y3Qgc2tfYnVm
ZiAqc2tiKQ0KPiB9DQo+IEVYUE9SVF9TWU1CT0woc2tiX2NoZWNrc3VtX2hlbHApOw0KPiANCj4g
K2ludCBza2Jfc2N0cF9jc3VtX2hlbHAoc3RydWN0IHNrX2J1ZmYgKnNrYikNCj4gK3sNCj4gKwlf
X2xlMzIgY3JjMzJjX2NzdW07DQo+ICsJaW50IHJldCA9IDAsIG9mZnNldDsNCj4gKw0KPiArCWlm
IChza2ItPmlwX3N1bW1lZCAhPSBDSEVDS1NVTV9QQVJUSUFMKQ0KPiArCQlnb3RvIG91dDsNCj4g
KwlpZiAodW5saWtlbHkoc2tiX2lzX2dzbyhza2IpKSkNCj4gKwkJZ290byBvdXQ7DQo+ICsJaWYg
KHNrYl9oYXNfc2hhcmVkX2ZyYWcoc2tiKSkgew0KPiArCQlyZXQgPSBfX3NrYl9saW5lYXJpemUo
c2tiKTsNCg0KSSBkb24ndCB0aGluayB5b3UgcmVhbGx5IHdhbnQgdG8gbGluZWFyaXplIHRoZSBw
YWNrZXQuDQouLi4NCg0KCURhdmlkDQoNCg=

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-02-02 15:07       ` Davide Caratti
  2017-02-02 16:55         ` David Laight
@ 2017-02-02 18:08         ` Tom Herbert
  2017-02-27 13:39           ` Davide Caratti
  1 sibling, 1 reply; 51+ messages in thread
From: Tom Herbert @ 2017-02-02 18:08 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David Laight, David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

On Thu, Feb 2, 2017 at 7:07 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> hello Tom and David,
>
> thank you for the attention.
>
>> From: Tom Herbert
>> >
>> > Sent: 23 January 2017 21:00
>> ..
>> >
>> > skb_checksum_help is specific to the Internet checksum. For instance,
>> > CHECKSUM_COMPLETE can _only_ refer to Internet checksum calculation
>> > nothing else will work. Checksums and CRCs are very different things
>> > with very different processing. They are not interchangeable, have
>> > very different properties, and hence it is a mistake to try to shoe
>> > horn things so that they use a common infrastructure.
>> >
>
> true, we don't need to test CHECKSUM_COMPLETE on skbs carrying SCTP.
> So maybe we can simply replace patches 2/5 and 3/5 with the smaller one at
> the bottom of this message.
>
>> > It might make sense to create some CRC helper functions, but last time
>> > I checked there are so few users of CRC in skbufs I'm not even sure
>> > that would make sense.
>
> This is exactly the cause of issues I see with SCTP. These packets can be
> wrongly checksummed using skb_checksum_help, or simply not checksummed at
> all; and in both cases, the packet goes out from the NIC with wrong L4
> checksum.
>
Okay, makes sense. Please consider doing the following:

- Add a bit to skbuf called something like "csum_not_inet". When
ip_summed = CHECKSUM_PARTIAL and this bit is set that means we are
dealing with something other than an Internet checksum.
- At the top of skb_checksum_help (or maybe before the point where the
inet specific checksum start begins do something like:

   if (unlikely(skb->csum_not_inet))
       return skb_checksum_help_not_inet(...);

   The rest of skb_checksum_help should remained unchanged.

- Add a description of the new bit and how skb_checksum_help can work
to the comments for CHECKSUM_PARTIAL in skbuff.h
- Add FCOE to the list of protocol that can set CHECKSUM_UNNECESSARY
for a CRC/csum
- Add a note to CHECKSUM_COMPLETE section that it can only refer to an
Internet checksum

Thanks,
Tom

> For example: there are scenarios, even the trivial one below, where skb
> carrying SCTP packets are wrongly checksummed, because the originating
> socket read NETIF_F_SCTP_CRC bit in the underlying device features.
> Then, after the kernel forwards the skb, the final transmission
> happens on another device where CRC offload is not available: this
> typically leads to bad checksums on transmitted SCTP packets.
>
>
>
> namespace 1 |                   namespace 2
>             |
>             |                      br0
>             |         +------- Linux bridge -------+
>             |         |                            |
>             |         V                            V
> vethA <-----------> vethB                        eth0
>             |
>             |
>
> when a socket bound to vethA in namespace 1 generates an INIT packet,
> it's not checksummed since veth devices have NETIF_F_SCTP_CRC set [1].
> Then, after vethB receives the packet in namespace 2, linux bridge
> forwards it to eth0, and (depending on eth0 driver code), it will be
> transmitted with wrong CRC32c or simply dropped.
>
> On Tue, 2017-01-24 at 16:35 +0000, David Laight wrote:
>>
>> I can imagine horrid things happening if someone tries to encapsulate
>> SCTP/IP in UDP (or worse UDP/IP in SCTP).
>>
>> For UDP in UDP I suspect that CHECKSUM_COMPLETE on an inner UDP packet
>> allows the outer checksum be calculated by ignoring the inner packet
>> (since it sums to zero).
>> This just isn't true if SCTP is involved.
>> There are tricks to generate a crc of a longer packet, but they'd only
>> work for SCTP in SCTP.
>>
>> For non-encapsulated packets it is a different matter.
>
> If we limit the scope to skbs having ip_summed equal to CHECKSUM_PARTIAL,
> like it's done in patch 4, we only need checksumming the packet starting
> from csum_start to its end, and copy the computed value to csum_offset.
> The difficult thing is discriminating skbs that need CRC32c, namely SCTP,
> from the rest of the traffic (that will likely be checksummed by
> skb_checksum_help).
>
> Currently, the only way to fix wrong CRCs in the scenario above is to
> configure tc filter with "csum" action on eth0 egress, to compensate the
> missing capability of eth0 driver to deal with SCTP packets having
> ip_summed equal to CHECKSUM_PARTIAL [2].
>
> Patch 4 in the series is an attempt to solve the issue, both for
> encapsulated and non-encapsulated skbs, calling skb_csum_hwoffload_help()
> inside validate_xmit_skb. In order to look for unchecksummed SCTP packets,
> I took inspiration from a Linux-4.4 commit (6ae23ad36253 "net: Add driver
> helper functions ...) to implement skb_csum_hwoffload_help, then I called
> it in validate_xmit_skb() to fix situations that can't be recovered by the
> NIC driver (it's the case where NETIF_F_CSUM_MASK bits are all zero).
>
> Today most NICs can provide at least HW offload for Internet Checksum:
> that's why I'm a bit doubtful if it's ok to spend extra CPU cycles in
> validate_xmit_skb() to ensure correct CRC in some scenarios.
> Since this issue affects some (not all) NICs, maybe it's better to drop
> patch 4, or part of it, and provide a fix for individual drivers that
> don't currently handle non-checksummed SCTP packets. But to do that, we
> need at least patch 1 and the small code below.
>
> ------------------- 8< --------------------------
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -200,7 +200,8 @@
>   *     accordingly. Note the there is no indication in the skbuff that the
>   *     CHECKSUM_PARTIAL refers to an FCOE checksum, a driver that supports
>   *     both IP checksum offload and FCOE CRC offload must verify which offload
> - *     is configured for a packet presumably by inspecting packet headers.
> + *     is configured for a packet presumably by inspecting packet headers; in
> + *     case, skb_sctp_csum_help is provided to compute CRC on SCTP packets.
>   *
>   * E. Checksumming on output with GSO.
>   *
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ad5959e..fa9be6d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2580,6 +2580,42 @@ int skb_checksum_help(struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL(skb_checksum_help);
>
> +int skb_sctp_csum_help(struct sk_buff *skb)
> +{
> +       __le32 crc32c_csum;
> +       int ret = 0, offset;
> +
> +       if (skb->ip_summed != CHECKSUM_PARTIAL)
> +               goto out;
> +       if (unlikely(skb_is_gso(skb)))
> +               goto out;
> +       if (skb_has_shared_frag(skb)) {
> +               ret = __skb_linearize(skb);
> +               if (ret)
> +                       goto out;
> +       }
> +
> +       offset = skb_checksum_start_offset(skb);
> +       crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
> +                                 skb->len - offset, ~(__u32)0,
> +                                 sctp_csum_stub));
> +
> +       offset += offsetof(struct sctphdr, checksum);
> +       BUG_ON(offset >= skb_headlen(skb));
> +
> +       if (skb_cloned(skb) &&
> +           !skb_clone_writable(skb, offset + sizeof(__le32))) {
> +               ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
> +               if (ret)
> +                       goto out;
> +       }
> +       *(__le32 *)(skb->data + offset) = crc32c_csum;
> +       skb->ip_summed = CHECKSUM_NONE;
> +out:
> +       return ret;
> +}
> +EXPORT_SYMBOL(skb_sctp_csum_help);
> +
>  __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
>  {
>         __be16 type = skb->protocol;
> --
> 2.7.4
> ------------------- >8 --------------------------
>
> Thank you again for paying attention to this, and I would appreciate if
> you share your opinion.
>
> Notes:
>
> [1] see commit c80fafbbb59e ("veth: sctp: add NETIF_F_SCTP_CRC to device
> features")
> [2] see commit c008b33f3ef ("net/sched: act_csum: compute crc32c on SCTP
> packets").  We could also turn off NETIF_F_SCTP_CRC bit from vethA, but
> this would generate useless crc32c calculations if the SCTP server is not
> outside the physical node (e.g. it is bound to br0), leading to a
> throughput degradation.
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-02-02 18:08         ` Tom Herbert
@ 2017-02-27 13:39           ` Davide Caratti
  2017-02-27 15:11             ` Tom Herbert
                               ` (4 more replies)
  0 siblings, 5 replies; 51+ messages in thread
From: Davide Caratti @ 2017-02-27 13:39 UTC (permalink / raw)
  To: David Laight, 'Tom Herbert'
  Cc: David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

On Mon, 2017-01-23 at 12:59 -0800, Tom Herbert wrote:
> > > > It might make sense to create some CRC helper functions, but last time
> > > > I checked there are so few users of CRC in skbufs I'm not even sure
> > > > that would make sense.

hello Tom and David,

after some (thinking + testing) time, I'm going to re-post this RFC as v2 with
some feedbacks. Thank you in advance for looking at it!

On Thu, 2017-02-02 at 10:08 -0800, Tom Herbert wrote:
> On Thu, 2017-02-02 at 16:07 +0100, Davide Caratti wrote:
> > This is exactly the cause of issues I see with SCTP. These packets can be
> > wrongly checksummed using skb_checksum_help, or simply not checksummed at
> > all; and in both cases, the packet goes out from the NIC with wrong L4
> > checksum.
> > 
> Okay, makes sense. Please consider doing the following:
> 
> - Add a bit to skbuf called something like "csum_not_inet". When
> ip_summed = CHECKSUM_PARTIAL and this bit is set that means we are
> dealing with something other than an Internet checksum.

Ok, done. Another solution would be to extend possible values of
skb->ip_summed, and define a new value suitable for identifying
not-yet-checksummed SCTP packets (something like CRC32C_PARTIAL). Since
skb->ip_summed is 2-bit wide, the overall effect on skb metadata is the
same as adding skb->csum_not_inet [1].

> - At the top of skb_checksum_help (or maybe before the point where the
> inet specific checksum start begins do something like:
> 
>    if (unlikely(skb->csum_not_inet))
>        return skb_checksum_help_not_inet(...);
> 
>    The rest of skb_checksum_help should remained unchanged.

According to documentation [2], validate_xmit_skb() is a good place where
the if() statement above can be done, to preserve the possibility of having
the CRC32c computation offloaded by the NIC hardware:

if (unlikely(skb->csum_not_inet && !(features & NETIF_F_SCTP_CRC))
	       return skb_checksum_help_not_inet(...);

On Thu, 2017-02-02 at 16:55 +0000, David Laight wrote:
> 
> I'd put the onus on any such interface to perform the checksum (and
> set CHECKSUM_COMPLETE (or is it UNNECESSARY?) before passing the 
> message onto an interface that doesn't advertise CRC32 support.
> 
> You certainly don't want to have to go through all the ethernet drivers!

Ideally, a driver not able to offload checksum computation should call
skb_checksum_help() or skb_sctp_csum_help() to resolve CHECKSUM_PARTIAL
and turn it to CHECKSUM_NONE.
But this wouldn't solve all possible setups: there can be scenarios
where the NIC is configured with NETIF_F_SCTP_CRC set and NETIF_F_CSUM_HW
cleared (it's evil, but possible). In this situation, non-GSO SCTP packets
having CHECKSUM_PARTIAL will be systematically corrupted when they are
processed by validate_xmit_skb().

On Thu, 2017-02-02 at 10:08 -0800, Tom Herbert wrote:

> 
> - Add a description of the new bit and how skb_checksum_help can work
> to the comments for CHECKSUM_PARTIAL in skbuff.h

Done.

> 
> - Add FCOE to the list of protocol that can set CHECKSUM_UNNECESSARY
> for a CRC/csum

Done.

> 
> - Add a note to CHECKSUM_COMPLETE section that it can only refer to an
> Internet checksum

Done.

/* references + notes */

[1] ... this recalls to latest comment from David Laight:
On Thu, 2017-02-02 at 16:55 +0000, David Laight wrote:
> 
> I have to admit to not knowing exactly what the CHECKSUM_xxx flags
> actually mean. I have a good idea about what the intention is though.

According to domumentation, CHECKSUM_COMPLETE and CHECKSUM_UNNECESSARY are
not used for SCTP (nor in the TX path at all); nevertheless, IPVS snat/dnat
actually set CHECKSUM_UNNECESSARY on SCTP packets after the checksum is
updated (see 97203abe6bc4 "net: ipvs: sctp: do not recalc...).

I'm not sure if setting CHECKSUM_UNNECESSARY fits my case, because this would
implicitly skip RX validation when using devices like veth or loopback.

[2] Documentation/networking/checksum_offloads.txt

regards,
--
davide

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-02-27 13:39           ` Davide Caratti
@ 2017-02-27 15:11             ` Tom Herbert
  2017-02-28 10:31               ` Davide Caratti
  2017-02-28 10:32             ` [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 51+ messages in thread
From: Tom Herbert @ 2017-02-27 15:11 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David Laight, David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

On Mon, Feb 27, 2017 at 5:39 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> On Mon, 2017-01-23 at 12:59 -0800, Tom Herbert wrote:
>> > > > It might make sense to create some CRC helper functions, but last time
>> > > > I checked there are so few users of CRC in skbufs I'm not even sure
>> > > > that would make sense.
>
> hello Tom and David,
>
> after some (thinking + testing) time, I'm going to re-post this RFC as v2 with
> some feedbacks. Thank you in advance for looking at it!
>
> On Thu, 2017-02-02 at 10:08 -0800, Tom Herbert wrote:
>> On Thu, 2017-02-02 at 16:07 +0100, Davide Caratti wrote:
>> > This is exactly the cause of issues I see with SCTP. These packets can be
>> > wrongly checksummed using skb_checksum_help, or simply not checksummed at
>> > all; and in both cases, the packet goes out from the NIC with wrong L4
>> > checksum.
>> >
>> Okay, makes sense. Please consider doing the following:
>>
>> - Add a bit to skbuf called something like "csum_not_inet". When
>> ip_summed = CHECKSUM_PARTIAL and this bit is set that means we are
>> dealing with something other than an Internet checksum.
>
> Ok, done. Another solution would be to extend possible values of
> skb->ip_summed, and define a new value suitable for identifying
> not-yet-checksummed SCTP packets (something like CRC32C_PARTIAL). Since
> skb->ip_summed is 2-bit wide, the overall effect on skb metadata is the
> same as adding skb->csum_not_inet [1].
>
>> - At the top of skb_checksum_help (or maybe before the point where the
>> inet specific checksum start begins do something like:
>>
>>    if (unlikely(skb->csum_not_inet))
>>        return skb_checksum_help_not_inet(...);
>>
>>    The rest of skb_checksum_help should remained unchanged.
>
> According to documentation [2], validate_xmit_skb() is a good place where
> the if() statement above can be done, to preserve the possibility of having
> the CRC32c computation offloaded by the NIC hardware:
>
> if (unlikely(skb->csum_not_inet && !(features & NETIF_F_SCTP_CRC))
>                return skb_checksum_help_not_inet(...);
>
> On Thu, 2017-02-02 at 16:55 +0000, David Laight wrote:
>>
>> I'd put the onus on any such interface to perform the checksum (and
>> set CHECKSUM_COMPLETE (or is it UNNECESSARY?) before passing the
>> message onto an interface that doesn't advertise CRC32 support.
>>
>> You certainly don't want to have to go through all the ethernet drivers!
>
> Ideally, a driver not able to offload checksum computation should call
> skb_checksum_help() or skb_sctp_csum_help() to resolve CHECKSUM_PARTIAL
> and turn it to CHECKSUM_NONE.
> But this wouldn't solve all possible setups: there can be scenarios
> where the NIC is configured with NETIF_F_SCTP_CRC set and NETIF_F_CSUM_HW
> cleared (it's evil, but possible). In this situation, non-GSO SCTP packets
> having CHECKSUM_PARTIAL will be systematically corrupted when they are
> processed by validate_xmit_skb().
>
> On Thu, 2017-02-02 at 10:08 -0800, Tom Herbert wrote:
>
>>
>> - Add a description of the new bit and how skb_checksum_help can work
>> to the comments for CHECKSUM_PARTIAL in skbuff.h
>
> Done.
>
>>
>> - Add FCOE to the list of protocol that can set CHECKSUM_UNNECESSARY
>> for a CRC/csum
>
> Done.
>
>>
>> - Add a note to CHECKSUM_COMPLETE section that it can only refer to an
>> Internet checksum
>
> Done.
>
> /* references + notes */
>
> [1] ... this recalls to latest comment from David Laight:
> On Thu, 2017-02-02 at 16:55 +0000, David Laight wrote:
>>
>> I have to admit to not knowing exactly what the CHECKSUM_xxx flags
>> actually mean. I have a good idea about what the intention is though.
>
> According to domumentation, CHECKSUM_COMPLETE and CHECKSUM_UNNECESSARY are
> not used for SCTP (nor in the TX path at all); nevertheless, IPVS snat/dnat
> actually set CHECKSUM_UNNECESSARY on SCTP packets after the checksum is
> updated (see 97203abe6bc4 "net: ipvs: sctp: do not recalc...).
>
CHECKSUM_PARTIAL is the preferred mechanism on the transmit path this
defers defers the checksum computation as long as possible.
Unfortunately, if SCTP is encapsulated in UDP we will probably need to
run the SCTP CRC on the host which will be done with your changes to
skb_checksum_help.

> I'm not sure if setting CHECKSUM_UNNECESSARY fits my case, because this would
> implicitly skip RX validation when using devices like veth or loopback.
>
CHECKSUM_UNNECESSARY can be used in the transmit path (really the
forwarding path), however this I think this must imply that the
checksum in the packet must be correct. Please see my post about
drivers that are mistakingly using CHECKSUM_UNNECESSARY with LRO since
the checksum in the packet sent into the stack is not correct.

Tom

> [2] Documentation/networking/checksum_offloads.txt
>
> regards,
> --
> davide

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH net-next 2/5] net: split skb_checksum_help
  2017-02-27 15:11             ` Tom Herbert
@ 2017-02-28 10:31               ` Davide Caratti
  0 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-02-28 10:31 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David Laight, David S. Miller, Linux Kernel Network Developers,
	linux-sctp@vger.kernel.org, Marcelo Ricardo Leitner

On Mon, 2017-02-27 at 07:11 -0800, Tom Herbert wrote:
> CHECKSUM_PARTIAL is the preferred mechanism on the transmit path this
> defers defers the checksum computation as long as possible.
> Unfortunately, if SCTP is encapsulated in UDP we will probably need to
> run the SCTP CRC on the host which will be done with your changes to
> skb_checksum_help.

right. Tunnel devices have NETIF_F_SCTP_CRC bit cleared and
NETIF_F_HW_CSUM bit set: so, in this case csum_not_inet can help
recovering non-GSO SCTP packets having ip_summed equal to
CHECKSUM_PARTIAL.

> > I'm not sure if setting CHECKSUM_UNNECESSARY fits my case, because this would
> > implicitly skip RX validation when using devices like veth or loopback.
> >
> CHECKSUM_UNNECESSARY can be used in the transmit path (really the
> forwarding path), however this I think this must imply that the
> checksum in the packet must be correct. Please see my post about
> drivers that are mistakingly using CHECKSUM_UNNECESSARY with LRO since
> the checksum in the packet sent into the stack is not correct.

Ok, now I'm more convinced to use CHECKSUM_NONE :-)

thank you for the attention!
regards
--
davide
 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-02-27 13:39           ` Davide Caratti
  2017-02-27 15:11             ` Tom Herbert
@ 2017-02-28 10:32             ` Davide Caratti
  2017-02-28 22:46               ` Alexander Duyck
  2017-02-28 10:32             ` [PATCH RFC net-next v2 2/4] net: introduce skb_sctp_csum_help Davide Caratti
                               ` (2 subsequent siblings)
  4 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-02-28 10:32 UTC (permalink / raw)
  To: David Laight, Tom Herbert
  Cc: David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
it can't be used in net core. Like it has been done previously with other
symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
to allow computation of SCTP checksum in net core after sctp.ko (and thus
libcrc32c) has been loaded.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 20 ++++++++++++++++++++
 net/sctp/offload.c     |  7 +++++++
 3 files changed, 29 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 69ccd26..cab9a32 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3125,6 +3125,8 @@ struct skb_checksum_ops {
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
 };
 
+extern const struct skb_checksum_ops *sctp_csum_stub __read_mostly;
+
 __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
 		      __wsum csum, const struct skb_checksum_ops *ops);
 __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f355795..64fd8fd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2242,6 +2242,26 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 }
 EXPORT_SYMBOL(skb_copy_and_csum_bits);
 
+static __wsum warn_sctp_csum_update(const void *buff, int len, __wsum sum)
+{
+	net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
+	return 0;
+}
+
+static __wsum warn_sctp_csum_combine(__wsum csum, __wsum csum2,
+				     int offset, int len)
+{
+	net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
+	return 0;
+}
+
+const struct skb_checksum_ops *sctp_csum_stub __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = warn_sctp_csum_update,
+	.combine = warn_sctp_csum_combine,
+};
+EXPORT_SYMBOL(sctp_csum_stub);
+
  /**
  *	skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
  *	@from: source buffer
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 4f5a2b5..e9c3db0 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
 	},
 };
 
+static const struct skb_checksum_ops *sctp_csum_ops __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = sctp_csum_update,
+	.combine = sctp_csum_combine,
+};
+
 int __init sctp_offload_init(void)
 {
 	int ret;
@@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
 	if (ret)
 		goto ipv4;
 
+	sctp_csum_stub = sctp_csum_ops;
 	return ret;
 
 ipv4:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v2 2/4] net: introduce skb_sctp_csum_help
  2017-02-27 13:39           ` Davide Caratti
  2017-02-27 15:11             ` Tom Herbert
  2017-02-28 10:32             ` [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
@ 2017-02-28 10:32             ` Davide Caratti
  2017-02-28 10:32             ` [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb Davide Caratti
  2017-02-28 10:32             ` [PATCH RFC net-next v2 4/4] Documentation: update notes on checksum offloading Davide Caratti
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-02-28 10:32 UTC (permalink / raw)
  To: David Laight, Tom Herbert
  Cc: David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

skb_sctp_csum_help is like skb_checksum_help, but it is designed for
checksumming SCTP packets using crc32c (see RFC3309), provided that
sctp.ko has been loaded before. In case sctp.ko is not loaded, invoking
skb_sctp_csum_help on a skb results in the following printout:

sk_buff: attempt to compute crc32c without sctp.ko

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/netdevice.h |  1 +
 include/linux/skbuff.h    |  3 ++-
 net/core/dev.c            | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f40f0ab..8c34735 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3918,6 +3918,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
+int skb_sctp_csum_help(struct sk_buff *skb);
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cab9a32..0671131 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -192,7 +192,8 @@
  *     accordingly. Note the there is no indication in the skbuff that the
  *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
  *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers.
+ *     is configured for a packet presumably by inspecting packet headers; in
+ *     case, skb_sctp_csum_help is provided to compute CRC on SCTP packets.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
diff --git a/net/core/dev.c b/net/core/dev.c
index 05d19c6..b9fb843 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -140,6 +140,7 @@
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
 #include <linux/crash_dump.h>
+#include <linux/sctp.h>
 
 #include "net-sysfs.h"
 
@@ -2578,6 +2579,45 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+int skb_sctp_csum_help(struct sk_buff *skb)
+{
+	__le32 crc32c_csum;
+	int ret = 0, offset;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		goto out;
+
+	if (unlikely(skb_is_gso(skb)))
+		goto out;
+
+	/* Before computing a checksum, we should make sure no frag could
+	 * be modified by an external entity : checksum could be wrong.
+	 */
+	if (unlikely(skb_has_shared_frag(skb))) {
+		ret = __skb_linearize(skb);
+		if (ret)
+			goto out;
+	}
+
+	offset = skb_checksum_start_offset(skb);
+	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
+						  skb->len - offset, ~(__u32)0,
+						  sctp_csum_stub));
+	offset += offsetof(struct sctphdr, checksum);
+	BUG_ON(offset >= skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__le32 *)(skb->data + offset) = crc32c_csum;
+	skb->ip_summed = CHECKSUM_NONE;
+out:
+	return ret;
+}
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb
  2017-02-27 13:39           ` Davide Caratti
                               ` (2 preceding siblings ...)
  2017-02-28 10:32             ` [PATCH RFC net-next v2 2/4] net: introduce skb_sctp_csum_help Davide Caratti
@ 2017-02-28 10:32             ` Davide Caratti
  2017-02-28 19:50               ` Tom Herbert
  2017-02-28 10:32             ` [PATCH RFC net-next v2 4/4] Documentation: update notes on checksum offloading Davide Caratti
  4 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-02-28 10:32 UTC (permalink / raw)
  To: David Laight, Tom Herbert
  Cc: David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

Introduce skb->csum_not_inet to identify not-yet-checksummed SCTP packets.
Use this bit in combination with netdev feature bit in validate_xmit_skb,
to discriminate whether skb needs crc32c or 2-complement Internet Checksum
(or none of the two, when the underlying device can do checksum offload).

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h                |  1 +
 net/core/dev.c                        | 14 ++++++++++++--
 net/netfilter/ipvs/ip_vs_proto_sctp.c |  1 +
 net/netfilter/nf_nat_proto_sctp.c     |  1 +
 net/sched/act_csum.c                  |  1 +
 net/sctp/output.c                     |  1 +
 6 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0671131..236b7d9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -759,6 +759,7 @@ struct sk_buff {
 	__u8			tc_redirected:1;
 	__u8			tc_from_ingress:1;
 #endif
+	__u8			csum_not_inet:1;
 
 #ifdef CONFIG_NET_SCHED
 	__u16			tc_index;	/* traffic control index */
diff --git a/net/core/dev.c b/net/core/dev.c
index b9fb843..fae3217 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2614,6 +2614,7 @@ int skb_sctp_csum_help(struct sk_buff *skb)
 	}
 	*(__le32 *)(skb->data + offset) = crc32c_csum;
 	skb->ip_summed = CHECKSUM_NONE;
+	skb->csum_not_inet = 0;
 out:
 	return ret;
 }
@@ -2960,6 +2961,16 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
+static int skb_csum_hwoffload_help(struct sk_buff *skb,
+				   netdev_features_t features)
+{
+	if (unlikely(skb->csum_not_inet))
+		return !(features & NETIF_F_SCTP_CRC) ?
+				skb_sctp_csum_help(skb) : 0;
+
+	return !(features & NETIF_F_CSUM_MASK) ? skb_checksum_help(skb) : 0;
+}
+
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 {
 	netdev_features_t features;
@@ -2995,8 +3006,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 			else
 				skb_set_transport_header(skb,
 							 skb_checksum_start_offset(skb));
-			if (!(features & NETIF_F_CSUM_MASK) &&
-			    skb_checksum_help(skb))
+			if (skb_csum_hwoffload_help(skb, features))
 				goto out_kfree_skb;
 		}
 	}
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index d952d67..4972a60 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -81,6 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
 			  unsigned int sctphoff)
 {
 	sctph->checksum = sctp_compute_cksum(skb, sctphoff);
+	skb->csum_not_inet = 0;
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
 }
 
diff --git a/net/netfilter/nf_nat_proto_sctp.c b/net/netfilter/nf_nat_proto_sctp.c
index 31d3586..9459b88 100644
--- a/net/netfilter/nf_nat_proto_sctp.c
+++ b/net/netfilter/nf_nat_proto_sctp.c
@@ -49,6 +49,7 @@ sctp_manip_pkt(struct sk_buff *skb,
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
 		hdr->checksum = sctp_compute_cksum(skb, hdroff);
+		skb->csum_not_inet = 0;
 		skb->ip_summed = CHECKSUM_NONE;
 	}
 
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index e978ccd4..85cb150 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -337,6 +337,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
 
 	sctph->checksum = sctp_compute_cksum(skb,
 					     skb_network_offset(skb) + ihl);
+	skb->csum_not_inet = 0;
 	skb->ip_summed = CHECKSUM_NONE;
 
 	return 1;
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 814eac0..0dc227b 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -528,6 +528,7 @@ static int sctp_packet_pack(struct sctp_packet *packet,
 	} else {
 chksum:
 		head->ip_summed = CHECKSUM_PARTIAL;
+		head->csum_not_inet = 1;
 		head->csum_start = skb_transport_header(head) - head->head;
 		head->csum_offset = offsetof(struct sctphdr, checksum);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v2 4/4] Documentation: update notes on checksum offloading
  2017-02-27 13:39           ` Davide Caratti
                               ` (3 preceding siblings ...)
  2017-02-28 10:32             ` [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb Davide Caratti
@ 2017-02-28 10:32             ` Davide Caratti
  4 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-02-28 10:32 UTC (permalink / raw)
  To: David Laight, Tom Herbert
  Cc: David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

Add description of skb_sctp_csum_help in networking/checksum-offload.txt,
and document its usage in combination with skb->csum_not_inet. While at
it, remove reference to skb_csum_off_chk* functions, since they have been
removed from Linux source tree since commit cf53b1da73bd ("Revert "net:
Add driver helper functions to determine checksum""), and add missing
explaination of CHECKSUM_UNNECESSARY for FCOE protocol.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 Documentation/networking/checksum-offloads.txt |  7 ++++---
 include/linux/skbuff.h                         | 25 ++++++++++++-------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e3686..81534e9 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -49,8 +49,8 @@ A driver declares its offload capabilities in netdev->hw_features; see
  and csum_offset given in the SKB; if it tries to deduce these itself in
  hardware (as some NICs do) the driver should check that the values in the
  SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h).  This
+ checksumming in software instead (with skb_checksum_help or
+ skb_sctp_csum_help functions as mentioned in include/linux/skbuff.h). This
  is a pain, but that's what you get when hardware tries to be clever.
 
 The stack should, for the most part, assume that checksum offload is
@@ -60,7 +60,8 @@ The stack should, for the most part, assume that checksum offload is
  may include other offloads besides TX Checksum Offload) and, if they are
  not supported or enabled on the device (determined by netdev->features),
  performs the corresponding offload in software.  In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_sctp_csum_help(skb) for SCTP
+ packets, and skb_checksum_help(skb) for other packets.
 
 
 LCO: Local Checksum Offload
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 236b7d9..12d3625 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -108,6 +108,7 @@
  *       may perform further validation in this case.
  *     GRE: only if the checksum is present in the header.
  *     SCTP: indicates the CRC in SCTP header has been validated.
+ *     FCOE: indicates the CRC in FC frame has been validated.
  *
  *   skb->csum_level indicates the number of consecutive checksums found in
  *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
@@ -161,14 +162,13 @@
  *
  *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
  *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
- *   checksum offload capability. If a	device has limited checksum capabilities
- *   (for instance can only perform NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM as
- *   described above) a helper function can be called to resolve
- *   CHECKSUM_PARTIAL. The helper functions are skb_csum_off_chk*. The helper
- *   function takes a spec argument that describes the protocol layer that is
- *   supported for checksum offload and can be called for each packet. If a
- *   packet does not match the specification for offload, skb_checksum_help
- *   is called to resolve the checksum.
+ *   checksum offload capability. If a device has limited checksum capabilities
+ *   (for instance it can't perform NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM as
+ *   described above) a helper function (namely skb_csum_hwoffload_help) can
+ *   be called to resolve CHECKSUM_PARTIAL. This function uses netdev_features_t
+ *   to have the Internet Checksum computed by HW, in case any feature belonging
+ *   to NETIF_F_CSUM_MASK is set, or by software using skb_checksum_help().
+ *   See also Section D.
  *
  * CHECKSUM_NONE:
  *
@@ -189,11 +189,10 @@
  *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
  *     offloading the SCTP CRC in a packet. To perform this offload the stack
  *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- *     accordingly. Note the there is no indication in the skbuff that the
- *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
- *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers; in
- *     case, skb_sctp_csum_help is provided to compute CRC on SCTP packets.
+ *     accordingly. skb->csum_not_inet is an indication in the skbuff that the
+ *     CHECKSUM_PARTIAL refers to an SCTP checksum: a driver can use it to
+ *     decide whether skb_checksum_help() or skb_sctp_csum_help() have to be
+ *     called on a sk_buff having ip_summed set to CHECKSUM_PARTIAL.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb
  2017-02-28 10:32             ` [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb Davide Caratti
@ 2017-02-28 19:50               ` Tom Herbert
  0 siblings, 0 replies; 51+ messages in thread
From: Tom Herbert @ 2017-02-28 19:50 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David Laight, David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

On Tue, Feb 28, 2017 at 2:32 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> Introduce skb->csum_not_inet to identify not-yet-checksummed SCTP packets.
> Use this bit in combination with netdev feature bit in validate_xmit_skb,
> to discriminate whether skb needs crc32c or 2-complement Internet Checksum
> (or none of the two, when the underlying device can do checksum offload).
>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/skbuff.h                |  1 +
>  net/core/dev.c                        | 14 ++++++++++++--
>  net/netfilter/ipvs/ip_vs_proto_sctp.c |  1 +
>  net/netfilter/nf_nat_proto_sctp.c     |  1 +
>  net/sched/act_csum.c                  |  1 +
>  net/sctp/output.c                     |  1 +
>  6 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 0671131..236b7d9 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -759,6 +759,7 @@ struct sk_buff {
>         __u8                    tc_redirected:1;
>         __u8                    tc_from_ingress:1;
>  #endif
> +       __u8                    csum_not_inet:1;
>
Unfortunately this potentially pushes the skbuf flags over 32 bits if
I count correctly. I suggest that you rename csum_bad to
csum_not_inet. Looks like csum_bad is only set by a grand total of one
driver and I don't believe that is enough to justify its existence.
It's probably a good time to remove it.

>  #ifdef CONFIG_NET_SCHED
>         __u16                   tc_index;       /* traffic control index */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b9fb843..fae3217 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2614,6 +2614,7 @@ int skb_sctp_csum_help(struct sk_buff *skb)
>         }
>         *(__le32 *)(skb->data + offset) = crc32c_csum;
>         skb->ip_summed = CHECKSUM_NONE;
> +       skb->csum_not_inet = 0;
>  out:
>         return ret;
>  }
> @@ -2960,6 +2961,16 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
>         return skb;
>  }
>
> +static int skb_csum_hwoffload_help(struct sk_buff *skb,
> +                                  netdev_features_t features)
> +{
> +       if (unlikely(skb->csum_not_inet))
> +               return !(features & NETIF_F_SCTP_CRC) ?
> +                               skb_sctp_csum_help(skb) : 0;
> +
Return value looks complex. Maybe we should just change
skb_csum_*_help to return bool, true of checksum was handled false if
not.

> +       return !(features & NETIF_F_CSUM_MASK) ? skb_checksum_help(skb) : 0;
> +}
> +
>  static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
>  {
>         netdev_features_t features;
> @@ -2995,8 +3006,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
>                         else
>                                 skb_set_transport_header(skb,
>                                                          skb_checksum_start_offset(skb));
> -                       if (!(features & NETIF_F_CSUM_MASK) &&
> -                           skb_checksum_help(skb))
> +                       if (skb_csum_hwoffload_help(skb, features))
>                                 goto out_kfree_skb;
>                 }
>         }
> diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> index d952d67..4972a60 100644
> --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
> +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> @@ -81,6 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
>                           unsigned int sctphoff)
>  {
>         sctph->checksum = sctp_compute_cksum(skb, sctphoff);
> +       skb->csum_not_inet = 0;
>         skb->ip_summed = CHECKSUM_UNNECESSARY;
>  }
>
> diff --git a/net/netfilter/nf_nat_proto_sctp.c b/net/netfilter/nf_nat_proto_sctp.c
> index 31d3586..9459b88 100644
> --- a/net/netfilter/nf_nat_proto_sctp.c
> +++ b/net/netfilter/nf_nat_proto_sctp.c
> @@ -49,6 +49,7 @@ sctp_manip_pkt(struct sk_buff *skb,
>
>         if (skb->ip_summed != CHECKSUM_PARTIAL) {
>                 hdr->checksum = sctp_compute_cksum(skb, hdroff);
> +               skb->csum_not_inet = 0;
>                 skb->ip_summed = CHECKSUM_NONE;
>         }
>
> diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
> index e978ccd4..85cb150 100644
> --- a/net/sched/act_csum.c
> +++ b/net/sched/act_csum.c
> @@ -337,6 +337,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
>
>         sctph->checksum = sctp_compute_cksum(skb,
>                                              skb_network_offset(skb) + ihl);
> +       skb->csum_not_inet = 0;
>         skb->ip_summed = CHECKSUM_NONE;
>
>         return 1;
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 814eac0..0dc227b 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -528,6 +528,7 @@ static int sctp_packet_pack(struct sctp_packet *packet,
>         } else {
>  chksum:
>                 head->ip_summed = CHECKSUM_PARTIAL;
> +               head->csum_not_inet = 1;
>                 head->csum_start = skb_transport_header(head) - head->head;
>                 head->csum_offset = offsetof(struct sctphdr, checksum);
>         }
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-02-28 10:32             ` [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
@ 2017-02-28 22:46               ` Alexander Duyck
  2017-03-01  3:17                 ` Tom Herbert
                                   ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Alexander Duyck @ 2017-02-28 22:46 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David Laight, Tom Herbert, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

On Tue, Feb 28, 2017 at 2:32 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
> it can't be used in net core. Like it has been done previously with other
> symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
> to allow computation of SCTP checksum in net core after sctp.ko (and thus
> libcrc32c) has been loaded.

At a minimum the name really needs to change.  SCTP does not do
checksums.  It does a CRC, and a CRC is a very different thing.  The
fact that somebody decided that offloading a CRC could use the same
framework is very unfortunate, and your patch descriptions in this
whole set are calling out a CRC as checksums which it is not.

I don't want to see anything "checksum" or "csum" related in the
naming when it comes to dealing with SCTP unless we absolutely have to
have it.  So any function names or structures with sctp in the name
should call out "crc32" or "crc", please don't use checksum.

> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/skbuff.h |  2 ++
>  net/core/skbuff.c      | 20 ++++++++++++++++++++
>  net/sctp/offload.c     |  7 +++++++
>  3 files changed, 29 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 69ccd26..cab9a32 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3125,6 +3125,8 @@ struct skb_checksum_ops {
>         __wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
>  };
>
> +extern const struct skb_checksum_ops *sctp_csum_stub __read_mostly;
> +
>  __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
>                       __wsum csum, const struct skb_checksum_ops *ops);
>  __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index f355795..64fd8fd 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2242,6 +2242,26 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
>  }
>  EXPORT_SYMBOL(skb_copy_and_csum_bits);
>
> +static __wsum warn_sctp_csum_update(const void *buff, int len, __wsum sum)
> +{
> +       net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
> +       return 0;
> +}
> +
> +static __wsum warn_sctp_csum_combine(__wsum csum, __wsum csum2,
> +                                    int offset, int len)
> +{
> +       net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
> +       return 0;
> +}
> +
> +const struct skb_checksum_ops *sctp_csum_stub __read_mostly > +       &(struct skb_checksum_ops) {
> +       .update  = warn_sctp_csum_update,
> +       .combine = warn_sctp_csum_combine,
> +};
> +EXPORT_SYMBOL(sctp_csum_stub);
> +
>   /**
>   *     skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
>   *     @from: source buffer
> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
> index 4f5a2b5..e9c3db0 100644
> --- a/net/sctp/offload.c
> +++ b/net/sctp/offload.c
> @@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
>         },
>  };
>
> +static const struct skb_checksum_ops *sctp_csum_ops __read_mostly > +       &(struct skb_checksum_ops) {
> +       .update  = sctp_csum_update,
> +       .combine = sctp_csum_combine,
> +};
> +
>  int __init sctp_offload_init(void)
>  {
>         int ret;
> @@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
>         if (ret)
>                 goto ipv4;
>
> +       sctp_csum_stub = sctp_csum_ops;
>         return ret;
>
>  ipv4:
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-02-28 22:46               ` Alexander Duyck
@ 2017-03-01  3:17                 ` Tom Herbert
  2017-03-01 10:53                 ` David Laight
  2017-03-06 21:51                 ` Davide Caratti
  2 siblings, 0 replies; 51+ messages in thread
From: Tom Herbert @ 2017-03-01  3:17 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Davide Caratti, David Laight, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

On Tue, Feb 28, 2017 at 2:46 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Tue, Feb 28, 2017 at 2:32 AM, Davide Caratti <dcaratti@redhat.com> wrote:
>> sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
>> it can't be used in net core. Like it has been done previously with other
>> symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
>> to allow computation of SCTP checksum in net core after sctp.ko (and thus
>> libcrc32c) has been loaded.
>
> At a minimum the name really needs to change.  SCTP does not do
> checksums.  It does a CRC, and a CRC is a very different thing.  The
> fact that somebody decided that offloading a CRC could use the same
> framework is very unfortunate, and your patch descriptions in this
> whole set are calling out a CRC as checksums which it is not.
>
> I don't want to see anything "checksum" or "csum" related in the
> naming when it comes to dealing with SCTP unless we absolutely have to
> have it.  So any function names or structures with sctp in the name
> should call out "crc32" or "crc", please don't use checksum.
>
Alexander,

I agree that internal functions to sctp should not refer to checksum,
but I think we need to take care to be consistent with any external
API (even if somebody made a mistake defining it this way :-) ). As
you know the checksum interface must be very precisely defined, there
is no leeway for ambiguity. Many places in the stack use csum and
CHECKSUM_* to refer to the API not the actual algorithm, others don't
(e.g. CHECKSUM_UNNECESSARY can apply to SCTP checksum,
CHECKSUM_COMPLETE must be an Internet checksum).

For instance, in that light skb_sctp_csum_help is appropriately named
I think because this is being called from skb_csum_help and refers to
the interface to resolve a checksum.

Tom

>> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
>> ---
>>  include/linux/skbuff.h |  2 ++
>>  net/core/skbuff.c      | 20 ++++++++++++++++++++
>>  net/sctp/offload.c     |  7 +++++++
>>  3 files changed, 29 insertions(+)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 69ccd26..cab9a32 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -3125,6 +3125,8 @@ struct skb_checksum_ops {
>>         __wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
>>  };
>>
>> +extern const struct skb_checksum_ops *sctp_csum_stub __read_mostly;
>> +
>>  __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
>>                       __wsum csum, const struct skb_checksum_ops *ops);
>>  __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index f355795..64fd8fd 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -2242,6 +2242,26 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
>>  }
>>  EXPORT_SYMBOL(skb_copy_and_csum_bits);
>>
>> +static __wsum warn_sctp_csum_update(const void *buff, int len, __wsum sum)
>> +{
>> +       net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
>> +       return 0;
>> +}
>> +
>> +static __wsum warn_sctp_csum_combine(__wsum csum, __wsum csum2,
>> +                                    int offset, int len)
>> +{
>> +       net_warn_ratelimited("attempt to compute crc32c without sctp.ko\n");
>> +       return 0;
>> +}
>> +
>> +const struct skb_checksum_ops *sctp_csum_stub __read_mostly >> +       &(struct skb_checksum_ops) {
>> +       .update  = warn_sctp_csum_update,
>> +       .combine = warn_sctp_csum_combine,
>> +};
>> +EXPORT_SYMBOL(sctp_csum_stub);
>> +
>>   /**
>>   *     skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
>>   *     @from: source buffer
>> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
>> index 4f5a2b5..e9c3db0 100644
>> --- a/net/sctp/offload.c
>> +++ b/net/sctp/offload.c
>> @@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
>>         },
>>  };
>>
>> +static const struct skb_checksum_ops *sctp_csum_ops __read_mostly >> +       &(struct skb_checksum_ops) {
>> +       .update  = sctp_csum_update,
>> +       .combine = sctp_csum_combine,
>> +};
>> +
>>  int __init sctp_offload_init(void)
>>  {
>>         int ret;
>> @@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
>>         if (ret)
>>                 goto ipv4;
>>
>> +       sctp_csum_stub = sctp_csum_ops;
>>         return ret;
>>
>>  ipv4:
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-02-28 22:46               ` Alexander Duyck
  2017-03-01  3:17                 ` Tom Herbert
@ 2017-03-01 10:53                 ` David Laight
  2017-03-06 21:51                 ` Davide Caratti
  2 siblings, 0 replies; 51+ messages in thread
From: David Laight @ 2017-03-01 10:53 UTC (permalink / raw)
  To: 'Alexander Duyck', Davide Caratti
  Cc: Tom Herbert, David S . Miller, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

RnJvbTogQWxleGFuZGVyIER1eWNrDQo+IFNlbnQ6IDI4IEZlYnJ1YXJ5IDIwMTcgMjI6NDYNCi4u
Lg0KPiBJIGRvbid0IHdhbnQgdG8gc2VlIGFueXRoaW5nICJjaGVja3N1bSIgb3IgImNzdW0iIHJl
bGF0ZWQgaW4gdGhlDQo+IG5hbWluZyB3aGVuIGl0IGNvbWVzIHRvIGRlYWxpbmcgd2l0aCBTQ1RQ
IHVubGVzcyB3ZSBhYnNvbHV0ZWx5IGhhdmUgdG8NCj4gaGF2ZSBpdC4gIFNvIGFueSBmdW5jdGlv
biBuYW1lcyBvciBzdHJ1Y3R1cmVzIHdpdGggc2N0cCBpbiB0aGUgbmFtZQ0KPiBzaG91bGQgY2Fs
bCBvdXQgImNyYzMyIiBvciAiY3JjIiwgcGxlYXNlIGRvbid0IHVzZSBjaGVja3N1bS4NCg0KVGhl
biBhbHNvIGNoYW5nZSBhbGwgdGhlIHBsYWNlcyB0aGF0IHJlZmVyIHRoZSBJUCAxJ3MgY29tcGxp
bWVudA0KY2hlY2tzdW0gdG8gaXBjaGVja3N1bS4NCg0KCURhdmlkDQoNCg=

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-02-28 22:46               ` Alexander Duyck
  2017-03-01  3:17                 ` Tom Herbert
  2017-03-01 10:53                 ` David Laight
@ 2017-03-06 21:51                 ` Davide Caratti
  2017-03-07 18:06                   ` Alexander Duyck
  2 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-03-06 21:51 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Laight, Tom Herbert, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

On Tue, 2017-02-28 at 14:46 -0800, Alexander Duyck wrote:
> On Tue, Feb 28, 2017 at 2:32 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> > 
> > sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
> > it can't be used in net core. Like it has been done previously with other
> > symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
> > to allow computation of SCTP checksum in net core after sctp.ko (and thus
> > libcrc32c) has been loaded.
> 
> At a minimum the name really needs to change.  SCTP does not do
> checksums.  It does a CRC, and a CRC is a very different thing.  The
> fact that somebody decided that offloading a CRC could use the same
> framework is very unfortunate, and your patch descriptions in this
> whole set are calling out a CRC as checksums which it is not.

hello Alexander,

thank you for contributing to this topic. I see there has been a similar
discussion some months ago
(https://www.mail-archive.com/netdev@vger.kernel.org/msg94955.html).

> I don't want to see anything "checksum" or "csum" related in the
> naming when it comes to dealing with SCTP unless we absolutely have
> to have it.  So any function names or structures with sctp in the name
> should call out "crc32" or "crc", please don't use checksum.

On Wed, 2017-03-01 at 10:53 +0000, David Laight wrote:
> Then also change all the places that refer the IP 1's compliment
> checksum to ipchecksum.

(but crc32 uses a different polynomial than crc32c! :-) ) I understand 
your concerns, nevertheless we are writing to a member of struct sctphdr
whose name is 'checksum' since the earliest introduction of SCTP; moreover,
similar terminology ('crc32c checksum') is used throughout all RFC4960.
That's why I don't think anybody will be confused by usage of 'csum' or
'checksum' words.

On Tue, 2017-02-28 at 19:17 -0800, Tom Herbert wrote:
> I agree that internal functions to sctp should not refer to checksum,
> but I think we need to take care to be consistent with any external
> API (even if somebody made a mistake defining it this way :-) ). As
> you know the checksum interface must be very precisely defined, there
> is no leeway for ambiguity.

We can make the new symbols more generic removing 'sctp' from the
symbol name, and writing explicitly that skb needs crc32c (rather than
skb does not need internet checksum).

Proposal:
we use crc32c, possibly combined with 'csum' or 'checksum', just like
it has been done in RFC4960.  So, symbol names can be replaced as follows:

RFC v2 name              | RFC v3 name
-------------------------+-----------------------------
warn_sctp_csum_update    | warn_crc32c_csum_update
warn_sctp_csum_combine   | warn_crc32c_csum_combine
sctp_csum_stub           | crc32c_csum_stub
sctp_csum_ops            | crc32c_csum_ops
skb_sctp_csum_help       | skb_crc32c_csum_help
skb->csum_not_inet       | skb->crc32c_csum

please let me know if the proposal can be acceptable from your point of view.

On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
> Unfortunately this potentially pushes the skbuf flags over 32 bits if
> I count correctly. I suggest that you rename csum_bad to
> csum_not_inet. Looks like csum_bad is only set by a grand total of one
> driver and I don't believe that is enough to justify its existence.
> It's probably a good time to remove it.

you are right: find below the current layout obtained with 'allyesconfig':

short unsigned int         queue_mapping;                   /*   140     2 */
unsigned char              __cloned_offset[0];              /*   142     0 */
unsigned char              cloned:1;                        /*   142: 7  1 */
unsigned char              nohdr:1;                         /*   142: 6  1 */
unsigned char              fclone:2;                        /*   142: 4  1 */
unsigned char              peeked:1;                        /*   142: 3  1 */
unsigned char              head_frag:1;                     /*   142: 2  1 */
unsigned char              xmit_more:1;                     /*   142: 1  1 */
unsigned char              __unused:1;                      /*   142: 0  1 */

/* XXX 1 byte hole, try to pack */
unsigned int               headers_start[0];                /*   144     0 */
unsigned char              __pkt_type_offset[0];            /*   144     0 */
unsigned char              pkt_type:3;                      /*   144: 5  1 */

<...>

unsigned char              ipvs_property:1;                 /*   147: 7  1 */
unsigned char              inner_protocol_type:1;           /*   147: 6  1 */
unsigned char              remcsum_offload:1;               /*   147: 5  1 */
unsigned char              offload_fwd_mark:1;              /*   147: 4  1 */
unsigned char              tc_skip_classify:1;              /*   147: 3  1 */
unsigned char              tc_at_ingress:1;                 /*   147: 2  1 */
unsigned char              tc_redirected:1;                 /*   147: 1  1 */
unsigned char              tc_from_ingress:1;               /*   147: 0  1 */
short unsigned int         tc_index;                        /*   148     2 */

/* XXX 2 bytes hole, try to pack */
union {
                unsigned int       csum;                    /*           4 */
                struct {
                        short unsigned int csum_start;      /*   152     2 */
                       short unsigned int csum_offset;      /*   154     2 */
        };                                                  /*           4 */
}                                                           /*   152     4 */

skb->tc_from_ingress is the last element of the 32 bits starting at
skb->pkt_type. There are 16 bits free before skb->csum, and 9 free bits
before skb->pkt_type. I don't think I can easily make room by removing
'csum_bad' as per your suggestion, because it is used by GRO and
netfilter code also (see users of __skb_mark_checksum_bad()). So, either
I place 'csum_not_inet' in one of the two above intervals (i.e replacing
__unused with csum_not_inet AKA crc32c_csum), or I have to give up the
(good) idea of using a bit in sk_buff.

BTW: unlike what I see with other NICs, using ixgbe driver I don't see
corrupted L4 packets, even when SCTP CRC offload is turned off. Looking
at the code, I see ixgbe_tx_csum does a simple test to identify SCTP in
packets with CHECKSUM_PARTIAL and have their checksum resolved by the 
hardware:

switch (skb->csum_offset) {
	case offsetof(struct tcphdr, check):
		/* it's TCP */
		/* fall-through */
	case offsetof(struct udphdr, check)
		/* it's UDP */
		break;
	case offsetof(struct scphdr, checksum):
		if (/* an ipv4 or ipv6 header with protocol equal to
		     * IPPOROTO_SCTP is found
		     */)
		    /* it's SCTP */
			break;
		}
		/* fall through */
	default:
		skb_checksum_help(skb);
}

The above code is functionally similar to what I did in patch 4/5 of the
initial series (http://www.spinics.net/lists/linux-sctp/msg05608.html).
Should we consider it again for fixing wrong CRC32c issues in case using
a bit in struct sk_buff is not viable?

On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
> Return value looks complex. Maybe we should just change
> skb_csum_*_help to return bool, true of checksum was handled false if
> not.

These functions can return -EINVAL if skb is a GSO packet, or -ENOMEM if
skb_linearize(skb) or pskb_expand_head(skb) fail, or 0. I would preserve the
return value of skb_checksum_help() and provide similar range of return values
for skb_sctp_csum_help() (also known as skb_crc32c_csum_help()): this can
help eventual future attempts to remove skb_warn_bad_offload(). It makes
sense to make boolean the return value of skb_csum_hwoffload_help(),
since we are using it only for non-GSO packets.  
	
Thank you in advance for the feedbacks,

regards,
--
davide



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-03-06 21:51                 ` Davide Caratti
@ 2017-03-07 18:06                   ` Alexander Duyck
  2017-03-18 13:17                     ` Davide Caratti
  0 siblings, 1 reply; 51+ messages in thread
From: Alexander Duyck @ 2017-03-07 18:06 UTC (permalink / raw)
  To: Davide Caratti
  Cc: David Laight, Tom Herbert, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

On Mon, Mar 6, 2017 at 1:51 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> On Tue, 2017-02-28 at 14:46 -0800, Alexander Duyck wrote:
>> On Tue, Feb 28, 2017 at 2:32 AM, Davide Caratti <dcaratti@redhat.com> wrote:
>> >
>> > sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
>> > it can't be used in net core. Like it has been done previously with other
>> > symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
>> > to allow computation of SCTP checksum in net core after sctp.ko (and thus
>> > libcrc32c) has been loaded.
>>
>> At a minimum the name really needs to change.  SCTP does not do
>> checksums.  It does a CRC, and a CRC is a very different thing.  The
>> fact that somebody decided that offloading a CRC could use the same
>> framework is very unfortunate, and your patch descriptions in this
>> whole set are calling out a CRC as checksums which it is not.
>
> hello Alexander,
>
> thank you for contributing to this topic. I see there has been a similar
> discussion some months ago
> (https://www.mail-archive.com/netdev@vger.kernel.org/msg94955.html).
>
>> I don't want to see anything "checksum" or "csum" related in the
>> naming when it comes to dealing with SCTP unless we absolutely have
>> to have it.  So any function names or structures with sctp in the name
>> should call out "crc32" or "crc", please don't use checksum.
>
> On Wed, 2017-03-01 at 10:53 +0000, David Laight wrote:
>> Then also change all the places that refer the IP 1's compliment
>> checksum to ipchecksum.
>
> (but crc32 uses a different polynomial than crc32c! :-) ) I understand
> your concerns, nevertheless we are writing to a member of struct sctphdr
> whose name is 'checksum' since the earliest introduction of SCTP; moreover,
> similar terminology ('crc32c checksum') is used throughout all RFC4960.
> That's why I don't think anybody will be confused by usage of 'csum' or
> 'checksum' words.
>
> On Tue, 2017-02-28 at 19:17 -0800, Tom Herbert wrote:
>> I agree that internal functions to sctp should not refer to checksum,
>> but I think we need to take care to be consistent with any external
>> API (even if somebody made a mistake defining it this way :-) ). As
>> you know the checksum interface must be very precisely defined, there
>> is no leeway for ambiguity.
>
> We can make the new symbols more generic removing 'sctp' from the
> symbol name, and writing explicitly that skb needs crc32c (rather than
> skb does not need internet checksum).
>
> Proposal:
> we use crc32c, possibly combined with 'csum' or 'checksum', just like
> it has been done in RFC4960.  So, symbol names can be replaced as follows:
>
> RFC v2 name              | RFC v3 name
> -------------------------+-----------------------------
> warn_sctp_csum_update    | warn_crc32c_csum_update
> warn_sctp_csum_combine   | warn_crc32c_csum_combine
> sctp_csum_stub           | crc32c_csum_stub
> sctp_csum_ops            | crc32c_csum_ops
> skb_sctp_csum_help       | skb_crc32c_csum_help
> skb->csum_not_inet       | skb->crc32c_csum
>
> please let me know if the proposal can be acceptable from your point of view.

I do like this approach better.  You might even take this one step
further.  You could convert crc32_csum into a 1 bit enum for now.
Basically you would use 0 for 1's compliement csum, and 1 to represent
a crc32c csum.  Then if we end up having to add another bit for
something like FCoE in the future it would give us 4 possible checksum
types instead of just giving us 1 with a bit mask.

> On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
>> Unfortunately this potentially pushes the skbuf flags over 32 bits if
>> I count correctly. I suggest that you rename csum_bad to
>> csum_not_inet. Looks like csum_bad is only set by a grand total of one
>> driver and I don't believe that is enough to justify its existence.
>> It's probably a good time to remove it.
>
> you are right: find below the current layout obtained with 'allyesconfig':
>
> short unsigned int         queue_mapping;                   /*   140     2 */
> unsigned char              __cloned_offset[0];              /*   142     0 */
> unsigned char              cloned:1;                        /*   142: 7  1 */
> unsigned char              nohdr:1;                         /*   142: 6  1 */
> unsigned char              fclone:2;                        /*   142: 4  1 */
> unsigned char              peeked:1;                        /*   142: 3  1 */
> unsigned char              head_frag:1;                     /*   142: 2  1 */
> unsigned char              xmit_more:1;                     /*   142: 1  1 */
> unsigned char              __unused:1;                      /*   142: 0  1 */
>
> /* XXX 1 byte hole, try to pack */
> unsigned int               headers_start[0];                /*   144     0 */
> unsigned char              __pkt_type_offset[0];            /*   144     0 */
> unsigned char              pkt_type:3;                      /*   144: 5  1 */
>
> <...>
>
> unsigned char              ipvs_property:1;                 /*   147: 7  1 */
> unsigned char              inner_protocol_type:1;           /*   147: 6  1 */
> unsigned char              remcsum_offload:1;               /*   147: 5  1 */
> unsigned char              offload_fwd_mark:1;              /*   147: 4  1 */
> unsigned char              tc_skip_classify:1;              /*   147: 3  1 */
> unsigned char              tc_at_ingress:1;                 /*   147: 2  1 */
> unsigned char              tc_redirected:1;                 /*   147: 1  1 */
> unsigned char              tc_from_ingress:1;               /*   147: 0  1 */
> short unsigned int         tc_index;                        /*   148     2 */
>
> /* XXX 2 bytes hole, try to pack */
> union {
>                 unsigned int       csum;                    /*           4 */
>                 struct {
>                         short unsigned int csum_start;      /*   152     2 */
>                        short unsigned int csum_offset;      /*   154     2 */
>         };                                                  /*           4 */
> }                                                           /*   152     4 */
>
> skb->tc_from_ingress is the last element of the 32 bits starting at
> skb->pkt_type. There are 16 bits free before skb->csum, and 9 free bits
> before skb->pkt_type. I don't think I can easily make room by removing
> 'csum_bad' as per your suggestion, because it is used by GRO and
> netfilter code also (see users of __skb_mark_checksum_bad()). So, either
> I place 'csum_not_inet' in one of the two above intervals (i.e replacing
> __unused with csum_not_inet AKA crc32c_csum), or I have to give up the
> (good) idea of using a bit in sk_buff.
>
> BTW: unlike what I see with other NICs, using ixgbe driver I don't see
> corrupted L4 packets, even when SCTP CRC offload is turned off. Looking
> at the code, I see ixgbe_tx_csum does a simple test to identify SCTP in
> packets with CHECKSUM_PARTIAL and have their checksum resolved by the
> hardware:
>
> switch (skb->csum_offset) {
>         case offsetof(struct tcphdr, check):
>                 /* it's TCP */
>                 /* fall-through */
>         case offsetof(struct udphdr, check)
>                 /* it's UDP */
>                 break;
>         case offsetof(struct scphdr, checksum):
>                 if (/* an ipv4 or ipv6 header with protocol equal to
>                      * IPPOROTO_SCTP is found
>                      */)
>                     /* it's SCTP */
>                         break;
>                 }
>                 /* fall through */
>         default:
>                 skb_checksum_help(skb);
> }
>
> The above code is functionally similar to what I did in patch 4/5 of the
> initial series (http://www.spinics.net/lists/linux-sctp/msg05608.html).
> Should we consider it again for fixing wrong CRC32c issues in case using
> a bit in struct sk_buff is not viable?

I would say if you can't use an extra bit to indicate the checksum
type you probably don't have too much other choice.

As far as the patch you provided I would say it is a good start, but
was a bit to aggressive in a few spots.  For now we don't have support
for offloading crc32c when encapsulating a frame so you don't need to
worry about that too much for now.  Also as far as the features test
you should only need to find that one of the feature bits is set in
the list you were testing.  What might make sense would be to look
into updating can_checksum_protocol to possibly factor in csum_offset
when determining if we can offload it or not.

> On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
>> Return value looks complex. Maybe we should just change
>> skb_csum_*_help to return bool, true of checksum was handled false if
>> not.
>
> These functions can return -EINVAL if skb is a GSO packet, or -ENOMEM if
> skb_linearize(skb) or pskb_expand_head(skb) fail, or 0. I would preserve the
> return value of skb_checksum_help() and provide similar range of return values
> for skb_sctp_csum_help() (also known as skb_crc32c_csum_help()): this can
> help eventual future attempts to remove skb_warn_bad_offload(). It makes
> sense to make boolean the return value of skb_csum_hwoffload_help(),
> since we are using it only for non-GSO packets.
>
> Thank you in advance for the feedbacks,
>
> regards,
> --
> davide
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-03-07 18:06                   ` Alexander Duyck
@ 2017-03-18 13:17                     ` Davide Caratti
  2017-03-18 22:35                       ` Tom Herbert
  0 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-03-18 13:17 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Laight, Tom Herbert, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

hello Alexander and Tom,

On Tue, 2017-03-07 at 10:06 -0800, Alexander Duyck wrote:
> 
> You might even take this one step
> further.  You could convert crc32_csum into a 1 bit enum for now.
> Basically you would use 0 for 1's compliement csum, and 1 to represent
> a crc32c csum.  Then if we end up having to add another bit for
> something like FCoE in the future it would give us 4 possible checksum
> types instead of just giving us 1 with a bit mask.
<...>
> I would say if you can't use an extra bit to indicate the checksum type
> you probably don't have too much other choice.

Unluckily, there are no free bits in struct sk_buff (i.e. there is 1 + 8 
bits after skb->xmit_more, but its content would be be lost after
__copy_skb_header() _ so simply we can't use them).
As soon as two bits in sk_buff are freed, we will be able to rely on the
skb metadata, instead of inspecting the packet headers, to understand
what algorithm is used to ensure data integrity in the packet.

> As far as the patch you provided I would say it is a good start, but
> was a bit to aggressive in a few spots.  For now we don't have support
> for offloading crc32c when encapsulating a frame so you don't need to
> worry about that too much for now.  

Ok _ so, skb_csum_hwoffload_help(skb, features) will assume that skb needs
crc32c if all the following conditions are met:
- feature bitmask does not have NETIF_F_SCTP_CRC bit set
- skb->csum_offset is equal to 8 (i.e. offsetof(struct sctphdr,checksum)).
- skb is carrying an (outer, non encapsulated) IPv4/IPv6 header with
protocol number equal to 132 (i.e. IPPROTO_SCTP)

In any other case, we will compute the internet checksum or do nothing _
just what it's happening right now for non-GSO packets reaching
validate_xmit_skb(). I think this implementation can be extended to the
FCoE case if needed.

> Also as far as the features test
> you should only need to find that one of the feature bits is set in
> the list you were testing.  What might make sense would be to look
> into updating can_checksum_protocol to possibly factor in csum_offset
> when determining if we can offload it or not.

Looking again at the code, I noticed that the number of test on 'features'
bits can be reduced: see below.

can_checksum_protocol() takes an ethertype as parameter, so we would need
to invent a non-standardized valure for SCTP. Moreover, it is used in
skb_segment() for GSO: so, adding extra CPU cycles would affect
performance on a path where the kernel is already showing the right
behavior (GSO SCTP packets have their CRC32 computed correctly when
sctp_gso_segment() is called).     


hello Tom,
> > On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
> > > 
> > > Return value looks complex. Maybe we should just change
> > > skb_csum_*_help to return bool, true of checksum was handled false if
> > > not.
> > 
> > These functions can return -EINVAL if skb is a GSO packet, or -ENOMEM if
> > skb_linearize(skb) or pskb_expand_head(skb) fail, or 0. I would preserve the
> > return value of skb_checksum_help() and provide similar range of return values
> > for skb_sctp_csum_help() (also known as skb_crc32c_csum_help()): this can
> > help eventual future attempts to remove skb_warn_bad_offload(). It makes
> > sense to make boolean the return value of skb_csum_hwoffload_help(),
> > since we are using it only for non-GSO packets.

the above statement is still valid after the body of the function changed. A
very small thing: according to the kernel coding style, I should find a
'predicative' name for this function. Something like

skb_can_resolve_partial_csum(),

(which is terrible, I know)

or similar / better.

Please let me know if you think the code below is ok for you.
Thank you in advance!

regards,

--
davide


--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2987,6 +2987,38 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
+static bool skb_csum_hwoffload_help(struct sk_buff *skb,
+				    netdev_features_t features)
+{
+	bool crc32c_csum_hwoff = !!(features & NETIF_F_SCTP_CRC);
+	bool inet_csum_hwoff = !!(features & NETIF_F_CSUM_MASK);
+	unsigned int offset = 0;
+
+	if (crc32c_csum_hwoff && inet_csum_hwoff)
+		return true;
+
+	if (skb->encapsulation ||
+	    skb->csum_offset != offsetof(struct sctphdr, checksum))
+		goto inet_csum;
+
+	switch (vlan_get_protocol(skb)) {
+	case ntohs(ETH_P_IP):
+		if (ip_hdr(skb)->protocol = IPPROTO_SCTP)
+			goto crc32c_csum;
+		break;
+	case ntohs(ETH_P_IPV6):
+		if (ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL) =
+		    IPPROTO_SCTP)
+			goto crc32c_csum;
+		break;
+	}
+inet_csum:
+	return inet_csum_hwoff ? true : !skb_checksum_help(skb);
+
+crc32c_csum:
+	return crc32c_csum_hwoff ? true : !skb_crc32c_csum_help(skb);
+}
+
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 {
 	netdev_features_t features;
@@ -3022,8 +3054,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 			else
 				skb_set_transport_header(skb,
 							 skb_checksum_start_offset(skb));
-			if (!(features & NETIF_F_CSUM_MASK) &&
-			    skb_checksum_help(skb))
+			if (skb_csum_hwoffload_help(skb, features) = false)
 				goto out_kfree_skb;
 		}
 	}




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets
  2017-03-18 13:17                     ` Davide Caratti
@ 2017-03-18 22:35                       ` Tom Herbert
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path Davide Caratti
                                           ` (7 more replies)
  0 siblings, 8 replies; 51+ messages in thread
From: Tom Herbert @ 2017-03-18 22:35 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Linux Kernel Network Developers, linux-sctp @ vger . kernel . org,
	Marcelo Ricardo Leitner

On Sat, Mar 18, 2017 at 6:17 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> hello Alexander and Tom,
>
> On Tue, 2017-03-07 at 10:06 -0800, Alexander Duyck wrote:
>>
>> You might even take this one step
>> further.  You could convert crc32_csum into a 1 bit enum for now.
>> Basically you would use 0 for 1's compliement csum, and 1 to represent
>> a crc32c csum.  Then if we end up having to add another bit for
>> something like FCoE in the future it would give us 4 possible checksum
>> types instead of just giving us 1 with a bit mask.
> <...>
>> I would say if you can't use an extra bit to indicate the checksum type
>> you probably don't have too much other choice.
>
> Unluckily, there are no free bits in struct sk_buff (i.e. there is 1 + 8
> bits after skb->xmit_more, but its content would be be lost after
> __copy_skb_header() _ so simply we can't use them).
> As soon as two bits in sk_buff are freed, we will be able to rely on the
> skb metadata, instead of inspecting the packet headers, to understand
> what algorithm is used to ensure data integrity in the packet.
>
>> As far as the patch you provided I would say it is a good start, but
>> was a bit to aggressive in a few spots.  For now we don't have support
>> for offloading crc32c when encapsulating a frame so you don't need to
>> worry about that too much for now.
>
> Ok _ so, skb_csum_hwoffload_help(skb, features) will assume that skb needs
> crc32c if all the following conditions are met:
> - feature bitmask does not have NETIF_F_SCTP_CRC bit set
> - skb->csum_offset is equal to 8 (i.e. offsetof(struct sctphdr,checksum)).
> - skb is carrying an (outer, non encapsulated) IPv4/IPv6 header with
> protocol number equal to 132 (i.e. IPPROTO_SCTP)
>
That's too complicated. Just create a non_ip_csum bit in skbuff.
csum_bad can replaced with this I think. If the bit is set then more
work can be done to differentiate between alternative checksums.

Tom

> In any other case, we will compute the internet checksum or do nothing _
> just what it's happening right now for non-GSO packets reaching
> validate_xmit_skb(). I think this implementation can be extended to the
> FCoE case if needed.
>
>> Also as far as the features test
>> you should only need to find that one of the feature bits is set in
>> the list you were testing.  What might make sense would be to look
>> into updating can_checksum_protocol to possibly factor in csum_offset
>> when determining if we can offload it or not.
>
> Looking again at the code, I noticed that the number of test on 'features'
> bits can be reduced: see below.
>
> can_checksum_protocol() takes an ethertype as parameter, so we would need
> to invent a non-standardized valure for SCTP. Moreover, it is used in
> skb_segment() for GSO: so, adding extra CPU cycles would affect
> performance on a path where the kernel is already showing the right
> behavior (GSO SCTP packets have their CRC32 computed correctly when
> sctp_gso_segment() is called).
>
>
> hello Tom,
>> > On Tue, 2017-02-28 at 11:50 -0800, Tom Herbert wrote:
>> > >
>> > > Return value looks complex. Maybe we should just change
>> > > skb_csum_*_help to return bool, true of checksum was handled false if
>> > > not.
>> >
>> > These functions can return -EINVAL if skb is a GSO packet, or -ENOMEM if
>> > skb_linearize(skb) or pskb_expand_head(skb) fail, or 0. I would preserve the
>> > return value of skb_checksum_help() and provide similar range of return values
>> > for skb_sctp_csum_help() (also known as skb_crc32c_csum_help()): this can
>> > help eventual future attempts to remove skb_warn_bad_offload(). It makes
>> > sense to make boolean the return value of skb_csum_hwoffload_help(),
>> > since we are using it only for non-GSO packets.
>
> the above statement is still valid after the body of the function changed. A
> very small thing: according to the kernel coding style, I should find a
> 'predicative' name for this function. Something like
>
> skb_can_resolve_partial_csum(),
>
> (which is terrible, I know)
>
> or similar / better.
>
> Please let me know if you think the code below is ok for you.
> Thank you in advance!
>
> regards,
>
> --
> davide
>
>
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2987,6 +2987,38 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
>         return skb;
>  }
>
> +static bool skb_csum_hwoffload_help(struct sk_buff *skb,
> +                                   netdev_features_t features)
> +{
> +       bool crc32c_csum_hwoff = !!(features & NETIF_F_SCTP_CRC);
> +       bool inet_csum_hwoff = !!(features & NETIF_F_CSUM_MASK);
> +       unsigned int offset = 0;
> +
> +       if (crc32c_csum_hwoff && inet_csum_hwoff)
> +               return true;
> +
> +       if (skb->encapsulation ||
> +           skb->csum_offset != offsetof(struct sctphdr, checksum))
> +               goto inet_csum;
> +
> +       switch (vlan_get_protocol(skb)) {
> +       case ntohs(ETH_P_IP):
> +               if (ip_hdr(skb)->protocol = IPPROTO_SCTP)
> +                       goto crc32c_csum;
> +               break;
> +       case ntohs(ETH_P_IPV6):
> +               if (ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL) =
> +                   IPPROTO_SCTP)
> +                       goto crc32c_csum;
> +               break;
> +       }
> +inet_csum:
> +       return inet_csum_hwoff ? true : !skb_checksum_help(skb);
> +
> +crc32c_csum:
> +       return crc32c_csum_hwoff ? true : !skb_crc32c_csum_help(skb);
> +}
> +
>  static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
>  {
>         netdev_features_t features;
> @@ -3022,8 +3054,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
>                         else
>                                 skb_set_transport_header(skb,
>                                                          skb_checksum_start_offset(skb));
> -                       if (!(features & NETIF_F_CSUM_MASK) &&
> -                           skb_checksum_help(skb))
> +                       if (skb_csum_hwoffload_help(skb, features) = false)
>                                 goto out_kfree_skb;
>                 }
>         }
>
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path
  2017-03-18 22:35                       ` Tom Herbert
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
                                           ` (6 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

On Tue, 2017-03-07 at 10:06 -0800, Alexander Duyck wrote:
> You might even take this one step
> further.  You could convert crc32_csum into a 1 bit enum for now.
> Basically you would use 0 for 1's compliement csum, and 1 to represent
> a crc32c csum.  Then if we end up having to add another bit for
> something like FCoE in the future it would give us 4 possible checksum
> types instead of just giving us 1 with a bit mask.


On Sat, 2017-03-18 at 15:35 -0700, Tom Herbert wrote:
> Just create a non_ip_csum bit in skbuff.
> csum_bad can replaced with this I think. If the bit is set then more
> work can be done to differentiate between alternative checksums.

hello Alexander and Tom,

I refreshed the series including your suggestions.
Some followups are still possible:

* drivers that parse the packet header to correctly resolve CHECKSUM_PARTIAL
(e.g. ixgbe_tx_csum()) can benefit from skb->csum_algo savng some CPU cycles
(e.g. avoiding calling ip_hdr(skb)->protocol or ixgbe_ipv6_csum_is_sctp(skb)).

* drivers that call skb_checksum_help() to resolve CHECKSUM_PARTIAL can
call skb_crc32c_csum_help (or skb_csum_hwoffload_help(skb, 0)) to avoid
wrong CRC on SCTP packets.

thank you in advance for looking at this!
regards,
--
davide

v3:
 * review documentation
 * include a fix for corrupted SCTP packets in openvswitch datapath
 * skb->crc32c_csum renamed in skb->csum_algo and converted to enum
 * deprecate skb->csum_bad to free one bit in struct sk_buff
 * use 'CRC32c csum' terminology everywhere


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets
  2017-03-18 22:35                       ` Tom Herbert
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
                                           ` (5 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
it can't be used in net core. Like it has been done previously with other
symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
to allow computation of crc32c checksum in net core after sctp.ko (and thus
libcrc32c) has been loaded.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 24 ++++++++++++++++++++++++
 net/sctp/offload.c     |  7 +++++++
 3 files changed, 33 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c776abd..8e9dd82 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3126,6 +3126,8 @@ struct skb_checksum_ops {
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
 };
 
+extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
+
 __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
 		      __wsum csum, const struct skb_checksum_ops *ops);
 __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9f78109..1a142aa 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2242,6 +2242,30 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 }
 EXPORT_SYMBOL(skb_copy_and_csum_bits);
 
+static __wsum warn_crc32c_csum_update(const void *buff, int len, __wsum sum)
+{
+	net_warn_ratelimited(
+		"%s: attempt to compute crc32c without libcrc32c.ko\n",
+		__func__);
+	return 0;
+}
+
+static __wsum warn_crc32c_csum_combine(__wsum csum, __wsum csum2,
+				       int offset, int len)
+{
+	net_warn_ratelimited(
+		"%s: attempt to compute crc32c without libcrc32c.ko\n",
+		__func__);
+	return 0;
+}
+
+const struct skb_checksum_ops *crc32c_csum_stub __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = warn_crc32c_csum_update,
+	.combine = warn_crc32c_csum_combine,
+};
+EXPORT_SYMBOL(crc32c_csum_stub);
+
  /**
  *	skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
  *	@from: source buffer
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 4f5a2b5..378f462 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
 	},
 };
 
+static const struct skb_checksum_ops *crc32c_csum_ops __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = sctp_csum_update,
+	.combine = sctp_csum_combine,
+};
+
 int __init sctp_offload_init(void)
 {
 	int ret;
@@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
 	if (ret)
 		goto ipv4;
 
+	crc32c_csum_stub = crc32c_csum_ops;
 	return ret;
 
 ipv4:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 2/7] net: introduce skb_crc32c_csum_help
  2017-03-18 22:35                       ` Tom Herbert
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
                                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

skb_crc32c_csum_help is like skb_checksum_help, but it is designed for
checksumming SCTP packets using crc32c (see RFC3309), provided that
libcrc32c.ko has been loaded before. In case libcrc32c is not loaded,
invoking skb_crc32c_csum_help on a skb results in one the following
printouts:

warn_crc32c_csum_update: attempt to compute crc32c without libcrc32c.ko
warn_crc32c_csum_combine: attempt to compute crc32c without libcrc32c.ko

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/netdevice.h |  1 +
 include/linux/skbuff.h    |  3 ++-
 net/core/dev.c            | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cc07c3b..e86b50e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3899,6 +3899,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
+int skb_crc32c_csum_help(struct sk_buff *skb);
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8e9dd82..d18b31d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -193,7 +193,8 @@
  *     accordingly. Note the there is no indication in the skbuff that the
  *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
  *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers.
+ *     is configured for a packet presumably by inspecting packet headers; in
+ *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
diff --git a/net/core/dev.c b/net/core/dev.c
index ef9fe60e..7d59cba 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -140,6 +140,7 @@
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
 #include <linux/crash_dump.h>
+#include <linux/sctp.h>
 
 #include "net-sysfs.h"
 
@@ -2606,6 +2607,45 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+int skb_crc32c_csum_help(struct sk_buff *skb)
+{
+	__le32 crc32c_csum;
+	int ret = 0, offset;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		goto out;
+
+	if (unlikely(skb_is_gso(skb)))
+		goto out;
+
+	/* Before computing a checksum, we should make sure no frag could
+	 * be modified by an external entity : checksum could be wrong.
+	 */
+	if (unlikely(skb_has_shared_frag(skb))) {
+		ret = __skb_linearize(skb);
+		if (ret)
+			goto out;
+	}
+
+	offset = skb_checksum_start_offset(skb);
+	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
+						  skb->len - offset, ~(__u32)0,
+						  crc32c_csum_stub));
+	offset += offsetof(struct sctphdr, checksum);
+	BUG_ON(offset >= skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__le32 *)(skb->data + offset) = crc32c_csum;
+	skb->ip_summed = CHECKSUM_NONE;
+out:
+	return ret;
+}
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff
  2017-03-18 22:35                       ` Tom Herbert
                                           ` (2 preceding siblings ...)
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c Davide Caratti
                                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

This bit was introduced with 5a21232983aa ("net: Support for csum_bad in
skbuff") to reduce the stack workload when processing RX packets carrying
a wrong Internet Checksum. Up to now, only one driver (besides GRO core)
are setting it.
The test on NAPI_GRO_CB(skb)->flush in dev_gro_receive() is now done
before the test on same_flow, to preserve behavior in case of wrong
checksum.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c |  2 +-
 include/linux/netdevice.h                        |  4 +---
 include/linux/skbuff.h                           | 23 ++---------------------
 net/bridge/netfilter/nft_reject_bridge.c         |  5 +----
 net/core/dev.c                                   |  8 +++-----
 net/ipv4/netfilter/nf_reject_ipv4.c              |  2 +-
 net/ipv6/netfilter/nf_reject_ipv6.c              |  3 ---
 7 files changed, 9 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index 0358e607..ec5579f 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -222,7 +222,7 @@ int aq_ring_rx_clean(struct aq_ring_s *self, int *work_done, int budget)
 		skb->protocol = eth_type_trans(skb, ndev);
 		if (unlikely(buff->is_cso_err)) {
 			++self->stats.rx.errors;
-			__skb_mark_checksum_bad(skb);
+			skb->ip_summed = CHECKSUM_NONE;
 		} else {
 			if (buff->is_ip_cso) {
 				__skb_incr_checksum_unnecessary(skb);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e86b50e..960f6ab 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2547,9 +2547,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
 	if (__skb_gro_checksum_validate_needed(skb, zero_okay, check))	\
 		__ret = __skb_gro_checksum_validate_complete(skb,	\
 				compute_pseudo(skb, proto));		\
-	if (__ret)							\
-		__skb_mark_checksum_bad(skb);				\
-	else								\
+	if (!__ret)							\
 		skb_gro_incr_csum_unnecessary(skb);			\
 	__ret;								\
 })
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d18b31d..aaf1072 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -742,7 +742,7 @@ struct sk_buff {
 	__u8			csum_valid:1;
 	__u8			csum_complete_sw:1;
 	__u8			csum_level:2;
-	__u8			csum_bad:1;
+	__u8			__unused:1; /* one bit hole */
 
 	__u8			dst_pending_confirm:1;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
@@ -3386,21 +3386,6 @@ static inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
 	}
 }
 
-static inline void __skb_mark_checksum_bad(struct sk_buff *skb)
-{
-	/* Mark current checksum as bad (typically called from GRO
-	 * path). In the case that ip_summed is CHECKSUM_NONE
-	 * this must be the first checksum encountered in the packet.
-	 * When ip_summed is CHECKSUM_UNNECESSARY, this is the first
-	 * checksum after the last one validated. For UDP, a zero
-	 * checksum can not be marked as bad.
-	 */
-
-	if (skb->ip_summed = CHECKSUM_NONE ||
-	    skb->ip_summed = CHECKSUM_UNNECESSARY)
-		skb->csum_bad = 1;
-}
-
 /* Check if we need to perform checksum complete validation.
  *
  * Returns true if checksum complete is needed, false otherwise
@@ -3454,9 +3439,6 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
 			skb->csum_valid = 1;
 			return 0;
 		}
-	} else if (skb->csum_bad) {
-		/* ip_summed = CHECKSUM_NONE in this case */
-		return (__force __sum16)1;
 	}
 
 	skb->csum = psum;
@@ -3516,8 +3498,7 @@ static inline __wsum null_compute_pseudo(struct sk_buff *skb, int proto)
 
 static inline bool __skb_checksum_convert_check(struct sk_buff *skb)
 {
-	return (skb->ip_summed = CHECKSUM_NONE &&
-		skb->csum_valid && !skb->csum_bad);
+	return (skb->ip_summed = CHECKSUM_NONE && skb->csum_valid);
 }
 
 static inline void __skb_checksum_convert(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 346ef6b..c16dd3a 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -111,7 +111,7 @@ static void nft_reject_br_send_v4_unreach(struct net *net,
 	__wsum csum;
 	u8 proto;
 
-	if (oldskb->csum_bad || !nft_bridge_iphdr_validate(oldskb))
+	if (!nft_bridge_iphdr_validate(oldskb))
 		return;
 
 	/* IP header checks: fragment. */
@@ -226,9 +226,6 @@ static bool reject6_br_csum_ok(struct sk_buff *skb, int hook)
 	__be16 fo;
 	u8 proto = ip6h->nexthdr;
 
-	if (skb->csum_bad)
-		return false;
-
 	if (skb_csum_unnecessary(skb))
 		return true;
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 7d59cba..91ba01a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4533,9 +4533,6 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (!(skb->dev->features & NETIF_F_GRO))
 		goto normal;
 
-	if (skb->csum_bad)
-		goto normal;
-
 	gro_list_prepare(napi, skb);
 
 	rcu_read_lock();
@@ -4595,11 +4592,12 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 		napi->gro_count--;
 	}
 
+	if (NAPI_GRO_CB(skb)->flush)
+		goto normal;
+
 	if (same_flow)
 		goto ok;
 
-	if (NAPI_GRO_CB(skb)->flush)
-		goto normal;
 
 	if (unlikely(napi->gro_count >= MAX_GRO_SKBS)) {
 		struct sk_buff *nskb = napi->gro_list;
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
index 7cd8d0d..6f8d9e5 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -172,7 +172,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook)
 	struct iphdr *iph = ip_hdr(skb_in);
 	u8 proto;
 
-	if (skb_in->csum_bad || iph->frag_off & htons(IP_OFFSET))
+	if (iph->frag_off & htons(IP_OFFSET))
 		return;
 
 	if (skb_csum_unnecessary(skb_in)) {
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
index eedee5d..f63b18e 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -220,9 +220,6 @@ static bool reject6_csum_ok(struct sk_buff *skb, int hook)
 	__be16 fo;
 	u8 proto;
 
-	if (skb->csum_bad)
-		return false;
-
 	if (skb_csum_unnecessary(skb))
 		return true;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
  2017-03-18 22:35                       ` Tom Herbert
                                           ` (3 preceding siblings ...)
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 15:43                           ` Tom Herbert
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
                                           ` (2 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

skb->csum_algo carries the indication on which algorithm is needed to
compute checksum on skb in the transmit path, when skb->ip_summed is
equal to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c
hasn't been yet written in L4 header, skb->csum_algo is assigned to
CRC32C_CHECKSUM. In any other case, skb->csum_algo is set to
INTERNET_CHECKSUM.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h                | 28 ++++++++++++++++++++--------
 net/core/dev.c                        |  2 +-
 net/netfilter/ipvs/ip_vs_proto_sctp.c |  2 +-
 net/netfilter/nf_nat_proto_sctp.c     |  2 +-
 net/sched/act_csum.c                  |  2 +-
 net/sctp/offload.c                    |  2 +-
 net/sctp/output.c                     |  3 ++-
 7 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index aaf1072..527be47 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -189,12 +189,13 @@
  *
  *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
  *     offloading the SCTP CRC in a packet. To perform this offload the stack
- *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- *     accordingly. Note the there is no indication in the skbuff that the
- *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
- *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers; in
- *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
+ *     will set set csum_start and csum_offset accordingly, set ip_summed to
+ *     CHECKSUM_PARTIAL and set csum_algo to CRC32C_CHECKSUM, to provide an
+ *     indication in the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
+ *     A driver that supports both IP checksum offload and SCTP CRC32c offload
+ *     must verify which offload is configured for a packet by testing the
+ *     value of skb->csum_algo; skb_crc32c_csum_help is provided to resolve
+ *     CHECKSUM_PARTIAL on skbs where csum_algo is CRC32C_CHECKSUM.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
@@ -614,6 +615,7 @@ static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
  *	@wifi_acked_valid: wifi_acked was set
  *	@wifi_acked: whether frame was acked on wifi or not
  *	@no_fcs:  Request NIC to treat last 4 bytes as Ethernet FCS
+ *	@csum_algo: algorithm used to compute checksum
  *	@dst_pending_confirm: need to confirm neighbour
   *	@napi_id: id of the NAPI struct this skb came from
  *	@secmark: security marking
@@ -742,8 +744,10 @@ struct sk_buff {
 	__u8			csum_valid:1;
 	__u8			csum_complete_sw:1;
 	__u8			csum_level:2;
-	__u8			__unused:1; /* one bit hole */
-
+	enum {
+		INTERNET_CHECKSUM = 0,
+		CRC32C_CHECKSUM,
+	}			csum_algo:1;
 	__u8			dst_pending_confirm:1;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
 	__u8			ndisc_nodetype:2;
@@ -3129,6 +3133,14 @@ struct skb_checksum_ops {
 
 extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
 
+static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
+					   const u8 ip_summed)
+{
+	skb->csum_algo = ip_summed = CHECKSUM_PARTIAL ? CRC32C_CHECKSUM :
+		INTERNET_CHECKSUM;
+	skb->ip_summed = ip_summed;
+}
+
 __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
 		      __wsum csum, const struct skb_checksum_ops *ops);
 __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/dev.c b/net/core/dev.c
index 91ba01a..c6a4281 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2641,7 +2641,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
 			goto out;
 	}
 	*(__le32 *)(skb->data + offset) = crc32c_csum;
-	skb->ip_summed = CHECKSUM_NONE;
+	skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
 out:
 	return ret;
 }
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 56f8e4b..8800bf7 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -81,7 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
 			  unsigned int sctphoff)
 {
 	sctph->checksum = sctp_compute_cksum(skb, sctphoff);
-	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	skb_set_crc32c_ipsummed(skb, CHECKSUM_UNNECESSARY);
 }
 
 static int
diff --git a/net/netfilter/nf_nat_proto_sctp.c b/net/netfilter/nf_nat_proto_sctp.c
index 804e8a0..82a7c4c 100644
--- a/net/netfilter/nf_nat_proto_sctp.c
+++ b/net/netfilter/nf_nat_proto_sctp.c
@@ -60,7 +60,7 @@ sctp_manip_pkt(struct sk_buff *skb,
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
 		hdr->checksum = sctp_compute_cksum(skb, hdroff);
-		skb->ip_summed = CHECKSUM_NONE;
+		skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
 	}
 
 	return true;
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index 6c319a4..6e7e862 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -349,7 +349,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
 
 	sctph->checksum = sctp_compute_cksum(skb,
 					     skb_network_offset(skb) + ihl);
-	skb->ip_summed = CHECKSUM_NONE;
+	skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
 
 	return 1;
 }
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 378f462..4b98339 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -34,7 +34,7 @@
 
 static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
 {
-	skb->ip_summed = CHECKSUM_NONE;
+	skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
 	return sctp_compute_cksum(skb, skb_transport_offset(skb));
 }
 
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 1224421..386cbd8 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -524,10 +524,11 @@ static int sctp_packet_pack(struct sctp_packet *packet,
 		struct sctphdr *sh  			(struct sctphdr *)skb_transport_header(head);
 
+		skb_set_crc32c_ipsummed(head, CHECKSUM_NONE);
 		sh->checksum = sctp_compute_cksum(head, 0);
 	} else {
 chksum:
-		head->ip_summed = CHECKSUM_PARTIAL;
+		skb_set_crc32c_ipsummed(head, CHECKSUM_PARTIAL);
 		head->csum_start = skb_transport_header(head) - head->head;
 		head->csum_offset = offsetof(struct sctphdr, checksum);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb()
  2017-03-18 22:35                       ` Tom Herbert
                                           ` (4 preceding siblings ...)
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

skb_csum_hwoffload_help() uses netdev features and skb->csum_algo to
determine if skb needs software computation of Internet Checksum or crc32c
(or nothing, if this computation can be done by the hardware). Use it in
place of skb_checksum_help() in validate_xmit_skb() to avoid corruption
of non-GSO SCTP packets having skb->ip_summed equal to CHECKSUM_PARTIAL.

While at it, remove references to skb_csum_off_chk* functions, since they
are not present anymore in Linux since commit cf53b1da73bd ('Revert "net:
Add driver helper functions to determine checksum"').

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 Documentation/networking/checksum-offloads.txt | 12 ++++++++----
 include/linux/netdevice.h                      |  3 +++
 include/linux/skbuff.h                         | 11 ++++-------
 net/core/dev.c                                 | 14 ++++++++++++--
 4 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e3686..95a49aa 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -35,6 +35,10 @@ This interface only allows a single checksum to be offloaded.  Where
  encapsulation is used, the packet may have multiple checksum fields in
  different header layers, and the rest will have to be handled by another
  mechanism such as LCO or RCO.
+CRC can also be offloaded using this interface, by means of filling
+ skb->csum_start and skb->csum_offset as described above, and setting
+ skb->csum_algo to values different than INTERNET_CHECKSUM: see skbuff.h
+ comment (section 'D') for more details.
 No offloading of the IP header checksum is performed; it is always done in
  software.  This is OK because when we build the IP header, we obviously
  have it in cache, so summing it isn't expensive.  It's also rather short.
@@ -49,9 +53,9 @@ A driver declares its offload capabilities in netdev->hw_features; see
  and csum_offset given in the SKB; if it tries to deduce these itself in
  hardware (as some NICs do) the driver should check that the values in the
  SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h).  This
- is a pain, but that's what you get when hardware tries to be clever.
+ checksumming in software instead (with skb_csum_hwoffload_help() or one of
+ the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in
+ include/linux/skbuff.h).
 
 The stack should, for the most part, assume that checksum offload is
  supported by the underlying device.  The only place that should check is
@@ -60,7 +64,7 @@ The stack should, for the most part, assume that checksum offload is
  may include other offloads besides TX Checksum Offload) and, if they are
  not supported or enabled on the device (determined by netdev->features),
  performs the corresponding offload in software.  In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features).
 
 
 LCO: Local Checksum Offload
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 960f6ab..e4ceb36 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3898,6 +3898,9 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
 int skb_crc32c_csum_help(struct sk_buff *skb);
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+			    const netdev_features_t features);
+
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 527be47..4d2a6ec 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -162,13 +162,10 @@
  *
  *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
  *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
- *   checksum offload capability. If a	device has limited checksum capabilities
- *   (for instance can only perform NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM as
- *   described above) a helper function can be called to resolve
- *   CHECKSUM_PARTIAL. The helper functions are skb_csum_off_chk*. The helper
- *   function takes a spec argument that describes the protocol layer that is
- *   supported for checksum offload and can be called for each packet. If a
- *   packet does not match the specification for offload, skb_checksum_help
+ *   checksum offload capability.
+ *   skb_csum_hwoffload_help() can be called to resolve CHECKSUM_PARTIAL based
+ *   on network device checksumming capabilities: if a packet does not match
+ *   them, skb_checksum_help/skb_crc32c_help (based on csum_algo, see item D.)
  *   is called to resolve the checksum.
  *
  * CHECKSUM_NONE:
diff --git a/net/core/dev.c b/net/core/dev.c
index c6a4281..223aa16 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2988,6 +2988,17 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+			    const netdev_features_t features)
+{
+	if (skb->csum_algo = CRC32C_CHECKSUM)
+		return !!(features & NETIF_F_SCTP_CRC) ? 0 :
+			skb_crc32c_csum_help(skb);
+
+	return !!(features & NETIF_F_CSUM_MASK) ? 0 : skb_checksum_help(skb);
+}
+EXPORT_SYMBOL(skb_csum_hwoffload_help);
+
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 {
 	netdev_features_t features;
@@ -3023,8 +3034,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 			else
 				skb_set_transport_header(skb,
 							 skb_checksum_start_offset(skb));
-			if (!(features & NETIF_F_CSUM_MASK) &&
-			    skb_checksum_help(skb))
+			if (skb_csum_hwoffload_help(skb, features))
 				goto out_kfree_skb;
 		}
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet()
  2017-03-18 22:35                       ` Tom Herbert
                                           ` (5 preceding siblings ...)
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs
CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid
corrupting such packets while queueing them towards userspace.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/openvswitch/datapath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 9c62b63..457f40d 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -453,7 +453,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
 
 	/* Complete checksum if needed */
 	if (skb->ip_summed = CHECKSUM_PARTIAL &&
-	    (err = skb_checksum_help(skb)))
+	    (err = skb_csum_hwoffload_help(skb, 0)))
 		goto out;
 
 	/* Older versions of OVS user space enforce alignment of the last
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}
  2017-03-18 22:35                       ` Tom Herbert
                                           ` (6 preceding siblings ...)
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
@ 2017-04-07 14:16                         ` Davide Caratti
  7 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 14:16 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: Linux Kernel Network Developers, linux-sctp

Add FCoE to the list of protocols that can set CHECKSUM_UNNECESSARY; add a
note to CHECKSUM_COMPLETE section to specify that it does not apply to SCTP
and FCoE protocols.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4d2a6ec..1a639e8 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -109,6 +109,7 @@
  *       may perform further validation in this case.
  *     GRE: only if the checksum is present in the header.
  *     SCTP: indicates the CRC in SCTP header has been validated.
+ *     FCOE: indicates the CRC in FC frame has been validated.
  *
  *   skb->csum_level indicates the number of consecutive checksums found in
  *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
@@ -126,8 +127,10 @@
  *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
  *   hardware doesn't need to parse L3/L4 headers to implement this.
  *
- *   Note: Even if device supports only some protocols, but is able to produce
- *   skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ *   Notes:
+ *   - Even if device supports only some protocols, but is able to produce
+ *     skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ *   - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
  *
  * CHECKSUM_PARTIAL:
  *
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
  2017-04-07 14:16                         ` [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c Davide Caratti
@ 2017-04-07 15:43                           ` Tom Herbert
  2017-04-07 17:29                             ` Davide Caratti
  0 siblings, 1 reply; 51+ messages in thread
From: Tom Herbert @ 2017-04-07 15:43 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

On Fri, Apr 7, 2017 at 7:16 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> skb->csum_algo carries the indication on which algorithm is needed to
> compute checksum on skb in the transmit path, when skb->ip_summed is
> equal to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c
> hasn't been yet written in L4 header, skb->csum_algo is assigned to
> CRC32C_CHECKSUM. In any other case, skb->csum_algo is set to
> INTERNET_CHECKSUM.
>
> Suggested-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/skbuff.h                | 28 ++++++++++++++++++++--------
>  net/core/dev.c                        |  2 +-
>  net/netfilter/ipvs/ip_vs_proto_sctp.c |  2 +-
>  net/netfilter/nf_nat_proto_sctp.c     |  2 +-
>  net/sched/act_csum.c                  |  2 +-
>  net/sctp/offload.c                    |  2 +-
>  net/sctp/output.c                     |  3 ++-
>  7 files changed, 27 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index aaf1072..527be47 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -189,12 +189,13 @@
>   *
>   *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
>   *     offloading the SCTP CRC in a packet. To perform this offload the stack
> - *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
> - *     accordingly. Note the there is no indication in the skbuff that the
> - *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
> - *     both IP checksum offload and SCTP CRC offload must verify which offload
> - *     is configured for a packet presumably by inspecting packet headers; in
> - *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
> + *     will set set csum_start and csum_offset accordingly, set ip_summed to
> + *     CHECKSUM_PARTIAL and set csum_algo to CRC32C_CHECKSUM, to provide an
> + *     indication in the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
> + *     A driver that supports both IP checksum offload and SCTP CRC32c offload
> + *     must verify which offload is configured for a packet by testing the
> + *     value of skb->csum_algo; skb_crc32c_csum_help is provided to resolve
> + *     CHECKSUM_PARTIAL on skbs where csum_algo is CRC32C_CHECKSUM.
>   *
>   *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
>   *     offloading the FCOE CRC in a packet. To perform this offload the stack
> @@ -614,6 +615,7 @@ static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
>   *     @wifi_acked_valid: wifi_acked was set
>   *     @wifi_acked: whether frame was acked on wifi or not
>   *     @no_fcs:  Request NIC to treat last 4 bytes as Ethernet FCS
> + *     @csum_algo: algorithm used to compute checksum
>   *     @dst_pending_confirm: need to confirm neighbour
>    *    @napi_id: id of the NAPI struct this skb came from
>   *     @secmark: security marking
> @@ -742,8 +744,10 @@ struct sk_buff {
>         __u8                    csum_valid:1;
>         __u8                    csum_complete_sw:1;
>         __u8                    csum_level:2;
> -       __u8                    __unused:1; /* one bit hole */
> -
> +       enum {
> +               INTERNET_CHECKSUM = 0,
> +               CRC32C_CHECKSUM,
> +       }                       csum_algo:1;

I am worried this opens the door to a new open ended functionality
that will be rarely used in practice. Checksum offload is pervasive,
CRC offload is still a very narrow use case. Adding yet more
CRC/checksum variants will need more bits. It may be sufficient for
now just to make this a single bit which indicates "ones' checksum" or
indicates "other". In this case of "other" we need some analysis so
determine which checksum it is, this might be something that flow
dissector could support.

>         __u8                    dst_pending_confirm:1;
>  #ifdef CONFIG_IPV6_NDISC_NODETYPE
>         __u8                    ndisc_nodetype:2;
> @@ -3129,6 +3133,14 @@ struct skb_checksum_ops {
>
>  extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
>
> +static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
> +                                          const u8 ip_summed)
> +{
> +       skb->csum_algo = ip_summed = CHECKSUM_PARTIAL ? CRC32C_CHECKSUM :
> +               INTERNET_CHECKSUM;
> +       skb->ip_summed = ip_summed;

This seems odd to me. skb->csum_algo and skb->ip_summed always end up
having the same value.

> +}
> +
>  __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
>                       __wsum csum, const struct skb_checksum_ops *ops);
>  __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 91ba01a..c6a4281 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2641,7 +2641,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
>                         goto out;
>         }
>         *(__le32 *)(skb->data + offset) = crc32c_csum;
> -       skb->ip_summed = CHECKSUM_NONE;
> +       skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
>  out:
>         return ret;
>  }
> diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> index 56f8e4b..8800bf7 100644
> --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
> +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> @@ -81,7 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
>                           unsigned int sctphoff)
>  {
>         sctph->checksum = sctp_compute_cksum(skb, sctphoff);
> -       skb->ip_summed = CHECKSUM_UNNECESSARY;
> +       skb_set_crc32c_ipsummed(skb, CHECKSUM_UNNECESSARY);

The old code is better. CHECKSUM_UNNECESSARY already applies to non IP
checksums. There is nothing special about crc32 in this regard and
skb->csum_algo should only be valid when skb->ip_summed =
CHECKSUM_PARTIAL so no need to set it here. This point should also be
in documentation.

>  }
>
>  static int
> diff --git a/net/netfilter/nf_nat_proto_sctp.c b/net/netfilter/nf_nat_proto_sctp.c
> index 804e8a0..82a7c4c 100644
> --- a/net/netfilter/nf_nat_proto_sctp.c
> +++ b/net/netfilter/nf_nat_proto_sctp.c
> @@ -60,7 +60,7 @@ sctp_manip_pkt(struct sk_buff *skb,
>
>         if (skb->ip_summed != CHECKSUM_PARTIAL) {
>                 hdr->checksum = sctp_compute_cksum(skb, hdroff);
> -               skb->ip_summed = CHECKSUM_NONE;
> +               skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
>         }
>
>         return true;
> diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
> index 6c319a4..6e7e862 100644
> --- a/net/sched/act_csum.c
> +++ b/net/sched/act_csum.c
> @@ -349,7 +349,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
>
>         sctph->checksum = sctp_compute_cksum(skb,
>                                              skb_network_offset(skb) + ihl);
> -       skb->ip_summed = CHECKSUM_NONE;
> +       skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
>
>         return 1;
>  }
> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
> index 378f462..4b98339 100644
> --- a/net/sctp/offload.c
> +++ b/net/sctp/offload.c
> @@ -34,7 +34,7 @@
>
>  static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>  {
> -       skb->ip_summed = CHECKSUM_NONE;
> +       skb_set_crc32c_ipsummed(skb, CHECKSUM_NONE);
>         return sctp_compute_cksum(skb, skb_transport_offset(skb));
>  }
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 1224421..386cbd8 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -524,10 +524,11 @@ static int sctp_packet_pack(struct sctp_packet *packet,
>                 struct sctphdr *sh >                         (struct sctphdr *)skb_transport_header(head);
>
> +               skb_set_crc32c_ipsummed(head, CHECKSUM_NONE);
>                 sh->checksum = sctp_compute_cksum(head, 0);
>         } else {
>  chksum:
> -               head->ip_summed = CHECKSUM_PARTIAL;
> +               skb_set_crc32c_ipsummed(head, CHECKSUM_PARTIAL);
>                 head->csum_start = skb_transport_header(head) - head->head;
>                 head->csum_offset = offsetof(struct sctphdr, checksum);
>         }
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
  2017-04-07 15:43                           ` Tom Herbert
@ 2017-04-07 17:29                             ` Davide Caratti
  2017-04-07 18:11                               ` Tom Herbert
  0 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-07 17:29 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

hello Tom,

On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> On Fri, Apr 7, 2017 at 7:16 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> > @@ -742,8 +744,10 @@ struct sk_buff {
> >         __u8                    csum_valid:1;
> >         __u8                    csum_complete_sw:1;
> >         __u8                    csum_level:2;
> > -       __u8                    __unused:1; /* one bit hole */
> > -
> > +       enum {
> > +               INTERNET_CHECKSUM = 0,
> > +               CRC32C_CHECKSUM,
> > +       }                       csum_algo:1;
> 
> I am worried this opens the door to a new open ended functionality
> that will be rarely used in practice. Checksum offload is pervasive,
> CRC offload is still a very narrow use case.

thank you for the prompt response. I thought there was a silent
agreement on that - Alexander proposed usage of an enum bitfield to be
ready for FCoE (and I'm not against it, unless I have to find a second
free bit in struct sk_buff :-) ). But maybe I'm misunderstanding your
concern: is it the  name of the variable, (csum_algo instead of
crc32c_csum), or the usage of enum bitfield (or both?) ?

On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> Adding yet more
> CRC/checksum variants will need more bits. It may be sufficient for
> now just to make this a single bit which indicates "ones' checksum" or
> indicates "other". In this case of "other" we need some analysis so
> determine which checksum it is, this might be something that flow
> dissector could support.

... which is my intent: by the way, from my perspective, we don't need more than 1 bit
to extend the functionality. While reviewing my code, I was also considering
extending the witdth of skb->ip_summed from 2 to 3 bit, so that it was possible
to

#define

CRC32C_PARTIAL <- for SCTP
CRC_PARTIAL <- for FCoE
CHECKSUM_PARTIAL <- for everything else

It's conceptually the same thing, and the free bit is used more
efficiently. But then I would need to check all places where
CHECKSUM_PARTIAL is used in assignments and test: so, I told myself it's
not worth doing it until somebody requests to extend this functionality to
FCoE.

> > @@ -3129,6 +3133,14 @@ struct skb_checksum_ops {
> > 
> >  extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
> > 
> > +static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
> > +                                          const u8 ip_summed)
> > +{
> > +       skb->csum_algo = ip_summed = CHECKSUM_PARTIAL ? CRC32C_CHECKSUM :
> > +               INTERNET_CHECKSUM;
> > +       skb->ip_summed = ip_summed;
> 
> This seems odd to me. skb->csum_algo and skb->ip_summed always end up
> having the same value.

this is accidentally true for CHEKSUM_NONE and CHECKSUM_PARTIAL, and only
if skb carries a SCTP packet. This was my intent:

ip_summed  (2 bit)                     | csum_algo     (1 bit)
---------------------------------------+-------------------
CHEKSUM_NONE = 0                       | INTERNET_CHECKSUM = 0
CHECKSUM_PARTIAL = 1                   | CRC32C_CHECKSUM = 1
CHECKSUM_COMPLETE = 2 (not applicable) | INTERNET_CHECKSUM = 0 (don't care)
CHECKSUM_UNNECESSARY = 3               | INTERNET_CHECKSUM = 0

I can do this in a more explicit way, changing the prototype to

static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
                                           const u8 ip_summed,
                                           const u8 csum_algo)

(with the advantage of saving a test on the value of ip_summed).
Find in the comment below the reason why I'm clearing csum_algo every time
the SCTP CRC32c is computed.

> > --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
> > +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> > @@ -81,7 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
> >                           unsigned int sctphoff)
> >  {
> >         sctph->checksum = sctp_compute_cksum(skb, sctphoff);
> > -       skb->ip_summed = CHECKSUM_UNNECESSARY;
> > +       skb_set_crc32c_ipsummed(skb, CHECKSUM_UNNECESSARY);
> 
> The old code is better. CHECKSUM_UNNECESSARY already applies to non IP
> checksums. There is nothing special about crc32 in this regard and
> skb->csum_algo should only be valid when skb->ip_summed =
> CHECKSUM_PARTIAL so no need to set it here. This point should also be
> in documentation.

In my understanding, csum_algo needs to be set to INTERNET_CHECKSUM after the
CRC32c is computed. Otherwise, after subsequent operation on the skb (e.g. it
is encapsulated in a UDP frame), there is the possibility for skb->ip_summed
to become CHECKSUM_PARTIAL again. So, to ensure that skb_checksum_help() and
not skb_crc32c_help() will be called, csum_algo must be 0.

To minimize the impact of the patch, I substituted all assignments of skb->ip_summed,
done by SCTP-related code, with calls to skb_set_crc32c_ipsummed(). The alternative is
to explicitly set csum_algo to 0 (INTERNET_CHECKSUM) in SCTP-related code. Do you agree?

thank you in advance,
regards
--
davide


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
  2017-04-07 17:29                             ` Davide Caratti
@ 2017-04-07 18:11                               ` Tom Herbert
  2017-04-13 10:36                                 ` Davide Caratti
                                                   ` (8 more replies)
  0 siblings, 9 replies; 51+ messages in thread
From: Tom Herbert @ 2017-04-07 18:11 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

On Fri, Apr 7, 2017 at 10:29 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> hello Tom,
>
> On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
>> On Fri, Apr 7, 2017 at 7:16 AM, Davide Caratti <dcaratti@redhat.com> wrote:
>> > @@ -742,8 +744,10 @@ struct sk_buff {
>> >         __u8                    csum_valid:1;
>> >         __u8                    csum_complete_sw:1;
>> >         __u8                    csum_level:2;
>> > -       __u8                    __unused:1; /* one bit hole */
>> > -
>> > +       enum {
>> > +               INTERNET_CHECKSUM = 0,
>> > +               CRC32C_CHECKSUM,
>> > +       }                       csum_algo:1;
>>
>> I am worried this opens the door to a new open ended functionality
>> that will be rarely used in practice. Checksum offload is pervasive,
>> CRC offload is still a very narrow use case.
>
> thank you for the prompt response. I thought there was a silent
> agreement on that - Alexander proposed usage of an enum bitfield to be
> ready for FCoE (and I'm not against it, unless I have to find a second
> free bit in struct sk_buff :-) ). But maybe I'm misunderstanding your
> concern: is it the  name of the variable, (csum_algo instead of
> crc32c_csum), or the usage of enum bitfield (or both?) ?
>
> On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
>> Adding yet more
>> CRC/checksum variants will need more bits. It may be sufficient for
>> now just to make this a single bit which indicates "ones' checksum" or
>> indicates "other". In this case of "other" we need some analysis so
>> determine which checksum it is, this might be something that flow
>> dissector could support.
>
> ... which is my intent: by the way, from my perspective, we don't need more than 1 bit
> to extend the functionality. While reviewing my code, I was also considering
> extending the witdth of skb->ip_summed from 2 to 3 bit, so that it was possible
> to
>
Maybe just call it csum_not_ip then. Then just do "if
(unlikely(skb->csum_not_ip)) ..."

> #define
>
> CRC32C_PARTIAL <- for SCTP
> CRC_PARTIAL <- for FCoE
> CHECKSUM_PARTIAL <- for everything else
>
> It's conceptually the same thing, and the free bit is used more
> efficiently. But then I would need to check all places where
> CHECKSUM_PARTIAL is used in assignments and test: so, I told myself it's
> not worth doing it until somebody requests to extend this functionality to
> FCoE.
>
I've thought about extending ip_summed before with something like
csum_invalid. I think it opens up a can of worms since ip_summed is
being used in so many places already and the semantics of each value
have to be extremely well defined for the whole system (this is one
place where we can't tolerate any ambiguity at all and it everything
needs to be clearly documented).

>> > @@ -3129,6 +3133,14 @@ struct skb_checksum_ops {
>> >
>> >  extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
>> >
>> > +static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
>> > +                                          const u8 ip_summed)
>> > +{
>> > +       skb->csum_algo = ip_summed = CHECKSUM_PARTIAL ? CRC32C_CHECKSUM :
>> > +               INTERNET_CHECKSUM;
>> > +       skb->ip_summed = ip_summed;
>>
>> This seems odd to me. skb->csum_algo and skb->ip_summed always end up
>> having the same value.
>
> this is accidentally true for CHEKSUM_NONE and CHECKSUM_PARTIAL, and only
> if skb carries a SCTP packet. This was my intent:
>
> ip_summed  (2 bit)                     | csum_algo     (1 bit)
> ---------------------------------------+-------------------
> CHEKSUM_NONE = 0                       | INTERNET_CHECKSUM = 0
> CHECKSUM_PARTIAL = 1                   | CRC32C_CHECKSUM = 1
> CHECKSUM_COMPLETE = 2 (not applicable) | INTERNET_CHECKSUM = 0 (don't care)
> CHECKSUM_UNNECESSARY = 3               | INTERNET_CHECKSUM = 0
>
> I can do this in a more explicit way, changing the prototype to
>
> static inline void skb_set_crc32c_ipsummed(struct sk_buff *skb,
>                                            const u8 ip_summed,
>                                            const u8 csum_algo)
>
> (with the advantage of saving a test on the value of ip_summed).
> Find in the comment below the reason why I'm clearing csum_algo every time
> the SCTP CRC32c is computed.
>
>> > --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
>> > +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
>> > @@ -81,7 +81,7 @@ static void sctp_nat_csum(struct sk_buff *skb, sctp_sctphdr_t *sctph,
>> >                           unsigned int sctphoff)
>> >  {
>> >         sctph->checksum = sctp_compute_cksum(skb, sctphoff);
>> > -       skb->ip_summed = CHECKSUM_UNNECESSARY;
>> > +       skb_set_crc32c_ipsummed(skb, CHECKSUM_UNNECESSARY);
>>
>> The old code is better. CHECKSUM_UNNECESSARY already applies to non IP
>> checksums. There is nothing special about crc32 in this regard and
>> skb->csum_algo should only be valid when skb->ip_summed =
>> CHECKSUM_PARTIAL so no need to set it here. This point should also be
>> in documentation.
>
> In my understanding, csum_algo needs to be set to INTERNET_CHECKSUM after the
> CRC32c is computed. Otherwise, after subsequent operation on the skb (e.g. it
> is encapsulated in a UDP frame), there is the possibility for skb->ip_summed
> to become CHECKSUM_PARTIAL again. So, to ensure that skb_checksum_help() and
> not skb_crc32c_help() will be called, csum_algo must be 0.
>
ip_summed should no longer be CHECKSUM_PARTIAL with CRC32c is computed.

> To minimize the impact of the patch, I substituted all assignments of skb->ip_summed,
> done by SCTP-related code, with calls to skb_set_crc32c_ipsummed(). The alternative is
> to explicitly set csum_algo to 0 (INTERNET_CHECKSUM) in SCTP-related code. Do you agree?
>
No, like I said the only case where this new bit is relevant is when
CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
sctp crc it must be set. When CRC is resolved, in the helper for
instance, it must be cleared. If these rules are properly followed
then the bit will be zero in all other cases without needing any
additional work or conditionals.

Tom

> thank you in advance,
> regards
> --
> davide
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
  2017-04-07 18:11                               ` Tom Herbert
@ 2017-04-13 10:36                                 ` Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums Davide Caratti
                                                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-13 10:36 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner
  Cc: netdev, linux-sctp

thank you,

On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> Maybe just call it csum_not_ip then. Then just do "if
> (unlikely(skb->csum_not_ip)) ..."

OK, I will rename the bit, avoid the enum and use the 'unlikely'. Up to now,
this series uses the bit for SCTP only and leaves unmodified behavior of
offloaded FCoE frames: please let me know if you disagree on that.

On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> On Fri, Apr 7, 2017 at 10:29 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> > In my understanding, csum_algo needs to be set to INTERNET_CHECKSUM after the
> > CRC32c is computed. Otherwise, after subsequent operation on the skb (e.g. it
> > is encapsulated in a UDP frame), there is the possibility for skb->ip_summed
> > to become CHECKSUM_PARTIAL again. So, to ensure that skb_checksum_help() and
> > not skb_crc32c_help() will be called, csum_algo must be 0.

> ip_summed should no longer be CHECKSUM_PARTIAL with CRC32c is computed.

Even though it's uncommon, skb->ip_summed can become CHECKSUM_PARTIAL again
after the CRC32c is computed and CHECKSUM_NONE is set: for example, when a
veth and a vxlan with UDP checksums are enslaved to the same bridge, and the
NIC below vxlan has no checksumming capabilities. Here, validate_xmit_skb is
called three times on the same skb (see perf output at the bottom): 

* before transmission on the veth: here ip_summed is CHECKSUM_PARTIAL, but
the device supports CRC32c offload so the skb is (correctly) untouched.

* before vxlan encapsulation: here ip_summed is CHECKSUM_PARTIAL,
skb->csum_not_inet is 1 and NETIF_F_SCTP_CRC is not set. Here,
skb_csum_hwoffload_help() correctly fills the CRC32c and assigns ip_summed
to CHECKSUM_NONE.

* before transmission on the NIC: ip_summed is CHECKSUM_PARTIAL again (because
udp_set_csum changed csum_start and csum_offset to point to the tunnel
UDP header). No bit in NETIF_F_HW_CSUM is set: if skb->csum_not_inet is still 1,
the helper (wrongly) computes CRC32c again, thus corrupting the outer UDP
transport header. On the contrary, if skb->csum_not_inet is 0, skb_checksum_help()
correctly resolves CHECKSUM_PARTIAL.

To avoid this problem, skb->csum_not_inet must be assigned to 0 every time
the CHECKSUM_PARTIAL is resolved on skb carrying SCTP packets.

> > To minimize the impact of the patch, I substituted all assignments of skb->ip_summed,
> > done by SCTP-related code, with calls to skb_set_crc32c_ipsummed(). The alternative is
> > to explicitly set csum_algo to 0 (INTERNET_CHECKSUM) in SCTP-related code. Do you agree?

> No, like I said the only case where this new bit is relevant is when
> CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
> sctp crc it must be set. When CRC is resolved, in the helper for
> instance, it must be cleared. If these rules are properly followed
> then the bit will be zero in all other cases without needing any
> additional work or conditionals.

At a minimum, this csum_not_inet bit needs to be cleared in three places:
1) in skb_crc32c_csum_help, to fix scenarios like veth->bridge->vxlan->NIC above.
2) in sctp_gso_make_checksum, a SCTP GSO packet is segmented and CRC32c is written
on each segment. skb->ip_summed transitions from CHECKSUM_PARTIAL to CHECKSUM_NONE.
3) in act_csum, because TC action mangling the packet are called before 
validate_xmit_skb().

It is not necessary to do it in netfilter NAT (even it is harmless), because
SCTP packets having CHECKSUM_PARTIAL are not resolved (since commit 3189a290f98d
"netfilter: nat: skip checksum on offload SCTP packets"). And it should be not
needed in IPVS code, because ip_summed is set to CHECKSUM_UNNECESSARY, so skb
is not going to be checksummed anymore.

thank you in advance for the feedback!
regards,
--
davide 


scenario:

vethA --> vethB --> bridge --> vxlan encaps --> NIC

# ./perf probe -k vmlinux  --add \
 "skb_csum_hwoffload_help csum_offset=skb->csum_offset ip_summed=skb->ip_summed:b2@7/32 skb=skb"
# ./perf record -e probe:skb_csum_hwoffload_help -aR -- ./scenario-vxlan.sh 

# ./perf script
<....>
ncat  7577 [000] 22056.548907: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=8 ip_summed=1 skb=0xffff880106f6bd00
ncat  7577 [000] 22056.548915: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=8 ip_summed=1 skb=0xffff880106f6bd00
ncat  7577 [000] 22056.548917: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=6 ip_summed=1 skb=0xffff880106f6bd00
<....>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums
  2017-04-07 18:11                               ` Tom Herbert
  2017-04-13 10:36                                 ` Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-27 12:41                                   ` Marcelo Ricardo Leitner
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
                                                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

hello Tom,

On Fri, 2017-04-07 at 11:11 -0700, Tom Herbert wrote:
> maybe just call it csum_not_ip then. Then just do "if
> (unlikely(skb->csum_not_ip)) ..."

Ok, done. V4 uses this bit for SCTP only and leaves unmodified behavior
when offloaded FCoE frames are processed. Further work is still possible
to extend this fix for FCoE, if needed, either by using additional sk_buff
bits, or using skb->csum_not_ip and use other data (e.g. skb->csum_offset)
to distinguish SCTP from FCoE.

> the only case where this new bit is relevant is when
> CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
> sctp crc it must be set. When CRC is resolved, in the helper for
> instance, it must be cleared.

in V4 the bit is set when SCTP packets with offloaded checksum are
generated; the bit is cleared when CRC32c is resolved for such packets
(i.e. skb->ip_summed transitions from CHECKSUM_PARTIAL to CHECKSUM_NONE).

Any feedbacks are appreciated!
thank you in advance,
--
davide


Davide Caratti (7):
  skbuff: add stub to help computing crc32c on SCTP packets
  net: introduce skb_crc32c_csum_help
  sk_buff: remove support for csum_bad in sk_buff
  net: use skb->csum_not_inet to identify packets needing crc32c
  net: more accurate checksumming in validate_xmit_skb()
  openvswitch: more accurate checksumming in queue_userspace_packet()
  sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}

 Documentation/networking/checksum-offloads.txt   | 11 +++--
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c |  2 +-
 include/linux/netdevice.h                        |  8 +--
 include/linux/skbuff.h                           | 58 +++++++++-------------
 net/bridge/netfilter/nft_reject_bridge.c         |  5 +-
 net/core/dev.c                                   | 63 +++++++++++++++++++++---
 net/core/skbuff.c                                | 24 +++++++++
 net/ipv4/netfilter/nf_reject_ipv4.c              |  2 +-
 net/ipv6/netfilter/nf_reject_ipv6.c              |  3 --
 net/openvswitch/datapath.c                       |  2 +-
 net/sched/act_csum.c                             |  1 +
 net/sctp/offload.c                               |  8 +++
 net/sctp/output.c                                |  1 +
 13 files changed, 128 insertions(+), 60 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 1/7] skbuff: add stub to help computing crc32c on SCTP packets
  2017-04-07 18:11                               ` Tom Herbert
  2017-04-13 10:36                                 ` Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
                                                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
it can't be used in net core. Like it has been done previously with other
symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
to allow computation of crc32c checksum in net core after sctp.ko (and thus
libcrc32c) has been loaded.

Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 24 ++++++++++++++++++++++++
 net/sctp/offload.c     |  7 +++++++
 3 files changed, 33 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 741d75c..ba3ae21 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3127,6 +3127,8 @@ struct skb_checksum_ops {
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
 };
 
+extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
+
 __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
 		      __wsum csum, const struct skb_checksum_ops *ops);
 __wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ad2af56..182608b 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2242,6 +2242,30 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 }
 EXPORT_SYMBOL(skb_copy_and_csum_bits);
 
+static __wsum warn_crc32c_csum_update(const void *buff, int len, __wsum sum)
+{
+	net_warn_ratelimited(
+		"%s: attempt to compute crc32c without libcrc32c.ko\n",
+		__func__);
+	return 0;
+}
+
+static __wsum warn_crc32c_csum_combine(__wsum csum, __wsum csum2,
+				       int offset, int len)
+{
+	net_warn_ratelimited(
+		"%s: attempt to compute crc32c without libcrc32c.ko\n",
+		__func__);
+	return 0;
+}
+
+const struct skb_checksum_ops *crc32c_csum_stub __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = warn_crc32c_csum_update,
+	.combine = warn_crc32c_csum_combine,
+};
+EXPORT_SYMBOL(crc32c_csum_stub);
+
  /**
  *	skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
  *	@from: source buffer
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 4f5a2b5..378f462 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -98,6 +98,12 @@ static const struct net_offload sctp6_offload = {
 	},
 };
 
+static const struct skb_checksum_ops *crc32c_csum_ops __read_mostly +	&(struct skb_checksum_ops) {
+	.update  = sctp_csum_update,
+	.combine = sctp_csum_combine,
+};
+
 int __init sctp_offload_init(void)
 {
 	int ret;
@@ -110,6 +116,7 @@ int __init sctp_offload_init(void)
 	if (ret)
 		goto ipv4;
 
+	crc32c_csum_stub = crc32c_csum_ops;
 	return ret;
 
 ipv4:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (2 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-27 12:29                                   ` Marcelo Ricardo Leitner
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
                                                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

skb_crc32c_csum_help is like skb_checksum_help, but it is designed for
checksumming SCTP packets using crc32c (see RFC3309), provided that
libcrc32c.ko has been loaded before. In case libcrc32c is not loaded,
invoking skb_crc32c_csum_help on a skb results in one the following
printouts:

warn_crc32c_csum_update: attempt to compute crc32c without libcrc32c.ko
warn_crc32c_csum_combine: attempt to compute crc32c without libcrc32c.ko

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/netdevice.h |  1 +
 include/linux/skbuff.h    |  3 ++-
 net/core/dev.c            | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b0aa089..bf84a67 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3898,6 +3898,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
+int skb_crc32c_csum_help(struct sk_buff *skb);
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ba3ae21..ec4551b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -193,7 +193,8 @@
  *     accordingly. Note the there is no indication in the skbuff that the
  *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
  *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers.
+ *     is configured for a packet presumably by inspecting packet headers; in
+ *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
diff --git a/net/core/dev.c b/net/core/dev.c
index 5d33e2b..c7aec95 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -140,6 +140,7 @@
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
 #include <linux/crash_dump.h>
+#include <linux/sctp.h>
 
 #include "net-sysfs.h"
 
@@ -2606,6 +2607,45 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+int skb_crc32c_csum_help(struct sk_buff *skb)
+{
+	__le32 crc32c_csum;
+	int ret = 0, offset;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		goto out;
+
+	if (unlikely(skb_is_gso(skb)))
+		goto out;
+
+	/* Before computing a checksum, we should make sure no frag could
+	 * be modified by an external entity : checksum could be wrong.
+	 */
+	if (unlikely(skb_has_shared_frag(skb))) {
+		ret = __skb_linearize(skb);
+		if (ret)
+			goto out;
+	}
+
+	offset = skb_checksum_start_offset(skb);
+	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
+						  skb->len - offset, ~(__u32)0,
+						  crc32c_csum_stub));
+	offset += offsetof(struct sctphdr, checksum);
+	BUG_ON(offset >= skb_headlen(skb));
+
+	if (skb_cloned(skb) &&
+	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
+		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+		if (ret)
+			goto out;
+	}
+	*(__le32 *)(skb->data + offset) = crc32c_csum;
+	skb->ip_summed = CHECKSUM_NONE;
+out:
+	return ret;
+}
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (3 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-29 20:21                                   ` Tom Herbert
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c Davide Caratti
                                                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

This bit was introduced with 5a21232983aa ("net: Support for csum_bad in
skbuff") to reduce the stack workload when processing RX packets carrying
a wrong Internet Checksum. Up to now, only one driver (besides GRO core)
are setting it.
The test on NAPI_GRO_CB(skb)->flush in dev_gro_receive() is now done
before the test on same_flow, to preserve behavior in case of wrong
checksum.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c |  2 +-
 include/linux/netdevice.h                        |  4 +---
 include/linux/skbuff.h                           | 23 ++---------------------
 net/bridge/netfilter/nft_reject_bridge.c         |  5 +----
 net/core/dev.c                                   |  8 +++-----
 net/ipv4/netfilter/nf_reject_ipv4.c              |  2 +-
 net/ipv6/netfilter/nf_reject_ipv6.c              |  3 ---
 7 files changed, 9 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index 3a8a4aa..9a08179 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -223,7 +223,7 @@ int aq_ring_rx_clean(struct aq_ring_s *self, int *work_done, int budget)
 		skb->protocol = eth_type_trans(skb, ndev);
 		if (unlikely(buff->is_cso_err)) {
 			++self->stats.rx.errors;
-			__skb_mark_checksum_bad(skb);
+			skb->ip_summed = CHECKSUM_NONE;
 		} else {
 			if (buff->is_ip_cso) {
 				__skb_incr_checksum_unnecessary(skb);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index bf84a67..ab9e3dc 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2546,9 +2546,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
 	if (__skb_gro_checksum_validate_needed(skb, zero_okay, check))	\
 		__ret = __skb_gro_checksum_validate_complete(skb,	\
 				compute_pseudo(skb, proto));		\
-	if (__ret)							\
-		__skb_mark_checksum_bad(skb);				\
-	else								\
+	if (!__ret)							\
 		skb_gro_incr_csum_unnecessary(skb);			\
 	__ret;								\
 })
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ec4551b..927309e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -743,7 +743,7 @@ struct sk_buff {
 	__u8			csum_valid:1;
 	__u8			csum_complete_sw:1;
 	__u8			csum_level:2;
-	__u8			csum_bad:1;
+	__u8			__csum_bad_unused:1; /* one bit hole */
 
 	__u8			dst_pending_confirm:1;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
@@ -3387,21 +3387,6 @@ static inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
 	}
 }
 
-static inline void __skb_mark_checksum_bad(struct sk_buff *skb)
-{
-	/* Mark current checksum as bad (typically called from GRO
-	 * path). In the case that ip_summed is CHECKSUM_NONE
-	 * this must be the first checksum encountered in the packet.
-	 * When ip_summed is CHECKSUM_UNNECESSARY, this is the first
-	 * checksum after the last one validated. For UDP, a zero
-	 * checksum can not be marked as bad.
-	 */
-
-	if (skb->ip_summed = CHECKSUM_NONE ||
-	    skb->ip_summed = CHECKSUM_UNNECESSARY)
-		skb->csum_bad = 1;
-}
-
 /* Check if we need to perform checksum complete validation.
  *
  * Returns true if checksum complete is needed, false otherwise
@@ -3455,9 +3440,6 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
 			skb->csum_valid = 1;
 			return 0;
 		}
-	} else if (skb->csum_bad) {
-		/* ip_summed = CHECKSUM_NONE in this case */
-		return (__force __sum16)1;
 	}
 
 	skb->csum = psum;
@@ -3517,8 +3499,7 @@ static inline __wsum null_compute_pseudo(struct sk_buff *skb, int proto)
 
 static inline bool __skb_checksum_convert_check(struct sk_buff *skb)
 {
-	return (skb->ip_summed = CHECKSUM_NONE &&
-		skb->csum_valid && !skb->csum_bad);
+	return (skb->ip_summed = CHECKSUM_NONE && skb->csum_valid);
 }
 
 static inline void __skb_checksum_convert(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 346ef6b..c16dd3a 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -111,7 +111,7 @@ static void nft_reject_br_send_v4_unreach(struct net *net,
 	__wsum csum;
 	u8 proto;
 
-	if (oldskb->csum_bad || !nft_bridge_iphdr_validate(oldskb))
+	if (!nft_bridge_iphdr_validate(oldskb))
 		return;
 
 	/* IP header checks: fragment. */
@@ -226,9 +226,6 @@ static bool reject6_br_csum_ok(struct sk_buff *skb, int hook)
 	__be16 fo;
 	u8 proto = ip6h->nexthdr;
 
-	if (skb->csum_bad)
-		return false;
-
 	if (skb_csum_unnecessary(skb))
 		return true;
 
diff --git a/net/core/dev.c b/net/core/dev.c
index c7aec95..77a2d73 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4533,9 +4533,6 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (!(skb->dev->features & NETIF_F_GRO))
 		goto normal;
 
-	if (skb->csum_bad)
-		goto normal;
-
 	gro_list_prepare(napi, skb);
 
 	rcu_read_lock();
@@ -4595,11 +4592,12 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 		napi->gro_count--;
 	}
 
+	if (NAPI_GRO_CB(skb)->flush)
+		goto normal;
+
 	if (same_flow)
 		goto ok;
 
-	if (NAPI_GRO_CB(skb)->flush)
-		goto normal;
 
 	if (unlikely(napi->gro_count >= MAX_GRO_SKBS)) {
 		struct sk_buff *nskb = napi->gro_list;
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
index 7cd8d0d..6f8d9e5 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -172,7 +172,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook)
 	struct iphdr *iph = ip_hdr(skb_in);
 	u8 proto;
 
-	if (skb_in->csum_bad || iph->frag_off & htons(IP_OFFSET))
+	if (iph->frag_off & htons(IP_OFFSET))
 		return;
 
 	if (skb_csum_unnecessary(skb_in)) {
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
index eedee5d..f63b18e 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -220,9 +220,6 @@ static bool reject6_csum_ok(struct sk_buff *skb, int hook)
 	__be16 fo;
 	u8 proto;
 
-	if (skb->csum_bad)
-		return false;
-
 	if (skb_csum_unnecessary(skb))
 		return true;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (4 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-29 20:18                                   ` Tom Herbert
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
                                                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

skb->csum_not_inet carries the indication on which algorithm is needed to
compute checksum on skb in the transmit path, when skb->ip_summed is equal
to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c hasn't been
yet written in L4 header, skb->csum_not_inet is assigned to 1; otherwise,
assume Internet Checksum is needed and thus set skb->csum_not_inet to 0.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h | 16 +++++++++-------
 net/core/dev.c         |  1 +
 net/sched/act_csum.c   |  1 +
 net/sctp/offload.c     |  1 +
 net/sctp/output.c      |  1 +
 5 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 927309e..419f4c8 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -189,12 +189,13 @@
  *
  *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
  *     offloading the SCTP CRC in a packet. To perform this offload the stack
- *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- *     accordingly. Note the there is no indication in the skbuff that the
- *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
- *     both IP checksum offload and SCTP CRC offload must verify which offload
- *     is configured for a packet presumably by inspecting packet headers; in
- *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
+ *     will set set csum_start and csum_offset accordingly, set ip_summed to
+ *     CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication in
+ *     the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
+ *     A driver that supports both IP checksum offload and SCTP CRC32c offload
+ *     must verify which offload is configured for a packet by testing the
+ *     value of skb->csum_not_inet; skb_crc32c_csum_help is provided to resolve
+ *     CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
  *
  *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
  *     offloading the FCOE CRC in a packet. To perform this offload the stack
@@ -615,6 +616,7 @@ static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
  *	@wifi_acked_valid: wifi_acked was set
  *	@wifi_acked: whether frame was acked on wifi or not
  *	@no_fcs:  Request NIC to treat last 4 bytes as Ethernet FCS
+ *	@csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
  *	@dst_pending_confirm: need to confirm neighbour
   *	@napi_id: id of the NAPI struct this skb came from
  *	@secmark: security marking
@@ -743,7 +745,7 @@ struct sk_buff {
 	__u8			csum_valid:1;
 	__u8			csum_complete_sw:1;
 	__u8			csum_level:2;
-	__u8			__csum_bad_unused:1; /* one bit hole */
+	__u8			csum_not_inet:1;
 
 	__u8			dst_pending_confirm:1;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
diff --git a/net/core/dev.c b/net/core/dev.c
index 77a2d73..9f56f87 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2642,6 +2642,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
 	}
 	*(__le32 *)(skb->data + offset) = crc32c_csum;
 	skb->ip_summed = CHECKSUM_NONE;
+	skb->csum_not_inet = 0;
 out:
 	return ret;
 }
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index ab6fdbd..3317a2f 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -350,6 +350,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
 	sctph->checksum = sctp_compute_cksum(skb,
 					     skb_network_offset(skb) + ihl);
 	skb->ip_summed = CHECKSUM_NONE;
+	skb->csum_not_inet = 0;
 
 	return 1;
 }
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 378f462..ef156ac 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -35,6 +35,7 @@
 static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
 {
 	skb->ip_summed = CHECKSUM_NONE;
+	skb->csum_not_inet = 0;
 	return sctp_compute_cksum(skb, skb_transport_offset(skb));
 }
 
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 1409a87..e2edf2e 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -538,6 +538,7 @@ static int sctp_packet_pack(struct sctp_packet *packet,
 	} else {
 chksum:
 		head->ip_summed = CHECKSUM_PARTIAL;
+		head->csum_not_inet = 1;
 		head->csum_start = skb_transport_header(head) - head->head;
 		head->csum_offset = offsetof(struct sctphdr, checksum);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 5/7] net: more accurate checksumming in validate_xmit_skb()
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (5 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
  8 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

skb_csum_hwoffload_help() uses netdev features and skb->csum_not_inet to
determine if skb needs software computation of Internet Checksum or crc32c
(or nothing, if this computation can be done by the hardware). Use it in
place of skb_checksum_help() in validate_xmit_skb() to avoid corruption
of non-GSO SCTP packets having skb->ip_summed equal to CHECKSUM_PARTIAL.

While at it, remove references to skb_csum_off_chk* functions, since they
are not present anymore in Linux since commit cf53b1da73bd ('Revert "net:
Add driver helper functions to determine checksum"').

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 Documentation/networking/checksum-offloads.txt | 11 +++++++----
 include/linux/netdevice.h                      |  3 +++
 include/linux/skbuff.h                         | 13 +++++--------
 net/core/dev.c                                 | 14 ++++++++++++--
 4 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e3686..d52d191 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -35,6 +35,9 @@ This interface only allows a single checksum to be offloaded.  Where
  encapsulation is used, the packet may have multiple checksum fields in
  different header layers, and the rest will have to be handled by another
  mechanism such as LCO or RCO.
+CRC32c can also be offloaded using this interface, by means of filling
+ skb->csum_start and skb->csum_offset as described above, and setting
+ skb->csum_not_inet: see skbuff.h comment (section 'D') for more details.
 No offloading of the IP header checksum is performed; it is always done in
  software.  This is OK because when we build the IP header, we obviously
  have it in cache, so summing it isn't expensive.  It's also rather short.
@@ -49,9 +52,9 @@ A driver declares its offload capabilities in netdev->hw_features; see
  and csum_offset given in the SKB; if it tries to deduce these itself in
  hardware (as some NICs do) the driver should check that the values in the
  SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h).  This
- is a pain, but that's what you get when hardware tries to be clever.
+ checksumming in software instead (with skb_csum_hwoffload_help() or one of
+ the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in
+ include/linux/skbuff.h).
 
 The stack should, for the most part, assume that checksum offload is
  supported by the underlying device.  The only place that should check is
@@ -60,7 +63,7 @@ The stack should, for the most part, assume that checksum offload is
  may include other offloads besides TX Checksum Offload) and, if they are
  not supported or enabled on the device (determined by netdev->features),
  performs the corresponding offload in software.  In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features).
 
 
 LCO: Local Checksum Offload
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ab9e3dc..45e8958 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3897,6 +3897,9 @@ void netdev_rss_key_fill(void *buffer, size_t len);
 int dev_get_nest_level(struct net_device *dev);
 int skb_checksum_help(struct sk_buff *skb);
 int skb_crc32c_csum_help(struct sk_buff *skb);
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+			    const netdev_features_t features);
+
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
 struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 419f4c8..4002c11 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -162,14 +162,11 @@
  *
  *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
  *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
- *   checksum offload capability. If a	device has limited checksum capabilities
- *   (for instance can only perform NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM as
- *   described above) a helper function can be called to resolve
- *   CHECKSUM_PARTIAL. The helper functions are skb_csum_off_chk*. The helper
- *   function takes a spec argument that describes the protocol layer that is
- *   supported for checksum offload and can be called for each packet. If a
- *   packet does not match the specification for offload, skb_checksum_help
- *   is called to resolve the checksum.
+ *   checksum offload capability.
+ *   skb_csum_hwoffload_help() can be called to resolve CHECKSUM_PARTIAL based
+ *   on network device checksumming capabilities: if a packet does not match
+ *   them, skb_checksum_help or skb_crc32c_help (depending on the value of
+ *   csum_not_inet, see item D.) is called to resolve the checksum.
  *
  * CHECKSUM_NONE:
  *
diff --git a/net/core/dev.c b/net/core/dev.c
index 9f56f87..440ace0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2989,6 +2989,17 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+			    const netdev_features_t features)
+{
+	if (unlikely(skb->csum_not_inet))
+		return !!(features & NETIF_F_SCTP_CRC) ? 0 :
+			skb_crc32c_csum_help(skb);
+
+	return !!(features & NETIF_F_CSUM_MASK) ? 0 : skb_checksum_help(skb);
+}
+EXPORT_SYMBOL(skb_csum_hwoffload_help);
+
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 {
 	netdev_features_t features;
@@ -3024,8 +3035,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 			else
 				skb_set_transport_header(skb,
 							 skb_checksum_start_offset(skb));
-			if (!(features & NETIF_F_CSUM_MASK) &&
-			    skb_checksum_help(skb))
+			if (skb_csum_hwoffload_help(skb, features))
 				goto out_kfree_skb;
 		}
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 6/7] openvswitch: more accurate checksumming in queue_userspace_packet()
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (6 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
  8 siblings, 0 replies; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs
CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid
corrupting such packets while queueing them towards userspace.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/openvswitch/datapath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 7b17da9..9ddc9f8 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -453,7 +453,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
 
 	/* Complete checksum if needed */
 	if (skb->ip_summed = CHECKSUM_PARTIAL &&
-	    (err = skb_checksum_help(skb)))
+	    (err = skb_csum_hwoffload_help(skb, 0)))
 		goto out;
 
 	/* Older versions of OVS user space enforce alignment of the last
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}
  2017-04-07 18:11                               ` Tom Herbert
                                                   ` (7 preceding siblings ...)
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
@ 2017-04-20 13:38                                 ` Davide Caratti
  2017-04-29 20:20                                   ` Tom Herbert
  8 siblings, 1 reply; 51+ messages in thread
From: Davide Caratti @ 2017-04-20 13:38 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, David Laight
  Cc: David S . Miller, Marcelo Ricardo Leitner, netdev, linux-sctp

Add FCoE to the list of protocols that can set CHECKSUM_UNNECESSARY; add a
note to CHECKSUM_COMPLETE section to specify that it does not apply to SCTP
and FCoE protocols.

Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 include/linux/skbuff.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4002c11..c902b77 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -109,6 +109,7 @@
  *       may perform further validation in this case.
  *     GRE: only if the checksum is present in the header.
  *     SCTP: indicates the CRC in SCTP header has been validated.
+ *     FCOE: indicates the CRC in FC frame has been validated.
  *
  *   skb->csum_level indicates the number of consecutive checksums found in
  *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
@@ -126,8 +127,10 @@
  *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
  *   hardware doesn't need to parse L3/L4 headers to implement this.
  *
- *   Note: Even if device supports only some protocols, but is able to produce
- *   skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ *   Notes:
+ *   - Even if device supports only some protocols, but is able to produce
+ *     skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ *   - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
  *
  * CHECKSUM_PARTIAL:
  *
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
@ 2017-04-27 12:29                                   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 51+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-04-27 12:29 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	netdev, linux-sctp

On Thu, Apr 20, 2017 at 03:38:08PM +0200, Davide Caratti wrote:
> skb_crc32c_csum_help is like skb_checksum_help, but it is designed for
> checksumming SCTP packets using crc32c (see RFC3309), provided that
> libcrc32c.ko has been loaded before. In case libcrc32c is not loaded,
> invoking skb_crc32c_csum_help on a skb results in one the following
> printouts:
> 
> warn_crc32c_csum_update: attempt to compute crc32c without libcrc32c.ko
> warn_crc32c_csum_combine: attempt to compute crc32c without libcrc32c.ko
> 
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/netdevice.h |  1 +
>  include/linux/skbuff.h    |  3 ++-
>  net/core/dev.c            | 40 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index b0aa089..bf84a67 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3898,6 +3898,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
>  
>  int dev_get_nest_level(struct net_device *dev);
>  int skb_checksum_help(struct sk_buff *skb);
> +int skb_crc32c_csum_help(struct sk_buff *skb);
>  struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
>  				  netdev_features_t features, bool tx_path);
>  struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index ba3ae21..ec4551b 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -193,7 +193,8 @@
>   *     accordingly. Note the there is no indication in the skbuff that the
>   *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
>   *     both IP checksum offload and SCTP CRC offload must verify which offload
> - *     is configured for a packet presumably by inspecting packet headers.
> + *     is configured for a packet presumably by inspecting packet headers; in
> + *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
>   *
>   *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
>   *     offloading the FCOE CRC in a packet. To perform this offload the stack
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5d33e2b..c7aec95 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -140,6 +140,7 @@
>  #include <linux/hrtimer.h>
>  #include <linux/netfilter_ingress.h>
>  #include <linux/crash_dump.h>
> +#include <linux/sctp.h>
>  
>  #include "net-sysfs.h"
>  
> @@ -2606,6 +2607,45 @@ int skb_checksum_help(struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL(skb_checksum_help);
>  
> +int skb_crc32c_csum_help(struct sk_buff *skb)
> +{
> +	__le32 crc32c_csum;
> +	int ret = 0, offset;
> +
> +	if (skb->ip_summed != CHECKSUM_PARTIAL)
> +		goto out;
> +
> +	if (unlikely(skb_is_gso(skb)))
> +		goto out;
> +
> +	/* Before computing a checksum, we should make sure no frag could
> +	 * be modified by an external entity : checksum could be wrong.
> +	 */
> +	if (unlikely(skb_has_shared_frag(skb))) {
> +		ret = __skb_linearize(skb);
> +		if (ret)
> +			goto out;
> +	}
> +
> +	offset = skb_checksum_start_offset(skb);
> +	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, offset,
> +						  skb->len - offset, ~(__u32)0,
> +						  crc32c_csum_stub));
> +	offset += offsetof(struct sctphdr, checksum);
> +	BUG_ON(offset >= skb_headlen(skb));

I suggest using WARN_ON_ONCE() here and returning an error instead. Will
still allow debugging and won't disrupt the system.

> +
> +	if (skb_cloned(skb) &&
> +	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
> +		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
> +		if (ret)
> +			goto out;
> +	}

We could do this check (including the BUG_ON/WARN check above) before
the actual crc32 calc. This can fail, and if it does, we will have
calculated it in vain. Note how offset doesn't really depend on the
checksum result.

I know skb_checksum_help also does it this way, maybe it was because of
some cache optimization on the offset += checksum offset  operation?

> +	*(__le32 *)(skb->data + offset) = crc32c_csum;
> +	skb->ip_summed = CHECKSUM_NONE;
> +out:
> +	return ret;
> +}
> +
>  __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
>  {
>  	__be16 type = skb->protocol;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums Davide Caratti
@ 2017-04-27 12:41                                   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 51+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-04-27 12:41 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Tom Herbert, Alexander Duyck, David Laight, David S . Miller,
	netdev, linux-sctp

On Thu, Apr 20, 2017 at 03:38:06PM +0200, Davide Caratti wrote:
> hello Tom,
> 
> On Fri, 2017-04-07 at 11:11 -0700, Tom Herbert wrote:
> > maybe just call it csum_not_ip then. Then just do "if
> > (unlikely(skb->csum_not_ip)) ..."
> 
> Ok, done. V4 uses this bit for SCTP only and leaves unmodified behavior
> when offloaded FCoE frames are processed. Further work is still possible
> to extend this fix for FCoE, if needed, either by using additional sk_buff
> bits, or using skb->csum_not_ip and use other data (e.g. skb->csum_offset)
> to distinguish SCTP from FCoE.
> 
> > the only case where this new bit is relevant is when
> > CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
> > sctp crc it must be set. When CRC is resolved, in the helper for
> > instance, it must be cleared.
> 
> in V4 the bit is set when SCTP packets with offloaded checksum are
> generated; the bit is cleared when CRC32c is resolved for such packets
> (i.e. skb->ip_summed transitions from CHECKSUM_PARTIAL to CHECKSUM_NONE).
> 
> Any feedbacks are appreciated!
> thank you in advance,
> --
> davide
> 
> 
> Davide Caratti (7):
>   skbuff: add stub to help computing crc32c on SCTP packets
>   net: introduce skb_crc32c_csum_help
>   sk_buff: remove support for csum_bad in sk_buff
>   net: use skb->csum_not_inet to identify packets needing crc32c
>   net: more accurate checksumming in validate_xmit_skb()
>   openvswitch: more accurate checksumming in queue_userspace_packet()
>   sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}

Other than the comments I did on patch 2, this series LGTM.


> 
>  Documentation/networking/checksum-offloads.txt   | 11 +++--
>  drivers/net/ethernet/aquantia/atlantic/aq_ring.c |  2 +-
>  include/linux/netdevice.h                        |  8 +--
>  include/linux/skbuff.h                           | 58 +++++++++-------------
>  net/bridge/netfilter/nft_reject_bridge.c         |  5 +-
>  net/core/dev.c                                   | 63 +++++++++++++++++++++---
>  net/core/skbuff.c                                | 24 +++++++++
>  net/ipv4/netfilter/nf_reject_ipv4.c              |  2 +-
>  net/ipv6/netfilter/nf_reject_ipv6.c              |  3 --
>  net/openvswitch/datapath.c                       |  2 +-
>  net/sched/act_csum.c                             |  1 +
>  net/sctp/offload.c                               |  8 +++
>  net/sctp/output.c                                |  1 +
>  13 files changed, 128 insertions(+), 60 deletions(-)
> 
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c Davide Caratti
@ 2017-04-29 20:18                                   ` Tom Herbert
  0 siblings, 0 replies; 51+ messages in thread
From: Tom Herbert @ 2017-04-29 20:18 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

On Thu, Apr 20, 2017 at 6:38 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> skb->csum_not_inet carries the indication on which algorithm is needed to
> compute checksum on skb in the transmit path, when skb->ip_summed is equal
> to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c hasn't been
> yet written in L4 header, skb->csum_not_inet is assigned to 1; otherwise,
> assume Internet Checksum is needed and thus set skb->csum_not_inet to 0.
>
> Suggested-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/skbuff.h | 16 +++++++++-------
>  net/core/dev.c         |  1 +
>  net/sched/act_csum.c   |  1 +
>  net/sctp/offload.c     |  1 +
>  net/sctp/output.c      |  1 +
>  5 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 927309e..419f4c8 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -189,12 +189,13 @@
>   *
>   *   NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
>   *     offloading the SCTP CRC in a packet. To perform this offload the stack
> - *     will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
> - *     accordingly. Note the there is no indication in the skbuff that the
> - *     CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
> - *     both IP checksum offload and SCTP CRC offload must verify which offload
> - *     is configured for a packet presumably by inspecting packet headers; in
> - *     case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
> + *     will set set csum_start and csum_offset accordingly, set ip_summed to
> + *     CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication in
> + *     the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
> + *     A driver that supports both IP checksum offload and SCTP CRC32c offload
> + *     must verify which offload is configured for a packet by testing the
> + *     value of skb->csum_not_inet; skb_crc32c_csum_help is provided to resolve
> + *     CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
>   *
>   *   NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
>   *     offloading the FCOE CRC in a packet. To perform this offload the stack
> @@ -615,6 +616,7 @@ static inline bool skb_mstamp_after(const struct skb_mstamp *t1,
>   *     @wifi_acked_valid: wifi_acked was set
>   *     @wifi_acked: whether frame was acked on wifi or not
>   *     @no_fcs:  Request NIC to treat last 4 bytes as Ethernet FCS
> + *     @csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
>   *     @dst_pending_confirm: need to confirm neighbour
>    *    @napi_id: id of the NAPI struct this skb came from
>   *     @secmark: security marking
> @@ -743,7 +745,7 @@ struct sk_buff {
>         __u8                    csum_valid:1;
>         __u8                    csum_complete_sw:1;
>         __u8                    csum_level:2;
> -       __u8                    __csum_bad_unused:1; /* one bit hole */
> +       __u8                    csum_not_inet:1;
>
>         __u8                    dst_pending_confirm:1;
>  #ifdef CONFIG_IPV6_NDISC_NODETYPE
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 77a2d73..9f56f87 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2642,6 +2642,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
>         }
>         *(__le32 *)(skb->data + offset) = crc32c_csum;
>         skb->ip_summed = CHECKSUM_NONE;
> +       skb->csum_not_inet = 0;
>  out:
>         return ret;
>  }
> diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
> index ab6fdbd..3317a2f 100644
> --- a/net/sched/act_csum.c
> +++ b/net/sched/act_csum.c
> @@ -350,6 +350,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
>         sctph->checksum = sctp_compute_cksum(skb,
>                                              skb_network_offset(skb) + ihl);
>         skb->ip_summed = CHECKSUM_NONE;
> +       skb->csum_not_inet = 0;
>
>         return 1;
>  }
> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
> index 378f462..ef156ac 100644
> --- a/net/sctp/offload.c
> +++ b/net/sctp/offload.c
> @@ -35,6 +35,7 @@
>  static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>  {
>         skb->ip_summed = CHECKSUM_NONE;
> +       skb->csum_not_inet = 0;
>         return sctp_compute_cksum(skb, skb_transport_offset(skb));
>  }
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 1409a87..e2edf2e 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -538,6 +538,7 @@ static int sctp_packet_pack(struct sctp_packet *packet,
>         } else {
>  chksum:
>                 head->ip_summed = CHECKSUM_PARTIAL;
> +               head->csum_not_inet = 1;
>                 head->csum_start = skb_transport_header(head) - head->head;
>                 head->csum_offset = offsetof(struct sctphdr, checksum);
>         }
> --
> 2.7.4
>
Looks great!

Acked-by: Tom Herbert <tom@herbertland.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
@ 2017-04-29 20:20                                   ` Tom Herbert
  0 siblings, 0 replies; 51+ messages in thread
From: Tom Herbert @ 2017-04-29 20:20 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

On Thu, Apr 20, 2017 at 6:38 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> Add FCoE to the list of protocols that can set CHECKSUM_UNNECESSARY; add a
> note to CHECKSUM_COMPLETE section to specify that it does not apply to SCTP
> and FCoE protocols.
>
> Suggested-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  include/linux/skbuff.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4002c11..c902b77 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -109,6 +109,7 @@
>   *       may perform further validation in this case.
>   *     GRE: only if the checksum is present in the header.
>   *     SCTP: indicates the CRC in SCTP header has been validated.
> + *     FCOE: indicates the CRC in FC frame has been validated.
>   *
>   *   skb->csum_level indicates the number of consecutive checksums found in
>   *   the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
> @@ -126,8 +127,10 @@
>   *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
>   *   hardware doesn't need to parse L3/L4 headers to implement this.
>   *
> - *   Note: Even if device supports only some protocols, but is able to produce
> - *   skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
> + *   Notes:
> + *   - Even if device supports only some protocols, but is able to produce
> + *     skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
> + *   - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
>   *
>   * CHECKSUM_PARTIAL:
>   *
> --
> 2.7.4
>

Acked-by: Tom Herbert <tom@herbertland.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff
  2017-04-20 13:38                                 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
@ 2017-04-29 20:21                                   ` Tom Herbert
  0 siblings, 0 replies; 51+ messages in thread
From: Tom Herbert @ 2017-04-29 20:21 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Alexander Duyck, David Laight, David S . Miller,
	Marcelo Ricardo Leitner, Linux Kernel Network Developers,
	linux-sctp @ vger . kernel . org

On Thu, Apr 20, 2017 at 6:38 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> This bit was introduced with 5a21232983aa ("net: Support for csum_bad in
> skbuff") to reduce the stack workload when processing RX packets carrying
> a wrong Internet Checksum. Up to now, only one driver (besides GRO core)
> are setting it.
> The test on NAPI_GRO_CB(skb)->flush in dev_gro_receive() is now done
> before the test on same_flow, to preserve behavior in case of wrong
> checksum.
>
> Suggested-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  drivers/net/ethernet/aquantia/atlantic/aq_ring.c |  2 +-
>  include/linux/netdevice.h                        |  4 +---
>  include/linux/skbuff.h                           | 23 ++---------------------
>  net/bridge/netfilter/nft_reject_bridge.c         |  5 +----
>  net/core/dev.c                                   |  8 +++-----
>  net/ipv4/netfilter/nf_reject_ipv4.c              |  2 +-
>  net/ipv6/netfilter/nf_reject_ipv6.c              |  3 ---
>  7 files changed, 9 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> index 3a8a4aa..9a08179 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> @@ -223,7 +223,7 @@ int aq_ring_rx_clean(struct aq_ring_s *self, int *work_done, int budget)
>                 skb->protocol = eth_type_trans(skb, ndev);
>                 if (unlikely(buff->is_cso_err)) {
>                         ++self->stats.rx.errors;
> -                       __skb_mark_checksum_bad(skb);
> +                       skb->ip_summed = CHECKSUM_NONE;
>                 } else {
>                         if (buff->is_ip_cso) {
>                                 __skb_incr_checksum_unnecessary(skb);
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index bf84a67..ab9e3dc 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2546,9 +2546,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>         if (__skb_gro_checksum_validate_needed(skb, zero_okay, check))  \
>                 __ret = __skb_gro_checksum_validate_complete(skb,       \
>                                 compute_pseudo(skb, proto));            \
> -       if (__ret)                                                      \
> -               __skb_mark_checksum_bad(skb);                           \
> -       else                                                            \
> +       if (!__ret)                                                     \
>                 skb_gro_incr_csum_unnecessary(skb);                     \
>         __ret;                                                          \
>  })
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index ec4551b..927309e 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -743,7 +743,7 @@ struct sk_buff {
>         __u8                    csum_valid:1;
>         __u8                    csum_complete_sw:1;
>         __u8                    csum_level:2;
> -       __u8                    csum_bad:1;
> +       __u8                    __csum_bad_unused:1; /* one bit hole */
>
>         __u8                    dst_pending_confirm:1;
>  #ifdef CONFIG_IPV6_NDISC_NODETYPE
> @@ -3387,21 +3387,6 @@ static inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
>         }
>  }
>
> -static inline void __skb_mark_checksum_bad(struct sk_buff *skb)
> -{
> -       /* Mark current checksum as bad (typically called from GRO
> -        * path). In the case that ip_summed is CHECKSUM_NONE
> -        * this must be the first checksum encountered in the packet.
> -        * When ip_summed is CHECKSUM_UNNECESSARY, this is the first
> -        * checksum after the last one validated. For UDP, a zero
> -        * checksum can not be marked as bad.
> -        */
> -
> -       if (skb->ip_summed = CHECKSUM_NONE ||
> -           skb->ip_summed = CHECKSUM_UNNECESSARY)
> -               skb->csum_bad = 1;
> -}
> -
>  /* Check if we need to perform checksum complete validation.
>   *
>   * Returns true if checksum complete is needed, false otherwise
> @@ -3455,9 +3440,6 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
>                         skb->csum_valid = 1;
>                         return 0;
>                 }
> -       } else if (skb->csum_bad) {
> -               /* ip_summed = CHECKSUM_NONE in this case */
> -               return (__force __sum16)1;
>         }
>
>         skb->csum = psum;
> @@ -3517,8 +3499,7 @@ static inline __wsum null_compute_pseudo(struct sk_buff *skb, int proto)
>
>  static inline bool __skb_checksum_convert_check(struct sk_buff *skb)
>  {
> -       return (skb->ip_summed = CHECKSUM_NONE &&
> -               skb->csum_valid && !skb->csum_bad);
> +       return (skb->ip_summed = CHECKSUM_NONE && skb->csum_valid);
>  }
>
>  static inline void __skb_checksum_convert(struct sk_buff *skb,
> diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
> index 346ef6b..c16dd3a 100644
> --- a/net/bridge/netfilter/nft_reject_bridge.c
> +++ b/net/bridge/netfilter/nft_reject_bridge.c
> @@ -111,7 +111,7 @@ static void nft_reject_br_send_v4_unreach(struct net *net,
>         __wsum csum;
>         u8 proto;
>
> -       if (oldskb->csum_bad || !nft_bridge_iphdr_validate(oldskb))
> +       if (!nft_bridge_iphdr_validate(oldskb))
>                 return;
>
>         /* IP header checks: fragment. */
> @@ -226,9 +226,6 @@ static bool reject6_br_csum_ok(struct sk_buff *skb, int hook)
>         __be16 fo;
>         u8 proto = ip6h->nexthdr;
>
> -       if (skb->csum_bad)
> -               return false;
> -
>         if (skb_csum_unnecessary(skb))
>                 return true;
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index c7aec95..77a2d73 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4533,9 +4533,6 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
>         if (!(skb->dev->features & NETIF_F_GRO))
>                 goto normal;
>
> -       if (skb->csum_bad)
> -               goto normal;
> -
>         gro_list_prepare(napi, skb);
>
>         rcu_read_lock();
> @@ -4595,11 +4592,12 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
>                 napi->gro_count--;
>         }
>
> +       if (NAPI_GRO_CB(skb)->flush)
> +               goto normal;
> +
>         if (same_flow)
>                 goto ok;
>
> -       if (NAPI_GRO_CB(skb)->flush)
> -               goto normal;
>
>         if (unlikely(napi->gro_count >= MAX_GRO_SKBS)) {
>                 struct sk_buff *nskb = napi->gro_list;
> diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
> index 7cd8d0d..6f8d9e5 100644
> --- a/net/ipv4/netfilter/nf_reject_ipv4.c
> +++ b/net/ipv4/netfilter/nf_reject_ipv4.c
> @@ -172,7 +172,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook)
>         struct iphdr *iph = ip_hdr(skb_in);
>         u8 proto;
>
> -       if (skb_in->csum_bad || iph->frag_off & htons(IP_OFFSET))
> +       if (iph->frag_off & htons(IP_OFFSET))
>                 return;
>
>         if (skb_csum_unnecessary(skb_in)) {
> diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
> index eedee5d..f63b18e 100644
> --- a/net/ipv6/netfilter/nf_reject_ipv6.c
> +++ b/net/ipv6/netfilter/nf_reject_ipv6.c
> @@ -220,9 +220,6 @@ static bool reject6_csum_ok(struct sk_buff *skb, int hook)
>         __be16 fo;
>         u8 proto;
>
> -       if (skb->csum_bad)
> -               return false;
> -
>         if (skb_csum_unnecessary(skb))
>                 return true;
>
> --
> 2.7.4
>

Acked-by: Tom Herbert <tom@herbertland.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-04-29 20:21 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 2/5] net: split skb_checksum_help Davide Caratti
2017-01-23 20:59   ` Tom Herbert
2017-01-24 16:35     ` David Laight
2017-02-02 15:07       ` Davide Caratti
2017-02-02 16:55         ` David Laight
2017-02-02 18:08         ` Tom Herbert
2017-02-27 13:39           ` Davide Caratti
2017-02-27 15:11             ` Tom Herbert
2017-02-28 10:31               ` Davide Caratti
2017-02-28 10:32             ` [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-02-28 22:46               ` Alexander Duyck
2017-03-01  3:17                 ` Tom Herbert
2017-03-01 10:53                 ` David Laight
2017-03-06 21:51                 ` Davide Caratti
2017-03-07 18:06                   ` Alexander Duyck
2017-03-18 13:17                     ` Davide Caratti
2017-03-18 22:35                       ` Tom Herbert
2017-04-07 14:16                         ` [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c Davide Caratti
2017-04-07 15:43                           ` Tom Herbert
2017-04-07 17:29                             ` Davide Caratti
2017-04-07 18:11                               ` Tom Herbert
2017-04-13 10:36                                 ` Davide Caratti
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums Davide Caratti
2017-04-27 12:41                                   ` Marcelo Ricardo Leitner
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
2017-04-27 12:29                                   ` Marcelo Ricardo Leitner
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
2017-04-29 20:21                                   ` Tom Herbert
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c Davide Caratti
2017-04-29 20:18                                   ` Tom Herbert
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
2017-04-20 13:38                                 ` [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
2017-04-29 20:20                                   ` Tom Herbert
2017-04-07 14:16                         ` [PATCH RFC net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
2017-04-07 14:16                         ` [PATCH RFC net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
2017-02-28 10:32             ` [PATCH RFC net-next v2 2/4] net: introduce skb_sctp_csum_help Davide Caratti
2017-02-28 10:32             ` [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb Davide Caratti
2017-02-28 19:50               ` Tom Herbert
2017-02-28 10:32             ` [PATCH RFC net-next v2 4/4] Documentation: update notes on checksum offloading Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 3/5] net: introduce skb_sctp_csum_help Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 4/5] net: more accurate checksumming in validate_xmit_skb Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 5/5] Documentation: add description of skb_sctp_csum_help Davide Caratti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).