From: David Woodhouse <dwmw2@infradead.org>
To: netdev@vger.kernel.org
Subject: [RFC PATCH] Fix false positives in can_checksum_protocol()
Date: Fri, 25 Sep 2015 13:55:58 +0100 [thread overview]
Message-ID: <1443185758.4674.44.camel@infradead.org> (raw)
[-- Attachment #1: Type: text/plain, Size: 4540 bytes --]
The check in harmonize_features() is supposed to match the skb against
the features of the device in question, and prevent us from handing a
skb to a device which can't handle it.
It doesn't work correctly. A device with NETIF_F_IP_CSUM or
NETIF_F_IPV6_CSUM capabilities is only required to checksum TCP or UDP,
on Legacy IP and IPv6 respectively. But the existing check will allow
us to pass it *any* ETH_P_IP/ETH_P_IPV6 packets for hardware checksum
offload.
Depending on the driver in use, this leads to a BUG, a WARNING, or just
silent data corruption.
This is one approach to fixing that, and my test program at
http://bombadil.infradead.org/~dwmw2/raw.c can no longer trivially
reproduce the problem.
The test does now have false *negatives*, but those shouldn't happen
for locally-generated packets; only packets injected from af_packet,
tun, virtio_net and other places that allow us to inject
CHECKSUM_PARTIAL packets in order to make use of hardware offload
features. And false negatives aren't anywhere near as much of a problem
as false positives are — we just finish the checksum in software and
send the packet anyway.
It would be possible to fix those false negatives, if we really wanted
to — perhaps by adding an additional bit in the skbuff which indicates
that it *is* a TCP or UDP packet, rather than using ->sk->sk_protocol.
Then that bit could be set if appropriate in skb_partial_csum_set(), as
well as the places where we locally generate such packets. And the
check in can_checksum_protocol() would just check for that bit.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
Since can_checksum_protocol is inline, the compiler ought to know that
it doesn't even need to dereference skb->sk in the case where the
device has the NETIF_F_GEN_CSUM feature. So the additional check should
not slow down the (hopefully) common case in the fast path.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2d15e38..76c8330 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3628,15 +3628,23 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, netdev_features_t features)
__be16 skb_network_protocol(struct sk_buff *skb, int *depth);
static inline bool can_checksum_protocol(netdev_features_t features,
- __be16 protocol)
-{
- return ((features & NETIF_F_GEN_CSUM) ||
- ((features & NETIF_F_V4_CSUM) &&
- protocol == htons(ETH_P_IP)) ||
- ((features & NETIF_F_V6_CSUM) &&
- protocol == htons(ETH_P_IPV6)) ||
- ((features & NETIF_F_FCOE_CRC) &&
- protocol == htons(ETH_P_FCOE)));
+ __be16 protocol, u8 sk_protocol)
+{
+ if ((features & NETIF_F_GEN_CSUM) ||
+ ((features & NETIF_F_FCOE_CRC) && protocol == htons(ETH_P_FCOE)))
+ return 1;
+
+ /* NETIF_F_V[46]_CSUM are defined to work only on TCP and UDP.
+ * That is, when it needs to start checksumming at the transport
+ * header, and place the result at an offset of either 6 (for UDP)
+ * or 16 (for TCP).
+ */
+ if ((((features & NETIF_F_V4_CSUM) && protocol == htons(ETH_P_IP)) ||
+ ((features & NETIF_F_V6_CSUM) && protocol == htons(ETH_P_IPV6))) &&
+ (sk_protocol == IPPROTO_TCP || sk_protocol == IPPROTO_UDP))
+ return 1;
+
+ return 0;
}
#ifdef CONFIG_BUG
diff --git a/net/core/dev.c b/net/core/dev.c
index 6bb6470..3c40957 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2613,7 +2613,8 @@ static netdev_features_t harmonize_features(struct sk_buff *skb,
features = net_mpls_features(skb, features, type);
if (skb->ip_summed != CHECKSUM_NONE &&
- !can_checksum_protocol(features, type)) {
+ !can_checksum_protocol(features, type,
+ skb->sk ? skb->sk->sk_protocol : 0)) {
features &= ~NETIF_F_ALL_CSUM;
} else if (illegal_highdma(skb->dev, skb)) {
features &= ~NETIF_F_SG;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index dad4dd3..9126c6f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3004,7 +3004,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
return ERR_PTR(-EINVAL);
csum = !head_skb->encap_hdr_csum &&
- !!can_checksum_protocol(features, proto);
+ !!can_checksum_protocol(features, proto,
+ head_skb->sk ? head_skb->sk->sk_protocol : 0);
headroom = skb_headroom(head_skb);
pos = skb_headlen(head_skb);
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
next reply other threads:[~2015-09-25 12:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-25 12:55 David Woodhouse [this message]
2015-09-28 17:03 ` [RFC PATCH] Fix false positives in can_checksum_protocol() Tom Herbert
2015-09-28 18:27 ` David Woodhouse
2015-09-28 19:13 ` Tom Herbert
2015-09-28 19:26 ` David Woodhouse
2015-09-28 19:37 ` Tom Herbert
2015-09-29 1:38 ` Jesse Brandeburg
2015-09-29 3:04 ` Tom Herbert
2015-09-29 7:12 ` David Woodhouse
2015-09-29 22:52 ` Tom Herbert
2015-10-05 11:16 ` David Woodhouse
2015-10-05 16:23 ` Tom Herbert
2015-10-05 18:28 ` Rustad, Mark D
2015-10-05 20:22 ` David Woodhouse
2015-09-29 7:08 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1443185758.4674.44.camel@infradead.org \
--to=dwmw2@infradead.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).