All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Eric Dumazet <edumazet@google.com>,
	Matthieu Baerts <matthieu.baerts@tessares.net>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.14 14/38] virtio_net: Do not pull payload in skb->head
Date: Mon,  2 Aug 2021 15:44:36 +0200	[thread overview]
Message-ID: <20210802134335.285560278@linuxfoundation.org> (raw)
In-Reply-To: <20210802134334.835358048@linuxfoundation.org>

From: Eric Dumazet <edumazet@google.com>

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/virtio_net.c   |   10 +++++++---
 include/linux/virtio_net.h |   14 +++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -339,9 +339,13 @@ static struct sk_buff *page_to_skb(struc
 	offset += hdr_padded_len;
 	p += hdr_padded_len;
 
-	copy = len;
-	if (copy > skb_tailroom(skb))
-		copy = skb_tailroom(skb);
+	/* Copy all frame if it fits skb->head, otherwise
+	 * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+	 */
+	if (len <= skb_tailroom(skb))
+		copy = len;
+	else
+		copy = ETH_HLEN;
 	skb_put_data(skb, p, copy);
 
 	len -= copy;
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
 	skb_reset_mac_header(skb);
 
 	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-		u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+		u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+		if (!pskb_may_pull(skb, needed))
+			return -EINVAL;
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
 
 		p_off = skb_transport_offset(skb) + thlen;
-		if (p_off > skb_headlen(skb))
+		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;
 	} else {
 		/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -100,14 +104,14 @@ retry:
 			}
 
 			p_off = keys.control.thoff + thlen;
-			if (p_off > skb_headlen(skb) ||
+			if (!pskb_may_pull(skb, p_off) ||
 			    keys.basic.ip_proto != ip_proto)
 				return -EINVAL;
 
 			skb_set_transport_header(skb, keys.control.thoff);
 		} else if (gso_type) {
 			p_off = thlen;
-			if (p_off > skb_headlen(skb))
+			if (!pskb_may_pull(skb, p_off))
 				return -EINVAL;
 		}
 	}


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Eric Dumazet <edumazet@google.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	virtualization@lists.linux-foundation.org,
	"David S. Miller" <davem@davemloft.net>,
	Matthieu Baerts <matthieu.baerts@tessares.net>
Subject: [PATCH 4.14 14/38] virtio_net: Do not pull payload in skb->head
Date: Mon,  2 Aug 2021 15:44:36 +0200	[thread overview]
Message-ID: <20210802134335.285560278@linuxfoundation.org> (raw)
In-Reply-To: <20210802134334.835358048@linuxfoundation.org>

From: Eric Dumazet <edumazet@google.com>

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/virtio_net.c   |   10 +++++++---
 include/linux/virtio_net.h |   14 +++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -339,9 +339,13 @@ static struct sk_buff *page_to_skb(struc
 	offset += hdr_padded_len;
 	p += hdr_padded_len;
 
-	copy = len;
-	if (copy > skb_tailroom(skb))
-		copy = skb_tailroom(skb);
+	/* Copy all frame if it fits skb->head, otherwise
+	 * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+	 */
+	if (len <= skb_tailroom(skb))
+		copy = len;
+	else
+		copy = ETH_HLEN;
 	skb_put_data(skb, p, copy);
 
 	len -= copy;
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
 	skb_reset_mac_header(skb);
 
 	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-		u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+		u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+		if (!pskb_may_pull(skb, needed))
+			return -EINVAL;
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
 
 		p_off = skb_transport_offset(skb) + thlen;
-		if (p_off > skb_headlen(skb))
+		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;
 	} else {
 		/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -100,14 +104,14 @@ retry:
 			}
 
 			p_off = keys.control.thoff + thlen;
-			if (p_off > skb_headlen(skb) ||
+			if (!pskb_may_pull(skb, p_off) ||
 			    keys.basic.ip_proto != ip_proto)
 				return -EINVAL;
 
 			skb_set_transport_header(skb, keys.control.thoff);
 		} else if (gso_type) {
 			p_off = thlen;
-			if (p_off > skb_headlen(skb))
+			if (!pskb_may_pull(skb, p_off))
 				return -EINVAL;
 		}
 	}



  parent reply	other threads:[~2021-08-02 13:48 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-02 13:44 [PATCH 4.14 00/38] 4.14.242-rc1 review Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 01/38] selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 02/38] KVM: x86: determine if an exception has an error code only when injecting it Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 03/38] net: split out functions related to registering inflight socket files Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 04/38] af_unix: fix garbage collect vs MSG_PEEK Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 05/38] workqueue: fix UAF in pwq_unbound_release_workfn() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 06/38] net/802/mrp: fix memleak in mrp_request_join() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 07/38] net/802/garp: fix memleak in garp_request_join() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 08/38] net: annotate data race around sk_ll_usec Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 09/38] sctp: move 198 addresses from unusable to private scope Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 10/38] hfs: add missing clean-up in hfs_fill_super Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 11/38] hfs: fix high memory mapping in hfs_bnode_read Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 12/38] hfs: add lock nesting notation to hfs_find_init Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 13/38] ARM: dts: versatile: Fix up interrupt controller node names Greg Kroah-Hartman
2021-08-02 13:44 ` Greg Kroah-Hartman [this message]
2021-08-02 13:44   ` [PATCH 4.14 14/38] virtio_net: Do not pull payload in skb->head Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 15/38] gro: ensure frag0 meets IP header alignment Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 16/38] x86/kvm: fix vcpu-id indexed array sizes Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 17/38] ocfs2: fix zero out valid data Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 18/38] ocfs2: issue zeroout to EOF blocks Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 19/38] can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAF Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 20/38] can: mcba_usb_start(): add missing urb->transfer_dma initialization Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 21/38] can: usb_8dev: fix memory leak Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 22/38] can: ems_usb: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 23/38] can: esd_usb2: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 24/38] NIU: fix incorrect error return, missed in previous revert Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 25/38] nfc: nfcsim: fix use after free during module unload Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 26/38] x86/asm: Ensure asm/proto.h can be included stand-alone Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 27/38] cfg80211: Fix possible memory leak in function cfg80211_bss_update Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 28/38] netfilter: conntrack: adjust stop timestamp to real expiry value Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 29/38] netfilter: nft_nat: allow to specify layer 4 protocol NAT only Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 30/38] tipc: fix sleeping in tipc accept routine Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 31/38] mlx4: Fix missing error code in mlx4_load_one() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 32/38] net: llc: fix skb_over_panic Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 33/38] net/mlx5: Fix flow table chaining Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 34/38] sctp: fix return value check in __sctp_rcv_asconf_lookup Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 35/38] tulip: windbond-840: Fix missing pci_disable_device() in probe and remove Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 36/38] sis900: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 37/38] can: hi311x: fix a signedness bug in hi3110_cmd() Greg Kroah-Hartman
2021-08-02 13:45 ` [PATCH 4.14 38/38] Revert "perf map: Fix dso->nsinfo refcounting" Greg Kroah-Hartman
2021-08-03 13:17 ` [PATCH 4.14 00/38] 4.14.242-rc1 review Naresh Kamboju
2021-08-03 14:50 ` Jon Hunter
2021-08-03 19:15 ` Guenter Roeck
2021-08-04  2:56 ` Samuel Zou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210802134335.285560278@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthieu.baerts@tessares.net \
    --cc=mst@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.