From: Eric Dumazet <eric.dumazet@gmail.com>
To: "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
netdev <netdev@vger.kernel.org>,
virtualization@lists.linux-foundation.org,
Eric Dumazet <edumazet@google.com>
Subject: [PATCH net] virtio_net: Do not pull payload in skb->head
Date: Fri, 2 Apr 2021 06:26:02 -0700 [thread overview]
Message-ID: <20210402132602.3659282-1-eric.dumazet@gmail.com> (raw)
From: Eric Dumazet <edumazet@google.com>
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
---
drivers/net/virtio_net.c | 10 +++++++---
include/linux/virtio_net.h | 14 +++++++++-----
2 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 82e520d2cb1229a0c7b9fd0def3e4a7135536478..0824e6999e49957f7aaf7c990f6259792d42f32b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
- copy = len;
- if (copy > skb_tailroom(skb))
- copy = skb_tailroom(skb);
+ /* Copy all frame if it fits skb->head, otherwise
+ * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+ */
+ if (len <= skb_tailroom(skb))
+ copy = len;
+ else
+ copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa69632e2c2da30b581a666f7fbb94b64..b465f8f3e554f27ced45c35f54f113cf6dce1f07 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
- u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+ u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+ if (!pskb_may_pull(skb, needed))
+ return -EINVAL;
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
p_off = skb_transport_offset(skb) + thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
}
p_off = keys.control.thoff + thlen;
- if (p_off > skb_headlen(skb) ||
+ if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
--
2.31.0.208.g409f899ff0-goog
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: netdev <netdev@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
virtualization@lists.linux-foundation.org
Subject: [PATCH net] virtio_net: Do not pull payload in skb->head
Date: Fri, 2 Apr 2021 06:26:02 -0700 [thread overview]
Message-ID: <20210402132602.3659282-1-eric.dumazet@gmail.com> (raw)
From: Eric Dumazet <edumazet@google.com>
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
---
drivers/net/virtio_net.c | 10 +++++++---
include/linux/virtio_net.h | 14 +++++++++-----
2 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 82e520d2cb1229a0c7b9fd0def3e4a7135536478..0824e6999e49957f7aaf7c990f6259792d42f32b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
- copy = len;
- if (copy > skb_tailroom(skb))
- copy = skb_tailroom(skb);
+ /* Copy all frame if it fits skb->head, otherwise
+ * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+ */
+ if (len <= skb_tailroom(skb))
+ copy = len;
+ else
+ copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa69632e2c2da30b581a666f7fbb94b64..b465f8f3e554f27ced45c35f54f113cf6dce1f07 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
- u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+ u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+ if (!pskb_may_pull(skb, needed))
+ return -EINVAL;
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
p_off = skb_transport_offset(skb) + thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
}
p_off = keys.control.thoff + thlen;
- if (p_off > skb_headlen(skb) ||
+ if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
--
2.31.0.208.g409f899ff0-goog
next reply other threads:[~2021-04-02 13:26 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-02 13:26 Eric Dumazet [this message]
2021-04-02 13:26 ` [PATCH net] virtio_net: Do not pull payload in skb->head Eric Dumazet
2021-04-06 2:03 ` Jason Wang
2021-04-06 2:03 ` Jason Wang
2021-04-07 18:09 ` Michael S. Tsirkin
2021-04-07 18:09 ` Michael S. Tsirkin
2021-04-11 13:43 ` Guenter Roeck
2021-04-11 13:43 ` Guenter Roeck
2021-04-11 15:06 ` Eric Dumazet
2021-04-11 20:37 ` Guenter Roeck
2021-04-11 20:37 ` Guenter Roeck
2021-04-11 21:23 ` Eric Dumazet
2021-04-11 21:32 ` Guenter Roeck
2021-04-11 21:32 ` Guenter Roeck
2021-04-11 21:43 ` Eric Dumazet
2021-04-11 22:07 ` Guenter Roeck
2021-04-11 22:07 ` Guenter Roeck
2021-04-12 5:48 ` Eric Dumazet
2021-04-12 5:53 ` Eric Dumazet
2021-04-12 6:23 ` Guenter Roeck
2021-04-12 6:23 ` Guenter Roeck
2021-04-12 6:37 ` Eric Dumazet
2021-04-11 22:20 ` Guenter Roeck
2021-04-11 22:20 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210402132602.3659282-1-eric.dumazet@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.