* [PATCH net-next v2 0/3] virtio-net: avoid conflicts between XDP and GUEST_CSUM
@ 2023-06-24 12:26 Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Heng Qi @ 2023-06-24 12:26 UTC (permalink / raw)
To: netdev, bpf
Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
virtio-net needs to clear the VIRTIO_NET_F_GUEST_CSUM feature when
loading XDP. The main reason for doing this is because
VIRTIO_NET_F_GUEST_CSUM allows to receive packets marked as
VIRTIO_NET_HDR_F_NEEDS_CSUM. Such packets are not compatible with
XDP programs, because we cannot guarantee that the csum_{start, offset}
fields are correct after XDP modifies the packets.
There is also an existing problem, in the same host vm-vm (eg
[vm]<->[ovs vhost-user]<->[vm]) scenario, loading XDP will cause packet loss.
To solve the above problems, we have discussed in the [1] proposal, and
now try to solve it through the method of reprobing fields suggested
by Jason.
[1] https://lists.oasis-open.org/archives/virtio-dev/202305/msg00318.html
---
v1->v2:
- Squash v1's patch [1/4] and patch [2/4] into v2's patch [1/3]. @Michael S. Tsirkin
- Some minor modifications.
Heng Qi (3):
virtio-net: reprobe csum related fields for skb passed by XDP
virtio-net: support coexistence of XDP and GUEST_CSUM
virtio-net: remove GUEST_CSUM check for XDP loading
drivers/net/virtio_net.c | 181 +++++++++++++++++++++++++++++++++++----
1 file changed, 166 insertions(+), 15 deletions(-)
--
2.19.1.6.gb485710b
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-24 12:26 [PATCH net-next v2 0/3] virtio-net: avoid conflicts between XDP and GUEST_CSUM Heng Qi
@ 2023-06-24 12:26 ` Heng Qi
2023-06-24 19:28 ` Simon Horman
2023-06-25 6:44 ` Jason Wang
2023-06-24 12:26 ` [PATCH net-next v2 2/3] virtio-net: support coexistence of XDP and GUEST_CSUM Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 3/3] virtio-net: remove GUEST_CSUM check for XDP Heng Qi
2 siblings, 2 replies; 11+ messages in thread
From: Heng Qi @ 2023-06-24 12:26 UTC (permalink / raw)
To: netdev, bpf
Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
for netdev) feature of the virtio-net driver conflicts with the loading
of XDP, which is caused by the problem described in [1][2], that is,
XDP may cause errors in partial csumed-related fields which can lead
to packet dropping.
In addition, when communicating between vm and vm on the same host, the
receiving side vm will receive packets marked as
VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
be cleared, causing the packet dropping.
This patch introduces a helper:
1. It will try to solve the above problems in the subsequent patch.
2. It parses UDP/TCP and calculates the pseudo-header checksum
for virtio-net. virtio-net currently does not resolve VLANs nor
SCTP CRC checksum offloading.
[1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
[2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
1 file changed, 136 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5a7f7a76b920..83ab9257043a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -22,6 +22,7 @@
#include <net/route.h>
#include <net/xdp.h>
#include <net/net_failover.h>
+#include <net/ip6_checksum.h>
static int napi_weight = NAPI_POLL_WEIGHT;
module_param(napi_weight, int, 0444);
@@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
}
+static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
+{
+ struct net_device *dev = vi->dev;
+ struct flow_keys_basic keys;
+ struct udphdr *uh;
+ struct tcphdr *th;
+ int len, offset;
+
+ /* The flow dissector needs this information. */
+ skb->dev = dev;
+ skb_reset_mac_header(skb);
+ skb->protocol = dev_parse_header_protocol(skb);
+ /* virtio-net does not need to resolve VLAN. */
+ skb_set_network_header(skb, ETH_HLEN);
+ if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
+ NULL, 0, 0, 0, 0))
+ return -EINVAL;
+
+ /* 1. Pseudo-header checksum calculation requires:
+ * (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
+ * 2. We don't parse SCTP because virtio-net currently doesn't
+ * support CRC checksum offloading for SCTP.
+ */
+ if (keys.basic.n_proto == htons(ETH_P_IP)) {
+ struct iphdr *iph;
+
+ /* Flow dissector has verified that there is an IP header.
+ * So we do not need a pskb_may_pull().
+ */
+ iph = ip_hdr(skb);
+ if (iph->version != 4)
+ return -EINVAL;
+
+ skb->transport_header = skb->mac_header + keys.control.thoff;
+ offset = skb_transport_offset(skb);
+ len = skb->len - offset;
+ if (keys.basic.ip_proto == IPPROTO_UDP) {
+ if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
+ return -EINVAL;
+
+ uh = udp_hdr(skb);
+ skb->csum_offset = offsetof(struct udphdr, check);
+ /* Although uh->len is already the 3rd parameter for the calculation
+ * of the pseudo-header checksum, we have already calculated the
+ * length of the transport layer, so use 'len' here directly.
+ */
+ uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
+ IPPROTO_UDP, 0);
+ } else if (keys.basic.ip_proto == IPPROTO_TCP) {
+ if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
+ return -EINVAL;
+
+ th = tcp_hdr(skb);
+ skb->csum_offset = offsetof(struct tcphdr, check);
+ th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
+ IPPROTO_TCP, 0);
+ } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
+ } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
+ struct ipv6hdr *ip6h;
+
+ ip6h = ipv6_hdr(skb);
+ if (ip6h->version != 6)
+ return -EINVAL;
+
+ /* We have skipped the possible extension headers for IPv6.
+ * If there is a Routing Header, the tx's check value is calculated by
+ * final_dst, and that value is the rx's daddr.
+ */
+ skb->transport_header = skb->mac_header + keys.control.thoff;
+ offset = skb_transport_offset(skb);
+ len = skb->len - offset;
+ if (keys.basic.ip_proto == IPPROTO_UDP) {
+ if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
+ return -EINVAL;
+
+ uh = udp_hdr(skb);
+ skb->csum_offset = offsetof(struct udphdr, check);
+ uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
+ (const struct in6_addr *)&ip6h->daddr,
+ len, IPPROTO_UDP, 0);
+ } else if (keys.basic.ip_proto == IPPROTO_TCP) {
+ if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
+ return -EINVAL;
+
+ th = tcp_hdr(skb);
+ skb->csum_offset = offsetof(struct tcphdr, check);
+ th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
+ (const struct in6_addr *)&ip6h->daddr,
+ len, IPPROTO_TCP, 0);
+ }
+ }
+
+ skb->csum_start = skb->transport_header;
+
+ return 0;
+}
+
+static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
+ struct sk_buff *skb,
+ __u8 flags)
+{
+ int err;
+
+ /* When XDP program is loaded, for example, the vm-vm scenario
+ * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
+ * will travel. Although these packets are safe from the point of
+ * view of the vm, to avoid modification by XDP and successful
+ * forwarding in the upper layer, we re-probe the necessary checksum
+ * related information: skb->csum_{start, offset}, pseudo-header csum.
+ *
+ * This benefits us:
+ * 1. XDP can be loaded when there's _F_GUEST_CSUM.
+ * 2. The device verifies the checksum of packets , especially
+ * benefiting for large packets.
+ * 3. In the same-host vm-vm scenario, packets marked as
+ * VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
+ * processed by XDP.
+ */
+ if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+ err = virtnet_flow_dissect_udp_tcp(vi, skb);
+ if (err < 0)
+ return -EINVAL;
+
+ skb->ip_summed = CHECKSUM_PARTIAL;
+ } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
+ /* We want to benefit from this: XDP guarantees that packets marked
+ * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
+ * are processed.
+ */
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ }
+
+ return 0;
+}
+
static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
void *buf, unsigned int len, void **ctx,
unsigned int *xdp_xmit,
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net-next v2 2/3] virtio-net: support coexistence of XDP and GUEST_CSUM
2023-06-24 12:26 [PATCH net-next v2 0/3] virtio-net: avoid conflicts between XDP and GUEST_CSUM Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
@ 2023-06-24 12:26 ` Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 3/3] virtio-net: remove GUEST_CSUM check for XDP Heng Qi
2 siblings, 0 replies; 11+ messages in thread
From: Heng Qi @ 2023-06-24 12:26 UTC (permalink / raw)
To: netdev, bpf
Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
We are now re-probing the csum related fields and trying
to have XDP and RX hw checksum capabilities coexist on the
XDP path. For the benefit of:
1. RX hw checksum capability can be used if XDP is loaded.
2. Avoid packet loss when loading XDP in the vm-vm scenario.
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
drivers/net/virtio_net.c | 41 ++++++++++++++++++++++++++++------------
1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 83ab9257043a..7643a188ec37 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1712,6 +1712,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
struct net_device *dev = vi->dev;
struct sk_buff *skb;
struct virtio_net_hdr_mrg_rxbuf *hdr;
+ __u8 flags;
if (unlikely(len < vi->hdr_len + ETH_HLEN)) {
pr_debug("%s: short packet %i\n", dev->name, len);
@@ -1720,6 +1721,13 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
return;
}
+ /* Save the flags of the hdr before XDP processes the data.
+ * It is ok to use this for both mergeable and small modes.
+ * Because that's what we do now.
+ */
+ if (unlikely(vi->xdp_enabled))
+ flags = ((struct virtio_net_hdr_mrg_rxbuf *)buf)->hdr.flags;
+
if (vi->mergeable_rx_bufs)
skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit,
stats);
@@ -1731,19 +1739,28 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
if (unlikely(!skb))
return;
- hdr = skb_vnet_hdr(skb);
- if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report)
- virtio_skb_set_hash((const struct virtio_net_hdr_v1_hash *)hdr, skb);
-
- if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
- skb->ip_summed = CHECKSUM_UNNECESSARY;
+ if (unlikely(vi->xdp_enabled)) {
+ if (virtnet_set_csum_after_xdp(vi, skb, flags) < 0) {
+ pr_debug("%s: errors occurred in flow dissector setting csum",
+ dev->name);
+ goto frame_err;
+ }
- if (virtio_net_hdr_to_skb(skb, &hdr->hdr,
- virtio_is_little_endian(vi->vdev))) {
- net_warn_ratelimited("%s: bad gso: type: %u, size: %u\n",
- dev->name, hdr->hdr.gso_type,
- hdr->hdr.gso_size);
- goto frame_err;
+ } else {
+ hdr = skb_vnet_hdr(skb);
+ if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report)
+ virtio_skb_set_hash((const struct virtio_net_hdr_v1_hash *)hdr, skb);
+
+ if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+ if (virtio_net_hdr_to_skb(skb, &hdr->hdr,
+ virtio_is_little_endian(vi->vdev))) {
+ net_warn_ratelimited("%s: bad gso: type: %u, size: %u\n",
+ dev->name, hdr->hdr.gso_type,
+ hdr->hdr.gso_size);
+ goto frame_err;
+ }
}
skb_record_rx_queue(skb, vq2rxq(rq->vq));
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net-next v2 3/3] virtio-net: remove GUEST_CSUM check for XDP
2023-06-24 12:26 [PATCH net-next v2 0/3] virtio-net: avoid conflicts between XDP and GUEST_CSUM Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 2/3] virtio-net: support coexistence of XDP and GUEST_CSUM Heng Qi
@ 2023-06-24 12:26 ` Heng Qi
2 siblings, 0 replies; 11+ messages in thread
From: Heng Qi @ 2023-06-24 12:26 UTC (permalink / raw)
To: netdev, bpf
Cc: Michael S . Tsirkin, Jason Wang, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
XDP and GUEST_CSUM no longer conflict now, so we removed the
check for GUEST_CSUM for XDP loading/unloading.
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v1->v2:
- Rewrite the commit log.
drivers/net/virtio_net.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7643a188ec37..df7cca4d950f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -61,7 +61,6 @@ static const unsigned long guest_offloads[] = {
VIRTIO_NET_F_GUEST_TSO6,
VIRTIO_NET_F_GUEST_ECN,
VIRTIO_NET_F_GUEST_UFO,
- VIRTIO_NET_F_GUEST_CSUM,
VIRTIO_NET_F_GUEST_USO4,
VIRTIO_NET_F_GUEST_USO6,
VIRTIO_NET_F_GUEST_HDRLEN
@@ -3530,10 +3529,9 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO) ||
- virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_USO4) ||
virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_USO6))) {
- NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing GRO_HW/CSUM, disable GRO_HW/CSUM first");
+ NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing GRO_HW, disable GRO_HW first");
return -EOPNOTSUPP;
}
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
@ 2023-06-24 19:28 ` Simon Horman
2023-06-25 2:17 ` Heng Qi
2023-06-25 6:44 ` Jason Wang
1 sibling, 1 reply; 11+ messages in thread
From: Simon Horman @ 2023-06-24 19:28 UTC (permalink / raw)
To: Heng Qi
Cc: netdev, bpf, Michael S . Tsirkin, Jason Wang, Xuan Zhuo,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend
On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:
...
> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> + struct sk_buff *skb,
> + __u8 flags)
Hi Heng Qi,
Unfortunately this appears to break an x86_64 allmodconfig build
with GCC 12.3.0:
drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]
1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
--
pw-bot: changes-requested
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-24 19:28 ` Simon Horman
@ 2023-06-25 2:17 ` Heng Qi
2023-06-25 21:53 ` Michael S. Tsirkin
0 siblings, 1 reply; 11+ messages in thread
From: Heng Qi @ 2023-06-25 2:17 UTC (permalink / raw)
To: Simon Horman
Cc: netdev, bpf, Michael S . Tsirkin, Jason Wang, Xuan Zhuo,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend
在 2023/6/25 上午3:28, Simon Horman 写道:
> On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:
>
> ...
>
>> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>> + struct sk_buff *skb,
>> + __u8 flags)
> Hi Heng Qi,
>
> Unfortunately this appears to break an x86_64 allmodconfig build
> with GCC 12.3.0:
>
> drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]
I admit that this is a patch set, you can cherry-pick patches [1/3] and
[2/3] together
to make it compile without 'defined but not used' warning. If people
don't want to
separate [1/3] and [2/3], I can squash them into one. :)
Thanks.
> 1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~
>
> --
> pw-bot: changes-requested
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
2023-06-24 19:28 ` Simon Horman
@ 2023-06-25 6:44 ` Jason Wang
2023-06-25 7:27 ` Heng Qi
1 sibling, 1 reply; 11+ messages in thread
From: Jason Wang @ 2023-06-25 6:44 UTC (permalink / raw)
To: Heng Qi
Cc: netdev, bpf, Michael S . Tsirkin, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> for netdev) feature of the virtio-net driver conflicts with the loading
> of XDP, which is caused by the problem described in [1][2], that is,
> XDP may cause errors in partial csumed-related fields which can lead
> to packet dropping.
>
> In addition, when communicating between vm and vm on the same host, the
> receiving side vm will receive packets marked as
> VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> be cleared, causing the packet dropping.
>
> This patch introduces a helper:
> 1. It will try to solve the above problems in the subsequent patch.
> 2. It parses UDP/TCP and calculates the pseudo-header checksum
> for virtio-net. virtio-net currently does not resolve VLANs nor
> SCTP CRC checksum offloading.
Do we need to bother? Can we simply use skb_probe_transport_header()
and skb_checksum_help() which can simplify a lot of things?
Thanks
>
> [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 136 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5a7f7a76b920..83ab9257043a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -22,6 +22,7 @@
> #include <net/route.h>
> #include <net/xdp.h>
> #include <net/net_failover.h>
> +#include <net/ip6_checksum.h>
>
> static int napi_weight = NAPI_POLL_WEIGHT;
> module_param(napi_weight, int, 0444);
> @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
> skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
> }
>
> +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> +{
> + struct net_device *dev = vi->dev;
> + struct flow_keys_basic keys;
> + struct udphdr *uh;
> + struct tcphdr *th;
> + int len, offset;
> +
> + /* The flow dissector needs this information. */
> + skb->dev = dev;
> + skb_reset_mac_header(skb);
> + skb->protocol = dev_parse_header_protocol(skb);
> + /* virtio-net does not need to resolve VLAN. */
> + skb_set_network_header(skb, ETH_HLEN);
> + if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> + NULL, 0, 0, 0, 0))
> + return -EINVAL;
> +
> + /* 1. Pseudo-header checksum calculation requires:
> + * (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> + * 2. We don't parse SCTP because virtio-net currently doesn't
> + * support CRC checksum offloading for SCTP.
> + */
> + if (keys.basic.n_proto == htons(ETH_P_IP)) {
> + struct iphdr *iph;
> +
> + /* Flow dissector has verified that there is an IP header.
> + * So we do not need a pskb_may_pull().
> + */
> + iph = ip_hdr(skb);
> + if (iph->version != 4)
> + return -EINVAL;
> +
> + skb->transport_header = skb->mac_header + keys.control.thoff;
> + offset = skb_transport_offset(skb);
> + len = skb->len - offset;
> + if (keys.basic.ip_proto == IPPROTO_UDP) {
> + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> + return -EINVAL;
> +
> + uh = udp_hdr(skb);
> + skb->csum_offset = offsetof(struct udphdr, check);
> + /* Although uh->len is already the 3rd parameter for the calculation
> + * of the pseudo-header checksum, we have already calculated the
> + * length of the transport layer, so use 'len' here directly.
> + */
> + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> + IPPROTO_UDP, 0);
> + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> + return -EINVAL;
> +
> + th = tcp_hdr(skb);
> + skb->csum_offset = offsetof(struct tcphdr, check);
> + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> + IPPROTO_TCP, 0);
> + } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> + } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> + struct ipv6hdr *ip6h;
> +
> + ip6h = ipv6_hdr(skb);
> + if (ip6h->version != 6)
> + return -EINVAL;
> +
> + /* We have skipped the possible extension headers for IPv6.
> + * If there is a Routing Header, the tx's check value is calculated by
> + * final_dst, and that value is the rx's daddr.
> + */
> + skb->transport_header = skb->mac_header + keys.control.thoff;
> + offset = skb_transport_offset(skb);
> + len = skb->len - offset;
> + if (keys.basic.ip_proto == IPPROTO_UDP) {
> + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> + return -EINVAL;
> +
> + uh = udp_hdr(skb);
> + skb->csum_offset = offsetof(struct udphdr, check);
> + uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> + (const struct in6_addr *)&ip6h->daddr,
> + len, IPPROTO_UDP, 0);
> + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> + return -EINVAL;
> +
> + th = tcp_hdr(skb);
> + skb->csum_offset = offsetof(struct tcphdr, check);
> + th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> + (const struct in6_addr *)&ip6h->daddr,
> + len, IPPROTO_TCP, 0);
> + }
> + }
> +
> + skb->csum_start = skb->transport_header;
> +
> + return 0;
> +}
> +
> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> + struct sk_buff *skb,
> + __u8 flags)
> +{
> + int err;
> +
> + /* When XDP program is loaded, for example, the vm-vm scenario
> + * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> + * will travel. Although these packets are safe from the point of
> + * view of the vm, to avoid modification by XDP and successful
> + * forwarding in the upper layer, we re-probe the necessary checksum
> + * related information: skb->csum_{start, offset}, pseudo-header csum.
> + *
> + * This benefits us:
> + * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> + * 2. The device verifies the checksum of packets , especially
> + * benefiting for large packets.
> + * 3. In the same-host vm-vm scenario, packets marked as
> + * VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> + * processed by XDP.
> + */
> + if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> + err = virtnet_flow_dissect_udp_tcp(vi, skb);
> + if (err < 0)
> + return -EINVAL;
> +
> + skb->ip_summed = CHECKSUM_PARTIAL;
> + } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> + /* We want to benefit from this: XDP guarantees that packets marked
> + * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> + * are processed.
> + */
> + skb->ip_summed = CHECKSUM_UNNECESSARY;
> + }
> +
> + return 0;
> +}
> +
> static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> void *buf, unsigned int len, void **ctx,
> unsigned int *xdp_xmit,
> --
> 2.19.1.6.gb485710b
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-25 6:44 ` Jason Wang
@ 2023-06-25 7:27 ` Heng Qi
2023-06-26 2:09 ` Jason Wang
0 siblings, 1 reply; 11+ messages in thread
From: Heng Qi @ 2023-06-25 7:27 UTC (permalink / raw)
To: Jason Wang
Cc: netdev, bpf, Michael S . Tsirkin, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
> On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> > Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> > for netdev) feature of the virtio-net driver conflicts with the loading
> > of XDP, which is caused by the problem described in [1][2], that is,
> > XDP may cause errors in partial csumed-related fields which can lead
> > to packet dropping.
> >
> > In addition, when communicating between vm and vm on the same host, the
> > receiving side vm will receive packets marked as
> > VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> > XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> > be cleared, causing the packet dropping.
> >
> > This patch introduces a helper:
> > 1. It will try to solve the above problems in the subsequent patch.
> > 2. It parses UDP/TCP and calculates the pseudo-header checksum
> > for virtio-net. virtio-net currently does not resolve VLANs nor
> > SCTP CRC checksum offloading.
>
> Do we need to bother? Can we simply use skb_probe_transport_header()
> and skb_checksum_help() which can simplify a lot of things?
We need to. We only compute partial checksums (not complete checksums) for
packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
consider packets with the partial checksum to be valid by the
protocol stack (Because it believes that the communication between the
VMs of the same host is reliable.).
In order to calculate the pseudo-header checksum, we need the IP address and the
\field{check} position of the transport layer header.
skb_probe_transport_header() only finds out the location of the
transport header and does not provide us with other information we
need. skb_checksum_help() is to calculate the complete checksum on the
tx path, which is not our purpose.
Thanks!
>
> Thanks
>
> >
> > [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> > [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
> >
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> > drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 136 insertions(+)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 5a7f7a76b920..83ab9257043a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -22,6 +22,7 @@
> > #include <net/route.h>
> > #include <net/xdp.h>
> > #include <net/net_failover.h>
> > +#include <net/ip6_checksum.h>
> >
> > static int napi_weight = NAPI_POLL_WEIGHT;
> > module_param(napi_weight, int, 0444);
> > @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
> > skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
> > }
> >
> > +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> > +{
> > + struct net_device *dev = vi->dev;
> > + struct flow_keys_basic keys;
> > + struct udphdr *uh;
> > + struct tcphdr *th;
> > + int len, offset;
> > +
> > + /* The flow dissector needs this information. */
> > + skb->dev = dev;
> > + skb_reset_mac_header(skb);
> > + skb->protocol = dev_parse_header_protocol(skb);
> > + /* virtio-net does not need to resolve VLAN. */
> > + skb_set_network_header(skb, ETH_HLEN);
> > + if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> > + NULL, 0, 0, 0, 0))
> > + return -EINVAL;
> > +
> > + /* 1. Pseudo-header checksum calculation requires:
> > + * (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> > + * 2. We don't parse SCTP because virtio-net currently doesn't
> > + * support CRC checksum offloading for SCTP.
> > + */
> > + if (keys.basic.n_proto == htons(ETH_P_IP)) {
> > + struct iphdr *iph;
> > +
> > + /* Flow dissector has verified that there is an IP header.
> > + * So we do not need a pskb_may_pull().
> > + */
> > + iph = ip_hdr(skb);
> > + if (iph->version != 4)
> > + return -EINVAL;
> > +
> > + skb->transport_header = skb->mac_header + keys.control.thoff;
> > + offset = skb_transport_offset(skb);
> > + len = skb->len - offset;
> > + if (keys.basic.ip_proto == IPPROTO_UDP) {
> > + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > + return -EINVAL;
> > +
> > + uh = udp_hdr(skb);
> > + skb->csum_offset = offsetof(struct udphdr, check);
> > + /* Although uh->len is already the 3rd parameter for the calculation
> > + * of the pseudo-header checksum, we have already calculated the
> > + * length of the transport layer, so use 'len' here directly.
> > + */
> > + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > + IPPROTO_UDP, 0);
> > + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > + return -EINVAL;
> > +
> > + th = tcp_hdr(skb);
> > + skb->csum_offset = offsetof(struct tcphdr, check);
> > + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > + IPPROTO_TCP, 0);
> > + } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> > + } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> > + struct ipv6hdr *ip6h;
> > +
> > + ip6h = ipv6_hdr(skb);
> > + if (ip6h->version != 6)
> > + return -EINVAL;
> > +
> > + /* We have skipped the possible extension headers for IPv6.
> > + * If there is a Routing Header, the tx's check value is calculated by
> > + * final_dst, and that value is the rx's daddr.
> > + */
> > + skb->transport_header = skb->mac_header + keys.control.thoff;
> > + offset = skb_transport_offset(skb);
> > + len = skb->len - offset;
> > + if (keys.basic.ip_proto == IPPROTO_UDP) {
> > + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > + return -EINVAL;
> > +
> > + uh = udp_hdr(skb);
> > + skb->csum_offset = offsetof(struct udphdr, check);
> > + uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > + (const struct in6_addr *)&ip6h->daddr,
> > + len, IPPROTO_UDP, 0);
> > + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > + return -EINVAL;
> > +
> > + th = tcp_hdr(skb);
> > + skb->csum_offset = offsetof(struct tcphdr, check);
> > + th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > + (const struct in6_addr *)&ip6h->daddr,
> > + len, IPPROTO_TCP, 0);
> > + }
> > + }
> > +
> > + skb->csum_start = skb->transport_header;
> > +
> > + return 0;
> > +}
> > +
> > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > + struct sk_buff *skb,
> > + __u8 flags)
> > +{
> > + int err;
> > +
> > + /* When XDP program is loaded, for example, the vm-vm scenario
> > + * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> > + * will travel. Although these packets are safe from the point of
> > + * view of the vm, to avoid modification by XDP and successful
> > + * forwarding in the upper layer, we re-probe the necessary checksum
> > + * related information: skb->csum_{start, offset}, pseudo-header csum.
> > + *
> > + * This benefits us:
> > + * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> > + * 2. The device verifies the checksum of packets , especially
> > + * benefiting for large packets.
> > + * 3. In the same-host vm-vm scenario, packets marked as
> > + * VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> > + * processed by XDP.
> > + */
> > + if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > + err = virtnet_flow_dissect_udp_tcp(vi, skb);
> > + if (err < 0)
> > + return -EINVAL;
> > +
> > + skb->ip_summed = CHECKSUM_PARTIAL;
> > + } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> > + /* We want to benefit from this: XDP guarantees that packets marked
> > + * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> > + * are processed.
> > + */
> > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > void *buf, unsigned int len, void **ctx,
> > unsigned int *xdp_xmit,
> > --
> > 2.19.1.6.gb485710b
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-25 2:17 ` Heng Qi
@ 2023-06-25 21:53 ` Michael S. Tsirkin
0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2023-06-25 21:53 UTC (permalink / raw)
To: Heng Qi
Cc: Simon Horman, netdev, bpf, Jason Wang, Xuan Zhuo,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend
On Sun, Jun 25, 2023 at 10:17:15AM +0800, Heng Qi wrote:
>
>
> 在 2023/6/25 上午3:28, Simon Horman 写道:
> > On Sat, Jun 24, 2023 at 08:26:02PM +0800, Heng Qi wrote:
> >
> > ...
> >
> > > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > > + struct sk_buff *skb,
> > > + __u8 flags)
> > Hi Heng Qi,
> >
> > Unfortunately this appears to break an x86_64 allmodconfig build
> > with GCC 12.3.0:
> >
> > drivers/net/virtio_net.c:1671:12: error: 'virtnet_set_csum_after_xdp' defined but not used [-Werror=unused-function]
>
> I admit that this is a patch set, you can cherry-pick patches [1/3] and
> [2/3] together
> to make it compile without 'defined but not used' warning. If people don't
> want to
> separate [1/3] and [2/3], I can squash them into one. :)
>
> Thanks.
the answer is yes, do not break the build.
> > 1671 | static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > --
> > pw-bot: changes-requested
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-25 7:27 ` Heng Qi
@ 2023-06-26 2:09 ` Jason Wang
2023-06-26 3:34 ` Heng Qi
0 siblings, 1 reply; 11+ messages in thread
From: Jason Wang @ 2023-06-26 2:09 UTC (permalink / raw)
To: Heng Qi
Cc: netdev, bpf, Michael S . Tsirkin, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
On Sun, Jun 25, 2023 at 3:28 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
> > On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >
> > > Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
> > > for netdev) feature of the virtio-net driver conflicts with the loading
> > > of XDP, which is caused by the problem described in [1][2], that is,
> > > XDP may cause errors in partial csumed-related fields which can lead
> > > to packet dropping.
> > >
> > > In addition, when communicating between vm and vm on the same host, the
> > > receiving side vm will receive packets marked as
> > > VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
> > > XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
> > > be cleared, causing the packet dropping.
> > >
> > > This patch introduces a helper:
> > > 1. It will try to solve the above problems in the subsequent patch.
> > > 2. It parses UDP/TCP and calculates the pseudo-header checksum
> > > for virtio-net. virtio-net currently does not resolve VLANs nor
> > > SCTP CRC checksum offloading.
> >
> > Do we need to bother? Can we simply use skb_probe_transport_header()
> > and skb_checksum_help() which can simplify a lot of things?
>
> We need to. We only compute partial checksums (not complete checksums) for
> packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
> consider packets with the partial checksum to be valid by the
> protocol stack (Because it believes that the communication between the
> VMs of the same host is reliable.).
>
> In order to calculate the pseudo-header checksum, we need the IP address and the
> \field{check} position of the transport layer header.
> skb_probe_transport_header() only finds out the location of the
> transport header and does not provide us with other information we
> need. skb_checksum_help() is to calculate the complete checksum on the
> tx path, which is not our purpose.
A typo in my previous reply, actually, I meant skb_checksum_setup().
I think the point is to have a general purpose helper in the core
instead of duplicating codes in any specific driver.
Thanks
>
> Thanks!
>
> >
> > Thanks
> >
> > >
> > > [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
> > > [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
> > >
> > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > > drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 136 insertions(+)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 5a7f7a76b920..83ab9257043a 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -22,6 +22,7 @@
> > > #include <net/route.h>
> > > #include <net/xdp.h>
> > > #include <net/net_failover.h>
> > > +#include <net/ip6_checksum.h>
> > >
> > > static int napi_weight = NAPI_POLL_WEIGHT;
> > > module_param(napi_weight, int, 0444);
> > > @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
> > > skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
> > > }
> > >
> > > +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
> > > +{
> > > + struct net_device *dev = vi->dev;
> > > + struct flow_keys_basic keys;
> > > + struct udphdr *uh;
> > > + struct tcphdr *th;
> > > + int len, offset;
> > > +
> > > + /* The flow dissector needs this information. */
> > > + skb->dev = dev;
> > > + skb_reset_mac_header(skb);
> > > + skb->protocol = dev_parse_header_protocol(skb);
> > > + /* virtio-net does not need to resolve VLAN. */
> > > + skb_set_network_header(skb, ETH_HLEN);
> > > + if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> > > + NULL, 0, 0, 0, 0))
> > > + return -EINVAL;
> > > +
> > > + /* 1. Pseudo-header checksum calculation requires:
> > > + * (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
> > > + * 2. We don't parse SCTP because virtio-net currently doesn't
> > > + * support CRC checksum offloading for SCTP.
> > > + */
> > > + if (keys.basic.n_proto == htons(ETH_P_IP)) {
> > > + struct iphdr *iph;
> > > +
> > > + /* Flow dissector has verified that there is an IP header.
> > > + * So we do not need a pskb_may_pull().
> > > + */
> > > + iph = ip_hdr(skb);
> > > + if (iph->version != 4)
> > > + return -EINVAL;
> > > +
> > > + skb->transport_header = skb->mac_header + keys.control.thoff;
> > > + offset = skb_transport_offset(skb);
> > > + len = skb->len - offset;
> > > + if (keys.basic.ip_proto == IPPROTO_UDP) {
> > > + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > > + return -EINVAL;
> > > +
> > > + uh = udp_hdr(skb);
> > > + skb->csum_offset = offsetof(struct udphdr, check);
> > > + /* Although uh->len is already the 3rd parameter for the calculation
> > > + * of the pseudo-header checksum, we have already calculated the
> > > + * length of the transport layer, so use 'len' here directly.
> > > + */
> > > + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > > + IPPROTO_UDP, 0);
> > > + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > > + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > > + return -EINVAL;
> > > +
> > > + th = tcp_hdr(skb);
> > > + skb->csum_offset = offsetof(struct tcphdr, check);
> > > + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
> > > + IPPROTO_TCP, 0);
> > > + } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
> > > + } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
> > > + struct ipv6hdr *ip6h;
> > > +
> > > + ip6h = ipv6_hdr(skb);
> > > + if (ip6h->version != 6)
> > > + return -EINVAL;
> > > +
> > > + /* We have skipped the possible extension headers for IPv6.
> > > + * If there is a Routing Header, the tx's check value is calculated by
> > > + * final_dst, and that value is the rx's daddr.
> > > + */
> > > + skb->transport_header = skb->mac_header + keys.control.thoff;
> > > + offset = skb_transport_offset(skb);
> > > + len = skb->len - offset;
> > > + if (keys.basic.ip_proto == IPPROTO_UDP) {
> > > + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
> > > + return -EINVAL;
> > > +
> > > + uh = udp_hdr(skb);
> > > + skb->csum_offset = offsetof(struct udphdr, check);
> > > + uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > > + (const struct in6_addr *)&ip6h->daddr,
> > > + len, IPPROTO_UDP, 0);
> > > + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
> > > + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
> > > + return -EINVAL;
> > > +
> > > + th = tcp_hdr(skb);
> > > + skb->csum_offset = offsetof(struct tcphdr, check);
> > > + th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
> > > + (const struct in6_addr *)&ip6h->daddr,
> > > + len, IPPROTO_TCP, 0);
> > > + }
> > > + }
> > > +
> > > + skb->csum_start = skb->transport_header;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
> > > + struct sk_buff *skb,
> > > + __u8 flags)
> > > +{
> > > + int err;
> > > +
> > > + /* When XDP program is loaded, for example, the vm-vm scenario
> > > + * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
> > > + * will travel. Although these packets are safe from the point of
> > > + * view of the vm, to avoid modification by XDP and successful
> > > + * forwarding in the upper layer, we re-probe the necessary checksum
> > > + * related information: skb->csum_{start, offset}, pseudo-header csum.
> > > + *
> > > + * This benefits us:
> > > + * 1. XDP can be loaded when there's _F_GUEST_CSUM.
> > > + * 2. The device verifies the checksum of packets , especially
> > > + * benefiting for large packets.
> > > + * 3. In the same-host vm-vm scenario, packets marked as
> > > + * VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
> > > + * processed by XDP.
> > > + */
> > > + if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > > + err = virtnet_flow_dissect_udp_tcp(vi, skb);
> > > + if (err < 0)
> > > + return -EINVAL;
> > > +
> > > + skb->ip_summed = CHECKSUM_PARTIAL;
> > > + } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
> > > + /* We want to benefit from this: XDP guarantees that packets marked
> > > + * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
> > > + * are processed.
> > > + */
> > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > > void *buf, unsigned int len, void **ctx,
> > > unsigned int *xdp_xmit,
> > > --
> > > 2.19.1.6.gb485710b
> > >
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP
2023-06-26 2:09 ` Jason Wang
@ 2023-06-26 3:34 ` Heng Qi
0 siblings, 0 replies; 11+ messages in thread
From: Heng Qi @ 2023-06-26 3:34 UTC (permalink / raw)
To: Jason Wang
Cc: netdev, bpf, Michael S . Tsirkin, Xuan Zhuo, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
在 2023/6/26 上午10:09, Jason Wang 写道:
> On Sun, Jun 25, 2023 at 3:28 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>> On Sun, Jun 25, 2023 at 02:44:07PM +0800, Jason Wang wrote:
>>> On Sat, Jun 24, 2023 at 8:26 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>> Currently, the VIRTIO_NET_F_GUEST_CSUM (corresponds to NETIF_F_RXCSUM
>>>> for netdev) feature of the virtio-net driver conflicts with the loading
>>>> of XDP, which is caused by the problem described in [1][2], that is,
>>>> XDP may cause errors in partial csumed-related fields which can lead
>>>> to packet dropping.
>>>>
>>>> In addition, when communicating between vm and vm on the same host, the
>>>> receiving side vm will receive packets marked as
>>>> VIRTIO_NET_HDR_F_NEEDS_CSUM, but after these packets are processed by
>>>> XDP, the VIRTIO_NET_HDR_F_NEEDS_CSUM and skb CHECKSUM_PARTIAL flags will
>>>> be cleared, causing the packet dropping.
>>>>
>>>> This patch introduces a helper:
>>>> 1. It will try to solve the above problems in the subsequent patch.
>>>> 2. It parses UDP/TCP and calculates the pseudo-header checksum
>>>> for virtio-net. virtio-net currently does not resolve VLANs nor
>>>> SCTP CRC checksum offloading.
>>> Do we need to bother? Can we simply use skb_probe_transport_header()
>>> and skb_checksum_help() which can simplify a lot of things?
>> We need to. We only compute partial checksums (not complete checksums) for
>> packets marked as NEEDS_CSUM. Please see skb_csum_unnecessary(), which will
>> consider packets with the partial checksum to be valid by the
>> protocol stack (Because it believes that the communication between the
>> VMs of the same host is reliable.).
>>
>> In order to calculate the pseudo-header checksum, we need the IP address and the
>> \field{check} position of the transport layer header.
>> skb_probe_transport_header() only finds out the location of the
>> transport header and does not provide us with other information we
>> need. skb_checksum_help() is to calculate the complete checksum on the
>> tx path, which is not our purpose.
> A typo in my previous reply, actually, I meant skb_checksum_setup().
I think this could work for us! I'm going to modify some of the code to
make it work for us. ^^
>
> I think the point is to have a general purpose helper in the core
> instead of duplicating codes in any specific driver.
Absolutely.
Thanks!
>
> Thanks
>
>> Thanks!
>>
>>> Thanks
>>>
>>>> [1] commit 18ba58e1c234 ("virtio-net: fail XDP set if guest csum is negotiated")
>>>> [2] commit e59ff2c49ae1 ("virtio-net: disable guest csum during XDP set")
>>>>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>> ---
>>>> drivers/net/virtio_net.c | 136 +++++++++++++++++++++++++++++++++++++++
>>>> 1 file changed, 136 insertions(+)
>>>>
>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>> index 5a7f7a76b920..83ab9257043a 100644
>>>> --- a/drivers/net/virtio_net.c
>>>> +++ b/drivers/net/virtio_net.c
>>>> @@ -22,6 +22,7 @@
>>>> #include <net/route.h>
>>>> #include <net/xdp.h>
>>>> #include <net/net_failover.h>
>>>> +#include <net/ip6_checksum.h>
>>>>
>>>> static int napi_weight = NAPI_POLL_WEIGHT;
>>>> module_param(napi_weight, int, 0444);
>>>> @@ -1568,6 +1569,141 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
>>>> skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
>>>> }
>>>>
>>>> +static int virtnet_flow_dissect_udp_tcp(struct virtnet_info *vi, struct sk_buff *skb)
>>>> +{
>>>> + struct net_device *dev = vi->dev;
>>>> + struct flow_keys_basic keys;
>>>> + struct udphdr *uh;
>>>> + struct tcphdr *th;
>>>> + int len, offset;
>>>> +
>>>> + /* The flow dissector needs this information. */
>>>> + skb->dev = dev;
>>>> + skb_reset_mac_header(skb);
>>>> + skb->protocol = dev_parse_header_protocol(skb);
>>>> + /* virtio-net does not need to resolve VLAN. */
>>>> + skb_set_network_header(skb, ETH_HLEN);
>>>> + if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
>>>> + NULL, 0, 0, 0, 0))
>>>> + return -EINVAL;
>>>> +
>>>> + /* 1. Pseudo-header checksum calculation requires:
>>>> + * (1) saddr/daddr (2) IP_PROTO (3) length of transport payload
>>>> + * 2. We don't parse SCTP because virtio-net currently doesn't
>>>> + * support CRC checksum offloading for SCTP.
>>>> + */
>>>> + if (keys.basic.n_proto == htons(ETH_P_IP)) {
>>>> + struct iphdr *iph;
>>>> +
>>>> + /* Flow dissector has verified that there is an IP header.
>>>> + * So we do not need a pskb_may_pull().
>>>> + */
>>>> + iph = ip_hdr(skb);
>>>> + if (iph->version != 4)
>>>> + return -EINVAL;
>>>> +
>>>> + skb->transport_header = skb->mac_header + keys.control.thoff;
>>>> + offset = skb_transport_offset(skb);
>>>> + len = skb->len - offset;
>>>> + if (keys.basic.ip_proto == IPPROTO_UDP) {
>>>> + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
>>>> + return -EINVAL;
>>>> +
>>>> + uh = udp_hdr(skb);
>>>> + skb->csum_offset = offsetof(struct udphdr, check);
>>>> + /* Although uh->len is already the 3rd parameter for the calculation
>>>> + * of the pseudo-header checksum, we have already calculated the
>>>> + * length of the transport layer, so use 'len' here directly.
>>>> + */
>>>> + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
>>>> + IPPROTO_UDP, 0);
>>>> + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
>>>> + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
>>>> + return -EINVAL;
>>>> +
>>>> + th = tcp_hdr(skb);
>>>> + skb->csum_offset = offsetof(struct tcphdr, check);
>>>> + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, len,
>>>> + IPPROTO_TCP, 0);
>>>> + } /* virtio-net doesn't support checksums for SCTP crc hw offloading.*/
>>>> + } else if (keys.basic.n_proto == htons(ETH_P_IPV6)) {
>>>> + struct ipv6hdr *ip6h;
>>>> +
>>>> + ip6h = ipv6_hdr(skb);
>>>> + if (ip6h->version != 6)
>>>> + return -EINVAL;
>>>> +
>>>> + /* We have skipped the possible extension headers for IPv6.
>>>> + * If there is a Routing Header, the tx's check value is calculated by
>>>> + * final_dst, and that value is the rx's daddr.
>>>> + */
>>>> + skb->transport_header = skb->mac_header + keys.control.thoff;
>>>> + offset = skb_transport_offset(skb);
>>>> + len = skb->len - offset;
>>>> + if (keys.basic.ip_proto == IPPROTO_UDP) {
>>>> + if (!pskb_may_pull(skb, offset + sizeof(struct udphdr)))
>>>> + return -EINVAL;
>>>> +
>>>> + uh = udp_hdr(skb);
>>>> + skb->csum_offset = offsetof(struct udphdr, check);
>>>> + uh->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
>>>> + (const struct in6_addr *)&ip6h->daddr,
>>>> + len, IPPROTO_UDP, 0);
>>>> + } else if (keys.basic.ip_proto == IPPROTO_TCP) {
>>>> + if (!pskb_may_pull(skb, offset + sizeof(struct tcphdr)))
>>>> + return -EINVAL;
>>>> +
>>>> + th = tcp_hdr(skb);
>>>> + skb->csum_offset = offsetof(struct tcphdr, check);
>>>> + th->check = ~csum_ipv6_magic((const struct in6_addr *)&ip6h->saddr,
>>>> + (const struct in6_addr *)&ip6h->daddr,
>>>> + len, IPPROTO_TCP, 0);
>>>> + }
>>>> + }
>>>> +
>>>> + skb->csum_start = skb->transport_header;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>>>> + struct sk_buff *skb,
>>>> + __u8 flags)
>>>> +{
>>>> + int err;
>>>> +
>>>> + /* When XDP program is loaded, for example, the vm-vm scenario
>>>> + * on the same host, packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM
>>>> + * will travel. Although these packets are safe from the point of
>>>> + * view of the vm, to avoid modification by XDP and successful
>>>> + * forwarding in the upper layer, we re-probe the necessary checksum
>>>> + * related information: skb->csum_{start, offset}, pseudo-header csum.
>>>> + *
>>>> + * This benefits us:
>>>> + * 1. XDP can be loaded when there's _F_GUEST_CSUM.
>>>> + * 2. The device verifies the checksum of packets , especially
>>>> + * benefiting for large packets.
>>>> + * 3. In the same-host vm-vm scenario, packets marked as
>>>> + * VIRTIO_NET_HDR_F_NEEDS_CSUM are no longer dropped after being
>>>> + * processed by XDP.
>>>> + */
>>>> + if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
>>>> + err = virtnet_flow_dissect_udp_tcp(vi, skb);
>>>> + if (err < 0)
>>>> + return -EINVAL;
>>>> +
>>>> + skb->ip_summed = CHECKSUM_PARTIAL;
>>>> + } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
>>>> + /* We want to benefit from this: XDP guarantees that packets marked
>>>> + * as VIRTIO_NET_HDR_F_DATA_VALID still have correct csum after they
>>>> + * are processed.
>>>> + */
>>>> + skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>>> void *buf, unsigned int len, void **ctx,
>>>> unsigned int *xdp_xmit,
>>>> --
>>>> 2.19.1.6.gb485710b
>>>>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-26 3:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-24 12:26 [PATCH net-next v2 0/3] virtio-net: avoid conflicts between XDP and GUEST_CSUM Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 1/3] virtio-net: reprobe csum related fields for skb passed by XDP Heng Qi
2023-06-24 19:28 ` Simon Horman
2023-06-25 2:17 ` Heng Qi
2023-06-25 21:53 ` Michael S. Tsirkin
2023-06-25 6:44 ` Jason Wang
2023-06-25 7:27 ` Heng Qi
2023-06-26 2:09 ` Jason Wang
2023-06-26 3:34 ` Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 2/3] virtio-net: support coexistence of XDP and GUEST_CSUM Heng Qi
2023-06-24 12:26 ` [PATCH net-next v2 3/3] virtio-net: remove GUEST_CSUM check for XDP Heng Qi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).