From: Heng Qi <hengqi@linux.alibaba.com>
To: Jason Wang <jasowang@redhat.com>
Cc: netdev@vger.kernel.org, virtualization@lists.linux.dev,
"Thomas Huth" <thuth@linux.vnet.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Eric Dumazet" <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>
Subject: Re: [PATCH 2/2] virtio_net: fixing XDP for fully checksummed packets handling
Date: Thu, 20 Jun 2024 17:28:48 +0800 [thread overview]
Message-ID: <1718875728.9338605-7-hengqi@linux.alibaba.com> (raw)
In-Reply-To: <CACGkMEsa3AsPkweqS0-BEjSw5sKW_XM669HVSN_eX7-8KVG8tQ@mail.gmail.com>
On Thu, 20 Jun 2024 16:33:35 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Tue, Jun 18, 2024 at 11:17 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> > On Tue, 18 Jun 2024 11:10:26 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Mon, Jun 17, 2024 at 9:15 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > > >
> > > > The XDP program can't correctly handle partially checksummed
> > > > packets, but works fine with fully checksummed packets.
> > >
> > > Not sure this is ture, if I was not wrong, XDP can try to calculate checksum.
> >
> > XDP's interface serves a full checksum,
>
> What do you mean by "serve" here? I mean, XDP can calculate the
> checksum and fill it in the packet by itself.
>
Yes, XDP can parse and calculate checksums for all packets.
However, the bpf_csum_diff and bpf_l4_csum_replace APIs provided by XDP assume
that the packets being processed are fully checksumed packets. That is,
after the XDP program modified the packets, the incremental checksum can be
calculated (for example, samples/bpf/tcbpf1_kern.c, samples/bpf/test_lwt_bpf.c).
Therefore, partially checksummed packets cannot be processed normally in these
examples and need to be discarded.
> > and this is why we disabled the
> > offloading of VIRTIO_NET_F_GUEST_CSUM when loading XDP.
>
> If we trust the device to disable VIRTIO_NET_F_GUEST_CSUM, any reason
> to check VIRTIO_NET_HDR_F_NEEDS_CSUM again in the receive path?
There doesn't seem to be a mandatory constraint in the spec that devices that
haven't negotiated VIRTIO_NET_F_GUEST_CSUM cannot set NEEDS_CSUM bit, so I check this.
Thanks.
>
> >
> > Thanks.
>
> Thanks
>
> >
> > >
> > > Thanks
> > >
> > > > If the
> > > > device has already validated fully checksummed packets, then
> > > > the driver doesn't need to re-validate them, saving CPU resources.
> > > >
> > > > Additionally, the driver does not drop all partially checksummed
> > > > packets when VIRTIO_NET_F_GUEST_CSUM is not negotiated. This is
> > > > not a bug, as the driver has always done this.
> > > >
> > > > Fixes: 436c9453a1ac ("virtio-net: keep vnet header zeroed after processing XDP")
> > > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > > ---
> > > > drivers/net/virtio_net.c | 20 +++++++++++++++++++-
> > > > 1 file changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index aa70a7ed8072..ea10db9a09fa 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -1360,6 +1360,10 @@ static struct sk_buff *receive_small_xdp(struct net_device *dev,
> > > > if (unlikely(hdr->hdr.gso_type))
> > > > goto err_xdp;
> > > >
> > > > + /* Partially checksummed packets must be dropped. */
> > > > + if (unlikely(hdr->hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM))
> > > > + goto err_xdp;
> > > > +
> > > > buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
> > > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > > >
> > > > @@ -1677,6 +1681,10 @@ static void *mergeable_xdp_get_buf(struct virtnet_info *vi,
> > > > if (unlikely(hdr->hdr.gso_type))
> > > > return NULL;
> > > >
> > > > + /* Partially checksummed packets must be dropped. */
> > > > + if (unlikely(hdr->hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM))
> > > > + return NULL;
> > > > +
> > > > /* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> > > > * with headroom may add hole in truesize, which
> > > > * make their length exceed PAGE_SIZE. So we disabled the
> > > > @@ -1943,6 +1951,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > > > struct net_device *dev = vi->dev;
> > > > struct sk_buff *skb;
> > > > struct virtio_net_common_hdr *hdr;
> > > > + u8 flags;
> > > >
> > > > if (unlikely(len < vi->hdr_len + ETH_HLEN)) {
> > > > pr_debug("%s: short packet %i\n", dev->name, len);
> > > > @@ -1951,6 +1960,15 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > > > return;
> > > > }
> > > >
> > > > + /* 1. Save the flags early, as the XDP program might overwrite them.
> > > > + * These flags ensure packets marked as VIRTIO_NET_HDR_F_DATA_VALID
> > > > + * stay valid after XDP processing.
> > > > + * 2. XDP doesn't work with partially checksummed packets (refer to
> > > > + * virtnet_xdp_set()), so packets marked as
> > > > + * VIRTIO_NET_HDR_F_NEEDS_CSUM get dropped during XDP processing.
> > > > + */
> > > > + flags = ((struct virtio_net_common_hdr *)buf)->hdr.flags;
> > > > +
> > > > if (vi->mergeable_rx_bufs)
> > > > skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit,
> > > > stats);
> > > > @@ -1966,7 +1984,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
> > > > if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report)
> > > > virtio_skb_set_hash(&hdr->hash_v1_hdr, skb);
> > > >
> > > > - if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
> > > > + if (flags & VIRTIO_NET_HDR_F_DATA_VALID)
> > > > skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > >
> > > > if (virtio_net_hdr_to_skb(skb, &hdr->hdr,
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > > >
> > >
> >
>
next prev parent reply other threads:[~2024-06-20 9:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-17 13:15 [PATCH 0/2] virtio_net: fixes for checksum offloading and XDP handling Heng Qi
2024-06-17 13:15 ` [PATCH 1/2] virtio_net: checksum offloading handling fix Heng Qi
2024-06-17 13:34 ` Jiri Pirko
2024-06-18 3:01 ` Jason Wang
2024-06-18 3:09 ` Heng Qi
2024-06-19 1:15 ` Jakub Kicinski
2024-06-19 2:02 ` Heng Qi
2024-06-19 15:08 ` Jakub Kicinski
2024-06-19 15:44 ` Heng Qi
2024-06-20 8:29 ` Jason Wang
2024-06-17 13:15 ` [PATCH 2/2] virtio_net: fixing XDP for fully checksummed packets handling Heng Qi
2024-06-18 3:10 ` Jason Wang
2024-06-18 3:15 ` Heng Qi
2024-06-20 8:33 ` Jason Wang
2024-06-20 9:28 ` Heng Qi [this message]
2024-06-20 10:19 ` Michael S. Tsirkin
2024-06-20 10:27 ` Heng Qi
2024-06-20 10:38 ` Heng Qi
2024-06-20 10:45 ` Michael S. Tsirkin
2024-06-19 10:00 ` [PATCH 0/2] virtio_net: fixes for checksum offloading and XDP handling patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1718875728.9338605-7-hengqi@linux.alibaba.com \
--to=hengqi@linux.alibaba.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=hawk@kernel.org \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=thuth@linux.vnet.ibm.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).