From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0334639AFD for ; Mon, 8 Jul 2024 08:09:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720426146; cv=none; b=LtGaBerQzSHYiNkTTjYGdWPAQBXIHX3i6ASNQikhVM3Q2ScxV0GEvdN96Y4BwIqviPuSts98jzQ+zrk7bi20bxmmcB6/TxGXAY1ouABPlG3D2SAlmeXeTH9kLfkMN1ytwgz7yC5MUAR6j1Hi3c5ZofmcvAgKxxK4/q1s1lNPKQA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720426146; c=relaxed/simple; bh=qMf0cWEIBDJI1wPVxIfS9rEbR2iypK+ArwrlvfqcS4Y=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=X5I9uetcvMLsQFxNhfWiOnY3j6SGpHN0Jr27fBdvHzO+GpXwY6M24AQL6uhZov9JS3QcL+1dm3fD6iy4fPhJT1Uv3rcpTU1zzHrwYRBstgi32jdxMNVIRS+wpsNc2m8SrOhCm3bH4PpkShUrzPx7r+vcJVc5uSuHJAw0l4mPMxQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TTtONQcM; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TTtONQcM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1720426144; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CLTQ2PKwxy+i8qaPB4UwrsoPh7735CAQV0lSOIofn4k=; b=TTtONQcMx9OVqQCgNCV6eUYbWSBd3T9Y/oy2/Sa+Zv8I3glrZUKnXaNe5giZNlhFrCwwXH 0Y+UN6AT9sJj6N1tAMIVj+KRv0v6+d2Tg6Cq95FYCLGsQy0bY7EqZQDz6eBKU0+BijrVT0 1i9999i4/MmRmjQSDCp59wVe2Vz0QM4= Received: from mail-oi1-f200.google.com (mail-oi1-f200.google.com [209.85.167.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-259-i7E0XY3ANvGNgN5tdJAyKg-1; Mon, 08 Jul 2024 04:08:57 -0400 X-MC-Unique: i7E0XY3ANvGNgN5tdJAyKg-1 Received: by mail-oi1-f200.google.com with SMTP id 5614622812f47-3d9253ff5d0so1821664b6e.0 for ; Mon, 08 Jul 2024 01:08:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720426137; x=1721030937; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CLTQ2PKwxy+i8qaPB4UwrsoPh7735CAQV0lSOIofn4k=; b=fq74U6VPBmP7r/HuO/eLYC2RboEh/ezZULnadDQ+5ah23kOWIXzi5UgNicV3NIlG51 3sE4CwpE3aRRjdiT/1ndhbpMYZR3TuTd+uyUO3CBJZepWAwSitjksyl2bZnWVqvIiHfV Lm2/XMUhvVwMOazbqGmsbGTI1C4yZXBjqn/pZPgnlrTR/F7dMYEcZWG8W9TO+FPbLIef D9F06imKtdUmKIYg80cq0V4U+xPY8zj/uPLLKTjYoOF8dOIaeLtgAhtFlDPHgJiVLfgJ vEyWP8YycH/jJNB2WyVjCZt8/1CxechCNdDKG9sT3SNuAdcm1XPW1TpJCoWft0HHCO9i 21jA== X-Gm-Message-State: AOJu0Yxp4H3/uYYpCjKxtUjjIpWoE1/ko6SdG9pCjKnMnjQc5RM1PN8q R730Oyyv/0VkqMZN6a6UIUQq7f1ofom04h2VCHOqVbkXjXyZHQtDSh9x7m5G1BZkUwoeL8LfoIY GxOtltG0e7fIddtT02cqhwxE/qqPxEqJWK4NcKX10s3q1JubemjkXpYbG3+t47iqYWiHnCncOG+ VQ7ofGxcmRcfOTkyBzj5gho0xtFksM X-Received: by 2002:a05:6808:1392:b0:3d9:1f05:845 with SMTP id 5614622812f47-3d91f050a43mr10973209b6e.19.1720426136684; Mon, 08 Jul 2024 01:08:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHx08npK6Y9eZ9wfcXTZ97ykJuf1vmZybpLRDDSHLjVuDj4PfnM/DqOi0LEjJIYUprg6VDeGTzORQHHaGgkqlk= X-Received: by 2002:a05:6808:1392:b0:3d9:1f05:845 with SMTP id 5614622812f47-3d91f050a43mr10973179b6e.19.1720426135888; Mon, 08 Jul 2024 01:08:55 -0700 (PDT) Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240705073734.93905-1-xuanzhuo@linux.alibaba.com> <20240705073734.93905-10-xuanzhuo@linux.alibaba.com> <1720424536.972943-1-xuanzhuo@linux.alibaba.com> In-Reply-To: <1720424536.972943-1-xuanzhuo@linux.alibaba.com> From: Jason Wang Date: Mon, 8 Jul 2024 16:08:44 +0800 Message-ID: Subject: Re: [PATCH net-next v7 09/10] virtio_net: xsk: rx: support recv small mode To: Xuan Zhuo Cc: netdev@vger.kernel.org, "Michael S. Tsirkin" , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux.dev, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jul 8, 2024 at 3:47=E2=80=AFPM Xuan Zhuo wrote: > > On Mon, 8 Jul 2024 15:00:50 +0800, Jason Wang wrote= : > > On Fri, Jul 5, 2024 at 3:38=E2=80=AFPM Xuan Zhuo wrote: > > > > > > In the process: > > > 1. We may need to copy data to create skb for XDP_PASS. > > > 2. We may need to call xsk_buff_free() to release the buffer. > > > 3. The handle for xdp_buff is difference from the buffer. > > > > > > If we pushed this logic into existing receive handle(merge and small)= , > > > we would have to maintain code scattered inside merge and small (and = big). > > > So I think it is a good choice for us to put the xsk code into an > > > independent function. > > > > > > Signed-off-by: Xuan Zhuo > > > --- > > > > > > v7: > > > 1. rename xdp_construct_skb to xsk_construct_skb > > > 2. refactor virtnet_receive() > > > > > > drivers/net/virtio_net.c | 176 +++++++++++++++++++++++++++++++++++++= -- > > > 1 file changed, 168 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > index 2b27f5ada64a..64d8cd481890 100644 > > > --- a/drivers/net/virtio_net.c > > > +++ b/drivers/net/virtio_net.c > > > @@ -498,6 +498,12 @@ struct virtio_net_common_hdr { > > > }; > > > > > > static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *b= uf); > > > +static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp= _buff *xdp, > > > + struct net_device *dev, > > > + unsigned int *xdp_xmit, > > > + struct virtnet_rq_stats *stats); > > > +static void virtnet_receive_done(struct virtnet_info *vi, struct rec= eive_queue *rq, > > > + struct sk_buff *skb, u8 flags); > > > > > > static bool is_xdp_frame(void *ptr) > > > { > > > @@ -1062,6 +1068,124 @@ static void sg_fill_dma(struct scatterlist *s= g, dma_addr_t addr, u32 len) > > > sg->length =3D len; > > > } > > > > > > +static struct xdp_buff *buf_to_xdp(struct virtnet_info *vi, > > > + struct receive_queue *rq, void *bu= f, u32 len) > > > +{ > > > + struct xdp_buff *xdp; > > > + u32 bufsize; > > > + > > > + xdp =3D (struct xdp_buff *)buf; > > > + > > > + bufsize =3D xsk_pool_get_rx_frame_size(rq->xsk_pool) + vi->hd= r_len; > > > + > > > + if (unlikely(len > bufsize)) { > > > + pr_debug("%s: rx error: len %u exceeds truesize %u\n"= , > > > + vi->dev->name, len, bufsize); > > > + DEV_STATS_INC(vi->dev, rx_length_errors); > > > + xsk_buff_free(xdp); > > > + return NULL; > > > + } > > > + > > > + xsk_buff_set_size(xdp, len); > > > + xsk_buff_dma_sync_for_cpu(xdp); > > > + > > > + return xdp; > > > +} > > > + > > > +static struct sk_buff *xsk_construct_skb(struct receive_queue *rq, > > > + struct xdp_buff *xdp) > > > +{ > > > + unsigned int metasize =3D xdp->data - xdp->data_meta; > > > + struct sk_buff *skb; > > > + unsigned int size; > > > + > > > + size =3D xdp->data_end - xdp->data_hard_start; > > > + skb =3D napi_alloc_skb(&rq->napi, size); > > > + if (unlikely(!skb)) { > > > + xsk_buff_free(xdp); > > > + return NULL; > > > + } > > > + > > > + skb_reserve(skb, xdp->data_meta - xdp->data_hard_start); > > > + > > > + size =3D xdp->data_end - xdp->data_meta; > > > + memcpy(__skb_put(skb, size), xdp->data_meta, size); > > > + > > > + if (metasize) { > > > + __skb_pull(skb, metasize); > > > + skb_metadata_set(skb, metasize); > > > + } > > > + > > > + xsk_buff_free(xdp); > > > + > > > + return skb; > > > +} > > > + > > > +static struct sk_buff *virtnet_receive_xsk_small(struct net_device *= dev, struct virtnet_info *vi, > > > + struct receive_queue= *rq, struct xdp_buff *xdp, > > > + unsigned int *xdp_xm= it, > > > + struct virtnet_rq_st= ats *stats) > > > +{ > > > + struct bpf_prog *prog; > > > + u32 ret; > > > + > > > + ret =3D XDP_PASS; > > > + rcu_read_lock(); > > > + prog =3D rcu_dereference(rq->xdp_prog); > > > + if (prog) > > > + ret =3D virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,= stats); > > > + rcu_read_unlock(); > > > + > > > + switch (ret) { > > > + case XDP_PASS: > > > + return xsk_construct_skb(rq, xdp); > > > + > > > + case XDP_TX: > > > + case XDP_REDIRECT: > > > + return NULL; > > > + > > > + default: > > > + /* drop packet */ > > > + xsk_buff_free(xdp); > > > + u64_stats_inc(&stats->drops); > > > + return NULL; > > > + } > > > +} > > > + > > > +static void virtnet_receive_xsk_buf(struct virtnet_info *vi, struct = receive_queue *rq, > > > + void *buf, u32 len, > > > + unsigned int *xdp_xmit, > > > + struct virtnet_rq_stats *stats) > > > +{ > > > + struct net_device *dev =3D vi->dev; > > > + struct sk_buff *skb =3D NULL; > > > + struct xdp_buff *xdp; > > > + u8 flags; > > > + > > > + len -=3D vi->hdr_len; > > > + > > > + u64_stats_add(&stats->bytes, len); > > > + > > > + xdp =3D buf_to_xdp(vi, rq, buf, len); > > > + if (!xdp) > > > + return; > > > + > > > + if (unlikely(len < ETH_HLEN)) { > > > + pr_debug("%s: short packet %i\n", dev->name, len); > > > + DEV_STATS_INC(dev, rx_length_errors); > > > + xsk_buff_free(xdp); > > > + return; > > > + } > > > + > > > + flags =3D ((struct virtio_net_common_hdr *)(xdp->data - vi->h= dr_len))->hdr.flags; > > > + > > > + if (!vi->mergeable_rx_bufs) > > > + skb =3D virtnet_receive_xsk_small(dev, vi, rq, xdp, x= dp_xmit, stats); > > > > I wonder if we add the mergeable support in the next patch would it be > > better to re-order the patch? For example, the xsk binding needs to be > > moved to the last patch, otherwise we break xsk with a mergeable > > buffer here? > > If you worry that the user works with this commit, I want to say you do n= ot > worry. > > Because the flags NETDEV_XDP_ACT_XSK_ZEROCOPY is not added. I plan to add= that > after the tx is completed. Ok, this is something I missed, it would be better to mention it somewhere (or it is already there but I miss it). > > I do test by adding this flags locally. > > Thanks. Acked-by: Jason Wang Thanks > > > > > Or anything I missed here? > > > > Thanks > > >