From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6867420F8 for ; Tue, 11 Jul 2023 02:59:23 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4800CE5F for ; Mon, 10 Jul 2023 19:59:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689044347; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PxfqrcaRfexPxhBCaGYWEZozEs8ut+Sm+p/oBwmLaAU=; b=QXIH3XWH7zECzRBmAe7StBKGLlXHGu6ivNY0fCivX0Ci6giiJfSsJrh8YMh97zsMIMh8M/ h9ZmB6ker1/IRkmP7jsn7DHlHFgkpjRgV9htYzbveTfBl+TwdW4Ak9wd0F2CuqHDQz+hHM OcgYjd1fyUMmd1xjhZK42Yussn0MVBM= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-107-60S54xucP7mdmu6FByiQlw-1; Mon, 10 Jul 2023 22:59:04 -0400 X-MC-Unique: 60S54xucP7mdmu6FByiQlw-1 Received: by mail-lj1-f200.google.com with SMTP id 38308e7fff4ca-2b70bfcd15aso36699181fa.0 for ; Mon, 10 Jul 2023 19:59:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689044343; x=1691636343; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PxfqrcaRfexPxhBCaGYWEZozEs8ut+Sm+p/oBwmLaAU=; b=diiX+egeniv7SuwPeKSQ+56PbpdIJNKCg/1cWUqNOc9HLHhj13ye2YfO29c69nJ6/i QTF08OqnQ5cv/5Po2f8+Zh3WBJP5mVj4peznJBFshe48RUOcw2DLUaTfkFvNQbbkKICc VuorTX6wWbPI3JnpfZi2op9D+NHAE4TUQl+zZjWxvdXr4jwb3g7mtIVfAlOVbGNQtLVH phOXKNzaGNJBkVOlGWn3cRMTe60pmW+MqONb7yVzMyyRmkF9/HTfcKFITJo5kASnj8uQ Ai2j8pB2OQ6tC/KkHonL6Cj1xq+1iQst7xIjSU5OBmQs6uZj4JYQaVNFONzhlWQUe3xo kxOg== X-Gm-Message-State: ABy/qLb8ekbKGLnBNgveSLccEwCJ3f2kXuYwTZOm4Ycw9Ld/oGLd3Pl0 0ojCxgVvE8w67L3F6xWPrxsrgM0ObmLTBNp1CCPjSAZMuNyQ1NWxsd7JsjsOGVs1Zb7nsmJvz2X xpY+LOOOWW29qxxE2AK7YCGz7Mmgt/ZGU X-Received: by 2002:a05:651c:1045:b0:2b6:d13a:8e34 with SMTP id x5-20020a05651c104500b002b6d13a8e34mr12465489ljm.46.1689044343319; Mon, 10 Jul 2023 19:59:03 -0700 (PDT) X-Google-Smtp-Source: APBJJlHQZMl+Tq9BuH64H6HlJtXAKQm/zvA47Cs6Hq9Q+ZhPYfKIRBC3Ud8J68QqpxMZJqXVE86GvsV/DD/t2oV60U0= X-Received: by 2002:a05:651c:1045:b0:2b6:d13a:8e34 with SMTP id x5-20020a05651c104500b002b6d13a8e34mr12465476ljm.46.1689044342947; Mon, 10 Jul 2023 19:59:02 -0700 (PDT) Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20230710034237.12391-1-xuanzhuo@linux.alibaba.com> <20230710034237.12391-11-xuanzhuo@linux.alibaba.com> <20230710051818-mutt-send-email-mst@kernel.org> <1688984310.480753-2-xuanzhuo@linux.alibaba.com> <20230710075534-mutt-send-email-mst@kernel.org> <1688992712.1534917-3-xuanzhuo@linux.alibaba.com> <1689043238.4362252-1-xuanzhuo@linux.alibaba.com> In-Reply-To: <1689043238.4362252-1-xuanzhuo@linux.alibaba.com> From: Jason Wang Date: Tue, 11 Jul 2023 10:58:51 +0800 Message-ID: Subject: Re: [PATCH vhost v11 10/10] virtio_net: merge dma operation for one page To: Xuan Zhuo Cc: "Michael S. Tsirkin" , virtualization@lists.linux-foundation.org, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , netdev@vger.kernel.org, bpf@vger.kernel.org, Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On Tue, Jul 11, 2023 at 10:42=E2=80=AFAM Xuan Zhuo wrote: > > On Tue, 11 Jul 2023 10:36:17 +0800, Jason Wang wrot= e: > > On Mon, Jul 10, 2023 at 8:41=E2=80=AFPM Xuan Zhuo wrote: > > > > > > On Mon, 10 Jul 2023 07:59:03 -0400, "Michael S. Tsirkin" wrote: > > > > On Mon, Jul 10, 2023 at 06:18:30PM +0800, Xuan Zhuo wrote: > > > > > On Mon, 10 Jul 2023 05:40:21 -0400, "Michael S. Tsirkin" wrote: > > > > > > On Mon, Jul 10, 2023 at 11:42:37AM +0800, Xuan Zhuo wrote: > > > > > > > Currently, the virtio core will perform a dma operation for e= ach > > > > > > > operation. Although, the same page may be operated multiple t= imes. > > > > > > > > > > > > > > The driver does the dma operation and manages the dma address= based the > > > > > > > feature premapped of virtio core. > > > > > > > > > > > > > > This way, we can perform only one dma operation for the same = page. In > > > > > > > the case of mtu 1500, this can reduce a lot of dma operations= . > > > > > > > > > > > > > > Tested on Aliyun g7.4large machine, in the case of a cpu 100%= , pps > > > > > > > increased from 1893766 to 1901105. An increase of 0.4%. > > > > > > > > > > > > what kind of dma was there? an IOMMU? which vendors? in which m= ode > > > > > > of operation? > > > > > > > > > > > > > > > Do you mean this: > > > > > > > > > > [ 0.470816] iommu: Default domain type: Passthrough > > > > > > > > > > > > > With passthrough, dma API is just some indirect function calls, the= y do > > > > not affect the performance a lot. > > > > > > > > > Yes, this benefit is worthless. I seem to have done a meaningless thi= ng. The > > > overhead of DMA I observed is indeed not too high. > > > > Have you measured with iommu=3Dstrict? > > I have not tested this way, our environment is pt, I wonder if strict is = a > common scenario. I can test it. It's not a common setup, but it's a way to stress DMA layer to see the over= head. Thanks > > Thanks. > > > > > > Thanks > > > > > > > > Thanks. > > > > > > > > > > > > > > Try e.g. bounce buffer. Which is where you will see a problem: your > > > > patches won't work. > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > > > > This kind of difference is likely in the noise. > > > > > > > > > > It's really not high, but this is because the proportion of DMA u= nder perf top > > > > > is not high. Probably that much. > > > > > > > > So maybe not worth the complexity. > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > drivers/net/virtio_net.c | 283 +++++++++++++++++++++++++++++= +++++++--- > > > > > > > 1 file changed, 267 insertions(+), 16 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_ne= t.c > > > > > > > index 486b5849033d..4de845d35bed 100644 > > > > > > > --- a/drivers/net/virtio_net.c > > > > > > > +++ b/drivers/net/virtio_net.c > > > > > > > @@ -126,6 +126,27 @@ static const struct virtnet_stat_desc vi= rtnet_rq_stats_desc[] =3D { > > > > > > > #define VIRTNET_SQ_STATS_LEN ARRAY_SIZE(virtnet_sq_stats_d= esc) > > > > > > > #define VIRTNET_RQ_STATS_LEN ARRAY_SIZE(virtnet_rq_stats_d= esc) > > > > > > > > > > > > > > +/* The bufs on the same page may share this struct. */ > > > > > > > +struct virtnet_rq_dma { > > > > > > > + struct virtnet_rq_dma *next; > > > > > > > + > > > > > > > + dma_addr_t addr; > > > > > > > + > > > > > > > + void *buf; > > > > > > > + u32 len; > > > > > > > + > > > > > > > + u32 ref; > > > > > > > +}; > > > > > > > + > > > > > > > +/* Record the dma and buf. */ > > > > > > > > > > > > I guess I see that. But why? > > > > > > And these two comments are the extent of the available > > > > > > documentation, that's not enough I feel. > > > > > > > > > > > > > > > > > > > +struct virtnet_rq_data { > > > > > > > + struct virtnet_rq_data *next; > > > > > > > > > > > > Is manually reimplementing a linked list the best > > > > > > we can do? > > > > > > > > > > Yes, we can use llist. > > > > > > > > > > > > > > > > > > + > > > > > > > + void *buf; > > > > > > > + > > > > > > > + struct virtnet_rq_dma *dma; > > > > > > > +}; > > > > > > > + > > > > > > > /* Internal representation of a send virtqueue */ > > > > > > > struct send_queue { > > > > > > > /* Virtqueue associated with this send _queue */ > > > > > > > @@ -175,6 +196,13 @@ struct receive_queue { > > > > > > > char name[16]; > > > > > > > > > > > > > > struct xdp_rxq_info xdp_rxq; > > > > > > > + > > > > > > > + struct virtnet_rq_data *data_array; > > > > > > > + struct virtnet_rq_data *data_free; > > > > > > > + > > > > > > > + struct virtnet_rq_dma *dma_array; > > > > > > > + struct virtnet_rq_dma *dma_free; > > > > > > > + struct virtnet_rq_dma *last_dma; > > > > > > > }; > > > > > > > > > > > > > > /* This structure can contain rss message with maximum setti= ngs for indirection table and keysize > > > > > > > @@ -549,6 +577,176 @@ static struct sk_buff *page_to_skb(stru= ct virtnet_info *vi, > > > > > > > return skb; > > > > > > > } > > > > > > > > > > > > > > +static void virtnet_rq_unmap(struct receive_queue *rq, struc= t virtnet_rq_dma *dma) > > > > > > > +{ > > > > > > > + struct device *dev; > > > > > > > + > > > > > > > + --dma->ref; > > > > > > > + > > > > > > > + if (dma->ref) > > > > > > > + return; > > > > > > > + > > > > > > > > > > > > If you don't unmap there is no guarantee valid data will be > > > > > > there in the buffer. > > > > > > > > > > > > > + dev =3D virtqueue_dma_dev(rq->vq); > > > > > > > + > > > > > > > + dma_unmap_page(dev, dma->addr, dma->len, DMA_FROM_DEV= ICE); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + > > > > > > > + dma->next =3D rq->dma_free; > > > > > > > + rq->dma_free =3D dma; > > > > > > > +} > > > > > > > + > > > > > > > +static void *virtnet_rq_recycle_data(struct receive_queue *r= q, > > > > > > > + struct virtnet_rq_data *= data) > > > > > > > +{ > > > > > > > + void *buf; > > > > > > > + > > > > > > > + buf =3D data->buf; > > > > > > > + > > > > > > > + data->next =3D rq->data_free; > > > > > > > + rq->data_free =3D data; > > > > > > > + > > > > > > > + return buf; > > > > > > > +} > > > > > > > + > > > > > > > +static struct virtnet_rq_data *virtnet_rq_get_data(struct re= ceive_queue *rq, > > > > > > > + void *buf, > > > > > > > + struct vir= tnet_rq_dma *dma) > > > > > > > +{ > > > > > > > + struct virtnet_rq_data *data; > > > > > > > + > > > > > > > + data =3D rq->data_free; > > > > > > > + rq->data_free =3D data->next; > > > > > > > + > > > > > > > + data->buf =3D buf; > > > > > > > + data->dma =3D dma; > > > > > > > + > > > > > > > + return data; > > > > > > > +} > > > > > > > + > > > > > > > +static void *virtnet_rq_get_buf(struct receive_queue *rq, u3= 2 *len, void **ctx) > > > > > > > +{ > > > > > > > + struct virtnet_rq_data *data; > > > > > > > + void *buf; > > > > > > > + > > > > > > > + buf =3D virtqueue_get_buf_ctx(rq->vq, len, ctx); > > > > > > > + if (!buf || !rq->data_array) > > > > > > > + return buf; > > > > > > > + > > > > > > > + data =3D buf; > > > > > > > + > > > > > > > + virtnet_rq_unmap(rq, data->dma); > > > > > > > + > > > > > > > + return virtnet_rq_recycle_data(rq, data); > > > > > > > +} > > > > > > > + > > > > > > > +static void *virtnet_rq_detach_unused_buf(struct receive_que= ue *rq) > > > > > > > +{ > > > > > > > + struct virtnet_rq_data *data; > > > > > > > + void *buf; > > > > > > > + > > > > > > > + buf =3D virtqueue_detach_unused_buf(rq->vq); > > > > > > > + if (!buf || !rq->data_array) > > > > > > > + return buf; > > > > > > > + > > > > > > > + data =3D buf; > > > > > > > + > > > > > > > + virtnet_rq_unmap(rq, data->dma); > > > > > > > + > > > > > > > + return virtnet_rq_recycle_data(rq, data); > > > > > > > +} > > > > > > > + > > > > > > > +static int virtnet_rq_map_sg(struct receive_queue *rq, void = *buf, u32 len) > > > > > > > +{ > > > > > > > + struct virtnet_rq_dma *dma =3D rq->last_dma; > > > > > > > + struct device *dev; > > > > > > > + u32 off, map_len; > > > > > > > + dma_addr_t addr; > > > > > > > + void *end; > > > > > > > + > > > > > > > + if (likely(dma) && buf >=3D dma->buf && (buf + len <= =3D dma->buf + dma->len)) { > > > > > > > + ++dma->ref; > > > > > > > + addr =3D dma->addr + (buf - dma->buf); > > > > > > > + goto ok; > > > > > > > + } > > > > > > > > > > > > So this is the meat of the proposed optimization. I guess that > > > > > > if the last buffer we allocated happens to be in the same page > > > > > > as this one then they can both be mapped for DMA together. > > > > > > > > > > Since we use page_frag, the buffers we allocated are all continuo= us. > > > > > > > > > > > Why last one specifically? Whether next one happens to > > > > > > be close depends on luck. If you want to try optimizing this > > > > > > the right thing to do is likely by using a page pool. > > > > > > There's actually work upstream on page pool, look it up. > > > > > > > > > > As we discussed in another thread, the page pool is first used fo= r xdp. Let's > > > > > transform it step by step. > > > > > > > > > > Thanks. > > > > > > > > ok so this should wait then? > > > > > > > > > > > > > > > > > + > > > > > > > + end =3D buf + len - 1; > > > > > > > + off =3D offset_in_page(end); > > > > > > > + map_len =3D len + PAGE_SIZE - off; > > > > > > > + > > > > > > > + dev =3D virtqueue_dma_dev(rq->vq); > > > > > > > + > > > > > > > + addr =3D dma_map_page_attrs(dev, virt_to_page(buf), o= ffset_in_page(buf), > > > > > > > + map_len, DMA_FROM_DEVICE, 0= ); > > > > > > > + if (addr =3D=3D DMA_MAPPING_ERROR) > > > > > > > + return -ENOMEM; > > > > > > > + > > > > > > > + dma =3D rq->dma_free; > > > > > > > + rq->dma_free =3D dma->next; > > > > > > > + > > > > > > > + dma->ref =3D 1; > > > > > > > + dma->buf =3D buf; > > > > > > > + dma->addr =3D addr; > > > > > > > + dma->len =3D map_len; > > > > > > > + > > > > > > > + rq->last_dma =3D dma; > > > > > > > + > > > > > > > +ok: > > > > > > > + sg_init_table(rq->sg, 1); > > > > > > > + rq->sg[0].dma_address =3D addr; > > > > > > > + rq->sg[0].length =3D len; > > > > > > > + > > > > > > > + return 0; > > > > > > > +} > > > > > > > + > > > > > > > +static int virtnet_rq_merge_map_init(struct virtnet_info *vi= ) > > > > > > > +{ > > > > > > > + struct receive_queue *rq; > > > > > > > + int i, err, j, num; > > > > > > > + > > > > > > > + /* disable for big mode */ > > > > > > > + if (!vi->mergeable_rx_bufs && vi->big_packets) > > > > > > > + return 0; > > > > > > > + > > > > > > > + for (i =3D 0; i < vi->max_queue_pairs; i++) { > > > > > > > + err =3D virtqueue_set_premapped(vi->rq[i].vq)= ; > > > > > > > + if (err) > > > > > > > + continue; > > > > > > > + > > > > > > > + rq =3D &vi->rq[i]; > > > > > > > + > > > > > > > + num =3D virtqueue_get_vring_size(rq->vq); > > > > > > > + > > > > > > > + rq->data_array =3D kmalloc_array(num, sizeof(= *rq->data_array), GFP_KERNEL); > > > > > > > + if (!rq->data_array) > > > > > > > + goto err; > > > > > > > + > > > > > > > + rq->dma_array =3D kmalloc_array(num, sizeof(*= rq->dma_array), GFP_KERNEL); > > > > > > > + if (!rq->dma_array) > > > > > > > + goto err; > > > > > > > + > > > > > > > + for (j =3D 0; j < num; ++j) { > > > > > > > + rq->data_array[j].next =3D rq->data_f= ree; > > > > > > > + rq->data_free =3D &rq->data_array[j]; > > > > > > > + > > > > > > > + rq->dma_array[j].next =3D rq->dma_fre= e; > > > > > > > + rq->dma_free =3D &rq->dma_array[j]; > > > > > > > + } > > > > > > > + } > > > > > > > + > > > > > > > + return 0; > > > > > > > + > > > > > > > +err: > > > > > > > + for (i =3D 0; i < vi->max_queue_pairs; i++) { > > > > > > > + struct receive_queue *rq; > > > > > > > + > > > > > > > + rq =3D &vi->rq[i]; > > > > > > > + > > > > > > > + kfree(rq->dma_array); > > > > > > > + kfree(rq->data_array); > > > > > > > + } > > > > > > > + > > > > > > > + return -ENOMEM; > > > > > > > +} > > > > > > > + > > > > > > > static void free_old_xmit_skbs(struct send_queue *sq, bool i= n_napi) > > > > > > > { > > > > > > > unsigned int len; > > > > > > > @@ -835,7 +1033,7 @@ static struct page *xdp_linearize_page(s= truct receive_queue *rq, > > > > > > > void *buf; > > > > > > > int off; > > > > > > > > > > > > > > - buf =3D virtqueue_get_buf(rq->vq, &buflen); > > > > > > > + buf =3D virtnet_rq_get_buf(rq, &buflen, NULL)= ; > > > > > > > if (unlikely(!buf)) > > > > > > > goto err_buf; > > > > > > > > > > > > > > @@ -1126,7 +1324,7 @@ static int virtnet_build_xdp_buff_mrg(s= truct net_device *dev, > > > > > > > return -EINVAL; > > > > > > > > > > > > > > while (--*num_buf > 0) { > > > > > > > - buf =3D virtqueue_get_buf_ctx(rq->vq, &len, &= ctx); > > > > > > > + buf =3D virtnet_rq_get_buf(rq, &len, &ctx); > > > > > > > if (unlikely(!buf)) { > > > > > > > pr_debug("%s: rx error: %d buffers ou= t of %d missing\n", > > > > > > > dev->name, *num_buf, > > > > > > > @@ -1351,7 +1549,7 @@ static struct sk_buff *receive_mergeabl= e(struct net_device *dev, > > > > > > > while (--num_buf) { > > > > > > > int num_skb_frags; > > > > > > > > > > > > > > - buf =3D virtqueue_get_buf_ctx(rq->vq, &len, &= ctx); > > > > > > > + buf =3D virtnet_rq_get_buf(rq, &len, &ctx); > > > > > > > if (unlikely(!buf)) { > > > > > > > pr_debug("%s: rx error: %d buffers ou= t of %d missing\n", > > > > > > > dev->name, num_buf, > > > > > > > @@ -1414,7 +1612,7 @@ static struct sk_buff *receive_mergeabl= e(struct net_device *dev, > > > > > > > err_skb: > > > > > > > put_page(page); > > > > > > > while (num_buf-- > 1) { > > > > > > > - buf =3D virtqueue_get_buf(rq->vq, &len); > > > > > > > + buf =3D virtnet_rq_get_buf(rq, &len, NULL); > > > > > > > if (unlikely(!buf)) { > > > > > > > pr_debug("%s: rx error: %d buffers mi= ssing\n", > > > > > > > dev->name, num_buf); > > > > > > > @@ -1529,6 +1727,7 @@ static int add_recvbuf_small(struct vir= tnet_info *vi, struct receive_queue *rq, > > > > > > > unsigned int xdp_headroom =3D virtnet_get_headroom(vi= ); > > > > > > > void *ctx =3D (void *)(unsigned long)xdp_headroom; > > > > > > > int len =3D vi->hdr_len + VIRTNET_RX_PAD + GOOD_PACKE= T_LEN + xdp_headroom; > > > > > > > + struct virtnet_rq_data *data; > > > > > > > int err; > > > > > > > > > > > > > > len =3D SKB_DATA_ALIGN(len) + > > > > > > > @@ -1539,11 +1738,34 @@ static int add_recvbuf_small(struct v= irtnet_info *vi, struct receive_queue *rq, > > > > > > > buf =3D (char *)page_address(alloc_frag->page) + allo= c_frag->offset; > > > > > > > get_page(alloc_frag->page); > > > > > > > alloc_frag->offset +=3D len; > > > > > > > - sg_init_one(rq->sg, buf + VIRTNET_RX_PAD + xdp_headro= om, > > > > > > > - vi->hdr_len + GOOD_PACKET_LEN); > > > > > > > - err =3D virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, bu= f, ctx, gfp); > > > > > > > + > > > > > > > + if (rq->data_array) { > > > > > > > + err =3D virtnet_rq_map_sg(rq, buf + VIRTNET_R= X_PAD + xdp_headroom, > > > > > > > + vi->hdr_len + GOOD_PA= CKET_LEN); > > > > > > > + if (err) > > > > > > > + goto map_err; > > > > > > > + > > > > > > > + data =3D virtnet_rq_get_data(rq, buf, rq->las= t_dma); > > > > > > > + } else { > > > > > > > + sg_init_one(rq->sg, buf + VIRTNET_RX_PAD + xd= p_headroom, > > > > > > > + vi->hdr_len + GOOD_PACKET_LEN); > > > > > > > + data =3D (void *)buf; > > > > > > > + } > > > > > > > + > > > > > > > + err =3D virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, da= ta, ctx, gfp); > > > > > > > if (err < 0) > > > > > > > - put_page(virt_to_head_page(buf)); > > > > > > > + goto add_err; > > > > > > > + > > > > > > > + return err; > > > > > > > + > > > > > > > +add_err: > > > > > > > + if (rq->data_array) { > > > > > > > + virtnet_rq_unmap(rq, data->dma); > > > > > > > + virtnet_rq_recycle_data(rq, data); > > > > > > > + } > > > > > > > + > > > > > > > +map_err: > > > > > > > + put_page(virt_to_head_page(buf)); > > > > > > > return err; > > > > > > > } > > > > > > > > > > > > > > @@ -1620,6 +1842,7 @@ static int add_recvbuf_mergeable(struct= virtnet_info *vi, > > > > > > > unsigned int headroom =3D virtnet_get_headroom(vi); > > > > > > > unsigned int tailroom =3D headroom ? sizeof(struct sk= b_shared_info) : 0; > > > > > > > unsigned int room =3D SKB_DATA_ALIGN(headroom + tailr= oom); > > > > > > > + struct virtnet_rq_data *data; > > > > > > > char *buf; > > > > > > > void *ctx; > > > > > > > int err; > > > > > > > @@ -1650,12 +1873,32 @@ static int add_recvbuf_mergeable(stru= ct virtnet_info *vi, > > > > > > > alloc_frag->offset +=3D hole; > > > > > > > } > > > > > > > > > > > > > > - sg_init_one(rq->sg, buf, len); > > > > > > > + if (rq->data_array) { > > > > > > > + err =3D virtnet_rq_map_sg(rq, buf, len); > > > > > > > + if (err) > > > > > > > + goto map_err; > > > > > > > + > > > > > > > + data =3D virtnet_rq_get_data(rq, buf, rq->las= t_dma); > > > > > > > + } else { > > > > > > > + sg_init_one(rq->sg, buf, len); > > > > > > > + data =3D (void *)buf; > > > > > > > + } > > > > > > > + > > > > > > > ctx =3D mergeable_len_to_ctx(len + room, headroom); > > > > > > > - err =3D virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, bu= f, ctx, gfp); > > > > > > > + err =3D virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, da= ta, ctx, gfp); > > > > > > > if (err < 0) > > > > > > > - put_page(virt_to_head_page(buf)); > > > > > > > + goto add_err; > > > > > > > + > > > > > > > + return 0; > > > > > > > + > > > > > > > +add_err: > > > > > > > + if (rq->data_array) { > > > > > > > + virtnet_rq_unmap(rq, data->dma); > > > > > > > + virtnet_rq_recycle_data(rq, data); > > > > > > > + } > > > > > > > > > > > > > > +map_err: > > > > > > > + put_page(virt_to_head_page(buf)); > > > > > > > return err; > > > > > > > } > > > > > > > > > > > > > > @@ -1775,13 +2018,13 @@ static int virtnet_receive(struct rec= eive_queue *rq, int budget, > > > > > > > void *ctx; > > > > > > > > > > > > > > while (stats.packets < budget && > > > > > > > - (buf =3D virtqueue_get_buf_ctx(rq->vq,= &len, &ctx))) { > > > > > > > + (buf =3D virtnet_rq_get_buf(rq, &len, = &ctx))) { > > > > > > > receive_buf(vi, rq, buf, len, ctx, xd= p_xmit, &stats); > > > > > > > stats.packets++; > > > > > > > } > > > > > > > } else { > > > > > > > while (stats.packets < budget && > > > > > > > - (buf =3D virtqueue_get_buf(rq->vq, &le= n)) !=3D NULL) { > > > > > > > + (buf =3D virtnet_rq_get_buf(rq, &len, = NULL)) !=3D NULL) { > > > > > > > receive_buf(vi, rq, buf, len, NULL, x= dp_xmit, &stats); > > > > > > > stats.packets++; > > > > > > > } > > > > > > > @@ -3514,6 +3757,9 @@ static void virtnet_free_queues(struct = virtnet_info *vi) > > > > > > > for (i =3D 0; i < vi->max_queue_pairs; i++) { > > > > > > > __netif_napi_del(&vi->rq[i].napi); > > > > > > > __netif_napi_del(&vi->sq[i].napi); > > > > > > > + > > > > > > > + kfree(vi->rq[i].data_array); > > > > > > > + kfree(vi->rq[i].dma_array); > > > > > > > } > > > > > > > > > > > > > > /* We called __netif_napi_del(), > > > > > > > @@ -3591,9 +3837,10 @@ static void free_unused_bufs(struct vi= rtnet_info *vi) > > > > > > > } > > > > > > > > > > > > > > for (i =3D 0; i < vi->max_queue_pairs; i++) { > > > > > > > - struct virtqueue *vq =3D vi->rq[i].vq; > > > > > > > - while ((buf =3D virtqueue_detach_unused_buf(v= q)) !=3D NULL) > > > > > > > - virtnet_rq_free_unused_buf(vq, buf); > > > > > > > + struct receive_queue *rq =3D &vi->rq[i]; > > > > > > > + > > > > > > > + while ((buf =3D virtnet_rq_detach_unused_buf(= rq)) !=3D NULL) > > > > > > > + virtnet_rq_free_unused_buf(rq->vq, bu= f); > > > > > > > cond_resched(); > > > > > > > } > > > > > > > } > > > > > > > @@ -3767,6 +4014,10 @@ static int init_vqs(struct virtnet_inf= o *vi) > > > > > > > if (ret) > > > > > > > goto err_free; > > > > > > > > > > > > > > + ret =3D virtnet_rq_merge_map_init(vi); > > > > > > > + if (ret) > > > > > > > + goto err_free; > > > > > > > + > > > > > > > cpus_read_lock(); > > > > > > > virtnet_set_affinity(vi); > > > > > > > cpus_read_unlock(); > > > > > > > -- > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > >