netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH vhost v13 05/12] virtio_ring: introduce virtqueue_dma_dev()
Date: Wed, 16 Aug 2023 11:22:27 +0800	[thread overview]
Message-ID: <1692156147.7470396-3-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <CACGkMEvnVy+p8+Nro6v7Yr-m_N07200skcqwz-pCr5==sn68BQ@mail.gmail.com>

On Wed, 16 Aug 2023 10:33:34 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Wed, Aug 16, 2023 at 10:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Wed, 16 Aug 2023 10:19:34 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Wed, Aug 16, 2023 at 10:16 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Wed, 16 Aug 2023 09:13:48 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Tue, Aug 15, 2023 at 5:40 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >
> > > > > > On Tue, 15 Aug 2023 15:50:23 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > > > > On Tue, Aug 15, 2023 at 2:32 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > Hi, Jason
> > > > > > > >
> > > > > > > > Could you skip this patch?
> > > > > > >
> > > > > > > I'm fine with either merging or dropping this.
> > > > > > >
> > > > > > > >
> > > > > > > > Let we review other patches firstly?
> > > > > > >
> > > > > > > I will be on vacation soon, and won't have time to do this until next week.
> > > > > >
> > > > > > Have a happly vacation.
> > > > > >
> > > > > > >
> > > > > > > But I spot two possible "issues":
> > > > > > >
> > > > > > > 1) the DMA metadata were stored in the headroom of the page, this
> > > > > > > breaks frags coalescing, we need to benchmark it's impact
> > > > > >
> > > > > > Not every page, just the first page of the COMP pages.
> > > > > >
> > > > > > So I think there is no impact.
> > > > >
> > > > > Nope, see this:
> > > > >
> > > > >         if (SKB_FRAG_PAGE_ORDER &&
> > > > >             !static_branch_unlikely(&net_high_order_alloc_disable_key)) {
> > > > >                 /* Avoid direct reclaim but allow kswapd to wake */
> > > > >                 pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
> > > > >                                           __GFP_COMP | __GFP_NOWARN |
> > > > >                                           __GFP_NORETRY,
> > > > >                                           SKB_FRAG_PAGE_ORDER);
> > > > >                 if (likely(pfrag->page)) {
> > > > >                         pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
> > > > >                         return true;
> > > > >                 }
> > > > >         }
> > > > >
> > > > > The comp page might be disabled due to the SKB_FRAG_PAGE_ORDER and
> > > > > net_high_order_alloc_disable_key.
> > > >
> > > >
> > > > YES.
> > > >
> > > > But if comp page is disabled. Then we only get one page each time. The pages are
> > > > not contiguous, so we don't have frags coalescing.
> > > >
> > > > If you mean the two pages got from alloc_page may be contiguous. The coalescing
> > > > may then be broken. It's a possibility, but I think the impact will be small.
> > >
> > > Let's have a simple benchmark and see?
> >
> >
> > That is ok.
> >
> > I think you want to know the perf num with big traffic and the comp page
> > disabled.
>
> Yes.


Hi,

Host:
	for ((i=0; i < 10; ++i)) do sockperf tp -i 192.168.122.100 -t 1000  -m 64000& done
Guest:
	03:23:12 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil
	03:23:13 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
	03:23:13 AM      ens4  61848.00      1.00 3868036.73      0.58      0.00      0.00      0.00      0.00

	tcpdump:
		03:25:01.741563 IP 192.168.122.1.29693 > 192.168.122.100.11111: UDP, length 64000
		03:25:01.741580 IP 192.168.122.1.22239 > 192.168.122.100.11111: UDP, length 64000
		03:25:01.741623 IP 192.168.122.1.22396 > 192.168.122.100.11111: UDP, length 64000

The Guest CPU util is low, every packet is 64000. But the Host vhost process is
100%. So we can not judge by the traffic or the cpu of the Guest.

So I use the kernel without my patches 0635819decaf9d60e6cacfecfebfabe3cbdddafb.

I want to count the frags coalescing num when the comp page is disabled.

	$ sh -x test.sh
	+ sysctl -w net.core.high_order_alloc_disable=1
	net.core.high_order_alloc_disable = 1
	+ sysctl net.core.high_order_alloc_disable
	net.core.high_order_alloc_disable = 1
	+ sleep 5
	+ timeout 5 bpftrace -e 'kprobe: skb_coalesce_rx_frag{@[nsecs/1000/1000/1000]=count()}'
	Attaching 1 probe...



	+ sysctl -w net.core.high_order_alloc_disable=0
	net.core.high_order_alloc_disable = 0
	+ sysctl net.core.high_order_alloc_disable
	net.core.high_order_alloc_disable = 0
	+ sleep 5
	+ timeout 5 bpftrace -e 'kprobe: skb_coalesce_rx_frag{@[nsecs/1000/1000/1000]=count()}'
	Attaching 1 probe...


	@[356]: 167020
	@[361]: 673653
	@[359]: 900844
	@[360]: 912657
	@[358]: 915853
	@[357]: 932245


We can see that the skb_coalesce_rx_frag is not called when comp page is disabled.
If the comp page is enable, there will be many frags coalescing.

So I think that my change will not have impact.

Thanks.




>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 2) pre mapped DMA addresses were not reused in the case of XDP_TX/XDP_REDIRECT
> > > > > >
> > > > > > Because that the tx is not the premapped mode.
> > > > >
> > > > > Yes, we can optimize this on top.
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > >
> > > > > > > I see Michael has merge this series so I'm fine to let it go first.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

  reply	other threads:[~2023-08-16  3:33 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10 12:30 [PATCH vhost v13 00/12] virtio core prepares for AF_XDP Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 01/12] virtio_ring: check use_dma_api before unmap desc for indirect Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 02/12] virtio_ring: put mapping error check in vring_map_one_sg Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 03/12] virtio_ring: introduce virtqueue_set_dma_premapped() Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 04/12] virtio_ring: support add premapped buf Xuan Zhuo
2024-06-04 16:07   ` Ilya Leoshkevich
2024-06-04 16:17     ` Alexander Potapenko
2024-06-05 16:02       ` Ilya Leoshkevich
2024-06-06  3:23     ` Xuan Zhuo
     [not found]       ` <CAG_fn=UsqAhH57s08+prkj2iJshhxuLznzDNft4dPXHKX9V72Q@mail.gmail.com>
2024-06-06  8:24         ` Xuan Zhuo
2024-06-06  9:49           ` Alexander Potapenko
2024-06-06  8:26       ` Alexander Potapenko
2023-08-10 12:30 ` [PATCH vhost v13 05/12] virtio_ring: introduce virtqueue_dma_dev() Xuan Zhuo
2023-08-14  3:05   ` Jason Wang
2023-08-14  8:56     ` Xuan Zhuo
2023-08-14 11:24       ` Michael S. Tsirkin
2023-08-14 11:55         ` Xuan Zhuo
2023-08-15  6:30     ` Xuan Zhuo
2023-08-15  7:50       ` Jason Wang
2023-08-15  9:27         ` Xuan Zhuo
2023-08-16  1:13           ` Jason Wang
2023-08-16  2:08             ` Xuan Zhuo
2023-08-16  2:19               ` Jason Wang
2023-08-16  2:21                 ` Xuan Zhuo
2023-08-16  2:33                   ` Jason Wang
2023-08-16  3:22                     ` Xuan Zhuo [this message]
2023-08-15 11:57         ` Michael S. Tsirkin
2023-08-10 12:30 ` [PATCH vhost v13 06/12] virtio_ring: skip unmap for premapped Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 07/12] virtio_ring: correct the expression of the description of virtqueue_resize() Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 08/12] virtio_ring: separate the logic of reset/enable from virtqueue_resize Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 09/12] virtio_ring: introduce virtqueue_reset() Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 10/12] virtio_ring: introduce dma map api for virtqueue Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 11/12] virtio_ring: introduce dma sync " Xuan Zhuo
2023-11-30  9:45   ` Michael S. Tsirkin
2023-11-30  9:49     ` Xuan Zhuo
2023-11-30 10:10       ` Michael S. Tsirkin
2023-11-30 10:15         ` Xuan Zhuo
2023-08-10 12:30 ` [PATCH vhost v13 12/12] virtio_net: merge dma operations when filling mergeable buffers Xuan Zhuo
2023-09-26 16:01   ` Michael S. Tsirkin
2023-09-27  1:50     ` Xuan Zhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1692156147.7470396-3-xuanzhuo@linux.alibaba.com \
    --to=xuanzhuo@linux.alibaba.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).