qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Cc: liuyongan@huawei.com, qinchuanyu@huawei.com,
	Jason Wang <jasowang@redhat.com>,
	akong@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [QA-virtio]:Why vring size is limited to 1024?
Date: Wed, 8 Oct 2014 12:13:45 +0300	[thread overview]
Message-ID: <20141008091345.GA3872@redhat.com> (raw)
In-Reply-To: <5434F0D3.8080504@huawei.com>

On Wed, Oct 08, 2014 at 04:07:47PM +0800, Zhangjie (HZ) wrote:
> MST, Thanks very much, I get it.
> 
> On 2014/10/8 15:37, Michael S. Tsirkin wrote:
> > On Wed, Oct 08, 2014 at 03:17:56PM +0800, Zhangjie (HZ) wrote:
> >> Thanks for your patient answer! :-)
> >>
> >> On 2014/9/30 17:33, Michael S. Tsirkin wrote:
> >>> On Tue, Sep 30, 2014 at 04:36:00PM +0800, Zhangjie (HZ) wrote:
> >>>> Hi,
> >>>> There exits packets loss when we do packet forwarding in VM,
> >>>> especially when we use dpdk to do the forwarding. By enlarging vring
> >>>> can alleviate the problem.
> >>>
> >>> I think this has to do with the fact that dpdk disables
> >>> checksum offloading, this has the side effect of disabling
> >>> segmentation offloading.
> >>>
> >>> Please fix dpdk to support checksum offloading, and
> >>> I think the problem will go away.
> >> In some application scene, loss of udp packets are not allowed,
> >>  and udp packets are always short than mtu.
> >> So, we need to support high pps(eg.0.3M Packets/s) forwarding, and
> >> offloading cannot fix it.
> > 
> > That's the point. With UFO you get larger than MTU UDP packets:
> > http://www.linuxfoundation.org/collaborate/workgroups/networking/ufo
> Then vm only do forwarding, and not create new packets itself.
> As we can not gro normal udp packets, when udp packets come from the nic of host, ufo cannot work.

This is something I've been thinking about for a while now.
We really should add GRO-like path for UDP, this isn't
too different from UDP.

LRO can often work with UDP too, but linux discards too much
info on LRO, but if you are doing drivers in userspace
you might be able to support this.

> > 
> > Additionally, checksum offloading reduces CPU utilization
> > and reduces the number of data copies, allowing higher pps
> > with smaller buffers.
> > 
> > It might look like queue depth helps performance for netperf, but in
> > real-life workloads the latency under load will suffer, with more
> > protocols implementing tunnelling on top of UDP such extreme bufferbloat
> > will not be tolerated.
> > 
> >>>
> >>>
> >>>> But now vring size is limited to 1024 as follows:
> >>>> VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
> >>>>                             void (*handle_output)(VirtIODevice *, VirtQueue *))
> >>>> {
> >>>> 	...
> >>>> 	if (i == VIRTIO_PCI_QUEUE_MAX || queue_size > VIRTQUEUE_MAX_SIZE)
> >>>>         abort();
> >>>> }
> >>>> ps:#define VIRTQUEUE_MAX_SIZE 1024
> >>>> I delete the judgement code, and set vring size to 2048,
> >>>> VM can be successfully started, and the network is ok too.
> >>>> So, Why vring size is limited to 1024 and what is the influence?
> >>>>
> >>>> Thanks!
> >>>
> >>> There are several reason for this limit.
> >>> First guest has to allocate descriptor buffer which is 16 * vring size.
> >>> With 1K size that is already 16K which might be tricky to
> >>> allocate contigiously if memory is fragmented when device is
> >>> added by hotplug.
> >> That is very
> >>> The second issue is that we want to be able to implement
> >>> the device on top of linux kernel, and
> >>> a single descriptor might use all of
> >>> the virtqueue. In this case we wont to be able to pass the
> >>> descriptor directly to linux as a single iov, since
> >>> that is limited to 1K entries.
> >> For the second issue, I wonder if it is ok to set vring size of virtio-net to large than 1024,
> >> as for net work, there is at most 18 pages for a skb, it will not exceed iov.
> >>>
> >>>> -- 
> >>>> Best Wishes!
> >>>> Zhang Jie
> >>> .
> >>>
> >>
> >> -- 
> >> Best Wishes!
> >> Zhang Jie
> > .
> > 
> 
> -- 
> Best Wishes!
> Zhang Jie

  reply	other threads:[~2014-10-08  9:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-30  8:36 [Qemu-devel] [QA-virtio]:Why vring size is limited to 1024? Zhangjie (HZ)
2014-09-30  9:33 ` Michael S. Tsirkin
2014-10-08  7:17   ` Zhangjie (HZ)
2014-10-08  7:37     ` Michael S. Tsirkin
2014-10-08  8:07       ` Zhangjie (HZ)
2014-10-08  9:13         ` Michael S. Tsirkin [this message]
2014-10-08  7:43   ` Avi Kivity
2014-10-08  8:26     ` Zhangjie (HZ)
2014-10-08  9:15     ` Michael S. Tsirkin
2014-10-08  9:51       ` Avi Kivity
2014-10-08 10:14         ` Michael S. Tsirkin
2014-10-08 10:37           ` Avi Kivity
2014-10-08 10:55             ` Michael S. Tsirkin
2014-10-08 10:59               ` Avi Kivity
2014-10-08 12:22                 ` Michael S. Tsirkin
2014-10-08 12:28                   ` Avi Kivity
2014-10-08 12:36                     ` Avi Kivity
2014-10-08 11:00               ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141008091345.GA3872@redhat.com \
    --to=mst@redhat.com \
    --cc=akong@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=liuyongan@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qinchuanyu@huawei.com \
    --cc=zhangjie14@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).