virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Sasha Levin <levinsasha928@gmail.com>
Cc: markmc@redhat.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio-ring: Use threshold for switching to indirect descriptors
Date: Tue, 29 Nov 2011 16:54:51 +0200	[thread overview]
Message-ID: <20111129145451.GD30966@redhat.com> (raw)
In-Reply-To: <1322576464.7003.6.camel@lappy>

On Tue, Nov 29, 2011 at 04:21:04PM +0200, Sasha Levin wrote:
> > > > Need to verify the effect on block too, and do some more
> > > > benchmarks. In particular we are making the ring
> > > > in effect smaller, how will this affect small packet perf
> > > > with multiple streams?
> > > 
> > > I couldn't get good block benchmarks on my hardware. They were all over
> > > the place even when I was trying to get the baseline. I'm guessing my
> > > disk is about to kick the bucket.
> > 
> > Try using memory as a backing store.
> 
> Here are the results from fio doing random reads:
> 
> With indirect buffers:
> Run status group 0 (all jobs):
>    READ: io=2419.7MB, aggrb=126001KB/s, minb=12887KB/s, maxb=13684KB/s, mint=18461msec, maxt=19664msec
> 
> Disk stats (read/write):
>   vda: ios=612107/0, merge=0/0, ticks=37559/0, in_queue=32723, util=76.70%
> 
> Indirect buffers disabled in the host:
> Run status group 0 (all jobs):
>    READ: io=2419.7MB, aggrb=127106KB/s, minb=12811KB/s, maxb=14557KB/s, mint=17486msec, maxt=19493msec
> 
> Disk stats (read/write):
>   vda: ios=617315/0, merge=1/0, ticks=166751/0, in_queue=162807, util=88.19%

I don't know much about this, only difference I see is that
in_queue is way higher. 

> 
> Which is actually strange, weren't indirect buffers introduced to make
> the performance *better*? From what I see it's pretty much the
> same/worse for virtio-blk.

I know they were introduced to allow adding very large bufs.
See 9fa29b9df32ba4db055f3977933cd0c1b8fe67cd
Mark, you wrote the patch, could you tell us which workloads
benefit the most from indirect bufs?

> Here's my fio test file:
> [random-read]
> rw=randread
> size=256m
> filename=/dev/vda
> ioengine=libaio
> iodepth=8
> direct=1
> invalidate=1
> numjobs=10
> > 
> > > This threshold should be dynamic and be based on the amount of avail
> > > descriptors over time, so for example, if the vring is 90% full over
> > > time the threshold will go up allowing for more indirect buffers.
> > > Currently it's static, but it's a first step to making it dynamic :)
> > > 
> > > I'll do a benchmark with small packets.
> > > 
> > > > A very simple test is to disable indirect buffers altogether.
> > > > qemu-kvm has a flag for this.
> > > > Is this an equivalent test?
> > > > If yes I'll try that.
> > > 
> > > Yes, it should be equivalent to qemu without that flag.
> > > 
> > > > 
> > > > 
> > > > > Cc: Rusty Russell <rusty@rustcorp.com.au>
> > > > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > > > Cc: virtualization@lists.linux-foundation.org
> > > > > Cc: kvm@vger.kernel.org
> > > > > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> > > > > ---
> > > > >  drivers/virtio/virtio_ring.c |   12 ++++++++++--
> > > > >  1 files changed, 10 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index c7a2c20..5e0ce15 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -82,6 +82,7 @@ struct vring_virtqueue
> > > > >  
> > > > >  	/* Host supports indirect buffers */
> > > > >  	bool indirect;
> > > > 
> > > > We can get rid of bool indirect now, just set indirect_threshold to 0,
> > > > right?
> > > 
> > > Yup.
> > > 
> > > > 
> > > > > +	unsigned int indirect_threshold;
> > > > 
> > > > Please add a comment. It should be something like
> > > > 'Min. number of free space in the ring to trigger direct descriptor use'
> > > 
> > > Will do.
> > > 
> > > > 
> > > > >  
> > > > >  	/* Host publishes avail event idx */
> > > > >  	bool event;
> > > > > @@ -176,8 +177,9 @@ int virtqueue_add_buf_gfp(struct virtqueue *_vq,
> > > > >  	BUG_ON(data == NULL);
> > > > >  
> > > > >  	/* If the host supports indirect descriptor tables, and we have multiple
> > > > > -	 * buffers, then go indirect. FIXME: tune this threshold */
> > > > > -	if (vq->indirect && (out + in) > 1 && vq->num_free) {
> > > > > +	 * buffers, then go indirect. */
> > > > > +	if (vq->indirect && (out + in) > 1 &&
> > > > > +	   (vq->num_free < vq->indirect_threshold)) {
> > > > 
> > > > If num_free is 0, this will allocate the buffer which is
> > > > not a good idea.
> > > > 
> > > > I think there's a regression here: with a small vq, we could
> > > > previously put in an indirect descriptor, with your patch
> > > > add_buf will fail. I think this is a real problem for block
> > > > which was the original reason indirect bufs were introduced.
> > > 
> > > I defined the threshold so at least 16 descriptors will be used as
> > > indirect buffers, so if you have a small vq theres still a solid minimum
> > > of indirect descriptors it could use.
> > 
> > Yes but request size might be > 16.
> > 
> > > > 
> > > > 
> > > > >  		head = vring_add_indirect(vq, sg, out, in, gfp);
> > > > >  		if (likely(head >= 0))
> > > > >  			goto add_head;
> > > > > @@ -485,6 +487,12 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
> > > > >  #endif
> > > > >  
> > > > >  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
> > > > > +	/*
> > > > > +	 * Use indirect descriptors only when we have less than either 12%
> > > > > +	 * or 16 of the descriptors in the ring available.
> > > > > +	 */
> > > > > +	if (vq->indirect)
> > > > > +		vq->indirect_threshold = max(16U, num >> 3);
> > > > 
> > > > Let's add some defines at top of the file please, maybe even
> > > > a module parameter.
> > > > 
> > > > >  	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
> > > > >  
> > > > >  	/* No callback?  Tell other side not to bother us. */
> > > > > -- 
> > > > > 1.7.8.rc3
> > > 
> > > -- 
> > > 
> > > Sasha.
> 
> -- 
> 
> Sasha.

  reply	other threads:[~2011-11-29 14:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29  9:33 [PATCH] virtio-ring: Use threshold for switching to indirect descriptors Sasha Levin
2011-11-29 12:56 ` Michael S. Tsirkin
2011-11-29 13:34   ` Sasha Levin
2011-11-29 13:54     ` Michael S. Tsirkin
     [not found]     ` <20111129135406.GB30966@redhat.com>
2011-11-29 14:21       ` Sasha Levin
2011-11-29 14:54         ` Michael S. Tsirkin [this message]
2011-11-29 14:58           ` Avi Kivity
2011-11-30 16:11             ` Sasha Levin
2011-11-30 16:17               ` Sasha Levin
2011-12-01  2:42               ` Rusty Russell
2011-12-01  7:58                 ` Michael S. Tsirkin
2011-12-01  8:09                   ` Sasha Levin
2011-12-01 10:26                     ` Michael S. Tsirkin
2011-12-02  0:46                       ` Rusty Russell
2011-12-03 11:50                         ` Sasha Levin
2011-12-04 11:06                           ` Michael S. Tsirkin
2011-12-04 15:15                             ` Michael S. Tsirkin
2011-12-04 11:52                           ` Avi Kivity
2011-12-04 12:01                             ` Michael S. Tsirkin
2011-12-04 12:06                               ` Avi Kivity
2011-12-04 15:11                                 ` Michael S. Tsirkin
2011-12-04 15:16                                   ` Avi Kivity
2011-12-04 16:00                                     ` Michael S. Tsirkin
2011-12-04 16:33                                       ` Avi Kivity
2011-12-05  0:10                                     ` Rusty Russell
2011-12-05  9:52                                       ` Avi Kivity
2011-12-06  5:07                                         ` Rusty Russell
2011-12-06  9:58                                           ` Avi Kivity
2011-12-06 12:03                                             ` Rusty Russell
     [not found]                                             ` <87pqg1kiuu.fsf@rustcorp.com.au>
2011-12-07 13:37                                               ` Avi Kivity
2011-12-04 12:13                             ` Sasha Levin
2011-12-04 16:22                               ` Michael S. Tsirkin
2011-12-04 17:34                                 ` Sasha Levin
2011-12-04 17:37                                   ` Avi Kivity
2011-12-04 17:39                                     ` Sasha Levin
2011-12-04 18:23                                       ` Sasha Levin
2011-12-07 14:02                                         ` Sasha Levin
2011-12-07 15:48                                           ` Michael S. Tsirkin
2011-12-08  9:44                                             ` Rusty Russell
     [not found]                                             ` <87r50fgzyj.fsf@rustcorp.com.au>
2011-12-08 10:37                                               ` Sasha Levin
2011-12-09  5:33                                                 ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111129145451.GD30966@redhat.com \
    --to=mst@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markmc@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).