All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, Dor Laor <dlaor@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Vadim Rozenfeld <vrozenfe@redhat.com>
Subject: Re: [Qemu-devel] Re: [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device
Date: Mon, 11 Jan 2010 09:13:23 -0600	[thread overview]
Message-ID: <4B4B4013.9030706@codemonkey.ws> (raw)
In-Reply-To: <4B4B39D4.8060405@redhat.com>

On 01/11/2010 08:46 AM, Avi Kivity wrote:
> On 01/11/2010 04:37 PM, Anthony Liguori wrote:
>>> That has the downside of bouncing a cache line on unrelated exits.
>>
>>
>> The read and write sides of the ring are widely separated in physical 
>> memory specifically to avoid cache line bouncing.
>
> I meant, exits on random vcpus will cause the cacheline containing the 
> notification disable flag to bounce around.  As it is, we read it on 
> the vcpu that owns the queue and write it on that vcpu or the I/O thread.

Bottom halves are always run from the IO thread.
>>>   It probably doesn't matter with qemu as it is now, since it will 
>>> bounce qemu_mutex, but it will hurt with large guests (especially if 
>>> they have many rings).
>>>
>>> IMO we should get things to work well without riding on unrelated 
>>> exits, especially as we're trying to reduce those exits.
>>
>> A block I/O request can potentially be very, very long lived.  By 
>> serializing requests like this, there's a high likelihood that it's 
>> going to kill performance with anything capable of processing 
>> multiple requests.
>
> Right, that's why I suggested having a queue depth at which disabling 
> notification kicks in.  The patch hardcodes this depth to 1, unpatched 
> qemu is infinite, a good value is probably spindle count + VAT.

That means we would need a user visible option which is quite unfortunate.

Also, that logic only really makes sense with cache=off.  With 
cache=writethrough, you can get pathological cases whereas you have an 
uncached access followed by cached accesses.  In fact, with read-ahead, 
this is probably not an uncommon scenario.

>> OTOH, if we aggressively poll the ring when we have an opportunity 
>> to, there's very little down side to that and it addresses the 
>> serialization problem.
>
> But we can't guarantee that we'll get those opportunities, so it 
> doesn't address the problem in a general way.  A guest that doesn't 
> use hpet and only has a single virtio-blk device will not have any 
> reason to exit to qemu.

We can mitigate this with a timer but honestly, we need to do perf 
measurements to see.  My feeling is that we will need some more 
aggressive form of polling than just waiting for IO completion.  I don't 
think queue depth is enough because it assumes that all requests are 
equal.  When dealing with cache=off or even just storage with it's own 
cache, that's simply not the case.

Regards,

Anthony Liguori

  reply	other threads:[~2010-01-11 15:13 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-11  7:40 [Qemu-devel] [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device Vadim Rozenfeld
2010-01-11  8:30 ` [Qemu-devel] " Avi Kivity
     [not found]   ` <4B4AE95D.7080305@redhat.com>
2010-01-11  9:19     ` Dor Laor
2010-01-11 13:11       ` Christoph Hellwig
2010-01-11 13:13         ` Avi Kivity
2010-01-11 13:16           ` Christoph Hellwig
2010-01-11 13:47           ` Christoph Hellwig
2010-01-11 14:00             ` Anthony Liguori
2010-02-24  2:58               ` Paul Brook
2010-02-24 14:59                 ` Anthony Liguori
2010-02-25 15:06                   ` Paul Brook
2010-02-25 15:23                     ` Anthony Liguori
2010-02-25 16:48                       ` Paul Brook
2010-02-25 17:11                     ` Avi Kivity
2010-02-25 17:15                       ` Anthony Liguori
2010-02-25 17:33                         ` Avi Kivity
2010-02-25 18:05                           ` malc
2010-02-25 19:55                           ` Anthony Liguori
2010-02-26  8:47                             ` Avi Kivity
2010-02-26 14:36                               ` Anthony Liguori
2010-02-26 15:39                                 ` Avi Kivity
2010-01-11 13:42   ` Christoph Hellwig
2010-01-11 13:49     ` Anthony Liguori
2010-01-11 14:29       ` Avi Kivity
2010-01-11 14:37         ` Anthony Liguori
2010-01-11 14:46           ` Avi Kivity
2010-01-11 15:13             ` Anthony Liguori [this message]
2010-01-11 15:19               ` Avi Kivity
2010-01-11 15:22                 ` Anthony Liguori
2010-01-11 15:31                   ` Avi Kivity
2010-01-11 15:32                     ` Anthony Liguori
2010-01-11 15:35                       ` Avi Kivity
2010-01-11 15:38                         ` Anthony Liguori
2010-01-11 18:22               ` [Qemu-devel] " Michael S. Tsirkin
2010-01-11 18:20           ` Michael S. Tsirkin
2010-01-11 14:25     ` [Qemu-devel] " Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B4B4013.9030706@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=dlaor@redhat.com \
    --cc=hch@lst.de \
    --cc=qemu-devel@nongnu.org \
    --cc=vrozenfe@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.