From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Vadim Rozenfeld <vrozenfe@redhat.com>,
Dor Laor <dlaor@redhat.com>, Christoph Hellwig <hch@lst.de>,
Avi Kivity <avi@redhat.com>, qemu-devel <qemu-devel@nongnu.org>
Subject: [Qemu-devel] Re: Re: [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device
Date: Mon, 11 Jan 2010 20:22:14 +0200 [thread overview]
Message-ID: <20100111182214.GB12121@redhat.com> (raw)
In-Reply-To: <4B4B4013.9030706@codemonkey.ws>
On Mon, Jan 11, 2010 at 09:13:23AM -0600, Anthony Liguori wrote:
> On 01/11/2010 08:46 AM, Avi Kivity wrote:
>> On 01/11/2010 04:37 PM, Anthony Liguori wrote:
>>>> That has the downside of bouncing a cache line on unrelated exits.
>>>
>>>
>>> The read and write sides of the ring are widely separated in physical
>>> memory specifically to avoid cache line bouncing.
>>
>> I meant, exits on random vcpus will cause the cacheline containing the
>> notification disable flag to bounce around. As it is, we read it on
>> the vcpu that owns the queue and write it on that vcpu or the I/O
>> thread.
>
> Bottom halves are always run from the IO thread.
>>>> It probably doesn't matter with qemu as it is now, since it will
>>>> bounce qemu_mutex, but it will hurt with large guests (especially
>>>> if they have many rings).
>>>>
>>>> IMO we should get things to work well without riding on unrelated
>>>> exits, especially as we're trying to reduce those exits.
>>>
>>> A block I/O request can potentially be very, very long lived. By
>>> serializing requests like this, there's a high likelihood that it's
>>> going to kill performance with anything capable of processing
>>> multiple requests.
>>
>> Right, that's why I suggested having a queue depth at which disabling
>> notification kicks in. The patch hardcodes this depth to 1, unpatched
>> qemu is infinite, a good value is probably spindle count + VAT.
>
> That means we would need a user visible option which is quite unfortunate.
>
> Also, that logic only really makes sense with cache=off. With
> cache=writethrough, you can get pathological cases whereas you have an
> uncached access followed by cached accesses. In fact, with read-ahead,
> this is probably not an uncommon scenario.
>
>>> OTOH, if we aggressively poll the ring when we have an opportunity
>>> to, there's very little down side to that and it addresses the
>>> serialization problem.
>>
>> But we can't guarantee that we'll get those opportunities, so it
>> doesn't address the problem in a general way. A guest that doesn't
>> use hpet and only has a single virtio-blk device will not have any
>> reason to exit to qemu.
>
> We can mitigate this with a timer but honestly, we need to do perf
> measurements to see. My feeling is that we will need some more
> aggressive form of polling than just waiting for IO completion. I don't
> think queue depth is enough because it assumes that all requests are
> equal. When dealing with cache=off or even just storage with it's own
> cache, that's simply not the case.
>
> Regards,
>
> Anthony Liguori
>
BTW this is why vhost net uses queue depth in bytes.
--
MST
next prev parent reply other threads:[~2010-01-11 18:25 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-11 7:40 [Qemu-devel] [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device Vadim Rozenfeld
2010-01-11 8:30 ` [Qemu-devel] " Avi Kivity
[not found] ` <4B4AE95D.7080305@redhat.com>
2010-01-11 9:19 ` Dor Laor
2010-01-11 13:11 ` Christoph Hellwig
2010-01-11 13:13 ` Avi Kivity
2010-01-11 13:16 ` Christoph Hellwig
2010-01-11 13:47 ` Christoph Hellwig
2010-01-11 14:00 ` Anthony Liguori
2010-02-24 2:58 ` Paul Brook
2010-02-24 14:59 ` Anthony Liguori
2010-02-25 15:06 ` Paul Brook
2010-02-25 15:23 ` Anthony Liguori
2010-02-25 16:48 ` Paul Brook
2010-02-25 17:11 ` Avi Kivity
2010-02-25 17:15 ` Anthony Liguori
2010-02-25 17:33 ` Avi Kivity
2010-02-25 18:05 ` malc
2010-02-25 19:55 ` Anthony Liguori
2010-02-26 8:47 ` Avi Kivity
2010-02-26 14:36 ` Anthony Liguori
2010-02-26 15:39 ` Avi Kivity
2010-01-11 13:42 ` Christoph Hellwig
2010-01-11 13:49 ` Anthony Liguori
2010-01-11 14:29 ` Avi Kivity
2010-01-11 14:37 ` Anthony Liguori
2010-01-11 14:46 ` Avi Kivity
2010-01-11 15:13 ` Anthony Liguori
2010-01-11 15:19 ` Avi Kivity
2010-01-11 15:22 ` Anthony Liguori
2010-01-11 15:31 ` Avi Kivity
2010-01-11 15:32 ` Anthony Liguori
2010-01-11 15:35 ` Avi Kivity
2010-01-11 15:38 ` Anthony Liguori
2010-01-11 18:22 ` Michael S. Tsirkin [this message]
2010-01-11 18:20 ` [Qemu-devel] " Michael S. Tsirkin
2010-01-11 14:25 ` [Qemu-devel] " Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100111182214.GB12121@redhat.com \
--to=mst@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=dlaor@redhat.com \
--cc=hch@lst.de \
--cc=qemu-devel@nongnu.org \
--cc=vrozenfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.