qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH +stable] block: don't attempt to merge overlapping requests
@ 2010-05-18 17:18 Avi Kivity
  2010-05-18 19:22 ` [Qemu-devel] " Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Avi Kivity @ 2010-05-18 17:18 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, kvm

The block multiwrite code pretends to be able to merge overlapping requests,
but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
of a large virtio disk).

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 block.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 48305b7..0e44e26 100644
--- a/block.c
+++ b/block.c
@@ -1956,8 +1956,8 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
         int64_t oldreq_last = reqs[outidx].sector + reqs[outidx].nb_sectors;
 
         // This handles the cases that are valid for all block drivers, namely
-        // exactly sequential writes and overlapping writes.
-        if (reqs[i].sector <= oldreq_last) {
+        // exactly sequential writes
+        if (reqs[i].sector == oldreq_last) {
             merge = 1;
         }
 
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-18 17:18 [Qemu-devel] [PATCH +stable] block: don't attempt to merge overlapping requests Avi Kivity
@ 2010-05-18 19:22 ` Stefan Hajnoczi
  2010-05-18 19:37   ` Stefan Hajnoczi
  2010-05-19  8:09   ` Avi Kivity
  0 siblings, 2 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-05-18 19:22 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

On Tue, May 18, 2010 at 6:18 PM, Avi Kivity <avi@redhat.com> wrote:
> The block multiwrite code pretends to be able to merge overlapping requests,
> but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
> of a large virtio disk).

Are overlapping write requests correct guest behavior?  I thought the
ordering semantics require a flush between overlapping writes to
ensure A is written before B.

What cache= mode are you running?

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-18 19:22 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-05-18 19:37   ` Stefan Hajnoczi
  2010-05-18 19:41     ` Michael Tokarev
  2010-05-19  8:09   ` Avi Kivity
  1 sibling, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-05-18 19:37 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

I just caught up on mails and saw you had already mentioned that
overlapping writes from the guest look fishy in the "the >1Tb block
issue".  Cache mode might still be interesting because it affects how
guest virtio-blk chooses queue ordering mode.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-18 19:37   ` Stefan Hajnoczi
@ 2010-05-18 19:41     ` Michael Tokarev
  2010-05-18 20:19       ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Tokarev @ 2010-05-18 19:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Christoph Hellwig, Avi Kivity, kvm, qemu-devel

18.05.2010 23:37, Stefan Hajnoczi wrote:
> I just caught up on mails and saw you had already mentioned that
> overlapping writes from the guest look fishy in the "the>1Tb block
> issue".  Cache mode might still be interesting because it affects how
> guest virtio-blk chooses queue ordering mode.

What I can say for sure is that the issue mentioned (>1Tb block)
occurs regardless of the cache mode.  I tried all 3, with exactly
the same results (well, not entirely exactly, since the prob
depends on timing too, and timing is different depending on the
cache mode), and performed all further tests with cache=writeback
since it's the mode which lets the guest to finish mkfs'ing the
1.5Tb "disk" in a reasonable time (or else it takes hours).

But actually I don't quite see where that dependency is: guest
does not know how its data is cached on host...

/mjt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-18 19:41     ` Michael Tokarev
@ 2010-05-18 20:19       ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-05-18 20:19 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Kevin Wolf, Christoph Hellwig, Avi Kivity, kvm, qemu-devel

On Tue, May 18, 2010 at 8:41 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> But actually I don't quite see where that dependency is: guest
> does not know how its data is cached on host...

I was thinking of this bit in linux-2.6:drivers/block/virtio_blk.c:

/* If barriers are supported, tell block layer that queue is ordered */
if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH))
    blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH,
                                  virtblk_prepare_flush);
else if (virtio_has_feature(vdev, VIRTIO_BLK_F_BARRIER))
    blk_queue_ordered(q, QUEUE_ORDERED_TAG, NULL);

I was wondering whether we have something wrong, causing the guest to
perform overlapping write requests without draining the queue or
flushing.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-18 19:22 ` [Qemu-devel] " Stefan Hajnoczi
  2010-05-18 19:37   ` Stefan Hajnoczi
@ 2010-05-19  8:09   ` Avi Kivity
  2010-05-19  9:01     ` Stefan Hajnoczi
  1 sibling, 1 reply; 10+ messages in thread
From: Avi Kivity @ 2010-05-19  8:09 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
> On Tue, May 18, 2010 at 6:18 PM, Avi Kivity<avi@redhat.com>  wrote:
>    
>> The block multiwrite code pretends to be able to merge overlapping requests,
>> but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
>> of a large virtio disk).
>>      
> Are overlapping write requests correct guest behavior?  I thought the
> ordering semantics require a flush between overlapping writes to
> ensure A is written before B.
>
> What cache= mode are you running?
>    

writeback.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-19  8:09   ` Avi Kivity
@ 2010-05-19  9:01     ` Stefan Hajnoczi
  2010-05-19  9:06       ` Avi Kivity
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-05-19  9:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

On Wed, May 19, 2010 at 9:09 AM, Avi Kivity <avi@redhat.com> wrote:
> On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
>> What cache= mode are you running?
>
> writeback.

In the cache=writeback case the virtio-blk guest driver does:

blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-19  9:01     ` Stefan Hajnoczi
@ 2010-05-19  9:06       ` Avi Kivity
  2010-05-19  9:23         ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Avi Kivity @ 2010-05-19  9:06 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

On 05/19/2010 12:01 PM, Stefan Hajnoczi wrote:
> On Wed, May 19, 2010 at 9:09 AM, Avi Kivity<avi@redhat.com>  wrote:
>    
>> On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
>>      
>>> What cache= mode are you running?
>>>        
>> writeback.
>>      
> In the cache=writeback case the virtio-blk guest driver does:
>
> blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
>    

I don't follow.  What's the implication?

btw I really dislike how the cache attribute (which I see as a pure host 
choice) is exposed to the guest.  It means we can't change caching mode 
on the fly (for example after live migration), or that changing caching 
mode during a restart may expose a previously hidden guest bug.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-19  9:06       ` Avi Kivity
@ 2010-05-19  9:23         ` Stefan Hajnoczi
  2010-05-19  9:31           ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-05-19  9:23 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kevin Wolf, qemu-devel, kvm, Christoph Hellwig

On Wed, May 19, 2010 at 10:06 AM, Avi Kivity <avi@redhat.com> wrote:
>> In the cache=writeback case the virtio-blk guest driver does:
>>
>> blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
>>
>
> I don't follow.  What's the implication?

I was wondering whether the queue is incorrectly set to a mode where
overlapping write requests aren't being ordered.  Anyway, Christoph
says overlapping write requests can be issued so my theory is broken.

> btw I really dislike how the cache attribute (which I see as a pure host
> choice) is exposed to the guest.  It means we can't change caching mode on
> the fly (for example after live migration), or that changing caching mode
> during a restart may expose a previously hidden guest bug.

Christoph has mentioned wanting to make the write cache switchable
from inside the guest.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH +stable] block: don't attempt to merge overlapping requests
  2010-05-19  9:23         ` Stefan Hajnoczi
@ 2010-05-19  9:31           ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2010-05-19  9:31 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Christoph Hellwig, Avi Kivity, kvm, qemu-devel

On Wed, May 19, 2010 at 10:23:44AM +0100, Stefan Hajnoczi wrote:
> On Wed, May 19, 2010 at 10:06 AM, Avi Kivity <avi@redhat.com> wrote:
> >> In the cache=writeback case the virtio-blk guest driver does:
> >>
> >> blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
> >>
> >
> > I don't follow. ?What's the implication?
> 
> I was wondering whether the queue is incorrectly set to a mode where
> overlapping write requests aren't being ordered.  Anyway, Christoph
> says overlapping write requests can be issued so my theory is broken.

They can happen, but won't during usual pagecache based writeback.  So
this should not happen for the pagecache based mke2fs workload.  It
could happen for a workload with mkfs that uses O_DIRECT.  And we
need to handle it either way.

And btw, our barrier handling for devices using multiwrite (aka virtio)
is utterly busted right now, patch will follow ASAP.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-05-19  9:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-18 17:18 [Qemu-devel] [PATCH +stable] block: don't attempt to merge overlapping requests Avi Kivity
2010-05-18 19:22 ` [Qemu-devel] " Stefan Hajnoczi
2010-05-18 19:37   ` Stefan Hajnoczi
2010-05-18 19:41     ` Michael Tokarev
2010-05-18 20:19       ` Stefan Hajnoczi
2010-05-19  8:09   ` Avi Kivity
2010-05-19  9:01     ` Stefan Hajnoczi
2010-05-19  9:06       ` Avi Kivity
2010-05-19  9:23         ` Stefan Hajnoczi
2010-05-19  9:31           ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).