[Qemu-devel] [PATCH 0/4] data integrity fixes

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/4] data integrity fixes
@ 2009-08-31 20:16 Christoph Hellwig
  2009-08-31 20:16 ` [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag Christoph Hellwig
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 20:16 UTC (permalink / raw)
  To: qemu-devel

This series contains various patches to make sure we actually get our
data to disk when the guest believes it is.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 20:16 [Qemu-devel] [PATCH 0/4] data integrity fixes Christoph Hellwig
@ 2009-08-31 20:16 ` Christoph Hellwig
  2009-08-31 22:09   ` Jamie Lokier
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync Christoph Hellwig
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 20:16 UTC (permalink / raw)
  To: qemu-devel


Add a enable_write_cache flag in the block driver state, and use it to
decide if we claim to have a volatile write cache that needs controlled
flushing from the guest.

Currently we only claim to have it when cache=none is specified.  While
that might seem wrong it actually is the case as we still have
outstanding block allocations and host drive caches to flush.  We do
not need to claim a write cache when we use cache=writethrough because
O_SYNC writes are guaranteed to have data on stable storage.

We would have to claim one for data=writeback to be safe, but for I will
follow Avi's opinion that it is a useless mode and should be our
dedicated unsafe mode.  If anyone disagrees please start the flame
thrower now and I will change it.  Otherwise a documentation patch will
follow to explicitly document data=writeback as unsafe.

Both scsi-disk and ide now use the new flage, changing from their
defaults of always off (ide) or always on (scsi-disk).

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: qemu-kvm/hw/scsi-disk.c
===================================================================
--- qemu-kvm.orig/hw/scsi-disk.c
+++ qemu-kvm/hw/scsi-disk.c
@@ -710,7 +710,9 @@ static int32_t scsi_send_command(SCSIDev
                 memset(p,0,20);
                 p[0] = 8;
                 p[1] = 0x12;
-                p[2] = 4; /* WCE */
+                if (bdrv_enable_write_cache(s->bdrv)) {
+                     p[2] = 4; /* WCE */
+                }
                 p += 20;
             }
             if ((page == 0x3f || page == 0x2a)
Index: qemu-kvm/block.c
===================================================================
--- qemu-kvm.orig/block.c
+++ qemu-kvm/block.c
@@ -408,6 +408,16 @@ int bdrv_open2(BlockDriverState *bs, con
     }
     bs->drv = drv;
     bs->opaque = qemu_mallocz(drv->instance_size);
+
+    /*
+     * Yes, BDRV_O_NOCACHE aka O_DIRECT means we have to present a
+     * write cache to the guest.  We do need the fdatasync to flush
+     * out transactions for block allocations, and we maybe have a
+     * volatile write cache in our backing device to deal with.
+     */
+    if (flags & BDRV_O_NOCACHE)
+        bs->enable_write_cache = 1;
+
     /* Note: for compatibility, we open disk image files as RDWR, and
        RDONLY as fallback */
     if (!(flags & BDRV_O_FILE))
@@ -918,6 +928,11 @@ int bdrv_is_sg(BlockDriverState *bs)
     return bs->sg;
 }
 
+int bdrv_enable_write_cache(BlockDriverState *bs)
+{
+    return bs->enable_write_cache;
+}
+
 /* XXX: no longer used */
 void bdrv_set_change_cb(BlockDriverState *bs,
                         void (*change_cb)(void *opaque), void *opaque)
Index: qemu-kvm/block_int.h
===================================================================
--- qemu-kvm.orig/block_int.h
+++ qemu-kvm/block_int.h
@@ -152,6 +152,9 @@ struct BlockDriverState {
     /* the memory alignment required for the buffers handled by this driver */
     int buffer_alignment;
 
+    /* do we need to tell the quest if we have a volatile write cache? */
+    int enable_write_cache;
+
     /* NOTE: the following infos are only hints for real hardware
        drivers. They are not used by the block driver */
     int cyls, heads, secs, translation;
Index: qemu-kvm/block.h
===================================================================
--- qemu-kvm.orig/block.h
+++ qemu-kvm/block.h
@@ -120,6 +120,7 @@ int bdrv_get_translation_hint(BlockDrive
 int bdrv_is_removable(BlockDriverState *bs);
 int bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_is_sg(BlockDriverState *bs);
+int bdrv_enable_write_cache(BlockDriverState *bs);
 int bdrv_is_inserted(BlockDriverState *bs);
 int bdrv_media_changed(BlockDriverState *bs);
 int bdrv_is_locked(BlockDriverState *bs);
Index: qemu-kvm/hw/ide/core.c
===================================================================
--- qemu-kvm.orig/hw/ide/core.c
+++ qemu-kvm/hw/ide/core.c
@@ -148,8 +148,11 @@ static void ide_identify(IDEState *s)
     put_le16(p + 83, (1 << 14) | (1 << 13) | (1 <<12) | (1 << 10));
     /* 14=set to 1, 1=SMART self test, 0=SMART error logging */
     put_le16(p + 84, (1 << 14) | 0);
-    /* 14 = NOP supported, 0=SMART feature set enabled */
-    put_le16(p + 85, (1 << 14) | 1);
+    /* 14 = NOP supported, 5=WCACHE enabled, 0=SMART feature set enabled */
+    if (bdrv_enable_write_cache(s->bs))
+         put_le16(p + 85, (1 << 14) | (1 << 5) | 1);
+    else
+         put_le16(p + 85, (1 << 14) | 1);
     /* 13=flush_cache_ext,12=flush_cache,10=lba48 */
     put_le16(p + 86, (1 << 14) | (1 << 13) | (1 <<12) | (1 << 10));
     /* 14=set to 1, 1=smart self test, 0=smart error logging */

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 20:16 ` [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag Christoph Hellwig
@ 2009-08-31 22:09   ` Jamie Lokier
  2009-08-31 22:16     ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 22:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> We would have to claim one for data=writeback to be safe, but for I will
> follow Avi's opinion that it is a useless mode and should be our
> dedicated unsafe mode.  If anyone disagrees please start the flame
> thrower now and I will change it.  Otherwise a documentation patch will
> follow to explicitly document data=writeback as unsafe.

*Opening flame thrower!*

Unsafe, useless?

It's the most useful mode when you're starting and stopping VMs
regularly, or if you can't use O_DIRECT.

It's safe if fdatasync is called - in other words not advertising a
write cache is silly.

I haven't measured but I'd expect it to be much faster than O_SYNC on
some host hardware, for the same reason that barriers + volatile write
cache are much faster on some host hardware than disabling the write cache.

Right now, on a Linux host O_SYNC is unsafe with hardware that has a
volatile write cache.  That might not be changed, but if it is than
performance with cache=writethrough will plummet (due to issuing a
CACHE FLUSH to the hardware after every write), while performance with
cache=writeback will be reasonable.

If an unsafe mode is desired (I think it is, for those throwaway
testing VMs, or during OS installs), I suggest adding cache=volatile:

    cache=none
        O_DIRECT, fdatasync, advertise volatile write cache

    cache=writethrough
        O_SYNC, do not advertise

    cache=writeback
        fdatasync, advertise volatile write cache

    cache=volatile
        nothing (perhaps fdatasync on QEMU blockdev close)

When using guests OSes which issue CACHE FLUSH commands (that's a
guest config issue), why would you ever use cache=writethrough?
cache=writeback should be faster and equally safe - provided you do
actually advertise the write cache!  So please do!  Not doing so is
silly.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:09   ` Jamie Lokier
@ 2009-08-31 22:16     ` Christoph Hellwig
  2009-08-31 22:46       ` Jamie Lokier
  2009-08-31 22:53       ` Anthony Liguori
  0 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 22:16 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 11:09:50PM +0100, Jamie Lokier wrote:
> Right now, on a Linux host O_SYNC is unsafe with hardware that has a
> volatile write cache.  That might not be changed, but if it is than
> performance with cache=writethrough will plummet (due to issuing a
> CACHE FLUSH to the hardware after every write), while performance with
> cache=writeback will be reasonable.

Currenly all modes are more or less unsafe with volatile write caches
at least when using ext3 or raw block device accesses.  XFS is safe
two thirds due to doing the right thing and one third due to sheer
luck.

> If an unsafe mode is desired (I think it is, for those throwaway
> testing VMs, or during OS installs), I suggest adding cache=volatile:
> 
>     cache=none
>         O_DIRECT, fdatasync, advertise volatile write cache
> 
>     cache=writethrough
>         O_SYNC, do not advertise
> 
>     cache=writeback
>         fdatasync, advertise volatile write cache
> 
>     cache=volatile
>         nothing (perhaps fdatasync on QEMU blockdev close)

Fine withe me, let the flame war begin :)

> When using guests OSes which issue CACHE FLUSH commands (that's a
> guest config issue), why would you ever use cache=writethrough?
> cache=writeback should be faster and equally safe - provided you do
> actually advertise the write cache!

And provided the guest OS actually issues cache flushes when it should,
something that at least Linux historicaly utterly failed at, and some
other operating systems haven't even tried.

cache=writethrough is the equivalent of turning off the volatile
write cache of the real disk.  It might be slower (which isn't even
always the case for real disks), but it is much safer.

E.g. if you want to move your old SCO Unix box into a VM it's the
only safe option.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:16     ` Christoph Hellwig
@ 2009-08-31 22:46       ` Jamie Lokier
  2009-08-31 23:06         ` Christoph Hellwig
  2009-08-31 22:53       ` Anthony Liguori
  1 sibling, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 22:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> On Mon, Aug 31, 2009 at 11:09:50PM +0100, Jamie Lokier wrote:
> > Right now, on a Linux host O_SYNC is unsafe with hardware that has a
> > volatile write cache.  That might not be changed, but if it is than
> > performance with cache=writethrough will plummet (due to issuing a
> > CACHE FLUSH to the hardware after every write), while performance with
> > cache=writeback will be reasonable.
> 
> Currenly all modes are more or less unsafe with volatile write caches
> at least when using ext3 or raw block device accesses.  XFS is safe
> two thirds due to doing the right thing and one third due to sheer
> luck.

Right, but now you've made it worse.  By not calling fdatasync at all,
you've reduced the integrity.  Previously it would reach the drive's
cache, and take whatever (short) time it took to reach the platter.
Now you're leaving data in the host cache which can stay for much
longer, and is vulnerable to host kernel crashes.

Oh, and QEMU could call whatever "hdparm -F" does when using raw block
devices ;-)

> >         nothing (perhaps fdatasync on QEMU blockdev close)
> 
> Fine withe me, let the flame war begin :)

Well I'd like to start by pointing out your patch introduces a
regression in the combination cache=writeback with emulated SCSI,
because it effectively removes the fdatasync calls in that case :-)

So please amend the patch before it gets applied lest such silliness
be propagated.  Thanks :-)

I've actually been using cache=writeback with emulated IDE on deployed
server VMs, assuming that worked with KVM.  It's been an eye opener to
find that it was broken all along because the driver failed set the
"has write cache" bit.  Thank you for the detective work.

It goes to show no matter how hard we try, data integrity is a
slippery thing where getting it wrong does not show up under normal
circumstances, only during catastrophic system failures.

Ironically, with emulated SCSI, I used cache=writethrough, thinking
guests would not issue CACHE FLUSH commands over SCSI because
historically performance has been reached by having overlapping writes
instead.

> > When using guests OSes which issue CACHE FLUSH commands (that's a
> > guest config issue), why would you ever use cache=writethrough?
> > cache=writeback should be faster and equally safe - provided you do
> > actually advertise the write cache!
> 
> And provided the guest OS actually issues cache flushes when it should,
> something that at least Linux historicaly utterly failed at, and some
> other operating systems haven't even tried.

For the hosts, yes - fsync/fdatasync/O_SYNC/O_DIRECT all utterly fail.
(Afaict, Windows hosts do it in some combinations).  But I'd like to
think we're about to fix Linux hosts soon, thanks to your good work on
that elsewhere.

For guests, Linux has been good at issuing the necessary flushes for
ordinary journalling, (in ext3's case, provided barrier=1 is given in
mount options), which is quite important.  It failed with fsync, which
is also important to applications, but filesystem integrity is the
most important thing and it's been good at that for many years.

> cache=writethrough is the equivalent of turning off the volatile
> write cache of the real disk.  It might be slower (which isn't even
> always the case for real disks), but it is much safer.

When O_SYNC is made to flush hardware cache on Linux hosts, it will be
excruciatingly slow: it'll have to seek twice for every write.  Once
for the data, once for the inode update.  That's another reason
O_DSYNC is important.

> E.g. if you want to move your old SCO Unix box into a VM it's the
> only safe option.

I agree, and for that reason, cache=writethrough or cache=none are the
only reasonable defaults.

By the way, all this has led me to another idea...  We may find that
O_SYNC is slower than batching several writes followed by one
fdatasync whose completion allows several writes to report as
completed, when emulating SCSI or virtio-blk (anything which allows
overlapping write commands from the guest).

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:46       ` Jamie Lokier
@ 2009-08-31 23:06         ` Christoph Hellwig
  2009-09-01 10:38           ` Jamie Lokier
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 23:06 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 11:46:45PM +0100, Jamie Lokier wrote:
> > On Mon, Aug 31, 2009 at 11:09:50PM +0100, Jamie Lokier wrote:
> > > Right now, on a Linux host O_SYNC is unsafe with hardware that has a
> > > volatile write cache.  That might not be changed, but if it is than
> > > performance with cache=writethrough will plummet (due to issuing a
> > > CACHE FLUSH to the hardware after every write), while performance with
> > > cache=writeback will be reasonable.
> > 
> > Currenly all modes are more or less unsafe with volatile write caches
> > at least when using ext3 or raw block device accesses.  XFS is safe
> > two thirds due to doing the right thing and one third due to sheer
> > luck.
> 
> Right, but now you've made it worse.  By not calling fdatasync at all,
> you've reduced the integrity.  Previously it would reach the drive's
> cache, and take whatever (short) time it took to reach the platter.
> Now you're leaving data in the host cache which can stay for much
> longer, and is vulnerable to host kernel crashes.

Your last comment is for data=writeback, which in Avi's proposal that
I implemented would indeed lost any guarantees and be for all pratical
matters unsafe.  It's not true for any of the other options.

> Oh, and QEMU could call whatever "hdparm -F" does when using raw block
> devices ;-)

Actually for ide/scsi implementing cache control is on my todo list.
Not sure about virtio yet.

> Well I'd like to start by pointing out your patch introduces a
> regression in the combination cache=writeback with emulated SCSI,
> because it effectively removes the fdatasync calls in that case :-)

Yes, you already pointed this out above.

> It goes to show no matter how hard we try, data integrity is a
> slippery thing where getting it wrong does not show up under normal
> circumstances, only during catastrophic system failures.

Honestly, it should not.  Digging through all this was a bit of work,
but I was extremly how carelessly most people that touched it before
were.  It's not rocket sciense and can be tested quite easily using
various tools - qemu beeing the easiest nowdays but scsi_debug or
an instrumented iscsi target would do the same thing.

> It failed with fsync, which
> is also important to applications, but filesystem integrity is the
> most important thing and it's been good at that for many years.

Users might disagree with that.  With my user hat on I couldn't care
less on what state the internal metadata is as long as I get back at
my data which the OS has guaranteed me to reach the disk after a
successfull fsync/fdatasync/O_SYNC write.

> > E.g. if you want to move your old SCO Unix box into a VM it's the
> > only safe option.
> 
> I agree, and for that reason, cache=writethrough or cache=none are the
> only reasonable defaults.

despite the extremly misleading name cache=none is _NOT_ an alternative,
unless we make it open the image using O_DIRECT|O_SYNC.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 23:06         ` Christoph Hellwig
@ 2009-09-01 10:38           ` Jamie Lokier
  0 siblings, 0 replies; 33+ messages in thread
From: Jamie Lokier @ 2009-09-01 10:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> > Oh, and QEMU could call whatever "hdparm -F" does when using raw block
> > devices ;-)
> 
> Actually for ide/scsi implementing cache control is on my todo list.
> Not sure about virtio yet.

I think hdparm -f -F does for some block devices what fdatasync
ideally does for files.  What I was getting at was until we have
perfect fdatasync on block devices for Linux, QEMU could use the
blockdev ioctls to accomplish the same thing on older kernels.

> > It goes to show no matter how hard we try, data integrity is a
> > slippery thing where getting it wrong does not show up under normal
> > circumstances, only during catastrophic system failures.
> 
> Honestly, it should not.  Digging through all this was a bit of work,
> but I was extremly how carelessly most people that touched it before
> were.  It's not rocket sciense and can be tested quite easily using
> various tools - qemu beeing the easiest nowdays but scsi_debug or
> an instrumented iscsi target would do the same thing.

Oh I agree - we have increasingly good debugging tools.  What's
missing is a dirty script^H^H^H^H^H^H a good validation test which
stresses the various combinations of ways to sync data on block
devices and various filesystems, and various types of emulated
hardware with/without caches enabled, and various mount options, and
checks the I/O does what is desired in every case.

> > It failed with fsync, which is also important to applications, but
> > filesystem integrity is the most important thing and it's been
> > good at that for many years.
> 
> Users might disagree with that.  With my user hat on I couldn't care
> less on what state the internal metadata is as long as I get back at
> my data which the OS has guaranteed me to reach the disk after a
> successfull fsync/fdatasync/O_SYNC write.

I guess it depends what you're doing.  I've observed more instances of
filesystem corruption due to lack of barriers, resulting in an
inability to find files, than I've ever noticed missing data inside
files, but then I hardly ever keep large amounts of data in databases.
And I get so much mail I wouldn't notice if a few got lost ;-)

> > > E.g. if you want to move your old SCO Unix box into a VM it's the
> > > only safe option.
> > 
> > I agree, and for that reason, cache=writethrough or cache=none are the
> > only reasonable defaults.
> 
> despite the extremly misleading name cache=none is _NOT_ an alternative,
> unless we make it open the image using O_DIRECT|O_SYNC.

Good point about the misleading name, and good point about O_DIRECT
being insufficient too.

For a safe emulation default with reasonable performance, I wonder if
it would work to emulate drive cache _off_ at the beginning, but with
the capability for the guest to enable it?  The theory is that old
guests don't know about drive caches and will leave it off and be safe
(getting O_DSYNC or O_DIRECT|O_DSYNC)[*], and newer guests will turn it on
if they also implement barriers (getting nothing or O_DIRECT, and
fdatasync when they issue barriers).  Do you think that would work
with typical guests we know about?

[*] - O_DSYNC as opposed to O_SYNC strikes me as important once proper
cache flushes are implemented, as it may behave very similar to real
hardware when doing data overwrites, whereas O_SYNC should seek back
and forth between the data and inode areas for every write, if it's
updating it's nanosecond timestamps correctly.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:16     ` Christoph Hellwig
  2009-08-31 22:46       ` Jamie Lokier
@ 2009-08-31 22:53       ` Anthony Liguori
  2009-08-31 22:55         ` Jamie Lokier
                           ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Anthony Liguori @ 2009-08-31 22:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
>> If an unsafe mode is desired (I think it is, for those throwaway
>> testing VMs, or during OS installs), I suggest adding cache=volatile:
>>
>>     cache=none
>>         O_DIRECT, fdatasync, advertise volatile write cache
>>
>>     cache=writethrough
>>         O_SYNC, do not advertise
>>
>>     cache=writeback
>>         fdatasync, advertise volatile write cache
>>
>>     cache=volatile
>>         nothing (perhaps fdatasync on QEMU blockdev close)
>>     
>
> Fine withe me, let the flame war begin :)
>   

I think we should pity our poor users and avoid adding yet another 
obscure option that is likely to be misunderstood.

Can someone do some benchmarking with cache=writeback and fdatasync 
first and quantify what the real performance impact is?

I think the two reasonable options are 1) make cache=writeback safe, 
avoid a massive perf decrease in the process 2) keep cache=writeback as 
a no-guarantees option.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:53       ` Anthony Liguori
@ 2009-08-31 22:55         ` Jamie Lokier
  2009-08-31 22:58         ` Christoph Hellwig
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 22:55 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

Anthony Liguori wrote:
> Christoph Hellwig wrote:
> >>If an unsafe mode is desired (I think it is, for those throwaway
> >>testing VMs, or during OS installs), I suggest adding cache=volatile:
> >>
> >>    cache=none
> >>        O_DIRECT, fdatasync, advertise volatile write cache
> >>
> >>    cache=writethrough
> >>        O_SYNC, do not advertise
> >>
> >>    cache=writeback
> >>        fdatasync, advertise volatile write cache
> >>
> >>    cache=volatile
> >>        nothing (perhaps fdatasync on QEMU blockdev close)
> >>    
> >
> >Fine withe me, let the flame war begin :)
> >  
> 
> I think we should pity our poor users and avoid adding yet another 
> obscure option that is likely to be misunderstood.
> 
> Can someone do some benchmarking with cache=writeback and fdatasync 
> first and quantify what the real performance impact is?
> 
> I think the two reasonable options are 1) make cache=writeback safe, 
> avoid a massive perf decrease in the process 2) keep cache=writeback as 
> a no-guarantees option.

Right now, cache=writeback does set the bit for SCSI emulation, which makes it
safe for guests which understand that.

Removing that is a regression in safety, not merely a lack of change.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:53       ` Anthony Liguori
  2009-08-31 22:55         ` Jamie Lokier
@ 2009-08-31 22:58         ` Christoph Hellwig
  2009-08-31 22:59         ` Jamie Lokier
  2009-09-02  3:53         ` Christoph Hellwig
  3 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 22:58 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 05:53:23PM -0500, Anthony Liguori wrote:
> I think we should pity our poor users and avoid adding yet another 
> obscure option that is likely to be misunderstood.
> 
> Can someone do some benchmarking with cache=writeback and fdatasync 
> first and quantify what the real performance impact is?

I can run benchmarks.  Any workloads that you are particularly looking
for?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:53       ` Anthony Liguori
  2009-08-31 22:55         ` Jamie Lokier
  2009-08-31 22:58         ` Christoph Hellwig
@ 2009-08-31 22:59         ` Jamie Lokier
  2009-08-31 23:06           ` Christoph Hellwig
  2009-09-02  3:53         ` Christoph Hellwig
  3 siblings, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 22:59 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

Anthony Liguori wrote:
> Can someone do some benchmarking with cache=writeback and fdatasync 
> first and quantify what the real performance impact is?

Unfortunately we can't yet quantify the impact on the hardware I care
about (ordinary consumer PCs with non-NCQ SATA disks), because Linux
hosts don't *yet* implement O_SYNC or fdatasync properly.

I would expect the performance difference to be much more significant
after those are implemented on the host.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:59         ` Jamie Lokier
@ 2009-08-31 23:06           ` Christoph Hellwig
  2009-08-31 23:09             ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 23:06 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 11:59:25PM +0100, Jamie Lokier wrote:
> Anthony Liguori wrote:
> > Can someone do some benchmarking with cache=writeback and fdatasync 
> > first and quantify what the real performance impact is?
> 
> Unfortunately we can't yet quantify the impact on the hardware I care
> about (ordinary consumer PCs with non-NCQ SATA disks), because Linux
> hosts don't *yet* implement O_SYNC or fdatasync properly.

They do if you use XFS.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 23:06           ` Christoph Hellwig
@ 2009-08-31 23:09             ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 23:09 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel

On Tue, Sep 01, 2009 at 01:06:46AM +0200, Christoph Hellwig wrote:
> On Mon, Aug 31, 2009 at 11:59:25PM +0100, Jamie Lokier wrote:
> > Anthony Liguori wrote:
> > > Can someone do some benchmarking with cache=writeback and fdatasync 
> > > first and quantify what the real performance impact is?
> > 
> > Unfortunately we can't yet quantify the impact on the hardware I care
> > about (ordinary consumer PCs with non-NCQ SATA disks), because Linux
> > hosts don't *yet* implement O_SYNC or fdatasync properly.
> 
> They do if you use XFS.

And data-writeback + fdatasync also when using reiserfs.  For ext3/4 you
need the patches I sent out today, for O_SYNC on everythig but XFS also
Jan Kara's patch series.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-08-31 22:53       ` Anthony Liguori
                           ` (2 preceding siblings ...)
  2009-08-31 22:59         ` Jamie Lokier
@ 2009-09-02  3:53         ` Christoph Hellwig
  2009-09-02 13:13           ` Anthony Liguori
  3 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-02  3:53 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 05:53:23PM -0500, Anthony Liguori wrote:
> I think we should pity our poor users and avoid adding yet another 
> obscure option that is likely to be misunderstood.
> 
> Can someone do some benchmarking with cache=writeback and fdatasync 
> first and quantify what the real performance impact is?

Some preliminary numbers because they are very interesting.  Note that
his is on a raid controller, not cheap ide disks.  To make up for that
I used an image file on ext3, which due to it's horrible fsync
performance should be kind of a worst case.  All these patches are
with Linux 2.6.31-rc8 + my various barrier fixes on guest and host,
using ext3 with barrier=1 on both.

A kernel defconfig compile takes between 9m40s and 9m42s with
data=writeback and barrieres disabled, and with fdatasync barriers
enabled it actually is minimally faster, between 9m38s and 9m39s
(given that I've only done three runs each this might fall under
the boundary for measurement tolerances).

For comparism the raw block device nodes with cache=none (just one run)
is 9m36.759s, which is not far apart.  A completely native run is
7m39.326, btw - and I fear much of the slowdown in KVM isn't I/O
related.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-09-02  3:53         ` Christoph Hellwig
@ 2009-09-02 13:13           ` Anthony Liguori
  2009-09-02 14:14             ` Christoph Hellwig
  2009-09-02 19:49             ` Christoph Hellwig
  0 siblings, 2 replies; 33+ messages in thread
From: Anthony Liguori @ 2009-09-02 13:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> On Mon, Aug 31, 2009 at 05:53:23PM -0500, Anthony Liguori wrote:
>   
>> I think we should pity our poor users and avoid adding yet another 
>> obscure option that is likely to be misunderstood.
>>
>> Can someone do some benchmarking with cache=writeback and fdatasync 
>> first and quantify what the real performance impact is?
>>     
>
> Some preliminary numbers because they are very interesting.  Note that
> his is on a raid controller, not cheap ide disks.  To make up for that
> I used an image file on ext3, which due to it's horrible fsync
> performance should be kind of a worst case.  All these patches are
> with Linux 2.6.31-rc8 + my various barrier fixes on guest and host,
> using ext3 with barrier=1 on both.
>   

Does barrier=0 make a performance difference?  IOW, would the typical 
default ext3 deployment show worse behavior?

> A kernel defconfig compile takes between 9m40s and 9m42s with
> data=writeback and barrieres disabled, and with fdatasync barriers
> enabled it actually is minimally faster,

If fdatasync different than fsync on ext3?  Does it result in a full 
metadata commit?

If we think these numbers make sense, then I'd vote for enabling 
fdatasync in master and we'll see if there are any corner cases.

>  between 9m38s and 9m39s
> (given that I've only done three runs each this might fall under
> the boundary for measurement tolerances).
>
> For comparism the raw block device nodes with cache=none (just one run)
> is 9m36.759s, which is not far apart.  A completely native run is
> 7m39.326, btw - and I fear much of the slowdown in KVM isn't I/O
> related.
>   

If you're on pre-NHM or BCN then the slowdown from shadow paging would 
be expected.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-09-02 13:13           ` Anthony Liguori
@ 2009-09-02 14:14             ` Christoph Hellwig
  2009-09-02 19:49             ` Christoph Hellwig
  1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-02 14:14 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

On Wed, Sep 02, 2009 at 08:13:52AM -0500, Anthony Liguori wrote:
> Does barrier=0 make a performance difference?  IOW, would the typical 
> default ext3 deployment show worse behavior?

I'll give it a spin.

> >A kernel defconfig compile takes between 9m40s and 9m42s with
> >data=writeback and barrieres disabled, and with fdatasync barriers
> >enabled it actually is minimally faster,
> 
> If fdatasync different than fsync on ext3?  Does it result in a full 
> metadata commit?

ext3 honors the fdatasync flag and only does it's horrible job if the
metadata we care about is dirty, that is only if some non-timestamp
data is dirty.  Which means for a non-sparse image file it does
well, while for a sparse image file needing allocations it will
cause trouble.  And now that you mention it I've only tested the
non-sparse case which now is default for the management tools at
least on Fedora.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
  2009-09-02 13:13           ` Anthony Liguori
  2009-09-02 14:14             ` Christoph Hellwig
@ 2009-09-02 19:49             ` Christoph Hellwig
  1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-02 19:49 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Christoph Hellwig, qemu-devel

On Wed, Sep 02, 2009 at 08:13:52AM -0500, Anthony Liguori wrote:
> >performance should be kind of a worst case.  All these patches are
> >with Linux 2.6.31-rc8 + my various barrier fixes on guest and host,
> >using ext3 with barrier=1 on both.
> >  
> 
> Does barrier=0 make a performance difference?  IOW, would the typical 
> default ext3 deployment show worse behavior?

Note for this tyical ext3 deployment the barrier patches are kinda
useless, because we still don't have any data integrity guarantees at
all.  Anyway, here are the numbers with barrier=0 on host and guest:

data=writeback, no write cache advertised:

	9m37.890s, 9m38.303s, 9m38.423s, 9m38.861s, 9m39.599s

data=writeback, write cache advertized (and backed by fdatasync):

	9m39.649s, 9m39.772s, 9m40.149s, 9m41.737s, 9m41.996s

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 20:16 [Qemu-devel] [PATCH 0/4] data integrity fixes Christoph Hellwig
  2009-08-31 20:16 ` [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag Christoph Hellwig
@ 2009-08-31 20:17 ` Christoph Hellwig
  2009-08-31 21:51   ` Jamie Lokier
  2009-09-01 15:59   ` Blue Swirl
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation Christoph Hellwig
  2009-08-31 20:18 ` [Qemu-devel] [PATCH 4/4] virtio-blk: add volatile writecache feature Christoph Hellwig
  3 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 20:17 UTC (permalink / raw)
  To: qemu-devel


If we are flushing the caches for our image files we only care about the
data (including the metadata required for accessing it) but not things
like timestamp updates.  So use fdatasync instead of fsync to implement
the flush operations.


Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: qemu/block/cow.c
===================================================================
--- qemu.orig/block/cow.c	2009-06-07 16:18:21.942938914 -0300
+++ qemu/block/cow.c	2009-08-31 16:49:55.509043084 -0300
@@ -258,7 +258,7 @@ static int cow_create(const char *filena
 static void cow_flush(BlockDriverState *bs)
 {
     BDRVCowState *s = bs->opaque;
-    fsync(s->fd);
+    fdatasync(s->fd);
 }
 
 static QEMUOptionParameter cow_create_options[] = {
Index: qemu/block/raw-posix.c
===================================================================
--- qemu.orig/block/raw-posix.c	2009-08-29 14:17:23.039370199 -0300
+++ qemu/block/raw-posix.c	2009-08-31 16:49:55.513071598 -0300
@@ -723,7 +723,7 @@ static int raw_create(const char *filena
 static void raw_flush(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
-    fsync(s->fd);
+    fdatasync(s->fd);
 }
 
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync Christoph Hellwig
@ 2009-08-31 21:51   ` Jamie Lokier
  2009-08-31 21:55     ` Christoph Hellwig
  2009-09-01 15:59   ` Blue Swirl
  1 sibling, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 21:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> 
> If we are flushing the caches for our image files we only care about the
> data (including the metadata required for accessing it) but not things
> like timestamp updates.  So use fdatasync instead of fsync to implement
> the flush operations.

> -    fsync(s->fd);
> +    fdatasync(s->fd);

I believe fsync was used because of uncertainty about whether
fdatasync reliably flushes the necessary metadata to access the data
on all hosts, after things like writing to holes and extending a file.

I'm still not sure if fdatasync provides that guarantee on Linux.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 21:51   ` Jamie Lokier
@ 2009-08-31 21:55     ` Christoph Hellwig
  2009-08-31 22:48       ` Jamie Lokier
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 21:55 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: qemu-devel

On Mon, Aug 31, 2009 at 10:51:38PM +0100, Jamie Lokier wrote:
> Christoph Hellwig wrote:
> > 
> > If we are flushing the caches for our image files we only care about the
> > data (including the metadata required for accessing it) but not things
> > like timestamp updates.  So use fdatasync instead of fsync to implement
> > the flush operations.
> 
> > -    fsync(s->fd);
> > +    fdatasync(s->fd);
> 
> I believe fsync was used because of uncertainty about whether
> fdatasync reliably flushes the necessary metadata to access the data
> on all hosts, after things like writing to holes and extending a file.
> 
> I'm still not sure if fdatasync provides that guarantee on Linux.

fdatasync is defined to provide that guarantee, and modulo the whole
clusterfuck around volatile write caches it does the right thing
thing on common Linux filesystems.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 21:55     ` Christoph Hellwig
@ 2009-08-31 22:48       ` Jamie Lokier
  2009-08-31 22:57         ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-08-31 22:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

Christoph Hellwig wrote:
> On Mon, Aug 31, 2009 at 10:51:38PM +0100, Jamie Lokier wrote:
> > Christoph Hellwig wrote:
> > > 
> > > If we are flushing the caches for our image files we only care about the
> > > data (including the metadata required for accessing it) but not things
> > > like timestamp updates.  So use fdatasync instead of fsync to implement
> > > the flush operations.
> > 
> > > -    fsync(s->fd);
> > > +    fdatasync(s->fd);
> > 
> > I believe fsync was used because of uncertainty about whether
> > fdatasync reliably flushes the necessary metadata to access the data
> > on all hosts, after things like writing to holes and extending a file.
> > 
> > I'm still not sure if fdatasync provides that guarantee on Linux.
> 
> fdatasync is defined to provide that guarantee, and modulo the whole
> clusterfuck around volatile write caches it does the right thing
> thing on common Linux filesystems.

That's good to know.  I'm not sure about other hosts though.  It's
specified to update the metadata, but I think I've encountered people
saying it doesn't - specifically when extending a file, that it
doesn't guarantee the file size is extended on some OSes.  But that
might be an urban myth.

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 22:48       ` Jamie Lokier
@ 2009-08-31 22:57         ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 22:57 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel

On Mon, Aug 31, 2009 at 11:48:19PM +0100, Jamie Lokier wrote:
> > fdatasync is defined to provide that guarantee, and modulo the whole
> > clusterfuck around volatile write caches it does the right thing
> > thing on common Linux filesystems.
> 
> That's good to know.  I'm not sure about other hosts though.  It's
> specified to update the metadata, but I think I've encountered people
> saying it doesn't - specifically when extending a file, that it
> doesn't guarantee the file size is extended on some OSes.  But that
> might be an urban myth.

Those would be typical bugs for really naive implementations.  I'm
pretty sure those existed for a short time in a lot of places, but I
would be extremly surprised if they continued to exist for longer time
in any major operating system.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync Christoph Hellwig
  2009-08-31 21:51   ` Jamie Lokier
@ 2009-09-01 15:59   ` Blue Swirl
  2009-09-01 16:04     ` Christoph Hellwig
  1 sibling, 1 reply; 33+ messages in thread
From: Blue Swirl @ 2009-09-01 15:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

On Mon, Aug 31, 2009 at 11:17 PM, Christoph Hellwig<hch@lst.de> wrote:
>
> If we are flushing the caches for our image files we only care about the
> data (including the metadata required for accessing it) but not things
> like timestamp updates.  So use fdatasync instead of fsync to implement
> the flush operations.

> -    fsync(s->fd);
> +    fdatasync(s->fd);

There is no fdatasync outside of Linux, not in OpenBSD and mingw32 at
least. But you could add a probe for it to configure.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-01 15:59   ` Blue Swirl
@ 2009-09-01 16:04     ` Christoph Hellwig
  2009-09-02  0:34       ` Jamie Lokier
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-01 16:04 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Christoph Hellwig, qemu-devel

On Tue, Sep 01, 2009 at 06:59:46PM +0300, Blue Swirl wrote:
> On Mon, Aug 31, 2009 at 11:17 PM, Christoph Hellwig<hch@lst.de> wrote:
> >
> > If we are flushing the caches for our image files we only care about the
> > data (including the metadata required for accessing it) but not things
> > like timestamp updates. ??So use fdatasync instead of fsync to implement
> > the flush operations.
> 
> > - ?? ??fsync(s->fd);
> > + ?? ??fdatasync(s->fd);
> 
> There is no fdatasync outside of Linux, not in OpenBSD and mingw32 at
> least. But you could add a probe for it to configure.

Oh, okay.  Kinda sad that people still don't implement useful inteface
that have been in Posix for a long time.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-01 16:04     ` Christoph Hellwig
@ 2009-09-02  0:34       ` Jamie Lokier
  2009-09-02  0:37         ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Jamie Lokier @ 2009-09-02  0:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Blue Swirl, qemu-devel

Christoph Hellwig wrote:
> On Tue, Sep 01, 2009 at 06:59:46PM +0300, Blue Swirl wrote:
> > On Mon, Aug 31, 2009 at 11:17 PM, Christoph Hellwig<hch@lst.de> wrote:
> > >
> > > If we are flushing the caches for our image files we only care about the
> > > data (including the metadata required for accessing it) but not things
> > > like timestamp updates. ??So use fdatasync instead of fsync to implement
> > > the flush operations.
> > 
> > > - ?? ??fsync(s->fd);
> > > + ?? ??fdatasync(s->fd);
> > 
> > There is no fdatasync outside of Linux, not in OpenBSD and mingw32 at
> > least. But you could add a probe for it to configure.
> 
> Oh, okay.  Kinda sad that people still don't implement useful inteface
> that have been in Posix for a long time.

fdatasync is common among the big commercial unixes which were still
being actively developed a few years ago, but fsync is universal -
it's very old.

Neither OpenBSD nor FreeBSD have fdatasync.

mingw32 is a thin wrapper around Windows.  Windows as far as I can
tell doesn't have an equivalent of fdatasync, although it does have an
equivalent of O_DIRECT|O_SYNC (but documentation contradicts itself
regarding whether the combination flushes metadata on each write).

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-02  0:34       ` Jamie Lokier
@ 2009-09-02  0:37         ` Christoph Hellwig
  2009-09-02  1:18           ` Jamie Lokier
  2009-09-02 14:02           ` Blue Swirl
  0 siblings, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-02  0:37 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Blue Swirl, Christoph Hellwig, qemu-devel

On Wed, Sep 02, 2009 at 01:34:54AM +0100, Jamie Lokier wrote:
> fdatasync is common among the big commercial unixes which were still
> being actively developed a few years ago, but fsync is universal -
> it's very old.
> 
> Neither OpenBSD nor FreeBSD have fdatasync.
> 
> mingw32 is a thin wrapper around Windows.  Windows as far as I can
> tell doesn't have an equivalent of fdatasync, although it does have an
> equivalent of O_DIRECT|O_SYNC (but documentation contradicts itself
> regarding whether the combination flushes metadata on each write).

Yeah,  I'll just add a

#ifndef _POSIX_SYNCHRONIZED_IO
# define fdatasync(fd)	fsync(fd)
#endif

Now I'm pretty sure we'll find various fuckups where headers define
fdatasync but not _POSIX_SYNCHRONIZED_IO or vice versa, but I can
outsource adding workarounds for that to people with those broken
setups.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-02  0:37         ` Christoph Hellwig
@ 2009-09-02  1:18           ` Jamie Lokier
  2009-09-02 14:02           ` Blue Swirl
  1 sibling, 0 replies; 33+ messages in thread
From: Jamie Lokier @ 2009-09-02  1:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Blue Swirl, qemu-devel

Christoph Hellwig wrote:
> On Wed, Sep 02, 2009 at 01:34:54AM +0100, Jamie Lokier wrote:
> > fdatasync is common among the big commercial unixes which were still
> > being actively developed a few years ago, but fsync is universal -
> > it's very old.
> > 
> > Neither OpenBSD nor FreeBSD have fdatasync.
> > 
> > mingw32 is a thin wrapper around Windows.  Windows as far as I can
> > tell doesn't have an equivalent of fdatasync, although it does have an
> > equivalent of O_DIRECT|O_SYNC (but documentation contradicts itself
> > regarding whether the combination flushes metadata on each write).
> 
> Yeah,  I'll just add a
> 
> #ifndef _POSIX_SYNCHRONIZED_IO
> # define fdatasync(fd)	fsync(fd)
> #endif
> 
> Now I'm pretty sure we'll find various fuckups where headers define
> fdatasync but not _POSIX_SYNCHRONIZED_IO or vice versa, but I can
> outsource adding workarounds for that to people with those broken
> setups.

I've now learned that Windows has equivalents of O_DSYNC and fsync,
but not O_SYNC and fdatasync, and that the O_DSYNC equivalent doesn't
flush metadata needed to access newly allocated data; it only commits
overwrites, and that the fsync equivalent does try to flush the
drive's volatile write cache after writing metadata, but only on some
versions with some settings and some drivers and possibly some types
of drive.  Not that you could tell any of that from the documentation.

Why do I have the feeling some unixes are the same.

No wonder the SQLite folks say, paraphrasing, "we call fsync or
something else; we can't help you beyond that point and many OSes have
rumoured bugs in this area or don't fulfil what these functions promise".

-- Jamie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-02  0:37         ` Christoph Hellwig
  2009-09-02  1:18           ` Jamie Lokier
@ 2009-09-02 14:02           ` Blue Swirl
  2009-09-02 14:15             ` Christoph Hellwig
  1 sibling, 1 reply; 33+ messages in thread
From: Blue Swirl @ 2009-09-02 14:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

On Wed, Sep 2, 2009 at 3:37 AM, Christoph Hellwig<hch@lst.de> wrote:
> On Wed, Sep 02, 2009 at 01:34:54AM +0100, Jamie Lokier wrote:
>> fdatasync is common among the big commercial unixes which were still
>> being actively developed a few years ago, but fsync is universal -
>> it's very old.
>>
>> Neither OpenBSD nor FreeBSD have fdatasync.
>>
>> mingw32 is a thin wrapper around Windows.  Windows as far as I can
>> tell doesn't have an equivalent of fdatasync, although it does have an
>> equivalent of O_DIRECT|O_SYNC (but documentation contradicts itself
>> regarding whether the combination flushes metadata on each write).
>
> Yeah,  I'll just add a
>
> #ifndef _POSIX_SYNCHRONIZED_IO
> # define fdatasync(fd)  fsync(fd)
> #endif
>
> Now I'm pretty sure we'll find various fuckups where headers define
> fdatasync but not _POSIX_SYNCHRONIZED_IO or vice versa, but I can
> outsource adding workarounds for that to people with those broken
> setups.

I'd use a real function, like qemu_fdatasync(). Especially if the
Windows equivalent will be complex.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync
  2009-09-02 14:02           ` Blue Swirl
@ 2009-09-02 14:15             ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-02 14:15 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Christoph Hellwig, qemu-devel

On Wed, Sep 02, 2009 at 05:02:32PM +0300, Blue Swirl wrote:
> > Now I'm pretty sure we'll find various fuckups where headers define
> > fdatasync but not _POSIX_SYNCHRONIZED_IO or vice versa, but I can
> > outsource adding workarounds for that to people with those broken
> > setups.
> 
> I'd use a real function, like qemu_fdatasync(). Especially if the
> Windows equivalent will be complex.

Ok.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation
  2009-08-31 20:16 [Qemu-devel] [PATCH 0/4] data integrity fixes Christoph Hellwig
  2009-08-31 20:16 ` [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag Christoph Hellwig
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync Christoph Hellwig
@ 2009-08-31 20:17 ` Christoph Hellwig
  2009-09-01 10:24   ` Avi Kivity
  2009-08-31 20:18 ` [Qemu-devel] [PATCH 4/4] virtio-blk: add volatile writecache feature Christoph Hellwig
  3 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 20:17 UTC (permalink / raw)
  To: qemu-devel


Instead stalling the VCPU while serving a cache flush try to do it
asynchronously.  Use our good old helper thread pool to issue an
asynchronous fdatasync for raw-posix.  Note that while Linux AIO
implements a fdatasync operation it is not useful for us because
it isn't actually implement in asynchronous fashion.

For now only use it in IDE because virtio-blk doesn't implement
cache flusing yet (will be fixed in patch 4/4) and the interface between
the HBA emulation and scsi-disk will need some changes to accomodate
it for scsi (will be a separate patch series).

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: qemu/block.c
===================================================================
--- qemu.orig/block.c	2009-08-31 16:49:54.508542113 -0300
+++ qemu/block.c	2009-08-31 16:49:59.593042021 -0300
@@ -54,6 +54,8 @@ static BlockDriverAIOCB *bdrv_aio_readv_
 static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque);
+static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
+        BlockDriverCompletionFunc *cb, void *opaque);
 static int bdrv_read_em(BlockDriverState *bs, int64_t sector_num,
                         uint8_t *buf, int nb_sectors);
 static int bdrv_write_em(BlockDriverState *bs, int64_t sector_num,
@@ -138,6 +140,10 @@ void bdrv_register(BlockDriver *bdrv)
         bdrv->bdrv_read = bdrv_read_em;
         bdrv->bdrv_write = bdrv_write_em;
     }
+
+    if (!bdrv->bdrv_aio_flush)
+        bdrv->bdrv_aio_flush = bdrv_aio_flush_em;
+
     bdrv->next = first_drv;
     first_drv = bdrv;
 }
@@ -1369,6 +1375,21 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockD
     return ret;
 }
 
+BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
+				 BlockDriverCompletionFunc *cb, void *opaque)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (!drv)
+        return NULL;
+
+    /*
+     * Note that unlike bdrv_flush the driver is reponsible for flushing a
+     * backing image if it exists.
+     */
+    return drv->bdrv_aio_flush(bs, cb, opaque);
+}
+
 void bdrv_aio_cancel(BlockDriverAIOCB *acb)
 {
     acb->pool->cancel(acb);
@@ -1459,6 +1480,25 @@ static BlockDriverAIOCB *bdrv_aio_writev
     return bdrv_aio_rw_vector(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
 }
 
+static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
+        BlockDriverCompletionFunc *cb, void *opaque)
+{
+    BlockDriverAIOCBSync *acb;
+
+    acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque);
+    acb->is_write = 1; /* don't bounce in the completion hadler */
+    acb->qiov = NULL;
+    acb->bounce = NULL;
+    acb->ret = 0;
+
+    if (!acb->bh)
+        acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
+
+    bdrv_flush(bs);
+    qemu_bh_schedule(acb->bh);
+    return &acb->common;
+}
+
 /**************************************************************/
 /* sync block device emulation */
 
Index: qemu/block.h
===================================================================
--- qemu.orig/block.h	2009-08-31 16:49:54.516577491 -0300
+++ qemu/block.h	2009-08-31 16:49:59.593042021 -0300
@@ -85,6 +85,8 @@ BlockDriverAIOCB *bdrv_aio_readv(BlockDr
 BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
                                   QEMUIOVector *iov, int nb_sectors,
                                   BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
+				 BlockDriverCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
 /* sg packet commands */
Index: qemu/block_int.h
===================================================================
--- qemu.orig/block_int.h	2009-08-31 16:49:54.512583129 -0300
+++ qemu/block_int.h	2009-08-31 16:49:59.597095469 -0300
@@ -69,6 +69,8 @@ struct BlockDriver {
     BlockDriverAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque);
+    BlockDriverAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
+        BlockDriverCompletionFunc *cb, void *opaque);
 
     const char *protocol_name;
     int (*bdrv_truncate)(BlockDriverState *bs, int64_t offset);
Index: qemu/hw/ide/core.c
===================================================================
--- qemu.orig/hw/ide/core.c	2009-08-31 16:49:54.516577491 -0300
+++ qemu/hw/ide/core.c	2009-08-31 16:49:59.601041920 -0300
@@ -771,6 +771,16 @@ static void ide_atapi_cmd_check_status(I
     ide_set_irq(s);
 }
 
+static void ide_flush_cb(void *opaque, int ret)
+{
+    IDEState *s = opaque;
+
+    /* XXX: how do we signal I/O errors here? */
+
+    s->status = READY_STAT | SEEK_STAT;
+    ide_set_irq(s);
+}
+
 static inline void cpu_to_ube16(uint8_t *buf, int val)
 {
     buf[0] = val >> 8;
@@ -1969,9 +1979,9 @@ void ide_ioport_write(void *opaque, uint
         case WIN_FLUSH_CACHE:
         case WIN_FLUSH_CACHE_EXT:
             if (s->bs)
-                bdrv_flush(s->bs);
-	    s->status = READY_STAT | SEEK_STAT;
-            ide_set_irq(s);
+                bdrv_aio_flush(s->bs, ide_flush_cb, s);
+            else
+                ide_flush_cb(s, 0);
             break;
         case WIN_STANDBY:
         case WIN_STANDBY2:
Index: qemu/block/raw-posix-aio.h
===================================================================
--- qemu.orig/block/raw-posix-aio.h	2009-08-27 23:50:52.510770924 -0300
+++ qemu/block/raw-posix-aio.h	2009-08-31 16:49:59.605095368 -0300
@@ -17,8 +17,9 @@
 #define QEMU_AIO_READ         0x0001
 #define QEMU_AIO_WRITE        0x0002
 #define QEMU_AIO_IOCTL        0x0004
+#define QEMU_AIO_FLUSH        0x0008
 #define QEMU_AIO_TYPE_MASK \
-	(QEMU_AIO_READ|QEMU_AIO_WRITE|QEMU_AIO_IOCTL)
+	(QEMU_AIO_READ|QEMU_AIO_WRITE|QEMU_AIO_IOCTL|QEMU_AIO_FLUSH)
 
 /* AIO flags */
 #define QEMU_AIO_MISALIGNED   0x1000
Index: qemu/block/raw-posix.c
===================================================================
--- qemu.orig/block/raw-posix.c	2009-08-31 16:49:55.513071598 -0300
+++ qemu/block/raw-posix.c	2009-08-31 16:49:59.613070264 -0300
@@ -574,6 +574,18 @@ static BlockDriverAIOCB *raw_aio_writev(
                           cb, opaque, QEMU_AIO_WRITE);
 }
 
+static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
+        BlockDriverCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (fd_open(bs) < 0)
+        return NULL;
+
+    return paio_submit(bs, s->aio_ctx, s->fd, 0, NULL, 0,
+    		       cb, opaque, QEMU_AIO_FLUSH);
+}
+
 static void raw_close(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
@@ -749,6 +761,7 @@ static BlockDriver bdrv_raw = {
 
     .bdrv_aio_readv = raw_aio_readv,
     .bdrv_aio_writev = raw_aio_writev,
+    .bdrv_aio_flush = raw_aio_flush,
 
     .bdrv_truncate = raw_truncate,
     .bdrv_getlength = raw_getlength,
@@ -1002,6 +1015,7 @@ static BlockDriver bdrv_host_device = {
 
     .bdrv_aio_readv	= raw_aio_readv,
     .bdrv_aio_writev	= raw_aio_writev,
+    .bdrv_aio_flush	= raw_aio_flush,
 
     .bdrv_read          = raw_read,
     .bdrv_write         = raw_write,
@@ -1096,6 +1110,7 @@ static BlockDriver bdrv_host_floppy = {
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush	= raw_aio_flush,
 
     .bdrv_read          = raw_read,
     .bdrv_write         = raw_write,
@@ -1176,6 +1191,7 @@ static BlockDriver bdrv_host_cdrom = {
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush	= raw_aio_flush,
 
     .bdrv_read          = raw_read,
     .bdrv_write         = raw_write,
@@ -1295,6 +1311,7 @@ static BlockDriver bdrv_host_cdrom = {
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush	= raw_aio_flush,
 
     .bdrv_read          = raw_read,
     .bdrv_write         = raw_write,
Index: qemu/posix-aio-compat.c
===================================================================
--- qemu.orig/posix-aio-compat.c	2009-08-27 23:50:52.654237211 -0300
+++ qemu/posix-aio-compat.c	2009-08-31 16:49:59.621095866 -0300
@@ -134,6 +134,16 @@ static size_t handle_aiocb_ioctl(struct 
 	return aiocb->aio_nbytes;
 }
 
+static size_t handle_aiocb_flush(struct qemu_paiocb *aiocb)
+{
+	int ret;
+
+	ret = fdatasync(aiocb->aio_fildes);
+	if (ret == -1)
+		return -errno;
+	return 0;
+}
+
 #ifdef CONFIG_PREADV
 
 static ssize_t
@@ -330,6 +340,9 @@ static void *aio_thread(void *unused)
         case QEMU_AIO_WRITE:
 		ret = handle_aiocb_rw(aiocb);
 		break;
+	case QEMU_AIO_FLUSH:
+		ret = handle_aiocb_flush(aiocb);
+		break;
         case QEMU_AIO_IOCTL:
 		ret = handle_aiocb_ioctl(aiocb);
 		break;
@@ -530,8 +543,10 @@ BlockDriverAIOCB *paio_submit(BlockDrive
     acb->aio_type = type;
     acb->aio_fildes = fd;
     acb->ev_signo = SIGUSR2;
-    acb->aio_iov = qiov->iov;
-    acb->aio_niov = qiov->niov;
+    if (qiov) {
+        acb->aio_iov = qiov->iov;
+        acb->aio_niov = qiov->niov;
+    }
     acb->aio_nbytes = nb_sectors * 512;
     acb->aio_offset = sector_num * 512;
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation Christoph Hellwig
@ 2009-09-01 10:24   ` Avi Kivity
  2009-09-01 14:25     ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Avi Kivity @ 2009-09-01 10:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

On 08/31/2009 11:17 PM, Christoph Hellwig wrote:
> Instead stalling the VCPU while serving a cache flush try to do it
> asynchronously.  Use our good old helper thread pool to issue an
> asynchronous fdatasync for raw-posix.  Note that while Linux AIO
> implements a fdatasync operation it is not useful for us because
> it isn't actually implement in asynchronous fashion.
>
>    

Doesn't aio_flush() need to wait for all previously issued requests to 
complete?

We can require callers to do that, but it seems quite a burden.

> For now only use it in IDE because virtio-blk doesn't implement
> cache flusing yet (will be fixed in patch 4/4) and the interface between
> the HBA emulation and scsi-disk will need some changes to accomodate
> it for scsi (will be a separate patch series).
>    

IDE integration should be in a separate patch.


Patch has some tabs.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation
  2009-09-01 10:24   ` Avi Kivity
@ 2009-09-01 14:25     ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-09-01 14:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Christoph Hellwig, qemu-devel

On Tue, Sep 01, 2009 at 01:24:02PM +0300, Avi Kivity wrote:
> Doesn't aio_flush() need to wait for all previously issued requests to 
> complete?
> 
> We can require callers to do that, but it seems quite a burden.

The guest does it for us when using ide/scsi/virtio, and I don't
think we want to grow other users.  It's worth documenting, though.

> Patch has some tabs.

sorry, of course ;-)  Writing code in the weird qemu style just takes me
three times as long as readable code, sorry.  I'll fix it.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 4/4] virtio-blk: add volatile writecache feature
  2009-08-31 20:16 [Qemu-devel] [PATCH 0/4] data integrity fixes Christoph Hellwig
                   ` (2 preceding siblings ...)
  2009-08-31 20:17 ` [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation Christoph Hellwig
@ 2009-08-31 20:18 ` Christoph Hellwig
  3 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2009-08-31 20:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: rusty

Add a new VIRTIO_BLK_F_WCACHE feature to virtio-blk to indicate that we have
a volatile write cache that needs controlled flushing.  Implement a
VIRTIO_BLK_T_FLUSH operation to flush it.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: qemu-kvm/hw/virtio-blk.c
===================================================================
--- qemu-kvm.orig/hw/virtio-blk.c
+++ qemu-kvm/hw/virtio-blk.c
@@ -129,6 +129,13 @@ static void virtio_blk_rw_complete(void 
     virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 }
 
+static void virtio_blk_flush_complete(void *opaque, int ret)
+{
+    VirtIOBlockReq *req = opaque;
+
+    virtio_blk_req_complete(req, ret ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK);
+}
+
 static VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
 {
     VirtIOBlockReq *req = qemu_mallocz(sizeof(*req));
@@ -252,6 +259,16 @@ static void virtio_blk_handle_scsi(VirtI
 }
 #endif /* __linux__ */
 
+static void virtio_blk_handle_flush(VirtIOBlockReq *req)
+{
+    BlockDriverAIOCB *acb;
+
+    acb = bdrv_aio_flush(req->dev->bs, virtio_blk_flush_complete, req);
+    if (!acb) {
+        virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
+    }
+}
+
 static void virtio_blk_handle_write(VirtIOBlockReq *req)
 {
     BlockDriverAIOCB *acb;
@@ -294,7 +311,9 @@ static void virtio_blk_handle_output(Vir
         req->out = (void *)req->elem.out_sg[0].iov_base;
         req->in = (void *)req->elem.in_sg[req->elem.in_num - 1].iov_base;
 
-        if (req->out->type & VIRTIO_BLK_T_SCSI_CMD) {
+        if (req->out->type & VIRTIO_BLK_T_FLUSH) {
+            virtio_blk_handle_flush(req);
+        } else if (req->out->type & VIRTIO_BLK_T_SCSI_CMD) {
             virtio_blk_handle_scsi(req);
         } else if (req->out->type & VIRTIO_BLK_T_OUT) {
             qemu_iovec_init_external(&req->qiov, &req->elem.out_sg[1],
@@ -382,6 +401,9 @@ static uint32_t virtio_blk_get_features(
 
     features |= (1 << VIRTIO_BLK_F_SEG_MAX);
     features |= (1 << VIRTIO_BLK_F_GEOMETRY);
+
+    if (bdrv_enable_write_cache(s->bs))
+        features |= (1 << VIRTIO_BLK_F_WCACHE);
 #ifdef __linux__
     features |= (1 << VIRTIO_BLK_F_SCSI);
 #endif
Index: qemu-kvm/hw/virtio-blk.h
===================================================================
--- qemu-kvm.orig/hw/virtio-blk.h
+++ qemu-kvm/hw/virtio-blk.h
@@ -31,6 +31,7 @@
 #define VIRTIO_BLK_F_BLK_SIZE   6       /* Block size of disk is available*/
 #define VIRTIO_BLK_F_SCSI       7       /* Supports scsi command passthru */
 #define VIRTIO_BLK_F_IDENTIFY   8       /* ATA IDENTIFY supported */
+#define VIRTIO_BLK_F_WCACHE     9       /* write cache enabled */
 
 #define VIRTIO_BLK_ID_LEN       256     /* length of identify u16 array */
 #define VIRTIO_BLK_ID_SN        10      /* start of char * serial# */
@@ -55,6 +56,9 @@ struct virtio_blk_config
 /* This bit says it's a scsi command, not an actual read or write. */
 #define VIRTIO_BLK_T_SCSI_CMD   2
 
+/* Flush the volatile write cache */
+#define VIRTIO_BLK_T_FLUSH      4
+
 /* Barrier before this op. */
 #define VIRTIO_BLK_T_BARRIER    0x80000000
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-09-02 19:49 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-31 20:16 [Qemu-devel] [PATCH 0/4] data integrity fixes Christoph Hellwig
2009-08-31 20:16 ` [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag Christoph Hellwig
2009-08-31 22:09   ` Jamie Lokier
2009-08-31 22:16     ` Christoph Hellwig
2009-08-31 22:46       ` Jamie Lokier
2009-08-31 23:06         ` Christoph Hellwig
2009-09-01 10:38           ` Jamie Lokier
2009-08-31 22:53       ` Anthony Liguori
2009-08-31 22:55         ` Jamie Lokier
2009-08-31 22:58         ` Christoph Hellwig
2009-08-31 22:59         ` Jamie Lokier
2009-08-31 23:06           ` Christoph Hellwig
2009-08-31 23:09             ` Christoph Hellwig
2009-09-02  3:53         ` Christoph Hellwig
2009-09-02 13:13           ` Anthony Liguori
2009-09-02 14:14             ` Christoph Hellwig
2009-09-02 19:49             ` Christoph Hellwig
2009-08-31 20:17 ` [Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync Christoph Hellwig
2009-08-31 21:51   ` Jamie Lokier
2009-08-31 21:55     ` Christoph Hellwig
2009-08-31 22:48       ` Jamie Lokier
2009-08-31 22:57         ` Christoph Hellwig
2009-09-01 15:59   ` Blue Swirl
2009-09-01 16:04     ` Christoph Hellwig
2009-09-02  0:34       ` Jamie Lokier
2009-09-02  0:37         ` Christoph Hellwig
2009-09-02  1:18           ` Jamie Lokier
2009-09-02 14:02           ` Blue Swirl
2009-09-02 14:15             ` Christoph Hellwig
2009-08-31 20:17 ` [Qemu-devel] [PATCH 3/4] block: add bdrv_aio_flush operation Christoph Hellwig
2009-09-01 10:24   ` Avi Kivity
2009-09-01 14:25     ` Christoph Hellwig
2009-08-31 20:18 ` [Qemu-devel] [PATCH 4/4] virtio-blk: add volatile writecache feature Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).