qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] bs->enable_write_cache and the guest ABI
@ 2010-03-07 14:42 Avi Kivity
  2010-03-08  9:39 ` Christoph Hellwig
  2010-03-08 10:29 ` [Qemu-devel] " Juan Quintela
  0 siblings, 2 replies; 12+ messages in thread
From: Avi Kivity @ 2010-03-07 14:42 UTC (permalink / raw)
  To: qemu-devel, Christoph Hellwig

block.c says:

>     /*
>      * Yes, BDRV_O_NOCACHE aka O_DIRECT means we have to present a
>      * write cache to the guest.  We do need the fdatasync to flush
>      * out transactions for block allocations, and we maybe have a
>      * volatile write cache in our backing device to deal with.
>      */
>     if (flags & (BDRV_O_CACHE_WB|BDRV_O_NOCACHE))
>         bs->enable_write_cache = 1;

This means that if I start a guest with cache=writethrough and then 
restart (or live migrate) it with cache=none, then the guest will see a 
change, even though the user only changed the drive's backing, not 
something guest visible.  In the case of live migration, the guest will 
not even notice the change and we may be at risk of data loss.

For 0.13 I propose setting enable_write_cache to true unconditionally.  
For 0.12 the question is more difficult, since we'll be changing the 
guest ABI.  Given that guests are unlikely not to be able to cope with 
write caches, and that the alternative is data loss, I believe that's 
also the right solution there.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] bs->enable_write_cache and the guest ABI
  2010-03-07 14:42 [Qemu-devel] bs->enable_write_cache and the guest ABI Avi Kivity
@ 2010-03-08  9:39 ` Christoph Hellwig
  2010-03-08  9:45   ` Avi Kivity
  2010-03-08  9:47   ` Jamie Lokier
  2010-03-08 10:29 ` [Qemu-devel] " Juan Quintela
  1 sibling, 2 replies; 12+ messages in thread
From: Christoph Hellwig @ 2010-03-08  9:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, Christoph Hellwig

On Sun, Mar 07, 2010 at 04:42:18PM +0200, Avi Kivity wrote:
> This means that if I start a guest with cache=writethrough and then 
> restart (or live migrate) it with cache=none, then the guest will see a 
> change, even though the user only changed the drive's backing, not 
> something guest visible.  In the case of live migration, the guest will 
> not even notice the change and we may be at risk of data loss.
> 
> For 0.13 I propose setting enable_write_cache to true unconditionally.  
> For 0.12 the question is more difficult, since we'll be changing the 
> guest ABI.  Given that guests are unlikely not to be able to cope with 
> write caches, and that the alternative is data loss, I believe that's 
> also the right solution there.

Setting it to true unconditionally will cause performance degradation
for cache=writethrough devices, as we now have to drain the queue in
the guest for no reason at all.

I think the better option would be to move the cache setting to qdev
property on the block device at it's a device visible setting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] bs->enable_write_cache and the guest ABI
  2010-03-08  9:39 ` Christoph Hellwig
@ 2010-03-08  9:45   ` Avi Kivity
  2010-03-08  9:47   ` Jamie Lokier
  1 sibling, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2010-03-08  9:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: qemu-devel

On 03/08/2010 11:39 AM, Christoph Hellwig wrote:
> On Sun, Mar 07, 2010 at 04:42:18PM +0200, Avi Kivity wrote:
>    
>> This means that if I start a guest with cache=writethrough and then
>> restart (or live migrate) it with cache=none, then the guest will see a
>> change, even though the user only changed the drive's backing, not
>> something guest visible.  In the case of live migration, the guest will
>> not even notice the change and we may be at risk of data loss.
>>
>> For 0.13 I propose setting enable_write_cache to true unconditionally.
>> For 0.12 the question is more difficult, since we'll be changing the
>> guest ABI.  Given that guests are unlikely not to be able to cope with
>> write caches, and that the alternative is data loss, I believe that's
>> also the right solution there.
>>      
> Setting it to true unconditionally will cause performance degradation
> for cache=writethrough devices, as we now have to drain the queue in
> the guest for no reason at all.
>    

True.

> I think the better option would be to move the cache setting to qdev
> property on the block device at it's a device visible setting.
>    

Reasonable.  Should default to write cache enabled since that's 
compatible across all host caching policies; if the user knows they will 
keep the host caching policy consistent, they can disable it and reap 
the gain.

Or keep the default as it is now, and make it the user's responsibility 
to keep track.  In this case we need to add a paragraph to the qemu 
management guide.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] bs->enable_write_cache and the guest ABI
  2010-03-08  9:39 ` Christoph Hellwig
  2010-03-08  9:45   ` Avi Kivity
@ 2010-03-08  9:47   ` Jamie Lokier
  2010-03-08 17:00     ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: Jamie Lokier @ 2010-03-08  9:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Avi Kivity, qemu-devel

Christoph Hellwig wrote:
> On Sun, Mar 07, 2010 at 04:42:18PM +0200, Avi Kivity wrote:
> > This means that if I start a guest with cache=writethrough and then 
> > restart (or live migrate) it with cache=none, then the guest will see a 
> > change, even though the user only changed the drive's backing, not 
> > something guest visible.  In the case of live migration, the guest will 
> > not even notice the change and we may be at risk of data loss.
> > 
> > For 0.13 I propose setting enable_write_cache to true unconditionally.  
> > For 0.12 the question is more difficult, since we'll be changing the 
> > guest ABI.  Given that guests are unlikely not to be able to cope with 
> > write caches, and that the alternative is data loss, I believe that's 
> > also the right solution there.
> 
> Setting it to true unconditionally will cause performance degradation
> for cache=writethrough devices, as we now have to drain the queue in
> the guest for no reason at all.
> 
> I think the better option would be to move the cache setting to qdev
> property on the block device at it's a device visible setting.

Assuming the outcome is that it becomes a qdev property, and stays
preserved across migrations, even if the backing device access
changes, then I think the right thing is to dynamically decide to set
O_DSYNC and/or call fdatasync before completing writes from qemu when
the guest thinks enable_write_cache=0 (or sets it to 0).  With
cache=none, that would set O_DSYNC|O_DIRECT if the two flags do work
properly together on our favourite hosts.

Thus enable_write_cache won't always have the default value for the
different backing device access type, but it will match the guest's
expectations and be actually safe.  Moreover more, by responding to
the guest changing that, it's closer to behaving like real harware.

-- Jamie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-07 14:42 [Qemu-devel] bs->enable_write_cache and the guest ABI Avi Kivity
  2010-03-08  9:39 ` Christoph Hellwig
@ 2010-03-08 10:29 ` Juan Quintela
  2010-03-08 10:48   ` Avi Kivity
  1 sibling, 1 reply; 12+ messages in thread
From: Juan Quintela @ 2010-03-08 10:29 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, Christoph Hellwig

Avi Kivity <avi@redhat.com> wrote:
> block.c says:
>
>>     /*
>>      * Yes, BDRV_O_NOCACHE aka O_DIRECT means we have to present a
>>      * write cache to the guest.  We do need the fdatasync to flush
>>      * out transactions for block allocations, and we maybe have a
>>      * volatile write cache in our backing device to deal with.
>>      */
>>     if (flags & (BDRV_O_CACHE_WB|BDRV_O_NOCACHE))
>>         bs->enable_write_cache = 1;
>
> This means that if I start a guest with cache=writethrough and then
> restart (or live migrate) it with cache=none, then the guest will see
> a change, even though the user only changed the drive's backing, not
> something guest visible.  In the case of live migration, the guest
> will not even notice the change and we may be at risk of data loss.
>
> For 0.13 I propose setting enable_write_cache to true unconditionally.
> For 0.12 the question is more difficult, since we'll be changing the
> guest ABI.  Given that guests are unlikely not to be able to cope with
> write caches, and that the alternative is data loss, I believe that's
> also the right solution there.

For RHEL I setted with adding enable_write_cache to the migration
state.  As you state, that value is guest visible.  I can update that
patches to qemu.  When I migrated from an old version, I just set that
value to 0.

What do you think?

Later, Juan.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-08 10:29 ` [Qemu-devel] " Juan Quintela
@ 2010-03-08 10:48   ` Avi Kivity
  2010-03-08 12:46     ` Juan Quintela
  2010-03-08 23:00     ` Jamie Lokier
  0 siblings, 2 replies; 12+ messages in thread
From: Avi Kivity @ 2010-03-08 10:48 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, Christoph Hellwig

On 03/08/2010 12:29 PM, Juan Quintela wrote:
> Avi Kivity<avi@redhat.com>  wrote:
>    
>> block.c says:
>>
>>      
>>>      /*
>>>       * Yes, BDRV_O_NOCACHE aka O_DIRECT means we have to present a
>>>       * write cache to the guest.  We do need the fdatasync to flush
>>>       * out transactions for block allocations, and we maybe have a
>>>       * volatile write cache in our backing device to deal with.
>>>       */
>>>      if (flags&  (BDRV_O_CACHE_WB|BDRV_O_NOCACHE))
>>>          bs->enable_write_cache = 1;
>>>        
>> This means that if I start a guest with cache=writethrough and then
>> restart (or live migrate) it with cache=none, then the guest will see
>> a change, even though the user only changed the drive's backing, not
>> something guest visible.  In the case of live migration, the guest
>> will not even notice the change and we may be at risk of data loss.
>>
>> For 0.13 I propose setting enable_write_cache to true unconditionally.
>> For 0.12 the question is more difficult, since we'll be changing the
>> guest ABI.  Given that guests are unlikely not to be able to cope with
>> write caches, and that the alternative is data loss, I believe that's
>> also the right solution there.
>>      
> For RHEL I setted with adding enable_write_cache to the migration
> state.  As you state, that value is guest visible.  I can update that
> patches to qemu.  When I migrated from an old version, I just set that
> value to 0.
>
> What do you think?
>    

I think we have to go with a qdev property as Christoph suggests.  Then 
it becomes the management's responsibility to set it right.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-08 10:48   ` Avi Kivity
@ 2010-03-08 12:46     ` Juan Quintela
  2010-03-08 23:00     ` Jamie Lokier
  1 sibling, 0 replies; 12+ messages in thread
From: Juan Quintela @ 2010-03-08 12:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, Christoph Hellwig

Avi Kivity <avi@redhat.com> wrote:
> On 03/08/2010 12:29 PM, Juan Quintela wrote:
>> Avi Kivity<avi@redhat.com>  wrote:
>> For RHEL I setted with adding enable_write_cache to the migration
>> state.  As you state, that value is guest visible.  I can update that
>> patches to qemu.  When I migrated from an old version, I just set that
>> value to 0.
>>
>> What do you think?
>>    
>
> I think we have to go with a qdev property as Christoph suggests.
> Then it becomes the management's responsibility to set it right.

Also works for me.

Later, Juan.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] bs->enable_write_cache and the guest ABI
  2010-03-08  9:47   ` Jamie Lokier
@ 2010-03-08 17:00     ` Christoph Hellwig
  2010-03-08 22:55       ` Jamie Lokier
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2010-03-08 17:00 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: qemu-devel, Christoph Hellwig, Avi Kivity

On Mon, Mar 08, 2010 at 09:47:32AM +0000, Jamie Lokier wrote:
> Assuming the outcome is that it becomes a qdev property, and stays
> preserved across migrations, even if the backing device access
> changes, then I think the right thing is to dynamically decide to set
> O_DSYNC and/or call fdatasync before completing writes from qemu when
> the guest thinks enable_write_cache=0 (or sets it to 0).  With
> cache=none, that would set O_DSYNC|O_DIRECT if the two flags do work
> properly together on our favourite hosts.
> 
> Thus enable_write_cache won't always have the default value for the
> different backing device access type, but it will match the guest's
> expectations and be actually safe.  Moreover more, by responding to
> the guest changing that, it's closer to behaving like real harware.

I'd have to look at other uses of qdev, but I would assume that we
always look at the qdev properties and only use the existing drive
suboptions as compatiblity if we do not have the qdev properties set.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] bs->enable_write_cache and the guest ABI
  2010-03-08 17:00     ` Christoph Hellwig
@ 2010-03-08 22:55       ` Jamie Lokier
  0 siblings, 0 replies; 12+ messages in thread
From: Jamie Lokier @ 2010-03-08 22:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Avi Kivity, qemu-devel

Christoph Hellwig wrote:
> On Mon, Mar 08, 2010 at 09:47:32AM +0000, Jamie Lokier wrote:
> > Assuming the outcome is that it becomes a qdev property, and stays
> > preserved across migrations, even if the backing device access
> > changes, then I think the right thing is to dynamically decide to set
> > O_DSYNC and/or call fdatasync before completing writes from qemu when
> > the guest thinks enable_write_cache=0 (or sets it to 0).  With
> > cache=none, that would set O_DSYNC|O_DIRECT if the two flags do work
> > properly together on our favourite hosts.
> > 
> > Thus enable_write_cache won't always have the default value for the
> > different backing device access type, but it will match the guest's
> > expectations and be actually safe.  Moreover more, by responding to
> > the guest changing that, it's closer to behaving like real harware.
> 
> I'd have to look at other uses of qdev, but I would assume that we
> always look at the qdev properties and only use the existing drive
> suboptions as compatiblity if we do not have the qdev properties set.

I'm not sure if I was clear or not: I mean if the guest runs "hdparm
-W $value" or Windows equivalent, and cache!=writethrough, it should
have the effect of turning on/off the host's O_DSYNC flag (or qemu
calling fdatasync() before signalling every write completion), because
that's really just another way for the guest say it wants to flush
write caches after every write.  So that's what the emulation should
do, either explicitly (fdatasync) or a subtler way (O_DSYNC).

That guest-visible flag, which is part of the emulated drive state,
should be preserved across migration.

-- Jamie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-08 10:48   ` Avi Kivity
  2010-03-08 12:46     ` Juan Quintela
@ 2010-03-08 23:00     ` Jamie Lokier
  2010-03-09  9:56       ` Avi Kivity
  1 sibling, 1 reply; 12+ messages in thread
From: Jamie Lokier @ 2010-03-08 23:00 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Christoph Hellwig, qemu-devel, Juan Quintela

Avi Kivity wrote:
> I think we have to go with a qdev property as Christoph suggests.  Then 
> it becomes the management's responsibility to set it right.

How can the management be expected to know or follow dynamically
changing guest state?  There guests which disable a drive's write
cache in some application scripts, and others which enable it in boot
scripts.  That's not something management should be knowing about.

-- Jamie

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-08 23:00     ` Jamie Lokier
@ 2010-03-09  9:56       ` Avi Kivity
  2010-03-09 19:26         ` Jamie Lokier
  0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2010-03-09  9:56 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Christoph Hellwig, qemu-devel, Juan Quintela

On 03/09/2010 01:00 AM, Jamie Lokier wrote:
> Avi Kivity wrote:
>    
>> I think we have to go with a qdev property as Christoph suggests.  Then
>> it becomes the management's responsibility to set it right.
>>      
> How can the management be expected to know or follow dynamically
> changing guest state?  There guests which disable a drive's write
> cache in some application scripts, and others which enable it in boot
> scripts.  That's not something management should be knowing about.
>    

The flag indicates whether the drive has a write cache or not.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Re: bs->enable_write_cache and the guest ABI
  2010-03-09  9:56       ` Avi Kivity
@ 2010-03-09 19:26         ` Jamie Lokier
  0 siblings, 0 replies; 12+ messages in thread
From: Jamie Lokier @ 2010-03-09 19:26 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Christoph Hellwig, qemu-devel, Juan Quintela

Avi Kivity wrote:
> On 03/09/2010 01:00 AM, Jamie Lokier wrote:
> >Avi Kivity wrote:
> >   
> >>I think we have to go with a qdev property as Christoph suggests.  Then
> >>it becomes the management's responsibility to set it right.
> >>     
> >How can the management be expected to know or follow dynamically
> >changing guest state?  There guests which disable a drive's write
> >cache in some application scripts, and others which enable it in boot
> >scripts.  That's not something management should be knowing about.
> >   
> 
> The flag indicates whether the drive has a write cache or not.

Ok, thanks, I was confused by the name saying "enable" not "has".

I presume the dynamic actually-enabled-write-cache guest device state
is dealt with properly, then?

-- Jamie

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-03-09 19:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-07 14:42 [Qemu-devel] bs->enable_write_cache and the guest ABI Avi Kivity
2010-03-08  9:39 ` Christoph Hellwig
2010-03-08  9:45   ` Avi Kivity
2010-03-08  9:47   ` Jamie Lokier
2010-03-08 17:00     ` Christoph Hellwig
2010-03-08 22:55       ` Jamie Lokier
2010-03-08 10:29 ` [Qemu-devel] " Juan Quintela
2010-03-08 10:48   ` Avi Kivity
2010-03-08 12:46     ` Juan Quintela
2010-03-08 23:00     ` Jamie Lokier
2010-03-09  9:56       ` Avi Kivity
2010-03-09 19:26         ` Jamie Lokier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).