All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Chris Webb <chris@arachsys.com>, Avi Kivity <avi@redhat.com>,
	balbir@linux.vnet.ibm.com,
	KVM development list <kvm@vger.kernel.org>,
	Rik van Riel <riel@surriel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter
Date: Mon, 15 Mar 2010 20:43:07 -0400	[thread overview]
Message-ID: <20100316004307.GA19470@infradead.org> (raw)
In-Reply-To: <4B9EC60A.2070101@codemonkey.ws>

On Mon, Mar 15, 2010 at 06:43:06PM -0500, Anthony Liguori wrote:
> I knew someone would do this...
>
> This really gets down to your definition of "safe" behaviour.  As it  
> stands, if you suffer a power outage, it may lead to guest corruption.
>
> While we are correct in advertising a write-cache, write-caches are  
> volatile and should a drive lose power, it could lead to data  
> corruption.  Enterprise disks tend to have battery backed write caches  
> to prevent this.
>
> In the set up you're emulating, the host is acting as a giant write  
> cache.  Should your host fail, you can get data corruption.
>
> cache=writethrough provides a much stronger data guarantee.  Even in the  
> event of a host failure, data integrity will be preserved.

Actually cache=writeback is as safe as any normal host is with a
volatile disk cache, except that in this case the disk cache is
actually a lot larger.  With a properly implemented filesystem this
will never cause corruption.  You will lose recent updates after
the last sync/fsync/etc up to the size of the cache, but filesystem
metadata should never be corrupted, and data that has been forced to
disk using fsync/O_SYNC should never be lost either.  If it is that's
a bug somewhere in the stack, but in my powerfail testing we never did
so using xfs or ext3/4 after I fixed up the fsync code in the latter
two.


WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@infradead.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Chris Webb <chris@arachsys.com>, Avi Kivity <avi@redhat.com>,
	balbir@linux.vnet.ibm.com,
	KVM development list <kvm@vger.kernel.org>,
	Rik van Riel <riel@surriel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter
Date: Mon, 15 Mar 2010 20:43:07 -0400	[thread overview]
Message-ID: <20100316004307.GA19470@infradead.org> (raw)
In-Reply-To: <4B9EC60A.2070101@codemonkey.ws>

On Mon, Mar 15, 2010 at 06:43:06PM -0500, Anthony Liguori wrote:
> I knew someone would do this...
>
> This really gets down to your definition of "safe" behaviour.  As it  
> stands, if you suffer a power outage, it may lead to guest corruption.
>
> While we are correct in advertising a write-cache, write-caches are  
> volatile and should a drive lose power, it could lead to data  
> corruption.  Enterprise disks tend to have battery backed write caches  
> to prevent this.
>
> In the set up you're emulating, the host is acting as a giant write  
> cache.  Should your host fail, you can get data corruption.
>
> cache=writethrough provides a much stronger data guarantee.  Even in the  
> event of a host failure, data integrity will be preserved.

Actually cache=writeback is as safe as any normal host is with a
volatile disk cache, except that in this case the disk cache is
actually a lot larger.  With a properly implemented filesystem this
will never cause corruption.  You will lose recent updates after
the last sync/fsync/etc up to the size of the cache, but filesystem
metadata should never be corrupted, and data that has been forced to
disk using fsync/O_SYNC should never be lost either.  If it is that's
a bug somewhere in the stack, but in my powerfail testing we never did
so using xfs or ext3/4 after I fixed up the fsync code in the latter
two.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-16  0:43 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-15  7:22 [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter Balbir Singh
2010-03-15  7:22 ` Balbir Singh
2010-03-15  7:48 ` Avi Kivity
2010-03-15  7:48   ` Avi Kivity
2010-03-15  8:07   ` Balbir Singh
2010-03-15  8:07     ` Balbir Singh
2010-03-15  8:27     ` Avi Kivity
2010-03-15  8:27       ` Avi Kivity
2010-03-15  9:17       ` Balbir Singh
2010-03-15  9:17         ` Balbir Singh
2010-03-15  9:27         ` Avi Kivity
2010-03-15  9:27           ` Avi Kivity
2010-03-15 10:45           ` Balbir Singh
2010-03-15 10:45             ` Balbir Singh
2010-03-15 18:48           ` Anthony Liguori
2010-03-15 18:48             ` Anthony Liguori
2010-03-16  9:05             ` Avi Kivity
2010-03-16  9:05               ` Avi Kivity
2010-03-19  7:23               ` Dave Hansen
2010-03-19  7:23                 ` Dave Hansen
2010-03-15 20:23       ` Chris Webb
2010-03-15 20:23         ` Chris Webb
2010-03-15 23:43         ` Anthony Liguori
2010-03-15 23:43           ` Anthony Liguori
2010-03-16  0:43           ` Christoph Hellwig [this message]
2010-03-16  0:43             ` Christoph Hellwig
2010-03-16  1:27             ` Anthony Liguori
2010-03-16  1:27               ` Anthony Liguori
2010-03-16  8:19               ` Christoph Hellwig
2010-03-16  8:19                 ` Christoph Hellwig
2010-03-17 15:14           ` Chris Webb
2010-03-17 15:14             ` Chris Webb
2010-03-17 15:55             ` Anthony Liguori
2010-03-17 15:55               ` Anthony Liguori
2010-03-17 16:27               ` Chris Webb
2010-03-17 16:27                 ` Chris Webb
2010-03-22 21:04                 ` Chris Webb
2010-03-22 21:04                   ` Chris Webb
2010-03-22 21:07                   ` Avi Kivity
2010-03-22 21:07                     ` Avi Kivity
2010-03-22 21:10                     ` Chris Webb
2010-03-22 21:10                       ` Chris Webb
2010-03-17 16:27               ` Balbir Singh
2010-03-17 16:27                 ` Balbir Singh
2010-03-17 17:05             ` Vivek Goyal
2010-03-17 17:05               ` Vivek Goyal
2010-03-17 19:11               ` Chris Webb
2010-03-17 19:11                 ` Chris Webb
2010-03-16  3:16         ` Balbir Singh
2010-03-16  3:16           ` Balbir Singh
2010-03-16  9:17         ` Avi Kivity
2010-03-16  9:17           ` Avi Kivity
2010-03-16  9:54           ` Kevin Wolf
2010-03-16  9:54             ` Kevin Wolf
2010-03-16 10:16             ` Avi Kivity
2010-03-16 10:16               ` Avi Kivity
2010-03-16 10:26           ` Christoph Hellwig
2010-03-16 10:26             ` Christoph Hellwig
2010-03-16 10:36             ` Avi Kivity
2010-03-16 10:36               ` Avi Kivity
2010-03-16 10:44               ` Christoph Hellwig
2010-03-16 10:44                 ` Christoph Hellwig
2010-03-16 11:08                 ` Avi Kivity
2010-03-16 11:08                   ` Avi Kivity
2010-03-16 14:27                   ` Balbir Singh
2010-03-16 14:27                     ` Balbir Singh
2010-03-16 15:59                     ` Avi Kivity
2010-03-16 15:59                       ` Avi Kivity
2010-03-17  8:49                   ` Christoph Hellwig
2010-03-17  8:49                     ` Christoph Hellwig
2010-03-17  9:10                     ` Avi Kivity
2010-03-17  9:10                       ` Avi Kivity
2010-03-17 15:24           ` Chris Webb
2010-03-17 15:24             ` Chris Webb
2010-03-17 16:22             ` Avi Kivity
2010-03-17 16:22               ` Avi Kivity
2010-03-17 16:40               ` Avi Kivity
2010-03-17 16:40                 ` Avi Kivity
2010-03-17 16:47                 ` Chris Webb
2010-03-17 16:47                   ` Chris Webb
2010-03-17 16:53                   ` Avi Kivity
2010-03-17 16:53                     ` Avi Kivity
2010-03-17 16:58                     ` Christoph Hellwig
2010-03-17 16:58                       ` Christoph Hellwig
2010-03-17 17:03                       ` Avi Kivity
2010-03-17 17:03                         ` Avi Kivity
2010-03-17 16:57                 ` Christoph Hellwig
2010-03-17 16:57                   ` Christoph Hellwig
2010-03-17 17:06                   ` Avi Kivity
2010-03-17 17:06                     ` Avi Kivity
2010-03-17 16:52               ` Christoph Hellwig
2010-03-17 16:52                 ` Christoph Hellwig
2010-03-17 17:02                 ` Avi Kivity
2010-03-17 17:02                   ` Avi Kivity
2010-03-15 15:46 ` Randy Dunlap
2010-03-15 15:46   ` Randy Dunlap
2010-03-16  3:21   ` Balbir Singh
2010-03-16  3:21     ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100316004307.GA19470@infradead.org \
    --to=hch@infradead.org \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=chris@arachsys.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.