From: Anthony Liguori <anthony@codemonkey.ws>
To: Chris Webb <chris@arachsys.com>
Cc: Avi Kivity <avi@redhat.com>,
balbir@linux.vnet.ibm.com,
KVM development list <kvm@vger.kernel.org>,
Rik van Riel <riel@surriel.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter
Date: Wed, 17 Mar 2010 10:55:47 -0500 [thread overview]
Message-ID: <4BA0FB83.1010502@codemonkey.ws> (raw)
In-Reply-To: <20100317151409.GY31148@arachsys.com>
On 03/17/2010 10:14 AM, Chris Webb wrote:
> Anthony Liguori<anthony@codemonkey.ws> writes:
>
>
>> This really gets down to your definition of "safe" behaviour. As it
>> stands, if you suffer a power outage, it may lead to guest
>> corruption.
>>
>> While we are correct in advertising a write-cache, write-caches are
>> volatile and should a drive lose power, it could lead to data
>> corruption. Enterprise disks tend to have battery backed write
>> caches to prevent this.
>>
>> In the set up you're emulating, the host is acting as a giant write
>> cache. Should your host fail, you can get data corruption.
>>
> Hi Anthony. I suspected my post might spark an interesting discussion!
>
> Before considering anything like this, we did quite a bit of testing with
> OSes in qemu-kvm guests running filesystem-intensive work, using an ipmitool
> power off to kill the host. I didn't manage to corrupt any ext3, ext4 or
> NTFS filesystems despite these efforts.
>
> Is your claim here that:-
>
> (a) qemu doesn't emulate a disk write cache correctly; or
>
> (b) operating systems are inherently unsafe running on top of a disk with
> a write-cache; or
>
> (c) installations that are already broken and lose data with a physical
> drive with a write-cache can lose much more in this case because the
> write cache is much bigger?
>
This is the closest to the most accurate.
It basically boils down to this: most enterprises use a disks with
battery backed write caches. Having the host act as a giant write cache
means that you can lose data.
I agree that a well behaved file system will not become corrupt, but my
contention is that for many types of applications, data lose ==
corruption and not all file systems are well behaved. And it's
certainly valid to argue about whether common filesystems are "broken"
but from a purely pragmatic perspective, this is going to be the case.
Regards,
Anthony Liguori
WARNING: multiple messages have this Message-ID (diff)
From: Anthony Liguori <anthony@codemonkey.ws>
To: Chris Webb <chris@arachsys.com>
Cc: Avi Kivity <avi@redhat.com>,
balbir@linux.vnet.ibm.com,
KVM development list <kvm@vger.kernel.org>,
Rik van Riel <riel@surriel.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter
Date: Wed, 17 Mar 2010 10:55:47 -0500 [thread overview]
Message-ID: <4BA0FB83.1010502@codemonkey.ws> (raw)
In-Reply-To: <20100317151409.GY31148@arachsys.com>
On 03/17/2010 10:14 AM, Chris Webb wrote:
> Anthony Liguori<anthony@codemonkey.ws> writes:
>
>
>> This really gets down to your definition of "safe" behaviour. As it
>> stands, if you suffer a power outage, it may lead to guest
>> corruption.
>>
>> While we are correct in advertising a write-cache, write-caches are
>> volatile and should a drive lose power, it could lead to data
>> corruption. Enterprise disks tend to have battery backed write
>> caches to prevent this.
>>
>> In the set up you're emulating, the host is acting as a giant write
>> cache. Should your host fail, you can get data corruption.
>>
> Hi Anthony. I suspected my post might spark an interesting discussion!
>
> Before considering anything like this, we did quite a bit of testing with
> OSes in qemu-kvm guests running filesystem-intensive work, using an ipmitool
> power off to kill the host. I didn't manage to corrupt any ext3, ext4 or
> NTFS filesystems despite these efforts.
>
> Is your claim here that:-
>
> (a) qemu doesn't emulate a disk write cache correctly; or
>
> (b) operating systems are inherently unsafe running on top of a disk with
> a write-cache; or
>
> (c) installations that are already broken and lose data with a physical
> drive with a write-cache can lose much more in this case because the
> write cache is much bigger?
>
This is the closest to the most accurate.
It basically boils down to this: most enterprises use a disks with
battery backed write caches. Having the host act as a giant write cache
means that you can lose data.
I agree that a well behaved file system will not become corrupt, but my
contention is that for many types of applications, data lose ==
corruption and not all file systems are well behaved. And it's
certainly valid to argue about whether common filesystems are "broken"
but from a purely pragmatic perspective, this is going to be the case.
Regards,
Anthony Liguori
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-17 15:55 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-15 7:22 [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter Balbir Singh
2010-03-15 7:22 ` Balbir Singh
2010-03-15 7:48 ` Avi Kivity
2010-03-15 7:48 ` Avi Kivity
2010-03-15 8:07 ` Balbir Singh
2010-03-15 8:07 ` Balbir Singh
2010-03-15 8:27 ` Avi Kivity
2010-03-15 8:27 ` Avi Kivity
2010-03-15 9:17 ` Balbir Singh
2010-03-15 9:17 ` Balbir Singh
2010-03-15 9:27 ` Avi Kivity
2010-03-15 9:27 ` Avi Kivity
2010-03-15 10:45 ` Balbir Singh
2010-03-15 10:45 ` Balbir Singh
2010-03-15 18:48 ` Anthony Liguori
2010-03-15 18:48 ` Anthony Liguori
2010-03-16 9:05 ` Avi Kivity
2010-03-16 9:05 ` Avi Kivity
2010-03-19 7:23 ` Dave Hansen
2010-03-19 7:23 ` Dave Hansen
2010-03-15 20:23 ` Chris Webb
2010-03-15 20:23 ` Chris Webb
2010-03-15 23:43 ` Anthony Liguori
2010-03-15 23:43 ` Anthony Liguori
2010-03-16 0:43 ` Christoph Hellwig
2010-03-16 0:43 ` Christoph Hellwig
2010-03-16 1:27 ` Anthony Liguori
2010-03-16 1:27 ` Anthony Liguori
2010-03-16 8:19 ` Christoph Hellwig
2010-03-16 8:19 ` Christoph Hellwig
2010-03-17 15:14 ` Chris Webb
2010-03-17 15:14 ` Chris Webb
2010-03-17 15:55 ` Anthony Liguori [this message]
2010-03-17 15:55 ` Anthony Liguori
2010-03-17 16:27 ` Chris Webb
2010-03-17 16:27 ` Chris Webb
2010-03-22 21:04 ` Chris Webb
2010-03-22 21:04 ` Chris Webb
2010-03-22 21:07 ` Avi Kivity
2010-03-22 21:07 ` Avi Kivity
2010-03-22 21:10 ` Chris Webb
2010-03-22 21:10 ` Chris Webb
2010-03-17 16:27 ` Balbir Singh
2010-03-17 16:27 ` Balbir Singh
2010-03-17 17:05 ` Vivek Goyal
2010-03-17 17:05 ` Vivek Goyal
2010-03-17 19:11 ` Chris Webb
2010-03-17 19:11 ` Chris Webb
2010-03-16 3:16 ` Balbir Singh
2010-03-16 3:16 ` Balbir Singh
2010-03-16 9:17 ` Avi Kivity
2010-03-16 9:17 ` Avi Kivity
2010-03-16 9:54 ` Kevin Wolf
2010-03-16 9:54 ` Kevin Wolf
2010-03-16 10:16 ` Avi Kivity
2010-03-16 10:16 ` Avi Kivity
2010-03-16 10:26 ` Christoph Hellwig
2010-03-16 10:26 ` Christoph Hellwig
2010-03-16 10:36 ` Avi Kivity
2010-03-16 10:36 ` Avi Kivity
2010-03-16 10:44 ` Christoph Hellwig
2010-03-16 10:44 ` Christoph Hellwig
2010-03-16 11:08 ` Avi Kivity
2010-03-16 11:08 ` Avi Kivity
2010-03-16 14:27 ` Balbir Singh
2010-03-16 14:27 ` Balbir Singh
2010-03-16 15:59 ` Avi Kivity
2010-03-16 15:59 ` Avi Kivity
2010-03-17 8:49 ` Christoph Hellwig
2010-03-17 8:49 ` Christoph Hellwig
2010-03-17 9:10 ` Avi Kivity
2010-03-17 9:10 ` Avi Kivity
2010-03-17 15:24 ` Chris Webb
2010-03-17 15:24 ` Chris Webb
2010-03-17 16:22 ` Avi Kivity
2010-03-17 16:22 ` Avi Kivity
2010-03-17 16:40 ` Avi Kivity
2010-03-17 16:40 ` Avi Kivity
2010-03-17 16:47 ` Chris Webb
2010-03-17 16:47 ` Chris Webb
2010-03-17 16:53 ` Avi Kivity
2010-03-17 16:53 ` Avi Kivity
2010-03-17 16:58 ` Christoph Hellwig
2010-03-17 16:58 ` Christoph Hellwig
2010-03-17 17:03 ` Avi Kivity
2010-03-17 17:03 ` Avi Kivity
2010-03-17 16:57 ` Christoph Hellwig
2010-03-17 16:57 ` Christoph Hellwig
2010-03-17 17:06 ` Avi Kivity
2010-03-17 17:06 ` Avi Kivity
2010-03-17 16:52 ` Christoph Hellwig
2010-03-17 16:52 ` Christoph Hellwig
2010-03-17 17:02 ` Avi Kivity
2010-03-17 17:02 ` Avi Kivity
2010-03-15 15:46 ` Randy Dunlap
2010-03-15 15:46 ` Randy Dunlap
2010-03-16 3:21 ` Balbir Singh
2010-03-16 3:21 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BA0FB83.1010502@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris@arachsys.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.