From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=41304 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Oxihm-0002c3-Cn
	for qemu-devel@nongnu.org; Mon, 20 Sep 2010 11:55:04 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <kwolf@redhat.com>) id 1Oxihl-0008NR-4t
	for qemu-devel@nongnu.org; Mon, 20 Sep 2010 11:54:58 -0400
Received: from mx1.redhat.com ([209.132.183.28]:10693)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <kwolf@redhat.com>) id 1Oxihk-0008NB-SE
	for qemu-devel@nongnu.org; Mon, 20 Sep 2010 11:54:57 -0400
Message-ID: <4C9783E7.5080905@redhat.com>
Date: Mon, 20 Sep 2010 17:55:19 +0200
From: Kevin Wolf <kwolf@redhat.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [RFC] block-queue: Delay and batch metadata writes
References: <1284991010-10951-1-git-send-email-kwolf@redhat.com>
	<4C977028.3050602@codemonkey.ws> <4C9778EC.9060704@redhat.com>
	<4C978071.2010209@codemonkey.ws>
In-Reply-To: <4C978071.2010209@codemonkey.ws>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel@nongnu.org

Am 20.09.2010 17:40, schrieb Anthony Liguori:
> On 09/20/2010 10:08 AM, Kevin Wolf wrote:
>>> If you're comfortable with a writeback cache for metadata, then you
>>> should also be comfortable with a writeback cache for data in which
>>> case, cache=writeback is the answer.
>>>      
>> Well, there is a difference: We don't pollute the host page cache with
>> guest data and we don't get a virtual "disk cache" as big as the host
>> RAM, but only a very limited queue of metadata.
>>
>> Basically, in qemu we have three different types of caching:
>>
>> 1. O_DSYNC, everything is always synced without any explicit request.
>>     This is cache=writethrough.
>>    
> 
> I actually think O_DSYNC is the wrong implementation of 
> cache=writethrough.  cache=writethrough should behave just like 
> cache=none except that data goes through the page cache.

Then you have cache=writeback, basically.

>> 2. Nothing is ever synced. This is cache=unsafe.
>>
>> 3. We present a writeback disk cache to the guest and the guest needs
>>     to explicitly flush to gets its data safe on disk. This is
>>     cache=writeback and cache=none.
>>    
> 
> We shouldn't tie the virtual disk cache to which cache= option is used 
> in the host.  cache=none means that all requests go directly to the 
> disk.  cache=writeback means the host acts as a writeback cache.

No, that's not the meaning of cache=none if you take the disk cache into
consideration. It might be what you think should be the meaning of
cache=none, but it's not what it means in any qemu release.

> If your disk is in writethrough mode, exposing cache=none as a writeback 
> disk cache is not correct.

The host's disk is writethrough? In this case it's being more
conservative than needed, yes.

>> We're still lacking modes for O_DSYNC | O_DIRECT and unsafe | O_DIRECT,
>> but they are entirely possible, because it's two different dimensions.
>> (And I think Christoph was planning to actually make it two independent
>> options)
> 
> I don't really think O_DSYNC | O_DIRECT makes much sense.

Maybe, maybe not. It's just a missing entry in the matrix.

Kevin