From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:43847) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtayO-0000yT-7U for qemu-devel@nongnu.org; Fri, 11 Jan 2013 04:32:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TtayM-0001Lo-1c for qemu-devel@nongnu.org; Fri, 11 Jan 2013 04:32:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53943) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtayL-0001Lf-PU for qemu-devel@nongnu.org; Fri, 11 Jan 2013 04:32:21 -0500 Message-ID: <50EFDC1E.40102@redhat.com> Date: Fri, 11 Jan 2013 10:32:14 +0100 From: Kevin Wolf MIME-Version: 1.0 References: <1355941771-3418-1-git-send-email-namei.unix@gmail.com> <50D967C3.7020109@gmail.com> <50E58B19.2050701@gmail.com> <20130104163830.GF6310@stefanha-thinkpad.hitronhub.home> <50E7AEC4.5080309@gmail.com> <50E7BA41.3020307@gmail.com> <50E7DC9B.4080309@gmail.com> <50EACC61.2020603@redhat.com> <50EBB1CB.9030608@gmail.com> <20130108094025.GE2557@stefanha-thinkpad.redhat.com> <50EBEAD2.6070608@gmail.com> <50EBEE42.7010407@redhat.com> <50EBF755.3050607@gmail.com> <50EBFA3F.8030808@redhat.com> <50EBFE20.9010100@gmail.com> <50EC00CE.80205@redhat.com> <50EC0493.8030701@gmail.com> <50EC0D41.4070200@redhat.com> <50EC1C9A.5040006@gmail.com> <50ED45A8.5020706@redhat.com> <50ED4829.1070302@gmail.com> <50ED4933.3040001@redhat.com> <50ED4A90.2080808@gmail.com> <50ED5D64.4040600@gmail.com> <50ED65AA.60000@redhat.com> <50ED6AF4.4060300@gmail.com> <50ED8868.6010805@redhat.com> <50EE53C8.4090009@gmail.com> <87k3rkxijx.wl%morita.kazutaka@lab.ntt.co.jp> In-Reply-To: <87k3rkxijx.wl%morita.kazutaka@lab.ntt.co.jp> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] sheepdog: implement direct write semantics List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: MORITA Kazutaka Cc: Stefan Hajnoczi , Liu Yuan , qemu-devel@nongnu.org, Paolo Bonzini Am 11.01.2013 08:52, schrieb MORITA Kazutaka: > At Thu, 10 Jan 2013 13:38:16 +0800, > Liu Yuan wrote: >> >> On 01/09/2013 11:10 PM, Paolo Bonzini wrote: >>> Il 09/01/2013 14:04, Liu Yuan ha scritto: >>>>>> 2 The upper layer software which relies on the 'cache=xxx' to choose >>>>>> cache mode will fail its assumption against new QEMU. >>>>> >>>>> Which assumptions do you mean? As far as I can say the behaviour hasn't >>>>> changed, except possibly for the performance. >>>> >>>> When users set 'cache=writethrough' to export only a writethrough cache >>>> to Guest, but with new QEMU, it will actually get a writeback cache as >>>> default. >>> >>> They get a writeback cache implementation-wise, but they get a >>> writethrough cache safety-wise. How the cache is implemented doesn't >>> matter, as long as it "looks like" a writethrough cache. >>> >> >>> In fact, consider a local disk that doesn't support FUA. In old QEMU, >>> images used to be opened with O_DSYNC and that splits each write into >>> WRITE+FLUSH, just like new QEMU. All that changes is _where_ the >>> flushes are created. Old QEMU changes it in the kernel, new QEMU >>> changes it in userspace. >>> >>>> We don't need to communicate to the guest. I think 'cache=xxx' means >>>> what kind of cache the users *expect* to export to Guest OS. So if >>>> cache=writethrough set, Guest OS couldn't turn it to writeback cache >>>> magically. This is like I bought a disk with 'writethrough' cache >>>> built-in, I didn't expect that it turned to be a disk with writeback >>>> cache under the hood which could possible lose data when power outage >>>> happened. >>> >>> It's not by magic. It's by explicitly requesting the disk to do this. >>> >>> Perhaps it's a bug that the cache mode is not reset when the machine is >>> reset. I haven't checked that, but it would be a valid complaint. >>> >> >> Ah I didn't get the current implementation right. I tried the 3.7 kernel >> and it works as expected (cache=writethrough result in a 'writethrough' >> cache in the guest). >> >> It looks fine to me to emulate writethrough as writeback + flush, since >> the profermance drop isn't big, though sheepdog itself support true >> writethrough cache (no flush). > > Can we drop the SD_FLAG_CMD_CACHE flag from sheepdog write requests > when bdrv_enable_write_cache() is false? Then the requests behave > like FUA writes and we can safely omit succeeding flush requests. First we would need to make sure that on a writeback -> writethrough switch a flush happens. But once this is implemented, I think your suggestion would work well. Kevin