From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions Date: Tue, 15 Nov 2011 07:56:28 -0600 Message-ID: <4EC26F8C.10102@codemonkey.ws> References: <4EBAAA68.10801@redhat.com> <4EBAACAF.4080407@codemonkey.ws> <4EBAB236.2060409@redhat.com> <4EBAB9FA.3070601@codemonkey.ws> <4EBB919B.7040605@redhat.com> <4EBC1792.3030004@codemonkey.ws> <4EBC4260.1090405@codemonkey.ws> <4EBCF5DA.1000605@redhat.com> <4EBE499E.4030100@redhat.com> <20111114101610.GA32392@redhat.com> <4EC1238B.2030906@codemonkey.ws> <874ny5iktv.fsf@trasno.mitica> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Kevin Wolf , Lucas Meneghel Rodrigues , KVM mailing list , "Michael S. Tsirkin" , "libvir-list@redhat.com" , Marcelo Tosatti , QEMU devel , Avi Kivity To: quintela@redhat.com Return-path: Received: from mail-iy0-f174.google.com ([209.85.210.174]:44901 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753894Ab1KON4c (ORCPT ); Tue, 15 Nov 2011 08:56:32 -0500 Received: by iage36 with SMTP id e36so8452310iag.19 for ; Tue, 15 Nov 2011 05:56:32 -0800 (PST) In-Reply-To: <874ny5iktv.fsf@trasno.mitica> Sender: kvm-owner@vger.kernel.org List-ID: On 11/15/2011 07:20 AM, Juan Quintela wrote: >> Again, I think defaulting DAS to cache=none|directsync is what makes >> the most sense here. > > I think it is the only sane solution. Otherwise, we need to write the > equivalent of a lock manager, to know _who_ has the storage, and > distributed lock managers are a mess :-( > >> We can even add a migration blocker for DAS with cache=on. If we can >> do dynamic toggling of the cache setting, then that's pretty friendly >> at the end of the day. > > That could fix the problem also. At the moment that we start migration, > we do an fsync() + switch to O_DIRECT for all filesystems. > > As you said, time for implementing fcntl(O_DIRECT). Yeah, I think this ends up being a very elegant solution. We always open block devices O_DIRECT to start with. That ensures reads go directly to disk if its DAS or result in NFS protocol reads. As long as we fsync on the source (and we do), then we're okay. For cache=write{back,through}, we would then just fcntl() away O_DIRECT as soon as we start the guest. Then we can start doing reads through the page cache. Regards, Anthony Liguori > Later, Juan. >