From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49512) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ROsCt-0002hG-7a for qemu-devel@nongnu.org; Fri, 11 Nov 2011 09:35:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ROsCr-00015z-VI for qemu-devel@nongnu.org; Fri, 11 Nov 2011 09:35:51 -0500 Received: from mail-yw0-f45.google.com ([209.85.213.45]:51513) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ROsCr-00015t-NH for qemu-devel@nongnu.org; Fri, 11 Nov 2011 09:35:49 -0500 Received: by ywa17 with SMTP id 17so450975ywa.4 for ; Fri, 11 Nov 2011 06:35:49 -0800 (PST) Message-ID: <4EBD32C1.6000805@codemonkey.ws> Date: Fri, 11 Nov 2011 08:35:45 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <4EBAAA68.10801@redhat.com> <4EBAACAF.4080407@codemonkey.ws> <4EBAB236.2060409@redhat.com> <4EBAB9FA.3070601@codemonkey.ws> <4EBB919B.7040605@redhat.com> <4EBC1792.3030004@codemonkey.ws> <4EBC4260.1090405@codemonkey.ws> <4EBCF5DA.1000605@redhat.com> <4EBD2B4F.5040409@codemonkey.ws> <4EBD3145.3050409@redhat.com> In-Reply-To: <4EBD3145.3050409@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Lucas Meneghel Rodrigues , KVM mailing list , "Michael S. Tsirkin" , "libvir-list@redhat.com" , Marcelo Tosatti , QEMU devel , Juan Jose Quintela Carreira , Avi Kivity On 11/11/2011 08:29 AM, Kevin Wolf wrote: > Am 11.11.2011 15:03, schrieb Anthony Liguori: >> On 11/11/2011 04:15 AM, Kevin Wolf wrote: >>> Am 10.11.2011 22:30, schrieb Anthony Liguori: >>>> Live migration with qcow2 or any other image format is just not going to work >>>> right now even with proper clustered storage. I think doing a block level flush >>>> cache interface and letting block devices decide how to do it is the best approach. >>> >>> I would really prefer reusing the existing open/close code. It means >>> less (duplicated) code, is existing code that is well tested and doesn't >>> make migration much of a special case. >> >> Just to be clear, reopen only addresses image format migration. It does not >> address NFS migration since it doesn't guarantee close-to-open semantics. > > Yes. But image formats are the only thing that is really completely > broken today. For NFS etc. we can tell users to use > cache=none/directsync and they will be good. There is no such option > that makes image formats safe. > >> The problem I have with the reopen patches are that they introduce regressions >> and change at semantics for a management tool. If you look at the libvirt >> workflow with encrypted disks, it would break with the reopen patches. > > Yes, this is nasty. But on the other hand: Today migration is broken for > all qcow2 images, with the reopen it's only broken for encrypted ones. > Certainly an improvement, even though there's still a bug left. This sounds like a good thing to work through in the next release. > >>> If you want to avoid reopening the file on the OS level, we can reopen >>> only the topmost layer (i.e. the format, but not the protocol) for now >>> and in 1.1 we can use bdrv_reopen(). >> >> I don't view not supporting migration with image formats as a regression as it's >> never been a feature we've supported. While there might be confusion about >> support around NFS, I think it's always been clear that image formats cannot be >> used. >> >> Given that, I don't think this is a candidate for 1.0. > > Nobody says it's a regression, but it's a bad bug and you're blocking a > solution for it for over a year now because the solution isn't perfect > enough in your eyes. :-( This patch was posted a year ago. Feedback was provided and there was never any follow up[1]. I've never Nack'd this approach. I can't see how I was blocking this since I never even responded in the thread. If this came in before soft freeze, I wouldn't have objected if you wanted to go in this direction. This is not a bug fix, this is a new feature. We're long past feature freeze. It's not a simple and obvious fix either. It only partially fixes the problem and introduces other problems. It's not a good candidate for making an exception at this stage in the release. [1] http://mid.gmane.org/cover.1294150511.git.quintela@redhat.com Regards, Anthony Liguori > > Kevin >