From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:51569) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ROFVF-0006dO-Po for qemu-devel@nongnu.org; Wed, 09 Nov 2011 16:16:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ROFVE-0000g1-3I for qemu-devel@nongnu.org; Wed, 09 Nov 2011 16:16:13 -0500 Received: from mail-iy0-f173.google.com ([209.85.210.173]:61003) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ROFVD-0000fx-Qb for qemu-devel@nongnu.org; Wed, 09 Nov 2011 16:16:11 -0500 Received: by iakk32 with SMTP id k32so2453279iak.4 for ; Wed, 09 Nov 2011 13:16:10 -0800 (PST) Message-ID: <4EBAED97.2000100@codemonkey.ws> Date: Wed, 09 Nov 2011 15:16:07 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <087d3dc42c667ea146edc73492b0f4afdd3a911d.1320865627.git.quintela@redhat.com> <4EBADBC0.8000201@codemonkey.ws> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/2] Reopen files after migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com Cc: qemu-devel@nongnu.org On 11/09/2011 03:10 PM, Juan Quintela wrote: > Anthony Liguori wrote: >> On 11/09/2011 01:16 PM, Juan Quintela wrote: >>> We need to invalidate the Read Cache on the destination, otherwise we >>> have corruption. Easy way to reproduce it is: >>> >>> - create an qcow2 images >>> - start qemu on destination of migration (qemu .... -incoming tcp:...) >>> - start qemu on source of migration and do one install. >>> - migrate at the end of install (when lot of disk IO has happened). >>> >>> Destination of migration has a local copy of the L1/L2 tables that existed >>> at the beginning, before the install started. We have disk corruption at >>> this point. The solution (for NFS) is to just re-open the file. Operations >>> have to happen in this order: >>> >>> - source of migration: flush() >>> - destination: close(file); >>> - destination: open(file) >>> >>> it is not necesary that source of migration close the file. >>> >>> Signed-off-by: Juan Quintela >> >> Couple thoughts: >> >> 1) Pretty sure this would break -snapshot. I do test migration with >> -snapshot so please don't break it. > > Can you give me one example? I don't know how to use -snapshot with migration. This is totally unsafe but has always worked for me. On the same box: $ qemu -hda foo.img -snapshot $ qemu -hda foo.img -snapshot -incoming tcp:localhost:1025 This is not the *only* way I test migration but it's very convenient for sniff testing. The problem with your patch is that it assumes that once you've opened a file, the name still exists. But that is not universally true. It needs to degrade in a useful way. I think just deferring open is probably the best strategy. > >> 2) I don't think this is going to work very well with encrypted drives. > > To be hones, no clue. Deferring open addresses this is a nice way I think. >> Perhaps we could do something like: >> >> http://mid.gmane.org/1284213896-12705-2-git-send-email-aliguori@us.ibm.com > > That is something like I wanted to know. > >> And do reopen as a default implementation. That way we don't have to >> do reopen for formats that don't need it (raw) > > Kevin told me that know that we allow online resize, we should also > update that for raw, but I haven't tested to be sure one way or another. > >> or can flush caches without reopening the file (qed). > > qcow2 could be told to flush caches, it is that the code is not there. > It shouldn't be _that_ difficult. But I am not able to understand > anymore block_open<-> block_file_open relationship. > >> It doesn't fix NFS close-to-open, but I think the right way to do that >> is to defer the open, not to reopen. > > Fully agree here, that would be another way to fix it. See that in my > other answer I showed that Markus already have problems with ide + cmos, > so I think that we should have: I've posted patches that delay the geometry guess until the device model is initialized. That avoids this particular problem. Regards, Anthony Liguori > > - initialization done before we open files/block/ > - open files/block/... > - late initialization that uses that (almost nothing needs to be here > and should be easy to audit). > > About NFS, iSCSI, FC, my understanding is that if you use anything > different than cache=none you are playing with fire, and will get burned > sooner or later (it took quite a bit for Christoph to make me understand > that, but now I fully agree with him). > > Later, Juan.