From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LYWcK-00015L-EQ for qemu-devel@nongnu.org; Sat, 14 Feb 2009 21:20:24 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LYWcJ-00014v-EI for qemu-devel@nongnu.org; Sat, 14 Feb 2009 21:20:23 -0500 Received: from [199.232.76.173] (port=35564 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LYWcJ-00014o-8s for qemu-devel@nongnu.org; Sat, 14 Feb 2009 21:20:23 -0500 Received: from mail2.shareable.org ([80.68.89.115]:58160) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LYWcI-0007Mj-Qm for qemu-devel@nongnu.org; Sat, 14 Feb 2009 21:20:23 -0500 Date: Sun, 15 Feb 2009 02:20:20 +0000 From: Jamie Lokier Subject: Re: [Qemu-devel] [PATCH] Revert block-qcow2.c to kvm-72 version due to corruption reports Message-ID: <20090215022020.GB9281@shareable.org> References: <4988AD96.6090308@codemonkey.ws> <20090213084023.GA1020@kos.to> <20090213163043.GJ18471@shareable.org> <4995A723.9010208@codemonkey.ws> <20090213190419.GB20328@shareable.org> <49974466.8060204@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49974466.8060204@redhat.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: dlaor@redhat.com, qemu-devel@nongnu.org Dor Laor wrote: > The solution is to find the real cause to the corruption. I agree, if someone is able to do that, great, but if not and practical reality results in these choices: 1. Ship the current code which results in corruption on Windows 2000 and 2003 guests (and who knows what else), and by the way is unlikely to have anything to do with device emulation. 2. Revert to (nearly) kvm-72 code which appears to fix the majority of those corruption cases, although there is still something rare, which may be a different bug. Which is the best choice? >>From a QA POV, I would revert the known bug until someone has a fix, then reinstate everything after it which is thought to be good. > Jamie Lokier wrote: > Anthony Liguori wrote: > Simply reverting the qcow2 code appears to fix > those problems, so it > needn't hold up cutting a release. That's what I > recommend. > Send some patches. > I did already. > > Here it is again. This should fix my bug and Marc's bug according to > his report that reverting qcow2.c fixes it. > Going back to kvm-72 is not good also. > First, there were qcow2 corruptions before it, they were very rare but still > exist. That's true. But they were noticably rarer - to the point that people clearly are using kvm-72 with qcow2 and not reporting many problems. Ubuntu 8.10 shipped kvm-72, and that coincided with their announcement that they're supporting KVM as their official virtualisation solution. I imagine kvm-72 is getting a fair bit of usage because of that. Of course they could be having rare problems and think it's a bug in the guest or its applications :-) > Not long ago we did not know even that qcow2 is the faulty. Worrying, isn't it. Does qcow2 get any rigorous testing? Should that be added - a blockdev test suite? There hasn't been a complete lack of bug reports about qcow2, but maybe they aren't getting to the right places, and maybe they're too difficult to reproduce and easy to workaround ("my guest occasionally shows random corruption", "don't use KVM for that guest", "I switch to raw and it went away") I very luckily discovered it prevented one of my VMs from booting, as soon as I upgraded from kvm-72 (shipped with Ubuntu) to something newer. If it hadn't prevented it from booting, just occasional rare corruption, I might not have realised it was qcow2 at all. Guest corruption can occur for many reasons, and -win2k-hack implies that the IDE emulation is not quite right in some way. -- Jamie