From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:57802)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1TimZX-0004JW-Bd
	for qemu-devel@nongnu.org; Wed, 12 Dec 2012 08:42:10 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1TimZR-0002HY-CP
	for qemu-devel@nongnu.org; Wed, 12 Dec 2012 08:42:03 -0500
Received: from mx1.redhat.com ([209.132.183.28]:42160)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1TimZR-0002HJ-4Z
	for qemu-devel@nongnu.org; Wed, 12 Dec 2012 08:41:57 -0500
Message-ID: <50C8899D.2050308@redhat.com>
Date: Wed, 12 Dec 2012 14:41:49 +0100
From: Kevin Wolf <kwolf@redhat.com>
MIME-Version: 1.0
References: <1339767219-24297-1-git-send-email-kwolf@redhat.com>
	<1339767219-24297-29-git-send-email-kwolf@redhat.com>
	<201212121425.41850.hahn@univention.de>
In-Reply-To: <201212121425.41850.hahn@univention.de>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix
 avail_sectors in cluster allocation code
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Philipp Hahn <hahn@univention.de>
Cc: Michael Tokarev <mjt@tls.msk.ru>, qemu-devel@nongnu.org

Hi Philipp,

Am 12.12.2012 14:25, schrieb Philipp Hahn:
> Hello Kevin, hello Michael, hello *,
> 
> we noticed a data corruption bug in qemu-1.1.2, which will be shipped by 
> Debian and our own Debian based distibution.
> The corruption mostly manifests while installing large Debian package files 
> and seems to be reladed to memory preasure: As long as the file is still in 
> the page cache, everything looks fine, but when the file is re-read from the 
> virtual hard disk using a qcow2 file backed by another qcow2 file, the file 
> is corrupted: dpkg complains that the .tar.gz file inside the Debian archive 
> file is corrupted and the md5sum no longer matches.
> 
> I tracked this down using "git bisect" to your patch attached below, which 
> fixed this bug, so everything is fine with qemu-kvm-1.2.0.
> From my reading this seems to explain our problems, since during my own 
> testing during development I never used backing chains and the problem only 
> showed up when my collegues started using qemu-kvm-1.1.2 with their VMs using 
> backing chains.
> 
> @Kevin: Do you thinks that's a valid explanation and your patch should fix 
> that problem?
> I'd like to get your expertise before filing a bug with Debian and asking 
> Michael to include that patch with his next stable update for 1.1.

As you can see in the commit message of that patch I was convinced that
no bug did exist in practice and this was only dangerous with respect to
future changes. Therefore my first question is if you're using an
unmodified upstream qemu or if some backported patches are applied to
it? If it's indeed unmodified, we should probably review the code once
again to understand why it makes a difference.

In any case, this is the cluster allocation code. It's probably not
related to rereading things from disk, but rather to the writeout of the
page cache.

Kevin