From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XFrS4-0003NM-Og for qemu-devel@nongnu.org; Fri, 08 Aug 2014 17:11:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XFrRz-0002hD-MS for qemu-devel@nongnu.org; Fri, 08 Aug 2014 17:11:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2815) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XFrRz-0002gB-F8 for qemu-devel@nongnu.org; Fri, 08 Aug 2014 17:11:47 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s78LBkkI020375 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Fri, 8 Aug 2014 17:11:46 -0400 Message-ID: <53E53D0F.3030808@redhat.com> Date: Fri, 08 Aug 2014 23:11:43 +0200 From: Max Reitz MIME-Version: 1.0 References: <1407444475-19516-1-git-send-email-mreitz@redhat.com> <1407444475-19516-4-git-send-email-mreitz@redhat.com> <20140808091527.GE4118@noname.redhat.com> In-Reply-To: <20140808091527.GE4118@noname.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 3/3] block: Catch !bs->drv in bdrv_check() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, Stefan Hajnoczi On 08.08.2014 11:15, Kevin Wolf wrote: > Am 07.08.2014 um 22:47 hat Max Reitz geschrieben: >> qemu-img check calls bdrv_check() twice if the first run repaired some >> inconsistencies. If the first run however again triggered corruption >> prevention (on qcow2) due to very bad inconsistencies, bs->drv may be >> NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set. >> >> Signed-off-by: Max Reitz > I suppose there was a real case of this happening? I think bdrv_check() > triggering corruption prevention is a rather bad sign. The most > important point for image repair should be that it doesn't make the > situation any worse. Smells like a follow-up patch to the qcow2 code. Yes, as I wrote in the cover letter, using the image provided in https://bugs.launchpad.net/qemu/+bug/1353456 and setting the refblock offset to 0 (the reftable entry) results in a segmentation fault. A simple way to trigger corruption during bdrv_check() is creating an image, setting the first (and only) reftable entry to 0 and running qemu-img check -r all. bdrv_check() will try to allocate a refblock, but since the first clusters are unallocated, it will allocate them there which would obviously overwrite the image header and/or L1 table and/or reftable. The only way I can imagine to fix this is to completely disregard the on-disk refcount information during bdrv_check() and instead only use the calculated refcounts. This would require own allocation functions which may probably be rather simple, but in any case we'd need to write them. I think I should have some time, so I'll have a look into it. Max