From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40948) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YDZBO-0006p1-TI for qemu-devel@nongnu.org; Tue, 20 Jan 2015 08:49:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YDZBL-0007yj-MC for qemu-devel@nongnu.org; Tue, 20 Jan 2015 08:49:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38321) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YDZBL-0007yc-FH for qemu-devel@nongnu.org; Tue, 20 Jan 2015 08:49:23 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t0KDnLhR025120 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Tue, 20 Jan 2015 08:49:22 -0500 Message-ID: <54BE5CE0.4080207@redhat.com> Date: Tue, 20 Jan 2015 08:49:20 -0500 From: Max Reitz MIME-Version: 1.0 References: <1421700544-24635-1-git-send-email-mreitz@redhat.com> <54BD7173.7060901@redhat.com> <54BD727D.5010608@redhat.com> <20150120100903.GA4318@noname.redhat.com> In-Reply-To: <20150120100903.GA4318@noname.redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 0/2] qcow2: Add two more unalignment checks List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, Stefan Hajnoczi On 2015-01-20 at 05:09, Kevin Wolf wrote: > Am 19.01.2015 um 22:09 hat Max Reitz geschrieben: >> On 2015-01-19 at 16:04, Eric Blake wrote: >>> On 01/19/2015 01:49 PM, Max Reitz wrote: >>>> With the series adding unalignment checks and the series reworking t= he >>>> zero cluster expansion code overlapping, the unalignment checks have= not >>>> been implemented in the latter code. >>>> >>>> This series fixes it. >>>> >>>> There are other places which would require unalignment checks, like = the >>>> offsets of L1 tables, especially for snapshots; but because it would= be >>>> best to add these checks in the function which reads the snapshot ta= ble, >>>> this would make images with broken snapshots completely unusable, wh= ich >>>> is something I opted to avoid for now. > That's something I noticed, too: For qemu-img check, our qcow2_open() > checks may be to strict. With a corrupted snapshot table, it won't even > run. Perhaps we should be laxer with some checks with BDRV_O_CHECK, and > for example just set s->nb_snapshots =3D 0 if loading the table fails. > >>>> Ideally, we need to make the qcow2 repair function repair such cases= , >>>> but until that is done there is not much we can do about them. >>> What's the best repair? >> That's what I was asking myself... >> >>> Read the data from the unaligned location, and >>> write a fresh copy into a new aligned allocation? >> Maybe. Maybe there is no way of repairing them and the only way is >> asking the user whether it's fine to delete the snapshot (qemu-img >> check -r make_consistent_whatever_it_takes). >> >> Also, the question remains for every unaligned data structure: L2 >> tables, data clusters=E2=80=A6 The refcount structures are the only on= es >> that can be easily recovered. For data clusters, reading from the >> unaligned location would probably be best; for L2 tables=E2=80=A6 mayb= e the >> same, then run a consistency check on the data and if it seems off=E2=84= =A2, >> throw the whole L2 table away. > Reading from the unaligned location doesn't help. I have never seen an > image whose L2 entries contained valid entries, except for the least > significant few bits. If your table is corrupted, it's usually garbage > from start to end. Probably so. > I think the only sane option is to drop the entries. The big question > is what should be used to replace them. /dev/random of course. There is a chance we are correct=E2=80=A6 Seriously speaking, well, unmap them? Zero clusters don't feel right to m= e. Max