From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1H6uaW-0006Kj-5T for qemu-devel@nongnu.org; Tue, 16 Jan 2007 15:07:20 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1H6uaV-0006Jy-EN for qemu-devel@nongnu.org; Tue, 16 Jan 2007 15:07:19 -0500 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1H6uaV-0006Jk-9O for qemu-devel@nongnu.org; Tue, 16 Jan 2007 15:07:19 -0500 Received: from [84.96.92.61] (helo=sMtp.neuf.fr) by monty-python.gnu.org with esmtp (Exim 4.52) id 1H6uaU-0005RI-Rq for qemu-devel@nongnu.org; Tue, 16 Jan 2007 15:07:19 -0500 Received: from [84.102.211.89] by sp604002mt.gpm.neuf.ld (Sun Java System Messaging Server 6.2-5.05 (built Feb 16 2006)) with ESMTP id <0JBZ00HUE7QTCT90@sp604002mt.gpm.neuf.ld> for qemu-devel@nongnu.org; Tue, 16 Jan 2007 20:35:17 +0100 (CET) Date: Tue, 16 Jan 2007 20:35:26 +0100 From: Fabrice Bellard Subject: Re: [Qemu-devel] Race condition in VMDK (QCOW*) formats. In-reply-to: <64F9B87B6B770947A9F8391472E0321609F7A82B@ehost011-8.exch011.intermedia.net> Message-id: <45AD28FE.9040705@bellard.org> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8; format=flowed Content-transfer-encoding: QUOTED-PRINTABLE References: <64F9B87B6B770947A9F8391472E0321609F7A82B@ehost011-8.exch011.intermedia.net> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Well, it was never said that the QCOW* code was safe if you interrupt= ed=20 QEMU at some point. But I agree that it could be safer to write the sector first and upda= te=20 the links after. It could be interesting to analyze the QCOW2 snapsho= ts=20 handling too (what if QEMU is stopped during the creation of a snapsh= ot ?). Regards, Fabrice. Igor Lvovsky wrote: >=20 >=20 > Hi all, >=20 > I have doubt about the race condition during the *write operation o= n=20 > snapshot*. >=20 > I think the problem exists in VMDK and QCOW* formats (I didn't chec= ked=20 > the others). >=20 > =20 >=20 > The example from the block_vmdk.c. >=20 > =20 >=20 > static int vmdk_write(BlockDriverState *bs, int64_t sector_num, >=20 > const uint8_t *buf, int nb_sectors) >=20 > { >=20 > BDRVVmdkState *s =3D bs->opaque; >=20 > int ret, index_in_cluster, n; >=20 > uint64_t cluster_offset; >=20 > =20 >=20 > while (nb_sectors > 0) { >=20 > index_in_cluster =3D sector_num & (s->cluster_sectors - 1); >=20 > n =3D s->cluster_sectors - index_in_cluster; >=20 > if (n > nb_sectors) >=20 > n =3D nb_sectors; >=20 > cluster_offset =3D get_cluster_offset(bs, sector_num << 9, = 1); >=20 > if (!cluster_offset) >=20 > return -1; >=20 > lseek(s->fd, cluster_offset + index_in_cluster * 512, SEEK_= SET); >=20 > ret =3D write(s->fd, buf, n * 512); >=20 > if (ret !=3D n * 512) >=20 > return -1; >=20 > nb_sectors -=3D n; >=20 > sector_num +=3D n; >=20 > buf +=3D n * 512; >=20 > } >=20 > return 0; >=20 > } >=20 > =20 >=20 > The /get_cluster_offset(=E2=80=A6)/ routine update the L2 table of = the metadata=20 > and return the /cluster_offset. / >=20 > After that the /vmdk_write(=E2=80=A6)/ routine/ /actually write the= grain at=20 > right place. >=20 > So, we have timing hole here. >=20 > =20 >=20 > Assume, VM that perform write operation will be destroyed at this m= oment. >=20 > So, we have corrupted image (with updated L2 table, but without the= =20 > grain itself). >=20 > =20 >=20 > Regards, >=20 > Igor Lvovsky >=20 > =20 >=20 > =20 >=20 > =20 >=20 > =20 >=20 > =20 >=20 >=20 > -------------------------------------------------------------------= ----- >=20 > _______________________________________________ > Qemu-devel mailing list > Qemu-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/qemu-devel