From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56497) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XctEP-0000YY-4i for qemu-devel@nongnu.org; Sat, 11 Oct 2014 05:45:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XctEI-0003we-UB for qemu-devel@nongnu.org; Sat, 11 Oct 2014 05:44:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64416) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XctEI-0003vu-MJ for qemu-devel@nongnu.org; Sat, 11 Oct 2014 05:44:50 -0400 Message-ID: <5438FBF4.7070504@redhat.com> Date: Sat, 11 Oct 2014 11:44:20 +0200 From: Max Reitz MIME-Version: 1.0 References: <1408215258-12545-1-git-send-email-mreitz@redhat.com> <1408215258-12545-2-git-send-email-mreitz@redhat.com> <20141010115011.GB10091@irqsave.net> In-Reply-To: <20141010115011.GB10091@irqsave.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 1/3] block: Ignore allocation size in underlying file List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?windows-1252?Q?Beno=EEt_Canet?= Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi Am 10.10.2014 um 13:50 schrieb Beno=EEt Canet: > The Saturday 16 Aug 2014 =E0 20:54:16 (+0200), Max Reitz wrote : >> When falling through to the underlying file in >> bdrv_co_get_block_status(), do not let the number of sectors for which >> information could be obtained be overwritten. >> >> Signed-off-by: Max Reitz >> --- >> block.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/block.c b/block.c >> index 3e252a2..c922664 100644 >> --- a/block.c >> +++ b/block.c >> @@ -3991,9 +3991,11 @@ static int64_t coroutine_fn bdrv_co_get_block_s= tatus(BlockDriverState *bs, >> if (bs->file && >> (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) && >> (ret & BDRV_BLOCK_OFFSET_VALID)) { >> + int backing_pnum; >> + >> ret2 =3D bdrv_co_get_block_status(bs->file, ret >> BDRV_SECT= OR_BITS, >> - *pnum, pnum); >> - if (ret2 >=3D 0) { >> + *pnum, &backing_pnum); >> + if (ret2 >=3D 0 && backing_pnum >=3D *pnum) { > About backing_pnum >=3D *pnum. > > The documentation of bdrv_co_get_block_status says: > > * 'nb_sectors' is the max value 'pnum' should be set to. If nb_secto= rs goes > * beyond the end of the disk image it will be clamped. > */ > static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *= bs, > int64_t sector_nu= m, > int nb_sectors, i= nt *pnum) > > So clearly after the bdrv_co_get_block_status *pnum >=3D backing_pnum. > > This means that backing_pnum > *pnum will never happen. > > I think either this test is wrong or the doc is wrong. Thank you for confusing me, I had to think quite a while about this. *g* The condition is not for error checking. If it was, it would be the=20 wrong order (the condition should be true on success, that's why it's=20 "ret2 >=3D 0" and not "ret2 < 0", so it should then be "backing_pnum <=3D= =20 *pnum"). So what this is testing is whether all sectors in the=20 underlying file in the queried range are read as zero. But if=20 "backing_pnum < *pnum" that is not the case, some clusters are not zero.=20 So we may not set the zero flag if backing_pnum < *pnum; or as it reads=20 in the code, we may only set it if backing_pnum >=3D *pnum. This is not=20 about whether *pnum > backing_pnum, but more about whether backing_pnum=20 =3D=3D *pnum (but >=3D would be fine, too, if bdrv_co_get_block_status()=20 supported it, so that's why I wrote it that way). However, I'm starting to think about whether it would be better, for the=20 backing_pnum < *pnum case, not to not set the zero flag, but rather=20 simply set *pnum =3D backing_pnum. And this in turn would be pretty=20 equivalent to just omitting this patch, because: If we get to this point where we query the underlying file and it=20 returns a certain number of sectors is zero; then we therefore want to=20 set *pnum =3D backing_pnum (both if backing_pnum < *pnum and if=20 backing_pnum =3D=3D *pnum; backing_pnum > *pnum cannot happen, as you=20 pointed out). On the other hand, if the sectors are not reported to be=20 zero, but backing_pnum < *pnum, we want to shorten *pnum accordingly as=20 well because this may indicate that after another backing_pnum sectors,=20 we arrive at a hole in the file. There is only one point I can imagine where it makes sense not to let=20 backing_pnum overwrite *pnum: And that's if bdrv_co_get_block_status()=20 reported BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID with an offset beyond=20 the EOF. I think this might actually happen with qcow2, if one cluster=20 simply lies beyond the EOF (which is perfectly valid). So I conclude=20 that this patch has its use after all but needs to be modified so that=20 backing_pnum always overwrites *pnum; except for when backing_pnum is=20 zero (which should only happen at or after the EOF) in which case the=20 zero flag should be set and *pnum should be left as it was. And now in all honesty: Thanks for confusing me, I guess I can think=20 better when I'm confused. :-) Max > Best regards > > Beno=EEt > > >> /* Ignore errors. This is just providing extra informat= ion, it >> * is useful but not necessary. >> */ >> --=20 >> 2.0.4 >> >>