From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60123) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XnxTn-0007VI-AD for qemu-devel@nongnu.org; Mon, 10 Nov 2014 17:30:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XnxTi-0001MV-9o for qemu-devel@nongnu.org; Mon, 10 Nov 2014 17:30:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35298) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XnxTi-0001MO-0x for qemu-devel@nongnu.org; Mon, 10 Nov 2014 17:30:30 -0500 Message-ID: <54613C7E.4020707@redhat.com> Date: Mon, 10 Nov 2014 15:30:22 -0700 From: Eric Blake MIME-Version: 1.0 References: <1415627159-15941-1-git-send-email-mreitz@redhat.com> <1415627159-15941-7-git-send-email-mreitz@redhat.com> In-Reply-To: <1415627159-15941-7-git-send-email-mreitz@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="inuP4nMm0lsrPI8ntGMARrkXpwPDcfbBB" Subject: Re: [Qemu-devel] [PATCH 06/21] qcow2: Helper function for refcount modification List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz , qemu-devel@nongnu.org Cc: Kevin Wolf , Peter Lieven , Stefan Hajnoczi This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --inuP4nMm0lsrPI8ntGMARrkXpwPDcfbBB Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 11/10/2014 06:45 AM, Max Reitz wrote: > Since refcounts do not always have to be a uint16_t, all refcount block= s > and arrays in memory should not have a specific type (thus they become > pointers to void) and for accessing them, two helper functions are used= > (a getter and a setter). Those functions are called indirectly through > function pointers in the BDRVQcowState so they may later be exchanged > for different refcount orders. >=20 > At the same time, replace all sizeof(**refcount_table) etc. in the qcow= 2 > check code by s->refcount_bits / 8. Note that this might lead to wrong > values due to truncating division, but currently s->refcount_bits is > always 16, and before the upcoming patch which removes this limitation > another patch will make the division round up correctly. Thanks for pointing out that this transition is still in progress, and needs more patches. I agree that for this patch, the division is safe. >=20 > Signed-off-by: Max Reitz > --- > block/qcow2-refcount.c | 152 +++++++++++++++++++++++++++++------------= -------- > block/qcow2.h | 8 +++ > 2 files changed, 98 insertions(+), 62 deletions(-) >=20 > @@ -116,20 +137,24 @@ int64_t qcow2_get_refcount(BlockDriverState *bs, = int64_t cluster_index) > } > =20 > ret =3D qcow2_cache_get(bs, s->refcount_block_cache, refcount_bloc= k_offset, > - (void**) &refcount_block); > + &refcount_block); > if (ret < 0) { > return ret; > } > =20 > block_index =3D cluster_index & (s->refcount_block_size - 1); > - refcount =3D be16_to_cpu(refcount_block[block_index]); > + refcount =3D s->get_refcount(refcount_block, block_index); > =20 > - ret =3D qcow2_cache_put(bs, s->refcount_block_cache, > - (void**) &refcount_block); > + ret =3D qcow2_cache_put(bs, s->refcount_block_cache, &refcount_blo= ck); > if (ret < 0) { > return ret; > } > =20 > + if (refcount < 0) { > + /* overflow */ > + return -ERANGE; > + } Should you be checking for overflow prior to calling qcow2_cache_put? > @@ -362,7 +387,7 @@ static int alloc_refcount_block(BlockDriverState *b= s, > s->cluster_size; > uint64_t table_offset =3D meta_offset + blocks_clusters * s->clust= er_size; > uint64_t *new_table =3D g_try_new0(uint64_t, table_size); > - uint16_t *new_blocks =3D g_try_malloc0(blocks_clusters * s->cluste= r_size); > + void *new_blocks =3D g_try_malloc0(blocks_clusters * s->cluster_si= ze); Can this multiplication ever overflow? Would we be better off with a g_try_new0 approach? > @@ -1118,12 +1142,13 @@ fail: > */ > static int inc_refcounts(BlockDriverState *bs, > BdrvCheckResult *res, > - uint16_t **refcount_table, > + void **refcount_table, > int64_t *refcount_table_size, > int64_t offset, int64_t size) > { > BDRVQcowState *s =3D bs->opaque; > - uint64_t start, last, cluster_offset, k; > + uint64_t start, last, cluster_offset, k, refcount; Why uint64_t, when you limit to int64_t in other patches? > + int64_t i; > =20 > if (size <=3D 0) { > return 0; > @@ -1136,12 +1161,12 @@ static int inc_refcounts(BlockDriverState *bs, > k =3D cluster_offset >> s->cluster_bits; > if (k >=3D *refcount_table_size) { > int64_t old_refcount_table_size =3D *refcount_table_size; > - uint16_t *new_refcount_table; > + void *new_refcount_table; > =20 > *refcount_table_size =3D k + 1; > new_refcount_table =3D g_try_realloc(*refcount_table, > *refcount_table_size * > - sizeof(**refcount_table= )); > + s->refcount_bits / 8); This multiplies before dividing. Can it ever overflow, where writing *refcount_table_size * (s->refcount_bits / 8) would be safer? Also, is it better to use a malloc variant that checks for overflow (I think it is g_try_renew?) instead of open-coding the multiply? > if (!new_refcount_table) { > *refcount_table_size =3D old_refcount_table_size; > res->check_errors++; > @@ -1149,16 +1174,19 @@ static int inc_refcounts(BlockDriverState *bs, > } > *refcount_table =3D new_refcount_table; > =20 > - memset(*refcount_table + old_refcount_table_size, 0, > - (*refcount_table_size - old_refcount_table_size) * > - sizeof(**refcount_table)); > + for (i =3D old_refcount_table_size; i < *refcount_table_si= ze; i++) { > + s->set_refcount(*refcount_table, i, 0); > + } This feels slower than memset. Any chance we can add an optimization that brings back the speed of memset (may require an additional callback in addition to the getter and setter)? > @@ -1178,7 +1206,7 @@ enum { > * error occurred. > */ > static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *r= es, > - uint16_t **refcount_table, int64_t *refcount_table_size, int64_t l= 2_offset, > + void **refcount_table, int64_t *refcount_table_size, int64_t l2_of= fset, > int flags) I noticed you cleaned up indentation in a lot of the patch, but not here. Any reason? > @@ -1541,7 +1569,7 @@ static int check_refblocks(BlockDriverState *bs, = BdrvCheckResult *res, > =20 > new_refcount_table =3D g_try_realloc(*refcount_table, > *nb_clusters * > - sizeof(**refcount_t= able)); > + s->refcount_bits / = 8); Another possible overflow or g_try_renew site? > if (!new_refcount_table) { > *nb_clusters =3D old_nb_clusters; > res->check_errors++; > @@ -1549,9 +1577,9 @@ static int check_refblocks(BlockDriverState *bs, = BdrvCheckResult *res, > } > *refcount_table =3D new_refcount_table; > =20 > - memset(*refcount_table + old_nb_clusters, 0, > - (*nb_clusters - old_nb_clusters) * > - sizeof(**refcount_table)); > + for (j =3D old_nb_clusters; j < *nb_clusters; j++) { > + s->set_refcount(*refcount_table, j, 0); > + } Another memset pessimation. Maybe even having a callback to expand the table, and factor out more of the common code of reallocating the table and clearing all new entries. > @@ -1611,7 +1640,7 @@ static int calculate_refcounts(BlockDriverState *= bs, BdrvCheckResult *res, > int ret; > =20 > if (!*refcount_table) { > - *refcount_table =3D g_try_new0(uint16_t, *nb_clusters); > + *refcount_table =3D g_try_malloc0(*nb_clusters * s->refcount_b= its / 8); Feels like a step backwards in overflow detection? > @@ -1787,22 +1816,22 @@ static int64_t alloc_clusters_imrt(BlockDriverS= tate *bs, > *imrt_nb_clusters =3D cluster + cluster_count - contiguous_fre= e_clusters; > new_refcount_table =3D g_try_realloc(*refcount_table, > *imrt_nb_clusters * > - sizeof(**refcount_table)); > + s->refcount_bits / 8); Another possible overflow > if (!new_refcount_table) { > *imrt_nb_clusters =3D old_imrt_nb_clusters; > return -ENOMEM; > } > *refcount_table =3D new_refcount_table; > =20 > - memset(*refcount_table + old_imrt_nb_clusters, 0, > - (*imrt_nb_clusters - old_imrt_nb_clusters) * > - sizeof(**refcount_table)); > + for (i =3D old_imrt_nb_clusters; i < *imrt_nb_clusters; i++) {= > + s->set_refcount(*refcount_table, i, 0); > + } > } and another resize where we pessimize memset > @@ -1911,12 +1940,11 @@ write_refblocks: > } > =20 > on_disk_refblock =3D qemu_blockalign0(bs->file, s->cluster_siz= e); > - for (i =3D 0; i < s->refcount_block_size && > - refblock_start + i < *nb_clusters; i++) > - { > - on_disk_refblock[i] =3D > - cpu_to_be16((*refcount_table)[refblock_start + i]); > - } > + > + memcpy(on_disk_refblock, (void *)((uintptr_t)*refcount_table += > + (refblock_index << s->refcount_block_= bits)), > + MIN(s->refcount_block_size, *nb_clusters - refblock_sta= rt) > + * s->refcount_bits / 8); > =20 This one's different in that you move TO a memcpy instead of open-coded loop. But I still worry if multiply before /8 could be a problem. > @@ -2064,7 +2092,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, B= drvCheckResult *res, > /* Because the old reftable has been exchanged for a new one t= he > * references have to be recalculated */ > rebuild =3D false; > - memset(refcount_table, 0, nb_clusters * sizeof(uint16_t)); > + memset(refcount_table, 0, nb_clusters * s->refcount_bits / 8);= Another /8 possible overflow. > ret =3D calculate_refcounts(bs, res, 0, &rebuild, &refcount_ta= ble, > &nb_clusters); > if (ret < 0) { > diff --git a/block/qcow2.h b/block/qcow2.h > index 0f8eb15..1c63221 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -213,6 +213,11 @@ typedef struct Qcow2DiscardRegion { > QTAILQ_ENTRY(Qcow2DiscardRegion) next; > } Qcow2DiscardRegion; > =20 > +typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array, > + uint64_t index); > +typedef void Qcow2SetRefcountFunc(void *refcount_array, > + uint64_t index, uint64_t value); Do you want int64_t for any of the types here, to make it obvious that you can't exceed 2^63? Looks like you are on track to a sane conversion, but I'm worried enough about the math that it probably needs a respin (either comments stating why we know we don't overflow, or else safer constructs). --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --inuP4nMm0lsrPI8ntGMARrkXpwPDcfbBB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJUYTx+AAoJEKeha0olJ0NqkxAIAJyytB2eb/nuTT/w8MTV2Xq9 MQHQvi1hMmmPOHN7Mwiv59kX8rJnbJgXIQK7ITE8k08knrRGQzYizAC28QX7Dx3p zQf4TMQfQD2J9V13VG0C+eFAgVO5eDKBTlGfGoyL+GHRHyoK6g/ID15eUFND5Oap aAeyQ9UdxPoXBopHzaLXmY5xsPwkoM7d324ojMi97+GKQZmxuITHvcyzXxUVs8yW UUCTlZ87NWGR77Q73f5GcB135WYOcyfqYE7c8MXlUH0zVu9mCfknV1VX508TJAvX AZVBja5mD4W4QTINmIiqaDfWd8gvcOSkVRxcTEp1NJECTdcayCFdcN9POPU2nZw= =NKmZ -----END PGP SIGNATURE----- --inuP4nMm0lsrPI8ntGMARrkXpwPDcfbBB--