From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42534) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ad0c7-00079F-ID for qemu-devel@nongnu.org; Mon, 07 Mar 2016 14:14:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ad0c5-0001LD-Uv for qemu-devel@nongnu.org; Mon, 07 Mar 2016 14:14:43 -0500 References: <1457373855-8072-1-git-send-email-ndevos@redhat.com> From: Eric Blake Message-ID: <56DDD31A.8060707@redhat.com> Date: Mon, 7 Mar 2016 12:14:34 -0700 MIME-Version: 1.0 In-Reply-To: <1457373855-8072-1-git-send-email-ndevos@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="bMJTFHiRpOVInbR3IA7JIpo8n9ocH616v" Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] block/gluster: add support for SEEK_DATA/SEEK_HOLE List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Niels de Vos , qemu-block@nongnu.org Cc: "qemu-devel@nongnu.org" , Prasanna Kumar Kalever This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --bMJTFHiRpOVInbR3IA7JIpo8n9ocH616v Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [adding qemu-devel; ALL patches must cc qemu-devel even when sent to another list] On 03/07/2016 11:04 AM, Niels de Vos wrote: > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes > it possible to detect sparse areas in files. >=20 > Signed-off-by: Niels de Vos >=20 > -- > Tested by compiling and running "qemu-img map gluster://..." with a > build of the current master branch of glusterfs. Using a Fedora > cloud image (in raw format) shows many SEEK procudure calls going back > and forth over the network. The output of "qemu map" matches the output= > when run against the image on the local filesystem. > --- > block/gluster.c | 159 ++++++++++++++++++++++++++++++++++++++++++++++++= ++++++++ > configure | 25 +++++++++ > 2 files changed, 184 insertions(+) >=20 > diff --git a/block/gluster.c b/block/gluster.c > index 65077a0..1430010 100644 > --- a/block/gluster.c > +++ b/block/gluster.c > @@ -677,6 +677,153 @@ static int qemu_gluster_has_zero_init(BlockDriver= State *bs) > return 0; > } > =20 > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA > +/* > + * Find allocation range in @bs around offset @start. > + * May change underlying file descriptor's file offset. > + * If @start is not in a hole, store @start in @data, and the > + * beginning of the next hole in @hole, and return 0. > + * If @start is in a non-trailing hole, store @start in @hole and the > + * beginning of the next non-hole in @data, and return 0. > + * If @start is in a trailing hole or beyond EOF, return -ENXIO. > + * If we can't find out, return a negative errno other than -ENXIO. > + * > + * (Shamefully copied from raw-posix.c, only miniscule adaptions.) > + */ > +static int find_allocation(BlockDriverState *bs, off_t start, > + off_t *data, off_t *hole) > +{ > + BDRVGlusterState *s =3D bs->opaque; > + off_t offs; > + > + /* > + * SEEK_DATA cases: > + * D1. offs =3D=3D start: start is in data > + * D2. offs > start: start is in a hole, next data at offs > + * D3. offs < 0, errno =3D ENXIO: either start is in a trailing ho= le > + * or start is beyond EOF > + * If the latter happens, the file has been truncated behind > + * our back since we opened it. All bets are off then. > + * Treating like a trailing hole is simplest. > + * D4. offs < 0, errno !=3D ENXIO: we learned nothing > + */ > + offs =3D glfs_lseek(s->fd, start, SEEK_DATA); > + if (offs < 0) { > + return -errno; /* D3 or D4 */ > + } > + assert(offs >=3D start); > + > + if (offs > start) { > + /* D2: in hole, next data at offs */ > + *hole =3D start; > + *data =3D offs; > + return 0; > + } > + > + /* D1: in data, end not yet known */ > + > + /* > + * SEEK_HOLE cases: > + * H1. offs =3D=3D start: start is in a hole > + * If this happens here, a hole has been dug behind our back > + * since the previous lseek(). > + * H2. offs > start: either start is in data, next hole at offs, > + * or start is in trailing hole, EOF at offs > + * Linux treats trailing holes like any other hole: offs =3D=3D= > + * start. Solaris seeks to EOF instead: offs > start (blech).= > + * If that happens here, a hole has been dug behind our back > + * since the previous lseek(). > + * H3. offs < 0, errno =3D ENXIO: start is beyond EOF > + * If this happens, the file has been truncated behind our > + * back since we opened it. Treat it like a trailing hole. > + * H4. offs < 0, errno !=3D ENXIO: we learned nothing > + * Pretend we know nothing at all, i.e. "forget" about D1. > + */ > + offs =3D glfs_lseek(s->fd, start, SEEK_HOLE); > + if (offs < 0) { > + return -errno; /* D1 and (H3 or H4) */ > + } > + assert(offs >=3D start); > + > + if (offs > start) { > + /* > + * D1 and H2: either in data, next hole at offs, or it was in > + * data but is now in a trailing hole. In the latter case, > + * all bets are off. Treating it as if it there was data all > + * the way to EOF is safe, so simply do that. > + */ > + *data =3D start; > + *hole =3D offs; > + return 0; > + } > + > + /* D1 and H1 */ > + return -EBUSY; > +} > + > +/* > + * Returns the allocation status of the specified sectors. > + * > + * If 'sector_num' is beyond the end of the disk image the return valu= e is 0 > + * and 'pnum' is set to 0. > + * > + * 'pnum' is set to the number of sectors (including and immediately f= ollowing > + * the specified sector) that are known to be in the same > + * allocated/unallocated state. > + * > + * 'nb_sectors' is the max value 'pnum' should be set to. If nb_secto= rs goes > + * beyond the end of the disk image it will be clamped. > + * > + * (Based on raw_co_get_block_status() from raw-posix.c.) > + */ > +static int64_t coroutine_fn qemu_gluster_co_get_block_status( > + BlockDriverState *bs, int64_t sector_num, int nb_sectors, int = *pnum) > +{ > + BDRVGlusterState *s =3D bs->opaque; > + off_t start, data =3D 0, hole =3D 0; > + int64_t total_size; > + int ret =3D -EINVAL; > + > + if (!s->fd) { > + return ret; > + } > + > + start =3D sector_num * BDRV_SECTOR_SIZE; > + total_size =3D bdrv_getlength(bs); > + if (total_size < 0) { > + return total_size; > + } else if (start >=3D total_size) { > + *pnum =3D 0; > + return 0; > + } else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) { > + nb_sectors =3D DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SI= ZE); > + } > + > + ret =3D find_allocation(bs, start, &data, &hole); > + if (ret =3D=3D -ENXIO) { > + /* Trailing hole */ > + *pnum =3D nb_sectors; > + ret =3D BDRV_BLOCK_ZERO; > + } else if (ret < 0) { > + /* No info available, so pretend there are no holes */ > + *pnum =3D nb_sectors; > + ret =3D BDRV_BLOCK_DATA; > + } else if (data =3D=3D start) { > + /* On a data extent, compute sectors to the end of the extent,= > + * possibly including a partial sector at EOF. */ > + *pnum =3D MIN(nb_sectors, DIV_ROUND_UP(hole - start, BDRV_SECT= OR_SIZE)); > + ret =3D BDRV_BLOCK_DATA; > + } else { > + /* On a hole, compute sectors to the beginning of the next ext= ent. */ > + assert(hole =3D=3D start); > + *pnum =3D MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE); > + ret =3D BDRV_BLOCK_ZERO; > + } > + return ret | BDRV_BLOCK_OFFSET_VALID | start; > +} > +#endif /* CONFIG_GLUSTERFS_SEEK_DATA */ > + > + > static QemuOptsList qemu_gluster_create_opts =3D { > .name =3D "qemu-gluster-create-opts", > .head =3D QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head), > @@ -719,6 +866,9 @@ static BlockDriver bdrv_gluster =3D { > #ifdef CONFIG_GLUSTERFS_ZEROFILL > .bdrv_co_write_zeroes =3D qemu_gluster_co_write_zeroes, > #endif > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA > + .bdrv_co_get_block_status =3D qemu_gluster_co_get_block_status= , > +#endif > .create_opts =3D &qemu_gluster_create_opts, > }; > =20 > @@ -746,6 +896,9 @@ static BlockDriver bdrv_gluster_tcp =3D { > #ifdef CONFIG_GLUSTERFS_ZEROFILL > .bdrv_co_write_zeroes =3D qemu_gluster_co_write_zeroes, > #endif > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA > + .bdrv_co_get_block_status =3D qemu_gluster_co_get_block_status= , > +#endif > .create_opts =3D &qemu_gluster_create_opts, > }; > =20 > @@ -773,6 +926,9 @@ static BlockDriver bdrv_gluster_unix =3D { > #ifdef CONFIG_GLUSTERFS_ZEROFILL > .bdrv_co_write_zeroes =3D qemu_gluster_co_write_zeroes, > #endif > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA > + .bdrv_co_get_block_status =3D qemu_gluster_co_get_block_status= , > +#endif > .create_opts =3D &qemu_gluster_create_opts, > }; > =20 > @@ -800,6 +956,9 @@ static BlockDriver bdrv_gluster_rdma =3D { > #ifdef CONFIG_GLUSTERFS_ZEROFILL > .bdrv_co_write_zeroes =3D qemu_gluster_co_write_zeroes, > #endif > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA > + .bdrv_co_get_block_status =3D qemu_gluster_co_get_block_status= , > +#endif > .create_opts =3D &qemu_gluster_create_opts, > }; > =20 > diff --git a/configure b/configure > index 0c0472a..ca3821c 100755 > --- a/configure > +++ b/configure > @@ -3351,6 +3351,9 @@ if test "$glusterfs" !=3D "no" ; then > if $pkg_config --atleast-version=3D6 glusterfs-api; then > glusterfs_zerofill=3D"yes" > fi > + if $pkg_config --atleast-version=3D7.3.8 glusterfs-api; then > + glusterfs_seek_data=3D"yes" > + fi > else > if test "$glusterfs" =3D "yes" ; then > feature_not_found "GlusterFS backend support" \ > @@ -3660,6 +3663,24 @@ if compile_prog "" "" ; then > fiemap=3Dyes > fi > =20 > +# check for SEEK_DATA and SEEK_HOLE > +seek_data=3Dno > +cat > $TMPC << EOF > +#define _GNU_SOURCE > +#include > +#include > + > +int main(void) > +{ > + lseek(0, 0, SEEK_DATA); > + lseek(0, 0, SEEK_HOLE); > + return 0; > +} > +EOF > +if compile_prog "" "" ; then > + seek_data=3Dyes > +fi > + > # check for dup3 > dup3=3Dno > cat > $TMPC << EOF > @@ -5278,6 +5299,10 @@ if test "$glusterfs_zerofill" =3D "yes" ; then > echo "CONFIG_GLUSTERFS_ZEROFILL=3Dy" >> $config_host_mak > fi > =20 > +if test "$glusterfs_seek_data" =3D "yes" && test "$seek_data" =3D "yes= " ; then > + echo "CONFIG_GLUSTERFS_SEEK_DATA=3Dy" >> $config_host_mak > +fi > + > if test "$archipelago" =3D "yes" ; then > echo "CONFIG_ARCHIPELAGO=3Dm" >> $config_host_mak > echo "ARCHIPELAGO_LIBS=3D$archipelago_libs" >> $config_host_mak >=20 --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --bMJTFHiRpOVInbR3IA7JIpo8n9ocH616v Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJW3dMaAAoJEKeha0olJ0NqbrQH/Rc1BYzsII+VmTxFRNUgu8Dd wSnxskBtdHNz2TqctH3e/FvD/7ODxs616om88KITr3lmyKcMRQuEJ5XIZuAwf8GM H0fYBfMnIZ9pEUisjdWEeKISoBVuh3bYssEfQW7b9JCp+DdrDqjxt707HHpe0qFw Jm1BXzYoLrQU4Tb9cpPMrjMezkEQcu93enRppZu37VP2ROlOYeRz7iZOjC8kQhch S6fhY7oIShcbhwHWqao2d8NYX08aNMivorLfI6U2HDGba2ZeEPCcIuqu1Vke3Z1A P5MHFp5ifQ5i1fP2jp7IsDY0D05HUcUINpJN221l887kvbBKQ+lc8a2qDigSVRw= =2Rai -----END PGP SIGNATURE----- --bMJTFHiRpOVInbR3IA7JIpo8n9ocH616v--