From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39078) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XowcV-0008E9-De for qemu-devel@nongnu.org; Thu, 13 Nov 2014 10:47:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XowcQ-0001tK-Tn for qemu-devel@nongnu.org; Thu, 13 Nov 2014 10:47:39 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42809) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XowcQ-0001t9-NP for qemu-devel@nongnu.org; Thu, 13 Nov 2014 10:47:34 -0500 Message-ID: <5464D291.4000303@redhat.com> Date: Thu, 13 Nov 2014 08:47:29 -0700 From: Eric Blake MIME-Version: 1.0 References: <1415873823-13844-1-git-send-email-armbru@redhat.com> <1415873823-13844-4-git-send-email-armbru@redhat.com> <20141113130327.GD3933@noname.redhat.com> <5464C59A.4000602@redhat.com> <5464CE42.4000809@redhat.com> In-Reply-To: <5464CE42.4000809@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Nh8xMLdcE5P2ignD3Bwt9lElCBpV1dWVB" Subject: Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf , Markus Armbruster Cc: famz@redhat.com, qemu-devel@nongnu.org, tony@bakeyournoodle.com, mreitz@redhat.com, stefanha@redhat.com, pbonzini@redhat.com This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Nh8xMLdcE5P2ignD3Bwt9lElCBpV1dWVB Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 11/13/2014 08:29 AM, Eric Blake wrote: >>> lseek() with SEEK_DATA starting in a hole when there is no data until= >>> EOF is actually the part that isn't documented in the man page, but >>> ENXIO is what I'm seeing here on RHEL 7. >> >> Here's the (proposed) POSIX wording: >> >> http://austingroupbugs.net/view.php?id=3D415 >> >> And ENXIO is indeed the expected error for SEEK_DATA on a trailing hol= e, >> so maybe we should special case it. >> >=20 > Uggh. Historical practice on Solaris (and therefore the POSIX wording)= > says that SEEK_HOLE in a trailing hole is allowed (but not required) to= > seek to EOF instead of reporting the offset requested. I have no clue > why this was done, but it is VERY annoying - it means that if you > provide an offset within a tail hole of a file, you cannot reliably tel= l > if the file ends in a hole or with data, without ALSO trying SEEK_DATA.= > For applications that are reading a file sequentially but skipping ove= r > holes, this behavior is fine (it short-circuits the hole/data search > points and might shave an iteration off a lop). But for OUR purposes, > where we are merely trying to ascertain whether we are in a hole, we > have an inaccurate response - since SEEK_HOLE does NOT return the offse= t > we passed in, we are prone to treat the offset as belonging to data, > which is a pessimization (you never get wrong results by treating a hol= e > as data and reading it, but it is definitely slower). >=20 > I think you HAVE to call lseek() twice, both with SEEK_HOLE and with > SEEK_DATA, if you want to accurately determine whether an offset happen= s > to live within a trailing hole. Here's a table of possible situations, based solely on POSIX wording (and not on actual tests on Solaris or Linux, although it shouldn't be too hard to confirm behavior): 0-length file: lseek(fd, 0, SEEK_HOLE) =3D> -1 ENXIO lseek(fd, 0, SEEK_DATA) =3D> -1 ENXIO conclusion: 0 is at EOF file of any size: lseek(fd, size_or_larger, SEEK_HOLE) =3D> -1 ENXIO lseek(fd, size_or_larger, SEEK_DATA) =3D> -1 ENXIO conclusion: size_or_larger is at or beyond EOF file where offset is in a hole, but data appears later: lseek(fd, offset, SEEK_HOLE) =3D> offset lseek(fd, offset, SEEK_DATA) =3D> end_of_hole conclusion: offset through end_of_hole is in a hole file where offset is data, whether or not a hole appears later: lseek(fd, offset, SEEK_HOLE) =3D> end_of_data lseek(fd, offset, SEEK_DATA) =3D> offset conclusion: offset through end_of_data is in data file where offset is in a tail hole, option 1: lseek(fd, offset, SEEK_HOLE) =3D> offset lseek(fd, offset, SEEK_DATA) =3D> -1 ENXIO conclusion: offset through EOF is in hole, but another seek needed to learn EOF file where offset is in a tail hole, option 2: lseek(fd, offset, SEEK_HOLE) =3D> EOF lseek(fd, offset, SEEK_DATA) =3D> -1 ENXIO conclusion: offset through EOF is in hole, no additional seek needed The two calls are both necessary, in order to learn which extant type offset belongs to, and to tell where that extant ends; and the behaviors are distinguishable (if both lseek() succeed, we have both numbers we want; if both fail with ENXIO, we know the offset is at or beyond EOF; and if only SEEK_HOLE fails with ENXIO, we know we have a trailing hole); and we can tell at runtime what to do about a trailing hole (if the return value is offset, we need one more lseek(fd, 0, SEEK_END) to find EOF; if the return value is larger than offset, we have EOF for free). You can optimize by calling SEEK_HOLE first (if it fails with ENXIO, there is no need to try SEEK_DATA); but SEEK_HOLE in isolation is insufficient to give you all the information you need. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --Nh8xMLdcE5P2ignD3Bwt9lElCBpV1dWVB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJUZNKSAAoJEKeha0olJ0Nq5r4IAK6a949z29brmOmgqusnZEKx XJ5KzzYucSjJWYNuO4Vo6stFY/jImurDs2rfDTcCSwTyVZaflRLMw8PY7hiqseWL yGReCs2XBdzw3YqwZqJb5+0OmSYSqCblRUuU0hqmY4DaZtipP2ClrEm88e7fQEvO VnTAT5XqBxkMnbpPnGPXh+bvx3twLT3gLymXJg6ULqcyPAuhkoVWeVHLuER4VPq6 GMyvlJSugbxZD2CJRdKnzbJbLcjpnaJNSJ2LivTz+SO6Siqj7adZNyORgY8fXVBz rQjwFDP8K34xEIx6r4eulnVJfNlJt/U0GA2LUq/veUu+eTDrwoEKqjhokPkfv6s= =8RpS -----END PGP SIGNATURE----- --Nh8xMLdcE5P2ignD3Bwt9lElCBpV1dWVB--