qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	famz@redhat.com, tony@bakeyournoodle.com, qemu-devel@nongnu.org,
	mreitz@redhat.com, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure
Date: Fri, 14 Nov 2014 17:47:33 -0700	[thread overview]
Message-ID: <5466A2A5.8070703@redhat.com> (raw)
In-Reply-To: <871tp62aff.fsf@blackfin.pond.sub.org>

[-- Attachment #1: Type: text/plain, Size: 5646 bytes --]

On 11/14/2014 06:12 AM, Markus Armbruster wrote:
>> 0-length file:
>> lseek(fd, 0, SEEK_HOLE) => -1 ENXIO
>> lseek(fd, 0, SEEK_DATA) => -1 ENXIO
>> conclusion: 0 is at EOF
> 
> Isn't this a special case of the next one?
> 
>> file of any size:
>> lseek(fd, size_or_larger, SEEK_HOLE) => -1 ENXIO
>> lseek(fd, size_or_larger, SEEK_DATA) => -1 ENXIO
>> conclusion: size_or_larger is at or beyond EOF

Yes.

>>
>> The two calls are both necessary, in order to learn which extant type
>> offset belongs to, and to tell where that extant ends; and the behaviors
>> are distinguishable (if both lseek() succeed, we have both numbers we
>> want; if both fail with ENXIO, we know the offset is at or beyond EOF;
>> and if only SEEK_HOLE fails with ENXIO, we know we have a trailing
>> hole); and we can tell at runtime what to do about a trailing hole (if
>> the return value is offset, we need one more lseek(fd, 0, SEEK_END) to
>> find EOF; if the return value is larger than offset, we have EOF for
>> free).  You can optimize by calling SEEK_HOLE first (if it fails with
>> ENXIO, there is no need to try SEEK_DATA); but SEEK_HOLE in isolation is
>> insufficient to give you all the information you need.
> 
> Not discussed: how to handle failures other than ENXIO.
> 
> The appended code still avoids a second seek in one case.  Useful mostly
> because it saves us from handling a second seek's contradictory
> information.

Slick - I focused on SEEK_HOLE first, but you focused on SEEK_DATA
first.  Your comments make all the difference.

> 
> 
> /*
>  * Find allocation range in @bs around offset @start.
>  * May change underlying file descriptor's file offset.
>  * If @start is not in a hole, store @start in @data, and the
>  * beginning of the next hole in @hole, and return 0.
>  * If @start is in a non-trailing hole, store @start in @hole and the
>  * beginning of the next non-hole in @data, and return 0.
>  * If @start is in a trailing hole or beyond EOF, return -ENXIO.

And caller can blindly and safely treat that as a trailing hole, as needed.

>  * If we can't find out, return a negative errno other than -ENXIO.
>  */
> static int find_allocation(BlockDriverState *bs, off_t start,
>                            off_t *data, off_t *hole)
> {
> #if defined SEEK_HOLE && defined SEEK_DATA

I seriously doubt you'd find a system with one but not both of these
constants defined.  But it doesn't hurt to check both.

>     BDRVRawState *s = bs->opaque;
>     off_t offs;
> 
>     /*
>      * SEEK_DATA cases:
>      * D1. offs == start: start is in data
>      * D2. offs > start: start is in a hole, next data at offs
>      * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
>      *                              or start is beyond EOF
>      *     If the latter happens, the file has been truncated behind
>      *     our back since we opened it.  Best we can do is treat like
>      *     a trailing hole.
>      * D4. offs < 0, errno != ENXIO: we learned nothing
>      */

Correct.

>     offs = lseek(s->fd, start, SEEK_DATA);
>     if (offs < 0) {
>         return -errno;          /* D3 or D4 */
>     }
>     assert(offs >= start);
> 
>     if (offs > start) {
>         /* D2: in hole, next data at offs */
>         *hole = start;
>         *data = offs;
>         return 0;
>     }
> 
>     /* D1: in data, end not yet known */
> 
>     /*
>      * SEEK_HOLE cases:
>      * H1. offs == start: start is in a hole
>      *     If this happens here, a hole has been dug behind our back
>      *     since the previous lseek().
>      * H2. offs > start: either start is in data, next hole at offs,
>      *                   or start is in trailing hole, EOF at offs
>      *     Linux treats trailing holes like any other hole: offs ==
>      *     start.  Solaris seeks to EOF instead: offs > start (blech).

Correct in isolation.  Coupled with the additional knowledge that we are
in state D1 (and already treated D3 as a trailing hole with early exit),...

>      *     If that happens here, a hole has been dug behind our back
>      *     since the previous lseek().

...this is further true for this function.

>      * H3. offs < 0, errno = ENXIO: start is beyond EOF
>      *     If this happens, the file has been truncated behind our
>      *     back since we opened it.  Treat it like a trailing hole.
>      * H4. offs < 0, errno != ENXIO: we learned nothing
>      *     Pretend we know nothing at all, i.e. "forget" about D1.
>      */
>     offs = lseek(s->fd, start, SEEK_HOLE);
>     if (offs < 0) {
>         return -errno;          /* D1 and (H3 or H4) */
>     }
>     assert(offs >= start);
> 
>     if (offs > start) {
>         /*
>          * D1 and H2: either in data, next hole at offs, or it was in
>          * data but is now in a trailing hole.  Treating the latter as
>          * if it there was data extending to EOF is safe, so simply do
>          * that.
>          */
>         *data = start;
>         *hole = offs;
>         return 0;
>     }

Reasonable.

> 
>     /* D1 and H1 */
>     return -EBUSY;
> #else
>     return -ENOTSUP;
> #endif
> }

I like it.  Maybe we could do better than -ENOTSUP (by treating the
entire file as data and the hole at EOF), but if the caller handles
ENOTSUP differently from ENXIO, you don't necessarily need to do it here.

Looking forward to this in an actual v3 patch.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

  reply	other threads:[~2014-11-15  0:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13 10:16 [Qemu-devel] [PATCH v2 0/4] raw-posix: Get rid of FIEMAP, and more Markus Armbruster
2014-11-13 10:17 ` [Qemu-devel] [PATCH v2 1/4] raw-posix: Fix comment for raw_co_get_block_status() Markus Armbruster
2014-11-13 10:17 ` [Qemu-devel] [PATCH v2 2/4] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP Markus Armbruster
2014-11-13 10:19   ` Max Reitz
2014-11-13 14:09   ` Eric Blake
2014-11-13 10:17 ` [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure Markus Armbruster
2014-11-13 10:22   ` Max Reitz
2014-11-13 13:03   ` Kevin Wolf
2014-11-13 14:52     ` Eric Blake
2014-11-13 15:29       ` Eric Blake
2014-11-13 15:44         ` Max Reitz
2014-11-13 15:49           ` Eric Blake
2014-11-13 15:52             ` Eric Blake
2014-11-13 15:47         ` Eric Blake
2014-11-13 16:01           ` Eric Blake
2014-11-14 13:12           ` Markus Armbruster
2014-11-15  0:47             ` Eric Blake [this message]
2014-11-13 10:17 ` [Qemu-devel] [PATCH v2 4/4] raw-posix: Clean up around raw_co_get_block_status() Markus Armbruster
2014-11-13 10:27   ` Max Reitz
2014-11-13 12:48     ` Markus Armbruster
2014-11-13 13:30 ` [Qemu-devel] [PATCH v2 0/4] raw-posix: Get rid of FIEMAP, and more Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5466A2A5.8070703@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=tony@bakeyournoodle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).