linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Sunil Mushran <sunil.mushran@oracle.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>,
	Christoph Hellwig <hch@infradead.org>,
	Josef Bacik <josef@toxicpanda.com>,
	linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/2] fs: add SEEK_HOLE and SEEK_DATA flags
Date: Fri, 22 Apr 2011 10:40:15 -0600	[thread overview]
Message-ID: <4DB1AF6F.4040706@redhat.com> (raw)
In-Reply-To: <4DB1AC9D.3010706@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 3050 bytes --]

On 04/22/2011 10:28 AM, Sunil Mushran wrote:
> On 04/22/2011 04:50 AM, Eric Blake wrote:
>> That blog also mentioned the useful idea of adding FIND_HOLE and
>> FIND_DATA, not implemented in Solaris, but which could easily be
>> provided as additional lseek constants in Linux to locate the start of
>> the next chunk without repositioning and which could ease application
>> programmer's life a bit.  After all, cp wants to know where data ends
>> without repositioning (FIND_HOLE), read() that much data which
>> repositions in the process, then skip to the next chunk of data
>> (SEEK_DATA) - two lseek() calls per iteration if we have 4 constants,
>> but 3 per iteration if we only have SEEK_HOLE and have to manually
>> rewind.
> 
> while(1) {
>     read(block);
>     if (block_all_zeroes)
>         lseek(SEEK_DATA);
> }
> 
> What's wrong with the above? If this is the case, even SEEK_HOLE
> is not needed but should be added as it is already in Solaris.

Because you don't know if the block is the same size as the minimum
hole, and because some systems require rather large holes (my Solaris
testing on a zfs system didn't have holes until 128k), that's a rather
large amount of reading just to prove that the block has all zeros to
know that it is even worth trying the lseek(SEEK_DATA).  My gut feel is
that doing the lseek(SEEK_HOLE) up front coupled with seeking back to
the same position is more efficient than manually checking for a run of
zeros (less cache pollution, works with 4k read buffers without having
to know filesystem hole size).

> 
> My problem with FIND_* is that we are messing with the well understood
> semantics of lseek().

You'll notice I didn't propose any FIND_* constants for POSIX.

> And if generic_file_llseek_unlocked() treats SEEK_DATA as SEEK_CUR and

You meant SEEK_SET not SEEK_CUR, but...

> SEEK_HOLE as SEEK_END (both with zero offset) then we don't even
> have to bother with the finding the "correct" error code.

...that's still not compatible with Solaris.  On a file with size of 0
bytes, lseek(fd, 1, SEEK_SET) and lseek(fd, 0, SEEK_END) will both
succeed, but lseek(fd, 1, SEEK_DATA) and lseek(fd, 0, SEEK_HOLE) must
fail with ENXIO (the offset was at or beyond the size of the file).

For a file with no holes, Solaris semantics behave as if:

off_t lseek(int fildes, off_t offset, int whence)
{
    off_t cur, end;
    switch (whence)
    {
    case SEEK_HOLE:
    case SEEK_DATA:
        cur = lseek(fildes, 0, SEEK_CUR);
        if (cur < 0)
            return cur;
        end = lseek(fildes, 0, SEEK_END);
        if (end < 0)
            return end;
        if (offset < end)
            return whence == SEEK_HOLE ? end : lseek(fildes, offset,
SEEK_SET);
        lseek(fildes, cur, SEEK_SET);
        errno = ENXIO;
        return -1;
    default:
        ... /* Existing implementation */
    }
}

-- 
Eric Blake   eblake@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

  reply	other threads:[~2011-04-22 16:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-21 19:42 [PATCH 1/2] fs: add SEEK_HOLE and SEEK_DATA flags Josef Bacik
2011-04-21 19:42 ` [PATCH 2/2] Btrfs: implement our own ->llseek Josef Bacik
2011-04-21 20:45 ` [PATCH 1/2] fs: add SEEK_HOLE and SEEK_DATA flags Theodore Tso
2011-04-21 21:29   ` Sunil Mushran
2011-04-22  3:23   ` Matthew Wilcox
2011-04-22  4:47 ` Christoph Hellwig
     [not found] ` <loom.20110422T001650-760@post.gmane.org>
     [not found]   ` <BANLkTiknb+hzFAjpwESwMcqMVtkFc0HFQQ@mail.gmail.com>
2011-04-22  4:50     ` Christoph Hellwig
2011-04-22 11:28       ` Markus Trippelsdorf
2011-04-22 11:50         ` Eric Blake
2011-04-22 16:28           ` Sunil Mushran
2011-04-22 16:40             ` Eric Blake [this message]
2011-04-22 16:57               ` Sunil Mushran
2011-04-22 17:03                 ` Eric Blake
2011-04-22 17:08                   ` Sunil Mushran
2011-04-22 18:06                     ` Andreas Dilger
2011-04-22 23:33                       ` Sunil Mushran
2011-04-24 17:49             ` Jamie Lokier
2011-04-25 12:37               ` Eric Blake
2011-04-25 14:15                 ` Jamie Lokier
     [not found]         ` <20110422112852.GB1627-tLCgZGx+iJ+kxVt8IV0GqQ@public.gmane.org>
2011-04-22 13:06           ` Eric Blake
2011-04-25 15:02             ` Nick Bowler
     [not found]               ` <20110425150227.GA10653-7BP4RkwGw0uXmMXjJBpWqg@public.gmane.org>
2011-04-25 15:48                 ` Eric Blake
2011-04-22 20:10 ` Jonathan Nieder
2011-04-22 20:49   ` Sunil Mushran
2011-04-25  3:11 ` Valdis.Kletnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DB1AF6F.4040706@redhat.com \
    --to=eblake@redhat.com \
    --cc=hch@infradead.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=sunil.mushran@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).