All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Josef Bacik <jbacik@fb.com>, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, coreutils@gnu.org
Subject: Re: Usefulness of SEEK_HOLE / SEEK_DATA in generic_file_llseek()
Date: Wed, 25 Dec 2013 22:34:23 +0800	[thread overview]
Message-ID: <52BAECEF.1050501@oracle.com> (raw)
In-Reply-To: <52B8D68A.9050902@fb.com>

Hi,

On 12/24 2013 08:34 AM, Josef Bacik wrote:
> On 12/23/2013 06:12 PM, Jan Kara wrote:
>>    Hello,
>>
>>    so I've now hit a xfstests failure for UDF which is caused by the
>> implementation of SEEK_HOLE / SEEK_DATA in generic_file_llseek(). UDF
>> uses
>> that function as its .llseek method but it supports holes as any other
>> unix
>> filesystem (e.g. ext2). The test in xfstests assumes that when it
>> creates a
>> file by pwrite(fd, buf, bufsz, off), then SEEK_DATA on offset 0 should
>> return 'off' (off is reasonably rounded) but that's not true for the
>> implementation in generic_file_llseek().
>>
>> Now I'm not so much interested in that test itself - that can be
>> tweaked to
>> detect that case. But I rather wanted to ask - how useful is it to
>> implement SEEK_HOLE / SEEK_DATA the way it is in generic_file_llseek()?
>> Because it seems to me that any serious user will have to detect whether
>> SEEK_HOLE / SEEK_DATA works reasonably and if not, fall back to some
>> heuristic anyway. So why bother inventing bogus values in
>> generic_file_llseek and thus making detection of working implementation
>> harder?
> I'm writing this from my in-laws so I'm going to make some assumptions
> about how the code works based on my memory, so sorry in advanced if
> this is completely wrong ;).
> 
> IIRC with the generic implementation we treat everything <= i_size as
> data and i_size as the first hole.  The way the spec works is that if we
> are currently at data and do seek_data then we just return our current
> offset, same for a hole.  In order to not be a jackass and have
> -EOPNOTSUPP for anybody who didn't implement seek_hole/seek_data I just
> did it this way where the only hole is the one that starts at i_size, so
> seek_data before that is going to return the value.
> 
> As far as detecting an optimized handling of seek_hole/seek_data I'm not
> sure what the best answer for that is.  I suppose seek_hole/seek_data is
> new enough that people will have checks for -EOPNOTSUPP anyway so we
> could just switch it back to that, but that seems like a regression of
> sorts to me.  I'm not married to the implementation as it is so I'm open
> to suggestions.  Thanks,
Maybe we could consider this semantics from user space tools point of view,
i.e, Coreutils cp, sha1susm, GUN tar, AST(cp, mv, pax), etc...

I once tried to introduce SEEK_HOLE/DATA to Coreutils to improve cp(1) about
two years ago, that is intended to replace my old implementation via fiemap
ioctl(2),
http://lists.gnu.org/archive/html/bug-coreutils/2011-08/msg00102.html

Currently cp(1) handle sparse files via fiemap ioctl(2) although it have some
issues that have been discussed previously, but maybe we still have to support
it as a back up given that it performs efficiently for dealing with sparse files
on file systems without SEEK_HOLE/DATA support, therefore we can fall back to
the old fiemap calls.  Hence, I personally would like to see if we can return
EOPNOTSUPP in this case.

BTW, star do pre-checkups if the file system support SEEK_HOLE/DATA by checking
EOPNOTSUPP, maybe this is in terms of the Solaris interface I'm not yet verified.

Thanks,
-Jeff

 





  reply	other threads:[~2013-12-25 14:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-23 23:12 Usefulness of SEEK_HOLE / SEEK_DATA in generic_file_llseek() Jan Kara
2013-12-24  0:34 ` Josef Bacik
2013-12-25 14:34   ` Jeff Liu [this message]
2013-12-30 21:18   ` Jan Kara
2013-12-31  6:00     ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52BAECEF.1050501@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=coreutils@gnu.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.