From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jan Kara <jack@suse.cz>
Subject: Re: Usefulness of SEEK_HOLE / SEEK_DATA in generic_file_llseek()
Date: Mon, 30 Dec 2013 22:18:31 +0100
Message-ID: <20131230211831.GE5457@quack.suse.cz>
References: <20131223231253.GA8376@quack.suse.cz>
 <52B8D68A.9050902@fb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Jan Kara <jack@suse.cz>, linux-fsdevel@vger.kernel.org
To: Josef Bacik <jbacik@fb.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from cantor2.suse.de ([195.135.220.15]:44476 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932232Ab3L3VSe (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 30 Dec 2013 16:18:34 -0500
Content-Disposition: inline
In-Reply-To: <52B8D68A.9050902@fb.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Mon 23-12-13 19:34:18, Josef Bacik wrote:
> On 12/23/2013 06:12 PM, Jan Kara wrote:
> >   Hello,
> >
> >   so I've now hit a xfstests failure for UDF which is caused by the
> >implementation of SEEK_HOLE / SEEK_DATA in generic_file_llseek(). UDF uses
> >that function as its .llseek method but it supports holes as any other unix
> >filesystem (e.g. ext2). The test in xfstests assumes that when it creates a
> >file by pwrite(fd, buf, bufsz, off), then SEEK_DATA on offset 0 should
> >return 'off' (off is reasonably rounded) but that's not true for the
> >implementation in generic_file_llseek().
> >
> >Now I'm not so much interested in that test itself - that can be tweaked to
> >detect that case. But I rather wanted to ask - how useful is it to
> >implement SEEK_HOLE / SEEK_DATA the way it is in generic_file_llseek()?
> >Because it seems to me that any serious user will have to detect whether
> >SEEK_HOLE / SEEK_DATA works reasonably and if not, fall back to some
> >heuristic anyway. So why bother inventing bogus values in
> >generic_file_llseek and thus making detection of working implementation
> >harder?
> I'm writing this from my in-laws so I'm going to make some
> assumptions about how the code works based on my memory, so sorry in
> advanced if this is completely wrong ;).
> 
> IIRC with the generic implementation we treat everything <= i_size
> as data and i_size as the first hole.  The way the spec works is
> that if we are currently at data and do seek_data then we just
> return our current offset, same for a hole.  In order to not be a
> jackass and have -EOPNOTSUPP for anybody who didn't implement
> seek_hole/seek_data I just did it this way where the only hole is
> the one that starts at i_size, so seek_data before that is going to
> return the value.
  Correct. My point is that I actually don't see anything 'jackass' in
returning -EOPNOTSUPP. I agree the way you've implemented
SEEK_HOLE/SEEK_DATA is a correct implementation of the spec but is it a
useful one? I mean e.g. cp(1) will rather want to fall back to its old
heuristic if SEEK_DATA just returns the current file offset. And if the
most of reasonable usecases of SEEK_DATA/SEEK_HOLE will need to detect the
case of dumb SEEK_DATA/SEEK_HOLE implementation, then what's the point?

> As far as detecting an optimized handling of seek_hole/seek_data I'm
> not sure what the best answer for that is.  I suppose
> seek_hole/seek_data is new enough that people will have checks for
> -EOPNOTSUPP anyway so we could just switch it back to that, but that
> seems like a regression of sorts to me.  I'm not married to the
> implementation as it is so I'm open to suggestions.  Thanks,

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR