public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	xfs@oss.sgi.com, Chris Mason <chris.mason@oracle.com>,
	aelder@sgi.com
Subject: Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE support to XFS V1
Date: Sun, 20 Nov 2011 21:59:31 +0800	[thread overview]
Message-ID: <4EC907C3.7020901@oracle.com> (raw)
In-Reply-To: <20111120003031.GM7046@dastard>

Hi Dave,

Thanks for your further comments!

On 11/20/2011 08:30 AM, Dave Chinner wrote:

> On Sat, Nov 19, 2011 at 04:37:13PM +0800, Jeff Liu wrote:
>>
>> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
> 
> Christoph has commented on the code-related aspects of the patch, so
> I won't repeat that. I'll comment on structural/design issues
> instead.
> 
> Firstly, the patch splits the functionality arbitrarily over three
> different files, and I don't think that is necessary. There really
> is no reason at all for xfs_bmap.c to know anything aout
> SEEK_HOLE/SEEK_DATA semantics - that file is for extent manipulation
> and search functions. SEEK semantics should be entirely encoded into
> the function that deals with the seeking.

Yes, I should merge and isolate those functions in xfs_file.c.

> 
> Secondly, xfs_find_desired_extent() and xfs_seek_hole_data() should
> be one function, and named something like xfs_file_seek_extent().

Definitely!  and sorry for my poor english, xfs_file_seek_extent() is distinct in this case.

> 
> Finally, don't think using low level extent search functions like
> xfs_bmap_search_extents() is the right level to implement this
> functionality (see my comments about SEEK_HOLE/SEEK_DATA semantics
> in xfs-bmap.c above), especially as we already have functions for
> looking up holes and extents at given offsets.
> 
> That is, to find the next hole at or after after a given offset, we
> already have xfs_bmap_first_unused(). Indeed, this function already
> has the exact semantics that SEEK_HOLE requires. Simply put:
> 
> 	case SEEK_HOLE:
> 		fsb = XFS_B_TO_FSBT(mp, start_offset);
> 		error = xfs_bmap_first_unused(NULL, ip, 1, &fsb,
> 						XFS_DATA_FORK);
> 		if (error)
> 			return -error;
> 
> 		if (fsb <= XFS_B_TO_FSBT(mp, start_offset))
> 			return start_offset;
> 		return XFS_FSB_TO_B(mp, fsb);

Thanks for pointing it out, I even don't know XFS has this convenient routine at that time. :(

> 
> 
> As to the data extent search, I'd prefer you to use xfs_bmapi_read()
> rather than xfs_bmap_search_extents() directly. I'd prefer that we
> return unwritten extents as holes rather than data from the initial
> implementation, and using the low level xfs_bmap_search_extents()
> makes that quite complex. However, we already have a function for
> handling that complexity: xfs_bmapi_read().
> 
> That is, xfs_bmapi_read() needs to be passed an array of two maps,
> one for the current offset, and one for the next extent type. This
> makes one call sufficient for most transitions. Done in a simple
> loop it will handle all conditions of hole->unwritten->hole....
> until it finds a data extent of EOF.
> 
> 
> 	start_fsbno = XFS_B_TO_FSBT(mp, start_offset);
> 	while (1) {
> 		struct xfs_bmbt_irec	maps[2];
> 		int			nmaps = 2;
> 
> 		count_fsb = XFS_B_TO_FSB(mp, XFS_MAXIOFFSET(mp));
> 		error = xfs_bmapi_read(ip, fsbno, count_fsb,
> 					&maps, &nmaps, XFS_BMAPI_ENTIRE);
> 
> 		if (error)
> 			return -error;
> 		if (!nmaps) {
> 			/* no extents at given offset, must be beyond EOF */
> 			return -ENXIO;
> 		}
> 
> 		switch (map[0].br_startblock) {
> 		case DELAYSTARTBLOCK:
> 			/* landed in an in-memory data extent */
> 			return map[0].br_startoff;
> 
> 		default:
> 			/* landed in an allocated extent */
> 			if (map[0].br_state == XFS_EXT_NORM) {
> 				/* a real data extent */
> 				return map[0].br_startoff;
> 			}
> 			/* Fall through to hole handling for unwritten extents */
> 
> 		case HOLESTARTBLOCK:
> 			/*
> 			 * landed in a hole. If the next extent is a data
> 			 * extent, then return the start of it, otherwise
> 			 * we need to move the start offset and map more
> 			 * blocks.
> 			 */
> 			if (map[1].br_startblock == DELAYSTARTBLOCK ||
> 			   ((map[1].br_startblock != HOLESTARTBLOCK &&
> 			     map[1].br_state == XFS_EXT_NORM)))
> 				return map[1].br_startoff;
> 
> 			start_fsbno = map[1].br_startoff + map[1].br_blockcount;
> 			break;
> 		}
> 
> 		if (XFS_FSB_TO_B(mp, start_fsbno) > ip->i_size) {
> 			/* Beyond EOF now */
> 			return -ENXIO;
> 		}
> 	}
> 
> This can pretty much all be done in
> fs/xfs/xfs_file.c::xfs_file_seek_extent() because all the functions
> used by the above code are exported from xfs_bmap.c for external use
> - that solves the scattering problem and uses interfaces that we
> already know work in the intended manner.... ;)

Thanks again for the detailed info!
At first, I have spent a few hours to think it over using xfs_bmap_search_extents() or xfs_bmapi_read().
Indeed, it will be more complex to deal with the unwritten extents later if using xfs_bmap_search_extents().
This change will reflected in my next post.

> 
> BTW:
> 
>> +xfs_file_llseek(
>> +	struct file *file,
>> +	loff_t offset,
>> +	int origin)
>> +{
>> +	struct inode *inode = file->f_mapping->host;
>> +	int ret;
>> +
>> +	if (origin != SEEK_DATA && origin != SEEK_HOLE)
>> +		return generic_file_llseek(file, offset, origin);
>> +
>> +	mutex_lock(&inode->i_mutex);
>> +	switch (origin) {
> 
> We don't need the i_mutex to be held here. We only need to hold the
> ilock in shared mode for this operation to protect against extent
> list modifications (like unwritten extent conversion and
> truncation).

looks we only need to hold the ilock in shared mode in xfs_file_seek_extent().

> 
>> +int
>> +xfs_find_desired_extent(
>> +	struct inode		*inode,
>> +	loff_t			*start,
>> +	u32			type)
>> +{
>> +	xfs_inode_t		*ip = XFS_I(inode);
>> +	xfs_mount_t		*mp = ip->i_mount;
>> +	struct xfs_ifork	*ifp;
>> +	int			lock;
>> +	int			error;
>> +
>> +	if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
>> +	    ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
>> +	    ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
>> +		return XFS_ERROR(EINVAL);
>> +
>> +	xfs_ilock(ip, XFS_IOLOCK_SHARED);
>> +
>> +	/*
>> +	 * Flush the delay alloc blocks. Even after flushing the inode,
>> +	 * there can still be delalloc blocks on the inode beyond EOF
>> +	 * due to speculative preallocation. These are not removed until
>> +	 * the release function is called or the inode is inactivated.
>> +	 * Hence we cannot assert here that ip->i_delayed_blks == 0.
>> +	 */
>> +	if (ip->i_delayed_blks || ip->i_size > ip->i_d.di_size) {
>> +		error = xfs_flush_pages(ip, 0, -1, 0, FI_REMAPF);
>> +		if (error)
>> +			goto out_unlock_iolock;
>> +	}
> 
> i.e. this IOLOCK and flush is completely unnecessary because we'll
> find delayed allocation extents in the extent lookup and can handle
> them just like real allocated extents....
> 
>> +	lock = xfs_ilock_map_shared(ip);
> 
> i.e. this is the only lock we need to take.

Yes, if we get rid of the the flush pages here, xfs_ilock() can be removed safely.

Thanks,
-Jeff

> 
> Cheers,
> 
> Dave.


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-11-20 13:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-02 15:04 SEEK_DATA/SEEK_HOLE support Jeff Liu
2011-10-02 15:42 ` Christoph Hellwig
2011-10-02 16:06   ` Jeff Liu
2011-10-02 17:59     ` Christoph Hellwig
2011-10-02 19:11       ` Andi Kleen
2011-10-03  4:04       ` Jeff Liu
2011-10-03 23:43         ` Dave Chinner
2011-10-04 13:02           ` Christoph Hellwig
2011-10-05  4:36             ` Dave Chinner
2011-10-05  5:32               ` Jeff Liu
2011-10-05  9:23                 ` Dave Chinner
2011-10-05 13:56                   ` Jeff Liu
2011-10-05  7:34               ` Michael Monnerie
2011-10-05  9:36                 ` Dave Chinner
2011-10-05 18:22                   ` Michael Monnerie
2011-10-06  0:32                     ` Dave Chinner
2011-11-14 10:24 ` Christoph Hellwig
2011-11-14 12:47   ` Jeff Liu
2011-11-14 12:50     ` Christoph Hellwig
2011-11-19  8:29       ` XFS SEEK_DATA/SEEK_HOLE support V1 Jeff Liu
2011-11-19  8:34         ` Jeff Liu
2011-11-19  8:37         ` [PATCH] Introduce SEEK_DATA/SEEK_HOLE support to XFS V1 Jeff Liu
2011-11-19 19:11           ` Christoph Hellwig
2011-11-20 13:15             ` Jeff Liu
2011-11-20  0:30           ` Dave Chinner
2011-11-20 13:59             ` Jeff Liu [this message]
2011-11-20 15:30               ` Christoph Hellwig
2011-11-20 22:34                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EC907C3.7020901@oracle.com \
    --to=jeff.liu@oracle.com \
    --cc=aelder@sgi.com \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox