From: Jeff Liu <jeff.liu@oracle.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Chris Mason <chris.mason@oracle.com>, xfs@oss.sgi.com
Subject: Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
Date: Wed, 23 Nov 2011 22:00:28 +0800 [thread overview]
Message-ID: <4ECCFC7C.7010207@oracle.com> (raw)
In-Reply-To: <20111123094049.GA5465@infradead.org>
Hi Christoph,
On 11/23/2011 05:40 PM, Christoph Hellwig wrote:
> On Tue, Nov 22, 2011 at 04:19:45PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
>
> Thanks Jeff, this looks pretty good for "simple" implementation,
> I only have a few rather cosmetic comments.
Thanks for your comments.
>
> Do you plan on updating Josef's old xfstests support patch for
> SEEK_DATA/SEEK_HOLE?
Sure!
Additionally, how about if I write two test cases, only to update
Josef's patch, another to perform a copy tests. i.e, create a sparse
file with dozens of holes, copy it via read/write, and then verify the
contents through cmp(1) for further checking?
> Also it would be nice to do the pagecache
> probing for dirty unwritten extents next to get a better quality
> of implementation.
Ok, I'll try to implement it in next post.
>
>> +STATIC int
>> +xfs_seek_data(
>> + struct xfs_inode *ip,
>> + loff_t *start)
>> +{
>
> In the XFS code we generally tab-aling the paramter names, just like
> you already did for the local variables:
>
> STATIC int
> xfs_seek_data(
> struct xfs_inode *ip,
> loff_t *start)
>
> (that also applies for a few other functions)
Sigh, made a stupid mistake again. :(
>
>> + /*
>> + * Hole handling for unwritten extents landed in a hole.
>> + * If the next extent is a data extent, then return the
>> + * start of it, otherwise we need to move the start offset
>> + * and map more blocks.
>> + */
>
> I don't think this comment is quite correct. We don't just end up here
> for unwritten extents. I'd recommend something like:
>
> /*
> * We landed in a hole. Skip to the next extent.
> */
>
>> + if (map[0].br_startblock == HOLESTARTBLOCK) {
>> + if (map[1].br_startblock == HOLESTARTBLOCK) {
>> + fsbno = map[1].br_startoff +
>> + map[1].br_blockcount;
>
> I don't think this code is reachable - xfs_bmapi will never produce
> multiple consecutive HOLESTARTBLOCK extents. If you want to ensure
> that feel free to add an assert, e.g.
Ah! I know why I failed to triggered this code with many test cases.
I'd like to add the assert in this stage.
>
> if (map[0].br_startblock == HOLESTARTBLOCK) {
> ASSERT(map[1].br_startblock != HOLESTARTBLOCK);
>
> *start = max_t(loff_t, seekoff,
> XFS_FSB_TO_B(mp, map[1].br_startoff));
> break;
> }
>
> This also means that we never have to loop here until we add dirty
> unwritten probing - if the second extent doesn't contain data there
> won't be any other data extent in this file beyound our offset.
>
>> +
>> + /*
>> + * Landed in an in-memory data extent or in an allocated
>> + * extent.
>> + */
>
>> + if (map[0].br_startoff == DELAYSTARTBLOCK ||
>> + map[0].br_state == XFS_EXT_NORM) {
>
> I think just checking for br_state == XFS_EXT_NORM should be fine here,
> as unwritten extents can be delayed allocated. But until we add probing
> for dirty unwritten extents is added we have to treat unwritten extents
> as data anyway to avoid data loss.
>
> Note that once unwrittent extent probing also needs to cover the hole
> case above and not just this case.
Thanks for pointing those out, I'll try to resolve them with page cache
probing for unwritten extents then.
>
>> +STATIC int
>> +xfs_seek_extent(
>> + struct inode *inode,
>> + loff_t *start,
>> + u32 type)
>> +{
>> + struct xfs_inode *ip = XFS_I(inode);
>> + struct xfs_mount *mp = ip->i_mount;
>> + struct xfs_ifork *ifp;
>> + int lock;
>> + int error = 0;
>> +
>> + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
>> + return XFS_ERROR(EINVAL);
>
> I'd recommend moving this check into xfs_file_llseek and even do it
> for the normal lseek requests - it's another sanity check for corrupted
> filesystems which makes sense everywhere. I also think the return value
> should be EFSCORRUPTED.
>
> Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
> shouldn't be tested for.
>
>> +
>> + lock = xfs_ilock_map_shared(ip);
>> +
>> + if (XFS_FORCED_SHUTDOWN(mp)) {
>> + error = EIO;
>> + goto out_lock;
>> + }
>
> The shutdown check probably should go into the common lseek code and
> done for all cases.
>
>> +
>> + XFS_STATS_INC(xs_blk_mapr);
>
> I don't think this counter should be incremented here.
I will take of above issues.
>
>> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
>> +
>> + ASSERT(ifp->if_ext_max ==
>> + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
>
> Please just drop this assert. if_ext_max is pretty useless, and I have
> a patch to remove it pending. No adding another use of it in your patch
> will make merging a bit easier.
This change will reflect in next post too.
>
>> + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
>> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
>> + if (error)
>> + goto out_lock;
>> + }
>> +
>> + if (type == SEEK_HOLE)
>> + error = xfs_seek_hole(ip, start);
>> + else
>> + error = xfs_seek_data(ip, start);
>
> Now that just the locking and the xfs_iread_extents call is left in
> this function I'd suggest to remove it and instead add duplicates
> of the locking and xfs_iread_extents into xfs_seek_hole and
> xfs_seek_data.
So per my understood, we need to isolate the pre-checking code to
xfs_file_llseek(), and duplicate locking and xfs_iread_extents() to
seek_data/hole. In this way, we could reduce the coupling in terms of
those routines functionality?
>
>> + switch (origin) {
>> + case SEEK_END:
>> + case SEEK_CUR:
>> + offset = generic_file_llseek(file, offset, origin);
>> + goto out;
>
> instead of the goto out just return the generic_file_llseek return
> value directly here.
Definitely!
>
>> + case SEEK_DATA:
>> + case SEEK_HOLE:
>> + if (offset >= i_size_read(inode)) {
>> + ret = -ENXIO;
>> + goto error;
>> + }
>> +
>> + ret = xfs_seek_extent(inode, &offset, origin);
>> + if (ret)
>> + goto error;
>> + }
>> +
>> + if (offset != file->f_pos)
>> + file->f_pos = offset;
>
> doing the offset update outside the case scrope doesn't make much sense.
>
> I'd probably just move the offset check and offset update into the
> low-level xfs_seek_data/xfs_seek_hole helpers - it's a tiny bit of
> duplication, but it keeps the functions self-contained and the
> top-level llseek method just dispatcher into the different routines.
Sorry, those codes are just copied from other file systems, I need to
consolidate them.
Thanks,
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-11-23 14:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-22 8:19 [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2 Jeff Liu
2011-11-22 8:30 ` Jeff Liu
2011-11-23 9:40 ` Christoph Hellwig
2011-11-23 14:00 ` Jeff Liu [this message]
2011-11-24 3:23 ` Dave Chinner
2011-11-24 9:02 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ECCFC7C.7010207@oracle.com \
--to=jeff.liu@oracle.com \
--cc=chris.mason@oracle.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.