From: Jeff Liu <jeff.liu@oracle.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Chris Mason <chris.mason@oracle.com>, xfs@oss.sgi.com
Subject: Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
Date: Wed, 23 Nov 2011 22:00:28 +0800 [thread overview]
Message-ID: <4ECCFC7C.7010207@oracle.com> (raw)
In-Reply-To: <20111123094049.GA5465@infradead.org>
Hi Christoph,
On 11/23/2011 05:40 PM, Christoph Hellwig wrote:
> On Tue, Nov 22, 2011 at 04:19:45PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
>
> Thanks Jeff, this looks pretty good for "simple" implementation,
> I only have a few rather cosmetic comments.
Thanks for your comments.
>
> Do you plan on updating Josef's old xfstests support patch for
> SEEK_DATA/SEEK_HOLE?
Sure!
Additionally, how about if I write two test cases, only to update
Josef's patch, another to perform a copy tests. i.e, create a sparse
file with dozens of holes, copy it via read/write, and then verify the
contents through cmp(1) for further checking?
> Also it would be nice to do the pagecache
> probing for dirty unwritten extents next to get a better quality
> of implementation.
Ok, I'll try to implement it in next post.
>
>> +STATIC int
>> +xfs_seek_data(
>> + struct xfs_inode *ip,
>> + loff_t *start)
>> +{
>
> In the XFS code we generally tab-aling the paramter names, just like
> you already did for the local variables:
>
> STATIC int
> xfs_seek_data(
> struct xfs_inode *ip,
> loff_t *start)
>
> (that also applies for a few other functions)
Sigh, made a stupid mistake again. :(
>
>> + /*
>> + * Hole handling for unwritten extents landed in a hole.
>> + * If the next extent is a data extent, then return the
>> + * start of it, otherwise we need to move the start offset
>> + * and map more blocks.
>> + */
>
> I don't think this comment is quite correct. We don't just end up here
> for unwritten extents. I'd recommend something like:
>
> /*
> * We landed in a hole. Skip to the next extent.
> */
>
>> + if (map[0].br_startblock == HOLESTARTBLOCK) {
>> + if (map[1].br_startblock == HOLESTARTBLOCK) {
>> + fsbno = map[1].br_startoff +
>> + map[1].br_blockcount;
>
> I don't think this code is reachable - xfs_bmapi will never produce
> multiple consecutive HOLESTARTBLOCK extents. If you want to ensure
> that feel free to add an assert, e.g.
Ah! I know why I failed to triggered this code with many test cases.
I'd like to add the assert in this stage.
>
> if (map[0].br_startblock == HOLESTARTBLOCK) {
> ASSERT(map[1].br_startblock != HOLESTARTBLOCK);
>
> *start = max_t(loff_t, seekoff,
> XFS_FSB_TO_B(mp, map[1].br_startoff));
> break;
> }
>
> This also means that we never have to loop here until we add dirty
> unwritten probing - if the second extent doesn't contain data there
> won't be any other data extent in this file beyound our offset.
>
>> +
>> + /*
>> + * Landed in an in-memory data extent or in an allocated
>> + * extent.
>> + */
>
>> + if (map[0].br_startoff == DELAYSTARTBLOCK ||
>> + map[0].br_state == XFS_EXT_NORM) {
>
> I think just checking for br_state == XFS_EXT_NORM should be fine here,
> as unwritten extents can be delayed allocated. But until we add probing
> for dirty unwritten extents is added we have to treat unwritten extents
> as data anyway to avoid data loss.
>
> Note that once unwrittent extent probing also needs to cover the hole
> case above and not just this case.
Thanks for pointing those out, I'll try to resolve them with page cache
probing for unwritten extents then.
>
>> +STATIC int
>> +xfs_seek_extent(
>> + struct inode *inode,
>> + loff_t *start,
>> + u32 type)
>> +{
>> + struct xfs_inode *ip = XFS_I(inode);
>> + struct xfs_mount *mp = ip->i_mount;
>> + struct xfs_ifork *ifp;
>> + int lock;
>> + int error = 0;
>> +
>> + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
>> + return XFS_ERROR(EINVAL);
>
> I'd recommend moving this check into xfs_file_llseek and even do it
> for the normal lseek requests - it's another sanity check for corrupted
> filesystems which makes sense everywhere. I also think the return value
> should be EFSCORRUPTED.
>
> Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
> shouldn't be tested for.
>
>> +
>> + lock = xfs_ilock_map_shared(ip);
>> +
>> + if (XFS_FORCED_SHUTDOWN(mp)) {
>> + error = EIO;
>> + goto out_lock;
>> + }
>
> The shutdown check probably should go into the common lseek code and
> done for all cases.
>
>> +
>> + XFS_STATS_INC(xs_blk_mapr);
>
> I don't think this counter should be incremented here.
I will take of above issues.
>
>> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
>> +
>> + ASSERT(ifp->if_ext_max ==
>> + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
>
> Please just drop this assert. if_ext_max is pretty useless, and I have
> a patch to remove it pending. No adding another use of it in your patch
> will make merging a bit easier.
This change will reflect in next post too.
>
>> + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
>> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
>> + if (error)
>> + goto out_lock;
>> + }
>> +
>> + if (type == SEEK_HOLE)
>> + error = xfs_seek_hole(ip, start);
>> + else
>> + error = xfs_seek_data(ip, start);
>
> Now that just the locking and the xfs_iread_extents call is left in
> this function I'd suggest to remove it and instead add duplicates
> of the locking and xfs_iread_extents into xfs_seek_hole and
> xfs_seek_data.
So per my understood, we need to isolate the pre-checking code to
xfs_file_llseek(), and duplicate locking and xfs_iread_extents() to
seek_data/hole. In this way, we could reduce the coupling in terms of
those routines functionality?
>
>> + switch (origin) {
>> + case SEEK_END:
>> + case SEEK_CUR:
>> + offset = generic_file_llseek(file, offset, origin);
>> + goto out;
>
> instead of the goto out just return the generic_file_llseek return
> value directly here.
Definitely!
>
>> + case SEEK_DATA:
>> + case SEEK_HOLE:
>> + if (offset >= i_size_read(inode)) {
>> + ret = -ENXIO;
>> + goto error;
>> + }
>> +
>> + ret = xfs_seek_extent(inode, &offset, origin);
>> + if (ret)
>> + goto error;
>> + }
>> +
>> + if (offset != file->f_pos)
>> + file->f_pos = offset;
>
> doing the offset update outside the case scrope doesn't make much sense.
>
> I'd probably just move the offset check and offset update into the
> low-level xfs_seek_data/xfs_seek_hole helpers - it's a tiny bit of
> duplication, but it keeps the functions self-contained and the
> top-level llseek method just dispatcher into the different routines.
Sorry, those codes are just copied from other file systems, I need to
consolidate them.
Thanks,
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-11-23 14:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-22 8:19 [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2 Jeff Liu
2011-11-22 8:30 ` Jeff Liu
2011-11-23 9:40 ` Christoph Hellwig
2011-11-23 14:00 ` Jeff Liu [this message]
2011-11-24 3:23 ` Dave Chinner
2011-11-24 9:02 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ECCFC7C.7010207@oracle.com \
--to=jeff.liu@oracle.com \
--cc=chris.mason@oracle.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox