* [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
@ 2011-11-22 8:19 Jeff Liu
2011-11-22 8:30 ` Jeff Liu
2011-11-23 9:40 ` Christoph Hellwig
0 siblings, 2 replies; 6+ messages in thread
From: Jeff Liu @ 2011-11-22 8:19 UTC (permalink / raw)
To: xfs; +Cc: Christoph Hellwig, Chris Mason, aelder
Hello,
This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
Changes:
========
1. Merge xfs_find_desired_extent() and xfs_seek_data_hole() into xfs_seek_extent(), and place it at xfs_file.c.
2. Using two different routines xfs_seek_data()/xfs_seek_hole() to handle SEEK_DATA/SEEK_HOLE requests respectively.
3. Remove some worthless result checking code from xfs_file_llseek().
4. s/xfs_mount_t/struct xfs_mount/g.
Tests:
======
In addition to the seek_test.c I have used previously, I have also done a large sparse file copy tests, it works to me.
Hope I have not made obvious stupid mistakes this time. :-P.
Any comments are appreciated!
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
fs/xfs/xfs_file.c | 188 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 187 insertions(+), 1 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 753ed9b..bb2be00 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1141,8 +1141,194 @@ xfs_vm_page_mkwrite(
return block_page_mkwrite(vma, vmf, xfs_get_blocks);
}
+STATIC int
+xfs_seek_data(
+ struct xfs_inode *ip,
+ loff_t *start)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_ifork *ifp;
+ xfs_fileoff_t fsbno;
+ xfs_filblks_t len;
+ loff_t startoff = *start;
+ int error = 0;
+
+ fsbno = XFS_B_TO_FSBT(mp, *start);
+ ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+ len = XFS_B_TO_FSB(mp, ip->i_size);
+
+ for (;;) {
+ struct xfs_bmbt_irec map[2];
+ int nmap = 2;
+ loff_t seekoff;
+
+ error = xfs_bmapi_read(ip, fsbno, len - fsbno, map, &nmap,
+ XFS_BMAPI_ENTIRE);
+ if (error)
+ break;
+
+ /* No extents at given offset, must be beyond EOF */
+ if (!nmap) {
+ error = ENXIO;
+ break;
+ }
+
+ seekoff = XFS_FSB_TO_B(mp, fsbno);
+
+ /*
+ * Hole handling for unwritten extents landed in a hole.
+ * If the next extent is a data extent, then return the
+ * start of it, otherwise we need to move the start offset
+ * and map more blocks.
+ */
+ if (map[0].br_startblock == HOLESTARTBLOCK) {
+ if (map[1].br_startblock == HOLESTARTBLOCK) {
+ fsbno = map[1].br_startoff +
+ map[1].br_blockcount;
+ } else {
+ *start = max_t(loff_t, seekoff,
+ XFS_FSB_TO_B(mp, map[1].br_startoff));
+ break;
+ }
+ }
+
+ /*
+ * Landed in an in-memory data extent or in an allocated
+ * extent.
+ */
+ if (map[0].br_startoff == DELAYSTARTBLOCK ||
+ map[0].br_state == XFS_EXT_NORM) {
+ *start = max_t(loff_t, seekoff,
+ XFS_FSB_TO_B(mp, map[0].br_startoff));
+ break;
+ }
+
+ /* return ENXIO if beyond eof */
+ if (XFS_FSB_TO_B(mp, fsbno) > ip->i_size) {
+ error = ENXIO;
+ break;
+ }
+ }
+
+ if (*start < startoff)
+ *start = startoff;
+
+ return error;
+}
+
+STATIC int
+xfs_seek_hole(
+ struct xfs_inode *ip,
+ loff_t *start)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ int error = 0;
+ loff_t seekoff = *start;
+ loff_t holeoff;
+ xfs_fileoff_t fsbno;
+
+ fsbno = XFS_B_TO_FSBT(mp, *start);
+ error = xfs_bmap_first_unused(NULL, ip, 1, &fsbno, XFS_DATA_FORK);
+ if (error)
+ return error;
+
+ holeoff = XFS_FSB_TO_B(mp, fsbno);
+ if (holeoff <= seekoff)
+ *start = seekoff;
+ else
+ *start = min_t(loff_t, holeoff, ip->i_size);
+
+ return error;
+}
+
+STATIC int
+xfs_seek_extent(
+ struct inode *inode,
+ loff_t *start,
+ u32 type)
+{
+ struct xfs_inode *ip = XFS_I(inode);
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_ifork *ifp;
+ int lock;
+ int error = 0;
+
+ if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
+ ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
+ ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
+ return XFS_ERROR(EINVAL);
+
+ lock = xfs_ilock_map_shared(ip);
+
+ if (XFS_FORCED_SHUTDOWN(mp)) {
+ error = EIO;
+ goto out_lock;
+ }
+
+ XFS_STATS_INC(xs_blk_mapr);
+ ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+
+ ASSERT(ifp->if_ext_max ==
+ XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
+
+ if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+ error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
+ if (error)
+ goto out_lock;
+ }
+
+ if (type == SEEK_HOLE)
+ error = xfs_seek_hole(ip, start);
+ else
+ error = xfs_seek_data(ip, start);
+
+out_lock:
+ xfs_iunlock_map_shared(ip, lock);
+
+ if (error)
+ return -error;
+
+ return 0;
+}
+
+STATIC loff_t
+xfs_file_llseek(
+ struct file *file,
+ loff_t offset,
+ int origin)
+{
+ struct inode *inode = file->f_mapping->host;
+ int ret;
+
+ switch (origin) {
+ case SEEK_END:
+ case SEEK_CUR:
+ offset = generic_file_llseek(file, offset, origin);
+ goto out;
+ case SEEK_DATA:
+ case SEEK_HOLE:
+ if (offset >= i_size_read(inode)) {
+ ret = -ENXIO;
+ goto error;
+ }
+
+ ret = xfs_seek_extent(inode, &offset, origin);
+ if (ret)
+ goto error;
+ }
+
+ if (offset != file->f_pos)
+ file->f_pos = offset;
+
+out:
+ return offset;
+
+error:
+ return ret;
+}
+
const struct file_operations xfs_file_operations = {
- .llseek = generic_file_llseek,
+ .llseek = xfs_file_llseek,
.read = do_sync_read,
.write = do_sync_write,
.aio_read = xfs_file_aio_read,
--
1.7.4.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
2011-11-22 8:19 [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2 Jeff Liu
@ 2011-11-22 8:30 ` Jeff Liu
2011-11-23 9:40 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Liu @ 2011-11-22 8:30 UTC (permalink / raw)
To: xfs; +Cc: Christoph Hellwig, Chris Mason, aelder
CC to Dave.
Thanks,
-Jeff
On 11/22/2011 04:19 PM, Jeff Liu wrote:
> Hello,
>
> This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
>
> Changes:
> ========
> 1. Merge xfs_find_desired_extent() and xfs_seek_data_hole() into xfs_seek_extent(), and place it at xfs_file.c.
> 2. Using two different routines xfs_seek_data()/xfs_seek_hole() to handle SEEK_DATA/SEEK_HOLE requests respectively.
> 3. Remove some worthless result checking code from xfs_file_llseek().
> 4. s/xfs_mount_t/struct xfs_mount/g.
>
> Tests:
> ======
> In addition to the seek_test.c I have used previously, I have also done a large sparse file copy tests, it works to me.
>
> Hope I have not made obvious stupid mistakes this time. :-P.
> Any comments are appreciated!
>
>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>
>
> ---
> fs/xfs/xfs_file.c | 188 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 187 insertions(+), 1 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 753ed9b..bb2be00 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1141,8 +1141,194 @@ xfs_vm_page_mkwrite(
> return block_page_mkwrite(vma, vmf, xfs_get_blocks);
> }
>
> +STATIC int
> +xfs_seek_data(
> + struct xfs_inode *ip,
> + loff_t *start)
> +{
> + struct xfs_mount *mp = ip->i_mount;
> + struct xfs_ifork *ifp;
> + xfs_fileoff_t fsbno;
> + xfs_filblks_t len;
> + loff_t startoff = *start;
> + int error = 0;
> +
> + fsbno = XFS_B_TO_FSBT(mp, *start);
> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> + len = XFS_B_TO_FSB(mp, ip->i_size);
> +
> + for (;;) {
> + struct xfs_bmbt_irec map[2];
> + int nmap = 2;
> + loff_t seekoff;
> +
> + error = xfs_bmapi_read(ip, fsbno, len - fsbno, map, &nmap,
> + XFS_BMAPI_ENTIRE);
> + if (error)
> + break;
> +
> + /* No extents at given offset, must be beyond EOF */
> + if (!nmap) {
> + error = ENXIO;
> + break;
> + }
> +
> + seekoff = XFS_FSB_TO_B(mp, fsbno);
> +
> + /*
> + * Hole handling for unwritten extents landed in a hole.
> + * If the next extent is a data extent, then return the
> + * start of it, otherwise we need to move the start offset
> + * and map more blocks.
> + */
> + if (map[0].br_startblock == HOLESTARTBLOCK) {
> + if (map[1].br_startblock == HOLESTARTBLOCK) {
> + fsbno = map[1].br_startoff +
> + map[1].br_blockcount;
> + } else {
> + *start = max_t(loff_t, seekoff,
> + XFS_FSB_TO_B(mp, map[1].br_startoff));
> + break;
> + }
> + }
> +
> + /*
> + * Landed in an in-memory data extent or in an allocated
> + * extent.
> + */
> + if (map[0].br_startoff == DELAYSTARTBLOCK ||
> + map[0].br_state == XFS_EXT_NORM) {
> + *start = max_t(loff_t, seekoff,
> + XFS_FSB_TO_B(mp, map[0].br_startoff));
> + break;
> + }
> +
> + /* return ENXIO if beyond eof */
> + if (XFS_FSB_TO_B(mp, fsbno) > ip->i_size) {
> + error = ENXIO;
> + break;
> + }
> + }
> +
> + if (*start < startoff)
> + *start = startoff;
> +
> + return error;
> +}
> +
> +STATIC int
> +xfs_seek_hole(
> + struct xfs_inode *ip,
> + loff_t *start)
> +{
> + struct xfs_mount *mp = ip->i_mount;
> + int error = 0;
> + loff_t seekoff = *start;
> + loff_t holeoff;
> + xfs_fileoff_t fsbno;
> +
> + fsbno = XFS_B_TO_FSBT(mp, *start);
> + error = xfs_bmap_first_unused(NULL, ip, 1, &fsbno, XFS_DATA_FORK);
> + if (error)
> + return error;
> +
> + holeoff = XFS_FSB_TO_B(mp, fsbno);
> + if (holeoff <= seekoff)
> + *start = seekoff;
> + else
> + *start = min_t(loff_t, holeoff, ip->i_size);
> +
> + return error;
> +}
> +
> +STATIC int
> +xfs_seek_extent(
> + struct inode *inode,
> + loff_t *start,
> + u32 type)
> +{
> + struct xfs_inode *ip = XFS_I(inode);
> + struct xfs_mount *mp = ip->i_mount;
> + struct xfs_ifork *ifp;
> + int lock;
> + int error = 0;
> +
> + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
> + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
> + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> + return XFS_ERROR(EINVAL);
> +
> + lock = xfs_ilock_map_shared(ip);
> +
> + if (XFS_FORCED_SHUTDOWN(mp)) {
> + error = EIO;
> + goto out_lock;
> + }
> +
> + XFS_STATS_INC(xs_blk_mapr);
> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +
> + ASSERT(ifp->if_ext_max ==
> + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
> +
> + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
> + if (error)
> + goto out_lock;
> + }
> +
> + if (type == SEEK_HOLE)
> + error = xfs_seek_hole(ip, start);
> + else
> + error = xfs_seek_data(ip, start);
> +
> +out_lock:
> + xfs_iunlock_map_shared(ip, lock);
> +
> + if (error)
> + return -error;
> +
> + return 0;
> +}
> +
> +STATIC loff_t
> +xfs_file_llseek(
> + struct file *file,
> + loff_t offset,
> + int origin)
> +{
> + struct inode *inode = file->f_mapping->host;
> + int ret;
> +
> + switch (origin) {
> + case SEEK_END:
> + case SEEK_CUR:
> + offset = generic_file_llseek(file, offset, origin);
> + goto out;
> + case SEEK_DATA:
> + case SEEK_HOLE:
> + if (offset >= i_size_read(inode)) {
> + ret = -ENXIO;
> + goto error;
> + }
> +
> + ret = xfs_seek_extent(inode, &offset, origin);
> + if (ret)
> + goto error;
> + }
> +
> + if (offset != file->f_pos)
> + file->f_pos = offset;
> +
> +out:
> + return offset;
> +
> +error:
> + return ret;
> +}
> +
> const struct file_operations xfs_file_operations = {
> - .llseek = generic_file_llseek,
> + .llseek = xfs_file_llseek,
> .read = do_sync_read,
> .write = do_sync_write,
> .aio_read = xfs_file_aio_read,
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
2011-11-22 8:19 [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2 Jeff Liu
2011-11-22 8:30 ` Jeff Liu
@ 2011-11-23 9:40 ` Christoph Hellwig
2011-11-23 14:00 ` Jeff Liu
2011-11-24 3:23 ` Dave Chinner
1 sibling, 2 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-11-23 9:40 UTC (permalink / raw)
To: Jeff Liu; +Cc: Chris Mason, xfs
On Tue, Nov 22, 2011 at 04:19:45PM +0800, Jeff Liu wrote:
> Hello,
>
> This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
Thanks Jeff, this looks pretty good for "simple" implementation,
I only have a few rather cosmetic comments.
Do you plan on updating Josef's old xfstests support patch for
SEEK_DATA/SEEK_HOLE? Also it would be nice to do the pagecache
probing for dirty unwritten extents next to get a better quality
of implementation.
>+STATIC int
>+xfs_seek_data(
>+ struct xfs_inode *ip,
>+ loff_t *start)
>+{
In the XFS code we generally tab-aling the paramter names, just like
you already did for the local variables:
STATIC int
xfs_seek_data(
struct xfs_inode *ip,
loff_t *start)
(that also applies for a few other functions)
> + /*
> + * Hole handling for unwritten extents landed in a hole.
> + * If the next extent is a data extent, then return the
> + * start of it, otherwise we need to move the start offset
> + * and map more blocks.
> + */
I don't think this comment is quite correct. We don't just end up here
for unwritten extents. I'd recommend something like:
/*
* We landed in a hole. Skip to the next extent.
*/
> + if (map[0].br_startblock == HOLESTARTBLOCK) {
> + if (map[1].br_startblock == HOLESTARTBLOCK) {
> + fsbno = map[1].br_startoff +
> + map[1].br_blockcount;
I don't think this code is reachable - xfs_bmapi will never produce
multiple consecutive HOLESTARTBLOCK extents. If you want to ensure
that feel free to add an assert, e.g.
if (map[0].br_startblock == HOLESTARTBLOCK) {
ASSERT(map[1].br_startblock != HOLESTARTBLOCK);
*start = max_t(loff_t, seekoff,
XFS_FSB_TO_B(mp, map[1].br_startoff));
break;
}
This also means that we never have to loop here until we add dirty
unwritten probing - if the second extent doesn't contain data there
won't be any other data extent in this file beyound our offset.
> +
> + /*
> + * Landed in an in-memory data extent or in an allocated
> + * extent.
> + */
> + if (map[0].br_startoff == DELAYSTARTBLOCK ||
> + map[0].br_state == XFS_EXT_NORM) {
I think just checking for br_state == XFS_EXT_NORM should be fine here,
as unwritten extents can be delayed allocated. But until we add probing
for dirty unwritten extents is added we have to treat unwritten extents
as data anyway to avoid data loss.
Note that once unwrittent extent probing also needs to cover the hole
case above and not just this case.
> +STATIC int
> +xfs_seek_extent(
> + struct inode *inode,
> + loff_t *start,
> + u32 type)
> +{
> + struct xfs_inode *ip = XFS_I(inode);
> + struct xfs_mount *mp = ip->i_mount;
> + struct xfs_ifork *ifp;
> + int lock;
> + int error = 0;
> +
> + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
> + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
> + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> + return XFS_ERROR(EINVAL);
I'd recommend moving this check into xfs_file_llseek and even do it
for the normal lseek requests - it's another sanity check for corrupted
filesystems which makes sense everywhere. I also think the return value
should be EFSCORRUPTED.
Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
shouldn't be tested for.
> +
> + lock = xfs_ilock_map_shared(ip);
> +
> + if (XFS_FORCED_SHUTDOWN(mp)) {
> + error = EIO;
> + goto out_lock;
> + }
The shutdown check probably should go into the common lseek code and
done for all cases.
> +
> + XFS_STATS_INC(xs_blk_mapr);
I don't think this counter should be incremented here.
> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +
> + ASSERT(ifp->if_ext_max ==
> + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
Please just drop this assert. if_ext_max is pretty useless, and I have
a patch to remove it pending. No adding another use of it in your patch
will make merging a bit easier.
> + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
> + if (error)
> + goto out_lock;
> + }
> +
> + if (type == SEEK_HOLE)
> + error = xfs_seek_hole(ip, start);
> + else
> + error = xfs_seek_data(ip, start);
Now that just the locking and the xfs_iread_extents call is left in
this function I'd suggest to remove it and instead add duplicates
of the locking and xfs_iread_extents into xfs_seek_hole and
xfs_seek_data.
> + switch (origin) {
> + case SEEK_END:
> + case SEEK_CUR:
> + offset = generic_file_llseek(file, offset, origin);
> + goto out;
instead of the goto out just return the generic_file_llseek return
value directly here.
> + case SEEK_DATA:
> + case SEEK_HOLE:
> + if (offset >= i_size_read(inode)) {
> + ret = -ENXIO;
> + goto error;
> + }
> +
> + ret = xfs_seek_extent(inode, &offset, origin);
> + if (ret)
> + goto error;
> + }
> +
> + if (offset != file->f_pos)
> + file->f_pos = offset;
doing the offset update outside the case scrope doesn't make much sense.
I'd probably just move the offset check and offset update into the
low-level xfs_seek_data/xfs_seek_hole helpers - it's a tiny bit of
duplication, but it keeps the functions self-contained and the
top-level llseek method just dispatcher into the different routines.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
2011-11-23 9:40 ` Christoph Hellwig
@ 2011-11-23 14:00 ` Jeff Liu
2011-11-24 3:23 ` Dave Chinner
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Liu @ 2011-11-23 14:00 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chris Mason, xfs
Hi Christoph,
On 11/23/2011 05:40 PM, Christoph Hellwig wrote:
> On Tue, Nov 22, 2011 at 04:19:45PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
>
> Thanks Jeff, this looks pretty good for "simple" implementation,
> I only have a few rather cosmetic comments.
Thanks for your comments.
>
> Do you plan on updating Josef's old xfstests support patch for
> SEEK_DATA/SEEK_HOLE?
Sure!
Additionally, how about if I write two test cases, only to update
Josef's patch, another to perform a copy tests. i.e, create a sparse
file with dozens of holes, copy it via read/write, and then verify the
contents through cmp(1) for further checking?
> Also it would be nice to do the pagecache
> probing for dirty unwritten extents next to get a better quality
> of implementation.
Ok, I'll try to implement it in next post.
>
>> +STATIC int
>> +xfs_seek_data(
>> + struct xfs_inode *ip,
>> + loff_t *start)
>> +{
>
> In the XFS code we generally tab-aling the paramter names, just like
> you already did for the local variables:
>
> STATIC int
> xfs_seek_data(
> struct xfs_inode *ip,
> loff_t *start)
>
> (that also applies for a few other functions)
Sigh, made a stupid mistake again. :(
>
>> + /*
>> + * Hole handling for unwritten extents landed in a hole.
>> + * If the next extent is a data extent, then return the
>> + * start of it, otherwise we need to move the start offset
>> + * and map more blocks.
>> + */
>
> I don't think this comment is quite correct. We don't just end up here
> for unwritten extents. I'd recommend something like:
>
> /*
> * We landed in a hole. Skip to the next extent.
> */
>
>> + if (map[0].br_startblock == HOLESTARTBLOCK) {
>> + if (map[1].br_startblock == HOLESTARTBLOCK) {
>> + fsbno = map[1].br_startoff +
>> + map[1].br_blockcount;
>
> I don't think this code is reachable - xfs_bmapi will never produce
> multiple consecutive HOLESTARTBLOCK extents. If you want to ensure
> that feel free to add an assert, e.g.
Ah! I know why I failed to triggered this code with many test cases.
I'd like to add the assert in this stage.
>
> if (map[0].br_startblock == HOLESTARTBLOCK) {
> ASSERT(map[1].br_startblock != HOLESTARTBLOCK);
>
> *start = max_t(loff_t, seekoff,
> XFS_FSB_TO_B(mp, map[1].br_startoff));
> break;
> }
>
> This also means that we never have to loop here until we add dirty
> unwritten probing - if the second extent doesn't contain data there
> won't be any other data extent in this file beyound our offset.
>
>> +
>> + /*
>> + * Landed in an in-memory data extent or in an allocated
>> + * extent.
>> + */
>
>> + if (map[0].br_startoff == DELAYSTARTBLOCK ||
>> + map[0].br_state == XFS_EXT_NORM) {
>
> I think just checking for br_state == XFS_EXT_NORM should be fine here,
> as unwritten extents can be delayed allocated. But until we add probing
> for dirty unwritten extents is added we have to treat unwritten extents
> as data anyway to avoid data loss.
>
> Note that once unwrittent extent probing also needs to cover the hole
> case above and not just this case.
Thanks for pointing those out, I'll try to resolve them with page cache
probing for unwritten extents then.
>
>> +STATIC int
>> +xfs_seek_extent(
>> + struct inode *inode,
>> + loff_t *start,
>> + u32 type)
>> +{
>> + struct xfs_inode *ip = XFS_I(inode);
>> + struct xfs_mount *mp = ip->i_mount;
>> + struct xfs_ifork *ifp;
>> + int lock;
>> + int error = 0;
>> +
>> + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
>> + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
>> + return XFS_ERROR(EINVAL);
>
> I'd recommend moving this check into xfs_file_llseek and even do it
> for the normal lseek requests - it's another sanity check for corrupted
> filesystems which makes sense everywhere. I also think the return value
> should be EFSCORRUPTED.
>
> Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
> shouldn't be tested for.
>
>> +
>> + lock = xfs_ilock_map_shared(ip);
>> +
>> + if (XFS_FORCED_SHUTDOWN(mp)) {
>> + error = EIO;
>> + goto out_lock;
>> + }
>
> The shutdown check probably should go into the common lseek code and
> done for all cases.
>
>> +
>> + XFS_STATS_INC(xs_blk_mapr);
>
> I don't think this counter should be incremented here.
I will take of above issues.
>
>> + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
>> +
>> + ASSERT(ifp->if_ext_max ==
>> + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
>
> Please just drop this assert. if_ext_max is pretty useless, and I have
> a patch to remove it pending. No adding another use of it in your patch
> will make merging a bit easier.
This change will reflect in next post too.
>
>> + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
>> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
>> + if (error)
>> + goto out_lock;
>> + }
>> +
>> + if (type == SEEK_HOLE)
>> + error = xfs_seek_hole(ip, start);
>> + else
>> + error = xfs_seek_data(ip, start);
>
> Now that just the locking and the xfs_iread_extents call is left in
> this function I'd suggest to remove it and instead add duplicates
> of the locking and xfs_iread_extents into xfs_seek_hole and
> xfs_seek_data.
So per my understood, we need to isolate the pre-checking code to
xfs_file_llseek(), and duplicate locking and xfs_iread_extents() to
seek_data/hole. In this way, we could reduce the coupling in terms of
those routines functionality?
>
>> + switch (origin) {
>> + case SEEK_END:
>> + case SEEK_CUR:
>> + offset = generic_file_llseek(file, offset, origin);
>> + goto out;
>
> instead of the goto out just return the generic_file_llseek return
> value directly here.
Definitely!
>
>> + case SEEK_DATA:
>> + case SEEK_HOLE:
>> + if (offset >= i_size_read(inode)) {
>> + ret = -ENXIO;
>> + goto error;
>> + }
>> +
>> + ret = xfs_seek_extent(inode, &offset, origin);
>> + if (ret)
>> + goto error;
>> + }
>> +
>> + if (offset != file->f_pos)
>> + file->f_pos = offset;
>
> doing the offset update outside the case scrope doesn't make much sense.
>
> I'd probably just move the offset check and offset update into the
> low-level xfs_seek_data/xfs_seek_hole helpers - it's a tiny bit of
> duplication, but it keeps the functions self-contained and the
> top-level llseek method just dispatcher into the different routines.
Sorry, those codes are just copied from other file systems, I need to
consolidate them.
Thanks,
-Jeff
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
2011-11-23 9:40 ` Christoph Hellwig
2011-11-23 14:00 ` Jeff Liu
@ 2011-11-24 3:23 ` Dave Chinner
2011-11-24 9:02 ` Christoph Hellwig
1 sibling, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-11-24 3:23 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jeff Liu, Chris Mason, xfs
On Wed, Nov 23, 2011 at 04:40:49AM -0500, Christoph Hellwig wrote:
> On Tue, Nov 22, 2011 at 04:19:45PM +0800, Jeff Liu wrote:
> > Hello,
> >
> > This is the V2 attempt to add SEEK_DATA/SEEK_HOLE to XFS.
>
> Thanks Jeff, this looks pretty good for "simple" implementation,
> I only have a few rather cosmetic comments.
>
> Do you plan on updating Josef's old xfstests support patch for
> SEEK_DATA/SEEK_HOLE? Also it would be nice to do the pagecache
> probing for dirty unwritten extents next to get a better quality
> of implementation.
>
> >+STATIC int
> >+xfs_seek_data(
> >+ struct xfs_inode *ip,
> >+ loff_t *start)
> >+{
>
> In the XFS code we generally tab-aling the paramter names, just like
> you already did for the local variables:
>
> STATIC int
> xfs_seek_data(
> struct xfs_inode *ip,
> loff_t *start)
>
> (that also applies for a few other functions)
>
> > + /*
> > + * Hole handling for unwritten extents landed in a hole.
> > + * If the next extent is a data extent, then return the
> > + * start of it, otherwise we need to move the start offset
> > + * and map more blocks.
> > + */
>
> I don't think this comment is quite correct. We don't just end up here
> for unwritten extents. I'd recommend something like:
>
> /*
> * We landed in a hole. Skip to the next extent.
> */
>
> > + if (map[0].br_startblock == HOLESTARTBLOCK) {
> > + if (map[1].br_startblock == HOLESTARTBLOCK) {
> > + fsbno = map[1].br_startoff +
> > + map[1].br_blockcount;
>
> I don't think this code is reachable - xfs_bmapi will never produce
> multiple consecutive HOLESTARTBLOCK extents.
If the hole would overflow br_blockcount (32 bits of FSB units, 16TB
by default), then we should get multiple consecutive hole records
returned.
> If you want to ensure
> that feel free to add an assert, e.g.
>
> if (map[0].br_startblock == HOLESTARTBLOCK) {
> ASSERT(map[1].br_startblock != HOLESTARTBLOCK);
>
> *start = max_t(loff_t, seekoff,
> XFS_FSB_TO_B(mp, map[1].br_startoff));
> break;
> }
>
> This also means that we never have to loop here until we add dirty
> unwritten probing - if the second extent doesn't contain data there
> won't be any other data extent in this file beyound our offset.
I think that's incorrect. A data extent is limited in length by the
on disk format (21 bits of FSB, 8GB in 4k block fs), so if you've
got more than 8GB of data or the file is fragmented after the
current extent then we can still get back multiple data extents
before we find the next hole.
> > +
> > + /*
> > + * Landed in an in-memory data extent or in an allocated
> > + * extent.
> > + */
>
> > + if (map[0].br_startoff == DELAYSTARTBLOCK ||
> > + map[0].br_state == XFS_EXT_NORM) {
>
> I think just checking for br_state == XFS_EXT_NORM should be fine here,
> as unwritten extents can be delayed allocated.
Can they? I'm pretty sure delalloc and unwritten are mutually
exclusive states for an extent. That is, if you write over an
unwritten extent, it gets mapped to the disk blocks but remains an
unwritten extent until the data is written. And, IIRC, preallocation
skips over delayed allocation ranges so you can't turn an existing
delalloc region into a preallocated one...
> But until we add probing
> for dirty unwritten extents is added we have to treat unwritten extents
> as data anyway to avoid data loss.
True.
>
> Note that once unwrittent extent probing also needs to cover the hole
> case above and not just this case.
>
> > +STATIC int
> > +xfs_seek_extent(
> > + struct inode *inode,
> > + loff_t *start,
> > + u32 type)
> > +{
> > + struct xfs_inode *ip = XFS_I(inode);
> > + struct xfs_mount *mp = ip->i_mount;
> > + struct xfs_ifork *ifp;
> > + int lock;
> > + int error = 0;
> > +
> > + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
> > + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
> > + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> > + return XFS_ERROR(EINVAL);
>
> I'd recommend moving this check into xfs_file_llseek and even do it
> for the normal lseek requests - it's another sanity check for corrupted
> filesystems which makes sense everywhere. I also think the return value
> should be EFSCORRUPTED.
>
> Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
> shouldn't be tested for.
I don't think it is necessary at all - the low level bmap functions
already do these checks.
>
> > +
> > + lock = xfs_ilock_map_shared(ip);
> > +
> > + if (XFS_FORCED_SHUTDOWN(mp)) {
> > + error = EIO;
> > + goto out_lock;
> > + }
>
> The shutdown check probably should go into the common lseek code and
> done for all cases.
I think the low level functions also do this check so it's not
explicitly needed here, anyway.
>
> > +
> > + XFS_STATS_INC(xs_blk_mapr);
>
> I don't think this counter should be incremented here.
It's done in the lower layer functions, so shouldn't be here.
>
> > + ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> > +
> > + ASSERT(ifp->if_ext_max ==
> > + XFS_IFORK_SIZE(ip, XFS_DATA_FORK) / (uint)sizeof(xfs_bmbt_rec_t));
>
> Please just drop this assert. if_ext_max is pretty useless, and I have
> a patch to remove it pending. No adding another use of it in your patch
> will make merging a bit easier.
I think it's done in the lower layer functions, anyway.
>
> > + if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> > + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
> > + if (error)
> > + goto out_lock;
> > + }
And that is definitely done in the lower layer bmap functions, so is
not necessary here.
> > +
> > + if (type == SEEK_HOLE)
> > + error = xfs_seek_hole(ip, start);
> > + else
> > + error = xfs_seek_data(ip, start);
>
> Now that just the locking and the xfs_iread_extents call is left in
> this function I'd suggest to remove it and instead add duplicates
> of the locking and xfs_iread_extents into xfs_seek_hole and
> xfs_seek_data.
Actually, it just turns into "lock, call seek/data fucntion, unlock",
so it can probaly go away entirely.
>
> > + switch (origin) {
> > + case SEEK_END:
> > + case SEEK_CUR:
> > + offset = generic_file_llseek(file, offset, origin);
> > + goto out;
>
> instead of the goto out just return the generic_file_llseek return
> value directly here.
>
> > + case SEEK_DATA:
> > + case SEEK_HOLE:
> > + if (offset >= i_size_read(inode)) {
> > + ret = -ENXIO;
> > + goto error;
> > + }
> > +
> > + ret = xfs_seek_extent(inode, &offset, origin);
> > + if (ret)
> > + goto error;
> > + }
> > +
> > + if (offset != file->f_pos)
> > + file->f_pos = offset;
>
> doing the offset update outside the case scrope doesn't make much sense.
>
> I'd probably just move the offset check and offset update into the
> low-level xfs_seek_data/xfs_seek_hole helpers - it's a tiny bit of
> duplication, but it keeps the functions self-contained and the
> top-level llseek method just dispatcher into the different routines.
Definitely.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2
2011-11-24 3:23 ` Dave Chinner
@ 2011-11-24 9:02 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-11-24 9:02 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, Jeff Liu, Chris Mason, xfs
On Thu, Nov 24, 2011 at 02:23:31PM +1100, Dave Chinner wrote:
> > > + if (map[0].br_startblock == HOLESTARTBLOCK) {
> > > + if (map[1].br_startblock == HOLESTARTBLOCK) {
> > > + fsbno = map[1].br_startoff +
> > > + map[1].br_blockcount;
> >
> > I don't think this code is reachable - xfs_bmapi will never produce
> > multiple consecutive HOLESTARTBLOCK extents.
>
> If the hole would overflow br_blockcount (32 bits of FSB units, 16TB
> by default), then we should get multiple consecutive hole records
> returned.
Right, the XFS_FILBLKS_MIN in xfs_bmapi_read will limit it, and we'll
it the same case again in the loop. So yes, we'll need it; and we
should have a test to verify this case.
> > This also means that we never have to loop here until we add dirty
> > unwritten probing - if the second extent doesn't contain data there
> > won't be any other data extent in this file beyound our offset.
>
> I think that's incorrect. A data extent is limited in length by the
> on disk format (21 bits of FSB, 8GB in 4k block fs), so if you've
> got more than 8GB of data or the file is fragmented after the
> current extent then we can still get back multiple data extents
> before we find the next hole.
Indeed. Add fragmented file to what we need to test in QA..
> >
> > I think just checking for br_state == XFS_EXT_NORM should be fine here,
> > as unwritten extents can be delayed allocated.
>
> Can they? I'm pretty sure delalloc and unwritten are mutually
> exclusive states for an extent.
Yes, they _can't_. That was a typo, the rest of the setentence wouldn't
make sense if that was allowed.
> > > + if (ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
> > > + ip->i_d.di_format != XFS_DINODE_FMT_BTREE &&
> > > + ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> > > + return XFS_ERROR(EINVAL);
> >
> > I'd recommend moving this check into xfs_file_llseek and even do it
> > for the normal lseek requests - it's another sanity check for corrupted
> > filesystems which makes sense everywhere. I also think the return value
> > should be EFSCORRUPTED.
> >
> > Also XFS_DINODE_FMT_LOCAL isn't valid for regular files (yet) so it
> > shouldn't be tested for.
>
> I don't think it is necessary at all - the low level bmap functions
> already do these checks.
Indeed, although xfs_bmap_first_unused also allows XFS_DINODE_FMT_LOCAL
format, but I think that is fine.
> > > + lock = xfs_ilock_map_shared(ip);
> > > +
> > > + if (XFS_FORCED_SHUTDOWN(mp)) {
> > > + error = EIO;
> > > + goto out_lock;
> > > + }
> >
> > The shutdown check probably should go into the common lseek code and
> > done for all cases.
>
> I think the low level functions also do this check so it's not
> explicitly needed here, anyway.
xfs_bmapi_read does it, xfs_bmap_first_unused lacks it. And returning
an error ASAP on a normal lseek for the normal lseek cases also makes
a lot of sense.
> >
> > > +
> > > + XFS_STATS_INC(xs_blk_mapr);
> >
> > I don't think this counter should be incremented here.
>
> It's done in the lower layer functions, so shouldn't be here.
It is for xfs_bmapi_read, it isn't for xfs_bmap_first_unused, and then
again it really shouldn't either - it's a counter for xfs_bmapi_read
calls.
> > Now that just the locking and the xfs_iread_extents call is left in
> > this function I'd suggest to remove it and instead add duplicates
> > of the locking and xfs_iread_extents into xfs_seek_hole and
> > xfs_seek_data.
>
> Actually, it just turns into "lock, call seek/data fucntion, unlock",
> so it can probaly go away entirely.
That's what I tried to imply with the above comment.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-24 9:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-22 8:19 [PATCH] Introduce SEEK_DATA/SEEK_HOLE to XFS V2 Jeff Liu
2011-11-22 8:30 ` Jeff Liu
2011-11-23 9:40 ` Christoph Hellwig
2011-11-23 14:00 ` Jeff Liu
2011-11-24 3:23 ` Dave Chinner
2011-11-24 9:02 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox