From: Tao Ma <tao.ma@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: sandeen@sandeen.net, linux-kernel@vger.kernel.org,
xfs@oss.sgi.com, Alex Elder <aelder@sgi.com>,
Christoph Hellwig <hch@lst.de>, "tao.ma" <tao.ma@oracle.com>
Subject: Re: [PATCH v2] xfs: Make fiemap works with sparse file.
Date: Mon, 14 Jun 2010 13:53:11 +0800 [thread overview]
Message-ID: <4C15C3C7.5090706@oracle.com> (raw)
In-Reply-To: <20100614002705.GA6590@dastard>
On 06/14/2010 08:27 AM, Dave Chinner wrote:
> On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote:
>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>> to return fi_extent_max extents, but actually it won't work for
>> a sparse file.
>
> Define "won't work". i.e. what's the test case? I just created a
> sparse file and checked it, and it reported all the extents in it:
>
> # xfs_bmap -vp testfile
> testfile:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
> 0: [0..7]: hole 8
> 1: [8..15]: 96..103 0 (96..103) 8 00000
> 2: [16..23]: hole 8
> 3: [24..31]: 112..119 0 (112..119) 8 00000
> 4: [32..39]: hole 8
> 5: [40..47]: 128..135 0 (128..135) 8 00000
> 6: [48..55]: hole 8
> 7: [56..63]: 144..151 0 (144..151) 8 00000
> 8: [64..71]: hole 8
> 9: [72..79]: 160..167 0 (160..167) 8 00000
> 10: [80..87]: hole 8
> 11: [88..95]: 176..183 0 (176..183) 8 00000
> 12: [96..103]: hole 8
> 13: [104..111]: 192..199 0 (192..199) 8 00000
> 14: [112..119]: hole 8
> 15: [120..127]: 208..215 0 (208..215) 8 00000
ok, so let me explain it. In commit
2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent
query of fiemap for xfs. So with your test file, it will return that we
have 8 extents(because in xfs_fiemap_format we don't return holes). So
normally and naturally, a user begin to iterate all the extents by doing
fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 8
But what will happen? He will only get 4 extent. So do you think it is
acceptable for a user? We told him that we have 8 extents, he has
allocated enough space, but he can't get what he wanted. And he need to
fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 16
to get 8 extent for your test file.
> # filefrag -v testfile
> Filesystem type is: 58465342
> File size of testfile is 65536 (16 blocks, blocksize 4096)
> ext logical physical expected length flags
> 0 1 12 1
> 1 3 14 12 1
> 2 5 16 14 1
> 3 7 18 16 1
> 4 9 20 18 1
> 5 11 22 20 1
> 6 13 24 22 1
> 7 15 26 24 1 eof
> testfile: 9 extents found
> #
>
> FWIW, filefrag seems busted - the file has 8 extents, not 9.
yeah, filefrag is really broken.
>
>> The reason is that in xfs_getbmap we will
>> calculate holes and set it in 'out', while out is malloced by
>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>> worst case, if 'out' vector looks like
>> [hole, extent, hole, extent, hole, ... hole, extent, hole],
>> we will only return half of fi_extent_max extents.
>
> Right, it's not broken, we simply return less than fi_extent_mex
> extents when there are holes. I don't see that as a problem as
> applications have to handle that case anyway, and....
see my above test case. I guess we really don't want a userspace user to
allocate num_extents * 2 + 1 fiemap_extent to get them.
>
>> So in xfs_vn_fiemap, we should consider this worst case. If the
>> user wants fi_extent_max extents, we need a 'out' with size of
>> 2 *fi_extent_max + 2(one more the header).
>
> That's rather dangerous, I think. It relies on other code to catch
> the buffer overrun that this sets up for fragmented, non-sparse
> files. Personally I'd much prefer to return fewer extents for sparse
> files than to add a landmine like this into the kernel code....
We just change the size of our 'out', we don't change fi_extent_max or
anything related to the fiemap. So I think what we care is how to keep
our 'out' in good shape and fiemap should handle and check their
fi_extent_max if we pass it more extents.
btw, maybe there is a better solution for the problem I described above.
If there is a good one, I am happy to accept it.
Regards,
Tao
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
WARNING: multiple messages have this Message-ID (diff)
From: Tao Ma <tao.ma@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-kernel@vger.kernel.org,
sandeen@sandeen.net, Alex Elder <aelder@sgi.com>,
Christoph Hellwig <hch@lst.de>, "tao.ma" <tao.ma@oracle.com>
Subject: Re: [PATCH v2] xfs: Make fiemap works with sparse file.
Date: Mon, 14 Jun 2010 13:53:11 +0800 [thread overview]
Message-ID: <4C15C3C7.5090706@oracle.com> (raw)
In-Reply-To: <20100614002705.GA6590@dastard>
On 06/14/2010 08:27 AM, Dave Chinner wrote:
> On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote:
>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>> to return fi_extent_max extents, but actually it won't work for
>> a sparse file.
>
> Define "won't work". i.e. what's the test case? I just created a
> sparse file and checked it, and it reported all the extents in it:
>
> # xfs_bmap -vp testfile
> testfile:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
> 0: [0..7]: hole 8
> 1: [8..15]: 96..103 0 (96..103) 8 00000
> 2: [16..23]: hole 8
> 3: [24..31]: 112..119 0 (112..119) 8 00000
> 4: [32..39]: hole 8
> 5: [40..47]: 128..135 0 (128..135) 8 00000
> 6: [48..55]: hole 8
> 7: [56..63]: 144..151 0 (144..151) 8 00000
> 8: [64..71]: hole 8
> 9: [72..79]: 160..167 0 (160..167) 8 00000
> 10: [80..87]: hole 8
> 11: [88..95]: 176..183 0 (176..183) 8 00000
> 12: [96..103]: hole 8
> 13: [104..111]: 192..199 0 (192..199) 8 00000
> 14: [112..119]: hole 8
> 15: [120..127]: 208..215 0 (208..215) 8 00000
ok, so let me explain it. In commit
2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent
query of fiemap for xfs. So with your test file, it will return that we
have 8 extents(because in xfs_fiemap_format we don't return holes). So
normally and naturally, a user begin to iterate all the extents by doing
fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 8
But what will happen? He will only get 4 extent. So do you think it is
acceptable for a user? We told him that we have 8 extents, he has
allocated enough space, but he can't get what he wanted. And he need to
fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 16
to get 8 extent for your test file.
> # filefrag -v testfile
> Filesystem type is: 58465342
> File size of testfile is 65536 (16 blocks, blocksize 4096)
> ext logical physical expected length flags
> 0 1 12 1
> 1 3 14 12 1
> 2 5 16 14 1
> 3 7 18 16 1
> 4 9 20 18 1
> 5 11 22 20 1
> 6 13 24 22 1
> 7 15 26 24 1 eof
> testfile: 9 extents found
> #
>
> FWIW, filefrag seems busted - the file has 8 extents, not 9.
yeah, filefrag is really broken.
>
>> The reason is that in xfs_getbmap we will
>> calculate holes and set it in 'out', while out is malloced by
>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>> worst case, if 'out' vector looks like
>> [hole, extent, hole, extent, hole, ... hole, extent, hole],
>> we will only return half of fi_extent_max extents.
>
> Right, it's not broken, we simply return less than fi_extent_mex
> extents when there are holes. I don't see that as a problem as
> applications have to handle that case anyway, and....
see my above test case. I guess we really don't want a userspace user to
allocate num_extents * 2 + 1 fiemap_extent to get them.
>
>> So in xfs_vn_fiemap, we should consider this worst case. If the
>> user wants fi_extent_max extents, we need a 'out' with size of
>> 2 *fi_extent_max + 2(one more the header).
>
> That's rather dangerous, I think. It relies on other code to catch
> the buffer overrun that this sets up for fragmented, non-sparse
> files. Personally I'd much prefer to return fewer extents for sparse
> files than to add a landmine like this into the kernel code....
We just change the size of our 'out', we don't change fi_extent_max or
anything related to the fiemap. So I think what we care is how to keep
our 'out' in good shape and fiemap should handle and check their
fi_extent_max if we pass it more extents.
btw, maybe there is a better solution for the problem I described above.
If there is a good one, I am happy to accept it.
Regards,
Tao
next prev parent reply other threads:[~2010-06-14 5:52 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-12 2:08 [PATCH v2] xfs: Make fiemap works with sparse file Tao Ma
2010-06-12 2:08 ` Tao Ma
2010-06-14 0:27 ` Dave Chinner
2010-06-14 0:27 ` Dave Chinner
2010-06-14 5:53 ` Tao Ma [this message]
2010-06-14 5:53 ` Tao Ma
2010-06-14 12:29 ` Dave Chinner
2010-06-14 12:29 ` Dave Chinner
2010-06-14 13:37 ` Tao Ma
2010-06-14 13:37 ` Tao Ma
2010-06-17 8:53 ` Tao Ma
2010-06-17 8:53 ` Tao Ma
2010-06-18 0:47 ` Dave Chinner
2010-06-18 0:47 ` Dave Chinner
2010-06-18 2:27 ` Tao Ma
2010-06-18 2:27 ` Tao Ma
2010-06-18 6:22 ` Dave Chinner
2010-06-18 6:22 ` Dave Chinner
2010-08-27 19:46 ` Alex Elder
2010-08-30 2:44 ` [PATCH v4] " Tao Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C15C3C7.5090706@oracle.com \
--to=tao.ma@oracle.com \
--cc=aelder@sgi.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@sandeen.net \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.