From: Tao Ma <tao.ma@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-kernel@vger.kernel.org,
sandeen@sandeen.net, Alex Elder <aelder@sgi.com>,
Christoph Hellwig <hch@lst.de>, "tao.ma" <tao.ma@oracle.com>
Subject: Re: [PATCH v2] xfs: Make fiemap works with sparse file.
Date: Mon, 14 Jun 2010 13:53:11 +0800 [thread overview]
Message-ID: <4C15C3C7.5090706@oracle.com> (raw)
In-Reply-To: <20100614002705.GA6590@dastard>
On 06/14/2010 08:27 AM, Dave Chinner wrote:
> On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote:
>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>> to return fi_extent_max extents, but actually it won't work for
>> a sparse file.
>
> Define "won't work". i.e. what's the test case? I just created a
> sparse file and checked it, and it reported all the extents in it:
>
> # xfs_bmap -vp testfile
> testfile:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
> 0: [0..7]: hole 8
> 1: [8..15]: 96..103 0 (96..103) 8 00000
> 2: [16..23]: hole 8
> 3: [24..31]: 112..119 0 (112..119) 8 00000
> 4: [32..39]: hole 8
> 5: [40..47]: 128..135 0 (128..135) 8 00000
> 6: [48..55]: hole 8
> 7: [56..63]: 144..151 0 (144..151) 8 00000
> 8: [64..71]: hole 8
> 9: [72..79]: 160..167 0 (160..167) 8 00000
> 10: [80..87]: hole 8
> 11: [88..95]: 176..183 0 (176..183) 8 00000
> 12: [96..103]: hole 8
> 13: [104..111]: 192..199 0 (192..199) 8 00000
> 14: [112..119]: hole 8
> 15: [120..127]: 208..215 0 (208..215) 8 00000
ok, so let me explain it. In commit
2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent
query of fiemap for xfs. So with your test file, it will return that we
have 8 extents(because in xfs_fiemap_format we don't return holes). So
normally and naturally, a user begin to iterate all the extents by doing
fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 8
But what will happen? He will only get 4 extent. So do you think it is
acceptable for a user? We told him that we have 8 extents, he has
allocated enough space, but he can't get what he wanted. And he need to
fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 16
to get 8 extent for your test file.
> # filefrag -v testfile
> Filesystem type is: 58465342
> File size of testfile is 65536 (16 blocks, blocksize 4096)
> ext logical physical expected length flags
> 0 1 12 1
> 1 3 14 12 1
> 2 5 16 14 1
> 3 7 18 16 1
> 4 9 20 18 1
> 5 11 22 20 1
> 6 13 24 22 1
> 7 15 26 24 1 eof
> testfile: 9 extents found
> #
>
> FWIW, filefrag seems busted - the file has 8 extents, not 9.
yeah, filefrag is really broken.
>
>> The reason is that in xfs_getbmap we will
>> calculate holes and set it in 'out', while out is malloced by
>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>> worst case, if 'out' vector looks like
>> [hole, extent, hole, extent, hole, ... hole, extent, hole],
>> we will only return half of fi_extent_max extents.
>
> Right, it's not broken, we simply return less than fi_extent_mex
> extents when there are holes. I don't see that as a problem as
> applications have to handle that case anyway, and....
see my above test case. I guess we really don't want a userspace user to
allocate num_extents * 2 + 1 fiemap_extent to get them.
>
>> So in xfs_vn_fiemap, we should consider this worst case. If the
>> user wants fi_extent_max extents, we need a 'out' with size of
>> 2 *fi_extent_max + 2(one more the header).
>
> That's rather dangerous, I think. It relies on other code to catch
> the buffer overrun that this sets up for fragmented, non-sparse
> files. Personally I'd much prefer to return fewer extents for sparse
> files than to add a landmine like this into the kernel code....
We just change the size of our 'out', we don't change fi_extent_max or
anything related to the fiemap. So I think what we care is how to keep
our 'out' in good shape and fiemap should handle and check their
fi_extent_max if we pass it more extents.
btw, maybe there is a better solution for the problem I described above.
If there is a good one, I am happy to accept it.
Regards,
Tao
next prev parent reply other threads:[~2010-06-14 5:55 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-12 2:08 [PATCH v2] xfs: Make fiemap works with sparse file Tao Ma
2010-06-14 0:27 ` Dave Chinner
2010-06-14 5:53 ` Tao Ma [this message]
2010-06-14 12:29 ` Dave Chinner
2010-06-14 13:37 ` Tao Ma
2010-06-17 8:53 ` Tao Ma
2010-06-18 0:47 ` Dave Chinner
2010-06-18 2:27 ` Tao Ma
2010-06-18 6:22 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C15C3C7.5090706@oracle.com \
--to=tao.ma@oracle.com \
--cc=aelder@sgi.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@sandeen.net \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).