All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tao Ma <tao.ma@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: sandeen@sandeen.net, linux-kernel@vger.kernel.org,
	xfs@oss.sgi.com, Alex Elder <aelder@sgi.com>,
	Christoph Hellwig <hch@lst.de>, "tao.ma" <tao.ma@oracle.com>
Subject: Re: [PATCH v2] xfs: Make fiemap works with sparse file.
Date: Mon, 14 Jun 2010 13:53:11 +0800	[thread overview]
Message-ID: <4C15C3C7.5090706@oracle.com> (raw)
In-Reply-To: <20100614002705.GA6590@dastard>



On 06/14/2010 08:27 AM, Dave Chinner wrote:
> On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote:
>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>> to return fi_extent_max extents, but actually it won't work for
>> a sparse file.
>
> Define "won't work". i.e. what's the test case?  I just created a
> sparse file and checked it, and it reported all the extents in it:
>
> # xfs_bmap -vp testfile
> testfile:
>   EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>     0: [0..7]:          hole                                     8
>     1: [8..15]:         96..103           0 (96..103)            8 00000
>     2: [16..23]:        hole                                     8
>     3: [24..31]:        112..119          0 (112..119)           8 00000
>     4: [32..39]:        hole                                     8
>     5: [40..47]:        128..135          0 (128..135)           8 00000
>     6: [48..55]:        hole                                     8
>     7: [56..63]:        144..151          0 (144..151)           8 00000
>     8: [64..71]:        hole                                     8
>     9: [72..79]:        160..167          0 (160..167)           8 00000
>    10: [80..87]:        hole                                     8
>    11: [88..95]:        176..183          0 (176..183)           8 00000
>    12: [96..103]:       hole                                     8
>    13: [104..111]:      192..199          0 (192..199)           8 00000
>    14: [112..119]:      hole                                     8
>    15: [120..127]:      208..215          0 (208..215)           8 00000
ok, so let me explain it. In commit 
2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent 
query of fiemap for xfs. So with your test file, it will return that we 
have 8 extents(because in xfs_fiemap_format we don't return holes). So 
normally and naturally, a user begin to iterate all the extents by doing

fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 8

But what will happen? He will only get 4 extent. So do you think it is 
acceptable for a user? We told him that we have 8 extents, he has 
allocated enough space, but he can't get what he wanted. And he need to
fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 16
to get 8 extent for your test file.

> # filefrag -v testfile
> Filesystem type is: 58465342
> File size of testfile is 65536 (16 blocks, blocksize 4096)
>   ext logical physical expected length flags
>     0       1       12               1
>     1       3       14       12      1
>     2       5       16       14      1
>     3       7       18       16      1
>     4       9       20       18      1
>     5      11       22       20      1
>     6      13       24       22      1
>     7      15       26       24      1 eof
> testfile: 9 extents found
> #
>
> FWIW, filefrag seems busted - the file has 8 extents, not 9.
yeah, filefrag is really broken.
>
>> The reason is that in xfs_getbmap we will
>> calculate holes and set it in 'out', while out is malloced by
>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>> worst case, if 'out' vector looks like
>> [hole, extent, hole, extent, hole, ... hole, extent, hole],
>> we will only return half of fi_extent_max extents.
>
> Right, it's not broken, we simply return less than fi_extent_mex
> extents when there are holes. I don't see that as a problem as
> applications have to handle that case anyway, and....
see my above test case. I guess we really don't want a userspace user to 
allocate num_extents * 2 + 1 fiemap_extent to get them.
>
>> So in xfs_vn_fiemap, we should consider this worst case. If the
>> user wants fi_extent_max extents, we need a 'out' with size of
>> 2 *fi_extent_max + 2(one more the header).
>
> That's rather dangerous, I think. It relies on other code to catch
> the buffer overrun that this sets up for fragmented, non-sparse
> files. Personally I'd much prefer to return fewer extents for sparse
> files than to add a landmine like this into the kernel code....
We just change the size of our 'out', we don't change fi_extent_max or 
anything related to the fiemap. So I think what we care is how to keep 
our 'out' in good shape and fiemap should handle and check their 
fi_extent_max if we pass it more extents.

btw, maybe there is a better solution for the problem I described above. 
If there is a good one, I am happy to accept it.

Regards,
Tao

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Tao Ma <tao.ma@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-kernel@vger.kernel.org,
	sandeen@sandeen.net, Alex Elder <aelder@sgi.com>,
	Christoph Hellwig <hch@lst.de>, "tao.ma" <tao.ma@oracle.com>
Subject: Re: [PATCH v2] xfs: Make fiemap works with sparse file.
Date: Mon, 14 Jun 2010 13:53:11 +0800	[thread overview]
Message-ID: <4C15C3C7.5090706@oracle.com> (raw)
In-Reply-To: <20100614002705.GA6590@dastard>



On 06/14/2010 08:27 AM, Dave Chinner wrote:
> On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote:
>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>> to return fi_extent_max extents, but actually it won't work for
>> a sparse file.
>
> Define "won't work". i.e. what's the test case?  I just created a
> sparse file and checked it, and it reported all the extents in it:
>
> # xfs_bmap -vp testfile
> testfile:
>   EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>     0: [0..7]:          hole                                     8
>     1: [8..15]:         96..103           0 (96..103)            8 00000
>     2: [16..23]:        hole                                     8
>     3: [24..31]:        112..119          0 (112..119)           8 00000
>     4: [32..39]:        hole                                     8
>     5: [40..47]:        128..135          0 (128..135)           8 00000
>     6: [48..55]:        hole                                     8
>     7: [56..63]:        144..151          0 (144..151)           8 00000
>     8: [64..71]:        hole                                     8
>     9: [72..79]:        160..167          0 (160..167)           8 00000
>    10: [80..87]:        hole                                     8
>    11: [88..95]:        176..183          0 (176..183)           8 00000
>    12: [96..103]:       hole                                     8
>    13: [104..111]:      192..199          0 (192..199)           8 00000
>    14: [112..119]:      hole                                     8
>    15: [120..127]:      208..215          0 (208..215)           8 00000
ok, so let me explain it. In commit 
2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent 
query of fiemap for xfs. So with your test file, it will return that we 
have 8 extents(because in xfs_fiemap_format we don't return holes). So 
normally and naturally, a user begin to iterate all the extents by doing

fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 8

But what will happen? He will only get 4 extent. So do you think it is 
acceptable for a user? We told him that we have 8 extents, he has 
allocated enough space, but he can't get what he wanted. And he need to
fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent));
fiemap->fm_extent_count = 16
to get 8 extent for your test file.

> # filefrag -v testfile
> Filesystem type is: 58465342
> File size of testfile is 65536 (16 blocks, blocksize 4096)
>   ext logical physical expected length flags
>     0       1       12               1
>     1       3       14       12      1
>     2       5       16       14      1
>     3       7       18       16      1
>     4       9       20       18      1
>     5      11       22       20      1
>     6      13       24       22      1
>     7      15       26       24      1 eof
> testfile: 9 extents found
> #
>
> FWIW, filefrag seems busted - the file has 8 extents, not 9.
yeah, filefrag is really broken.
>
>> The reason is that in xfs_getbmap we will
>> calculate holes and set it in 'out', while out is malloced by
>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>> worst case, if 'out' vector looks like
>> [hole, extent, hole, extent, hole, ... hole, extent, hole],
>> we will only return half of fi_extent_max extents.
>
> Right, it's not broken, we simply return less than fi_extent_mex
> extents when there are holes. I don't see that as a problem as
> applications have to handle that case anyway, and....
see my above test case. I guess we really don't want a userspace user to 
allocate num_extents * 2 + 1 fiemap_extent to get them.
>
>> So in xfs_vn_fiemap, we should consider this worst case. If the
>> user wants fi_extent_max extents, we need a 'out' with size of
>> 2 *fi_extent_max + 2(one more the header).
>
> That's rather dangerous, I think. It relies on other code to catch
> the buffer overrun that this sets up for fragmented, non-sparse
> files. Personally I'd much prefer to return fewer extents for sparse
> files than to add a landmine like this into the kernel code....
We just change the size of our 'out', we don't change fi_extent_max or 
anything related to the fiemap. So I think what we care is how to keep 
our 'out' in good shape and fiemap should handle and check their 
fi_extent_max if we pass it more extents.

btw, maybe there is a better solution for the problem I described above. 
If there is a good one, I am happy to accept it.

Regards,
Tao

  reply	other threads:[~2010-06-14  5:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-12  2:08 [PATCH v2] xfs: Make fiemap works with sparse file Tao Ma
2010-06-12  2:08 ` Tao Ma
2010-06-14  0:27 ` Dave Chinner
2010-06-14  0:27   ` Dave Chinner
2010-06-14  5:53   ` Tao Ma [this message]
2010-06-14  5:53     ` Tao Ma
2010-06-14 12:29   ` Dave Chinner
2010-06-14 12:29     ` Dave Chinner
2010-06-14 13:37     ` Tao Ma
2010-06-14 13:37       ` Tao Ma
2010-06-17  8:53     ` Tao Ma
2010-06-17  8:53       ` Tao Ma
2010-06-18  0:47       ` Dave Chinner
2010-06-18  0:47         ` Dave Chinner
2010-06-18  2:27         ` Tao Ma
2010-06-18  2:27           ` Tao Ma
2010-06-18  6:22           ` Dave Chinner
2010-06-18  6:22             ` Dave Chinner
2010-08-27 19:46       ` Alex Elder
2010-08-30  2:44         ` [PATCH v4] " Tao Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C15C3C7.5090706@oracle.com \
    --to=tao.ma@oracle.com \
    --cc=aelder@sgi.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.