linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* extent counting fun
@ 2010-07-05 19:24 Eric Sandeen
  2010-07-05 23:46 ` Andreas Dilger
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Sandeen @ 2010-07-05 19:24 UTC (permalink / raw)
  To: ext4 development

[root@host ~]# filefrag -B /mnt/test/file
/mnt/test/file: 34 extents found
[root@host ~]# filefrag /mnt/test/file
/mnt/test/file: 1058 extents found, perfection would be 1 extent

Hum, is it 34 or 1058? :)

Older filefrag counted contiguous metadata as part of a contiguous
extent... newer filefrag works in fiemap query-only mode by default,
and just takes what fiemap tells it.  The inconsistency is weird
though, and led to a Red Hat bug that I'm inclined to NOTABUG... but
do people think this needs to be made any more consistent?

Should we hack ext3_fiemap() to include the checks for contiguous
metadata?  Or was that too shady/clever to start with ...? :)

-Eric

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: extent counting fun
  2010-07-05 19:24 extent counting fun Eric Sandeen
@ 2010-07-05 23:46 ` Andreas Dilger
  2010-07-06  0:50   ` Eric Sandeen
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Dilger @ 2010-07-05 23:46 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

On 2010-07-05, at 13:24, Eric Sandeen wrote:
> [root@host ~]# filefrag -B /mnt/test/file
> /mnt/test/file: 34 extents found
> [root@host ~]# filefrag /mnt/test/file
> /mnt/test/file: 1058 extents found, perfection would be 1 extent
> 
> Hum, is it 34 or 1058? :)

What do the extents look like on disk?  Is this just because it is running on a block-mapped file and is skipping a singleton block periodically for indirect blocks, or is there a bug in the way the extents are being reported?

> Older filefrag counted contiguous metadata as part of a contiguous
> extent... newer filefrag works in fiemap query-only mode by default,
> and just takes what fiemap tells it.  The inconsistency is weird
> though, and led to a Red Hat bug that I'm inclined to NOTABUG... but
> do people think this needs to be made any more consistent?
> 
> Should we hack ext3_fiemap() to include the checks for contiguous
> metadata?  Or was that too shady/clever to start with ...? :)

I wouldn't object to having FIEMAP add a flag for metadata blocks.  I've always thought it would be useful to be able to query/dump metadata blocks (e.g. indirect/index blocks) and the inode itself.

Cheers, Andreas






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: extent counting fun
  2010-07-05 23:46 ` Andreas Dilger
@ 2010-07-06  0:50   ` Eric Sandeen
  2010-07-09 21:10     ` Andreas Dilger
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Sandeen @ 2010-07-06  0:50 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4 development

Andreas Dilger wrote:
> On 2010-07-05, at 13:24, Eric Sandeen wrote:
>> [root@host ~]# filefrag -B /mnt/test/file
>> /mnt/test/file: 34 extents found
>> [root@host ~]# filefrag /mnt/test/file
>> /mnt/test/file: 1058 extents found, perfection would be 1 extent
>>
>> Hum, is it 34 or 1058? :)
> 
> What do the extents look like on disk?  Is this just because it is running on a block-mapped file and is skipping a singleton block periodically for indirect blocks, or is there a bug in the way the extents are being reported?

Well, I didn't actually look but I'm 98% sure it's just because it's
not reporting the interspersed metadata blocks.

Sorry, above was on ext3, that wasn't clear, just a stock dd-streamed
file.

>> Older filefrag counted contiguous metadata as part of a contiguous
>> extent... newer filefrag works in fiemap query-only mode by default,
>> and just takes what fiemap tells it.  The inconsistency is weird
>> though, and led to a Red Hat bug that I'm inclined to NOTABUG... but
>> do people think this needs to be made any more consistent?
>>
>> Should we hack ext3_fiemap() to include the checks for contiguous
>> metadata?  Or was that too shady/clever to start with ...? :)
> 
> I wouldn't object to having FIEMAP add a flag for metadata blocks.  I've always thought it would be useful to be able to query/dump metadata blocks (e.g. indirect/index blocks) and the inode itself.

Hm don't we have that already?

Hmm... just xattr I guess.

In any case it's still a question of whether ext3 extent count should
be "fudged" to make blocks separated by metadata look contiguous
or not ...

-Eric

> Cheers, Andreas
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: extent counting fun
  2010-07-06  0:50   ` Eric Sandeen
@ 2010-07-09 21:10     ` Andreas Dilger
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2010-07-09 21:10 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

On 2010-07-05, at 18:50, Eric Sandeen wrote:
> Andreas Dilger wrote:
>> What do the extents look like on disk?  Is this just because it is running on a block-mapped file and is skipping a singleton block periodically for indirect blocks, or is there a bug in the way the extents are being reported?
> 
> Well, I didn't actually look but I'm 98% sure it's just because it's not reporting the interspersed metadata blocks.
> 
> Sorry, above was on ext3, that wasn't clear, just a stock dd-streamed file.
> 
>>> I wouldn't object to having FIEMAP add a flag for metadata blocks.  I've always thought it would be useful to be able to query/dump metadata blocks (e.g. indirect/index blocks) and the inode itself.
> 
> Hm don't we have that already?  Hmm... just xattr I guess.

Right.

> In any case it's still a question of whether ext3 extent count should be "fudged" to make blocks separated by metadata look contiguous or not ...

Good question.  I don't know if there is any way to do this easily or correctly at the FIEMAP level, since it IS correctly returning that the file data is not contiguous.  I would be against having the FIEMAP ioctl return a non-accurate extent count just for this, since it would break applications that want to call it first to find the number of extents to be returned, then call it again and are surprised when there are more extents than what they expecting (i.e. skipping the end of the file).

At the filefrag level I can understand that it already knows intimate details of the on-disk layout for FIBMAP, so it wouldn't be terrible to have it "fudge" the extent reporting if there is a one-block gap every $BLOCKSIZE/4 blocks in the file data.  That would mean filefrag could not just "count" the extents, but would need to get the full extent data and process it for block-mapped files.

Actually requesting the metadata via a new FIEMAP_FLAG_METADATA in the request, and marking the returned extents with FIEMAP_EXTENT_METADATA would avoid the guesswork, since even filefrag with FIBMAP is not really sure that the indirect blocks are in the holes, it is just guessing.


In my previous postings on this topic, I thought it would make sense
to have the "extent" of e.g. an index block be the the logical range
of the file that this index block covers.  The inode itself might be
marked with FIEMAP_EXTENT_INODE|FIEMAP_EXTENT_METADATA and cover the
entire logical range of the file.

This would essentially allow the caller to generate a tree of the
metadata by heirarchical ordering of the extent ranges.

FIEMAP_EXTENT_INODE    [  0-  10GB] [itable blk offset, +inode size]
  FIEMAP_EXTENT_METADATA [  0-   1GB] [index blk offset, +blocksize]
    [data, no flag]      [  0- 128MB] [data offset, +128MB]
    [data, no flag]      [128- 256MB] [data offset, +128MB]
     :
     :
    [data, no flag]      [896-1024MB] [data offset, +128MB]
  FIEMAP_EXTENT_METADATA [1GB-   2GB] [index blk offset, +blocksize]
    [data, no flag]      [1024-1280MB] [data offset, +128MB]
     :
     :
  (etc)

Cheers, Andreas






^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-07-09 21:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-05 19:24 extent counting fun Eric Sandeen
2010-07-05 23:46 ` Andreas Dilger
2010-07-06  0:50   ` Eric Sandeen
2010-07-09 21:10     ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).