Re: ext4: Used block count in df

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Eric Sandeen <sandeen@redhat.com>
To: Adil Mujeeb <mujeeb.adil@gmail.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: ext4: Used block count in df
Date: Tue, 12 Feb 2013 10:01:30 -0600	[thread overview]
Message-ID: <511A675A.8050004@redhat.com> (raw)
In-Reply-To: <CANBXnMkKA7qjMoO-yEGhbyKYQMRzvVA+GExg_KGCWPEHNVnyDQ@mail.gmail.com>

On 2/12/13 12:14 AM, Adil Mujeeb wrote:
> Hi,
> 
>> My only point is, default ext4 statfs behavior is quite complicated, and it
>> looks like you have found a bug related to the calculation of metadata overhead.
> 
> I see.
> Where should I report this issue to get it confirm by developers?

Here is fine.  :)

It would be good to file a bug on bugzilla.kernel.org too if you like.

The problem is, I think ext4's metadata behavior has gotten so complex,
the consensus so far seems to be to just accept the inaccuracy in this
style of df reporting:

 * Note: calculating the overhead so we can be compatible with
 * historical BSD practice is quite difficult in the face of
 * clusters/bigalloc.  This is because multiple metadata blocks from
 * different block group can end up in the same allocation cluster.
 * Calculating the exact overhead in the face of clustered allocation
 * requires either O(all block bitmaps) in memory or O(number of block
 * groups**2) in time.  We will still calculate the superblock for
 * older file systems --- and if we come across with a bigalloc file
 * system with zero in s_overhead_clusters the estimate will be close to
 * correct ...

but it is odd behavior, and filing a bug would probably be good.

-Eric

>> It should only be a reporting issue, and should not cause any runtime issues.
> 
> OK, I understand.
> 
> Thanks,
> Adil
> 
> On Mon, Feb 11, 2013 at 11:02 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 2/11/13 12:36 AM, Adil Mujeeb wrote:
>>> Thanks Eric.
>>>
>>>>> I have an observation on EXT4 filesystem. I created filesystem of size
>>>>> 1TB, 4TB, and 7TB and then checked the output of df command.
>>>>
>>>> Telling us which version of e2fsprogs and which kernel would be helpful,
>>>> but:
>>>
>>> its 1.41.12.
>>>
>>>> It reserves blocks for the superuser (5% by default) and also uses a lot
>>>> of blocks up-front for filesytem metadata - inode tables, block bitmaps,
>>>> and the like.
>>>
>>> I also thinks so. But with this assumption, the number of 1KB blocks
>>> used should increase as per filesystem size increase. No?
>>>
>>>>
>>>> But what you are seeing here is this:
>>>>
>>>> It also defaults to "bsd df" which does not count filesystem
>>>> metadata when telling you about the number of blocks used.  So in theory,
>>>> a freshly made fs should actually tell you 0 blocks used, I think.
>>>
>>> Agree if "bsd df" assumes so.
>>>
>>>> Looking at the dumpe2fs output for the 4t file, I see:
>>>>
>>>> # dumpe2fs -h 4tfile-ext4 | grep -i block
>>>> dumpe2fs 1.41.12 (17-May-2010)
>>>> Block count:              1073741824
>>>> Reserved block count:     53687091
>>>> Free blocks:              1056843748
>>>> ...
>>>>
>>>> and 1073741824-1056843748 is 16898076 4k blocks, or 67592304 1k blocks
>>>> actually used.
>>>>
>>>> If we ask for "minix df" by mounting with -o minixdf which is true blocks used, we get:
>>>>
>>>> # df 4t-ext4/
>>>> Filesystem           1K-blocks      Used Available Use% Mounted on
>>>> /mnt/test2/mkfs-test/4tfile-ext4
>>>>                      4294967296  67592304 4012626628   2% /mnt/test2/mkfs-test/4t-ext4
>>>>
>>>> I'd say this appears to be a slight inaccuracy in ext4_statfs, coupled with
>>>> the strangeness of the "bsd df" reporting.  It is apparently miscalculating
>>>> the filesystem metadata "overhead."
>>>
>>> In your example, dumpe2fs and minix df both are reporting same value, isn't it?
>>>
>>> I am still not able to understand why increasing the filesystem size
>>> decreases used 1K block count :(
>>> Am I missing some basic things here? Sorry if i am not able to catch
>>> your point :(
>>
>> My only point is, default ext4 statfs behavior is quite complicated, and it
>> looks like you have found a bug related to the calculation of metadata overhead.
>>
>> It should only be a reporting issue, and should not cause any runtime issues.
>>
>> Thanks,
>> -Eric
>>
>>> Regards,
>>> Adil
>>
>>

next prev parent reply	other threads:[~2013-02-12 16:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-07  6:39 ext4: Used block count in df Adil Mujeeb
2013-02-07 16:49 ` Eric Sandeen
2013-02-11  6:36   ` Adil Mujeeb
2013-02-11 17:32     ` Eric Sandeen
2013-02-11 17:53       ` Eric Sandeen
2013-02-12  6:14       ` Adil Mujeeb
2013-02-12 16:01         ` Eric Sandeen [this message]
2013-02-13  5:16           ` Adil Mujeeb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=511A675A.8050004@redhat.com \
    --to=sandeen@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mujeeb.adil@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).