All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: brookxu <brookxu.cn@gmail.com>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org
Subject: Re: [PATCH RESEND 4/8] ext4: add the gdt block of meta_bg to system_zone
Date: Tue, 15 Dec 2020 15:13:06 -0500	[thread overview]
Message-ID: <X9kY0htqhvFDNn20@mit.edu> (raw)
In-Reply-To: <1704f274-fe41-4215-8e6e-ff09d080cdd5@gmail.com>

You did your test on a 80T file system, but that's not where someone
would be using meta_bg.  Meta_bg ges used for much larger file systems
than that!  With meta_bg, we have 3 block group descriptors every 64
block groups.  Each block group describes 128M of memory.  So for that
means we are going to have 3 entries in the system zone tree for every_
8GB of file system space, 383,216 entries for every PB.  Given that
each entry is 40 bytes, that means that the block_validity entries
will consume 15 megabytes per PB.

Now, one third of these entries overlap with the flex_bg entries
(meta_bg groups are in the first, second, and last block group of each
meta_bg, where are 64 block groups in 4k file systems), and of course,
the default flex_bg size of 16 block groups means that there are
524,288 entries per PB.  So if we include all backup sb and block
groups, in a 1 PB file system, there will be roughly 786,432 entries
in a 1 PB file system.  (I'm ignoring the entries for the backup
superblocks, but that's only about 20 or so extra entries.)

So for a flex_bg 1PB file system, the amount of memory for a
block_validity data structure is roughly 20M, and including all backup
descriptors for meta_bg on a flex_bg + meta_bg setup is roughly 30M.

I agree with you that for a non-meta_bg file system, including all of
the backup superblock and block group descriptors is not going to be
large.  But while protecting the meta_bg group descriptors is
worthwhile, protecting the backup meta_bg's is not free, and will
increase the size of the tree by 33%.

I'm also wondering whether or not Lustre (where they do have some file
systems that are in the PB range) have run into overhead issues with
block_validity.

What do folks think?

						- Ted

  reply	other threads:[~2020-12-15 20:19 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-07 15:58 [PATCH RESEND 1/8] ext4: use ext4_assert() to replace J_ASSERT() Chunguang Xu
2020-11-07 15:58 ` [PATCH RESEND 2/8] ext4: remove redundant mb_regenerate_buddy() Chunguang Xu
2020-12-03 14:42   ` Theodore Y. Ts'o
2020-11-07 15:58 ` [PATCH RESEND 3/8] ext4: simplify the code of mb_find_order_for_block Chunguang Xu
2020-12-03 14:43   ` Theodore Y. Ts'o
2020-11-07 15:58 ` [PATCH RESEND 4/8] ext4: add the gdt block of meta_bg to system_zone Chunguang Xu
2020-12-03 15:08   ` Theodore Y. Ts'o
2020-12-04  1:26     ` brookxu
2020-12-09  4:34       ` Theodore Y. Ts'o
2020-12-09 11:48         ` brookxu
2020-12-09 19:39           ` Theodore Y. Ts'o
2020-12-10 11:00             ` brookxu
2020-12-15  1:14             ` brookxu
2020-12-15 20:13               ` Theodore Y. Ts'o [this message]
2020-12-17 16:01                 ` Andreas Dilger
2020-12-04  1:29     ` brookxu
2020-11-07 15:58 ` [PATCH RESEND 5/8] ext4: update ext4_data_block_valid related comments Chunguang Xu
2020-12-09 19:11   ` Theodore Y. Ts'o
2020-11-07 15:58 ` [PATCH 6/8] ext4: add a helper function to validate metadata block Chunguang Xu
2020-12-09  4:55   ` Theodore Y. Ts'o
2020-12-09 12:12     ` brookxu
2020-11-07 15:58 ` [PATCH RESEND 7/8] ext4: delete invalid code inside ext4_xattr_block_set() Chunguang Xu
2020-12-09 19:24   ` Theodore Y. Ts'o
2020-11-07 15:58 ` [PATCH RESEND 8/8] ext4: fix a memory leak of ext4_free_data Chunguang Xu
2020-12-09 19:29   ` Theodore Y. Ts'o
2020-12-03 14:38 ` [PATCH RESEND 1/8] ext4: use ext4_assert() to replace J_ASSERT() Theodore Y. Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X9kY0htqhvFDNn20@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=brookxu.cn@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.