From: "Theodore Ts'o" <tytso@mit.edu>
To: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 12/13] ext4: remove unnecessary check for avoiding multiple update_backups in ext4_flex_group_add
Date: Fri, 18 Aug 2023 01:00:36 -0400 [thread overview]
Message-ID: <20230818050036.GG3464136@mit.edu> (raw)
In-Reply-To: <cfad4b27-3174-1124-1516-a2ddb3843639@huaweicloud.com>
On Fri, Aug 18, 2023 at 11:16:35AM +0800, Kemeng Shi wrote:
> Ah, I guess here is the thing I missed that make this confusing:
> sbi->s_group_desc contains only primary block of each group. i.e.
> sbi->s_group_desc['i'] is the primary gdb block of group 'i'.
Correct. In fact, when we need to modify a block group descriptor for
a group, we call ext4_get_group_desc(), and it references
sbi->s_group_desc to find the appropriate buffer_head for the bg
descriptor that we want to modify.
I'm not sure "only" is the right adjective to use here, since the
whole *point* of s_group_desc[] is to keep the buffer_heads for the
block group descriptor blocks in memory, so we can modify them when we
allocate or free blocks, inodes, etc. And we only modify the primary
block group descriptors.
The secondary, or backup block group descriptors are only by used
e2fsck when the primary block group descriptor has been overwritten,
so we can find the inode table, allocation bitmaps, and so on. So we
do *not* modify them in the course of normal operations, and that's by
design. The only time the kernel will update those block group
descriptors is when we are doing an online resize, and we need make
sure the backup descriptors are updated, so that if the primary
descriptors get completely smashed, we can still get critical
information such as the location of that block group's inode table.
> From add_new_gdb and add_new_gdb_meta_bg, we can find that we always
> add primary gdb block of new group to s_group_desc. To be more specific:
> add_new_gdb
> gdblock = EXT4_SB(sb)->s_sbh->b_blocknr + 1 + gdb_num;
> gdb_bh = ext4_sb_bread(sb, gdblock, 0);
> n_group_desc[gdb_num] = gdb_bh;
>
> add_new_gdb_meta_bg
> gdblock = ext4_meta_bg_first_block_no(sb, group) +
> ext4_bg_has_super(sb, group);
> gdb_bh = ext4_sb_bread(sb, gdblock, 0);
> n_group_desc[gdb_num] = gdb_bh;
Put another way, there are EXT4_DESC_PER_BLOCK(sb) bg descriptors in a
block. For a file system with the 64-bit feature enabled, the size of
the block group descriptor is 128. If the block size is 4096, then we
can fit 32 block group descriptors in a block.
When we add a new block group such that its block group descriptor
will spill into a new block, then we need to expand s_group_desc[]
array, and initialize the new block group descriptor block. And
that's the job of add_new_gdb() and add_new_gdb_meta_bg().
> > More generally, this whole patch series is making it clear that the
> > online resize code is hard to understand, because it's super
> > complicated, leading to potential bugs, and very clearly code which is
> > very hard to maintain. So this may mean we need better comments to
> > make it clear *when* the backup block groups are going to be
> > accomplished, given the various different cases (e.g., no flex_bg or
> > meta_bg, with flex_bg, or with meat_bg).
> >
> Yes, I agree with this. I wonder if a series to add comments of some
> common rules is good to you. Like the information mentioned above
> that s_group_desc contains primary gdb block of each group.
Well, the meaning of s_group_desc[] was obvious to me, but that's why
it's useful to have somone with "new eyes" take a look at code, since
what may be obvious to old hands might not be obvious to someone
looking at the code for the first time --- and so yes, it's probably
worth documenting. The question is where is the best place to put it,
since the primary place where s_group_desc[] is *not* online resize.
s_group_desc[] is initialized in ext4_group_desc_init() in
fs/ext4/super.c, and it is used in fs/ext4/balloc.c, and of course, it
is defined in fs/ext4.h.
- Ted
next prev parent reply other threads:[~2023-08-18 5:01 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-29 12:00 [PATCH 00/13] fixes and cleanups to ext4 resize Kemeng Shi
2023-06-29 12:00 ` [PATCH 01/13] ext4: correct offset of gdb backup in non meta_bg group to update_backups Kemeng Shi
2023-08-16 3:03 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 02/13] ext4: add missed brelse in update_backups Kemeng Shi
2023-08-16 3:03 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 03/13] ext4: correct return value of ext4_convert_meta_bg Kemeng Shi
2023-08-16 3:00 ` Theodore Ts'o
2023-08-17 2:48 ` Kemeng Shi
2023-08-17 13:58 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 04/13] ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks Kemeng Shi
2023-08-16 3:10 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 05/13] ext4: fix typo " Kemeng Shi
2023-08-16 3:10 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 06/13] ext4: remove redundant check of count Kemeng Shi
2023-08-16 3:14 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 07/13] ext4: remove commented code in reserve_backup_gdb Kemeng Shi
2023-08-16 3:14 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 08/13] ext4: calculate free_clusters_count in cluster unit in verify_group_input Kemeng Shi
2023-08-16 3:22 ` Theodore Ts'o
2023-08-17 2:57 ` Kemeng Shi
2023-06-29 12:00 ` [PATCH 09/13] ext4: remove EXT4FS_DEBUG defination in resize.c Kemeng Shi
2023-08-16 3:23 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 10/13] ext4: use saved local variable sbi instead of EXT4_SB(sb) Kemeng Shi
2023-08-16 3:24 ` Theodore Ts'o
2023-06-29 12:00 ` [PATCH 11/13] ext4: correct gdblock calculation in add_new_gdb_meta_bg to support non first group Kemeng Shi
2023-08-16 3:45 ` Theodore Ts'o
2023-08-17 3:38 ` Kemeng Shi
2023-08-17 14:03 ` Theodore Ts'o
2023-08-18 2:29 ` Kemeng Shi
2023-08-18 4:07 ` Theodore Ts'o
2023-08-18 7:09 ` Kemeng Shi
2023-08-18 16:54 ` Theodore Ts'o
2023-08-22 2:48 ` Kemeng Shi
2023-06-29 12:00 ` [PATCH 12/13] ext4: remove unnecessary check for avoiding multiple update_backups in ext4_flex_group_add Kemeng Shi
2023-08-16 3:47 ` Theodore Ts'o
2023-08-17 3:53 ` Kemeng Shi
2023-08-17 14:11 ` Theodore Ts'o
2023-08-18 3:16 ` Kemeng Shi
2023-08-18 5:00 ` Theodore Ts'o [this message]
2023-08-18 8:42 ` Kemeng Shi
2023-08-18 16:00 ` Theodore Ts'o
2023-08-22 2:21 ` Kemeng Shi
2023-06-29 12:00 ` [PATCH 13/13] ext4: remove unnecessary initialization of count2 in set_flexbg_block_bitmap Kemeng Shi
2023-08-16 3:48 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230818050036.GG3464136@mit.edu \
--to=tytso@mit.edu \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shikemeng@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).