From: Kemeng Shi <shikemeng@huaweicloud.com>
To: tytso@mit.edu, adilger.kernel@dilger.ca, ritesh.list@gmail.com
Cc: ojaswin@linux.ibm.com, linux-ext4@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v7 08/12] ext4: Separate block bitmap and buddy bitmap freeing in ext4_group_add_blocks()
Date: Wed, 20 Sep 2023 04:15:28 +0800 [thread overview]
Message-ID: <20230919201532.310085-9-shikemeng@huaweicloud.com> (raw)
In-Reply-To: <20230919201532.310085-1-shikemeng@huaweicloud.com>
This patch separates block bitmap and buddy bitmap freeing in order to
udpate block bitmap with ext4_mb_mark_context in following patch.
The reason why this can be sperated is explained in previous submit.
Put the explanation here to simplify the code archeology to
ext4_group_add_blocks():
Separated freeing is safe with concurrent allocation as long as:
1. Firstly allocate block in buddy bitmap, and then in block bitmap.
2. Firstly free block in block bitmap, and then buddy bitmap.
Then freed block will only be available to allocation when both buddy
bitmap and block bitmap are updated by freeing.
Allocation obeys rule 1 already, just do sperated freeing with rule 2.
Separated freeing has no race with generate_buddy as:
Once ext4_mb_load_buddy_gfp is executed successfully, the update-to-date
buddy page can be found in sbi->s_buddy_cache and no more buddy
initialization of the buddy page will be executed concurrently until
buddy page is unloaded. As we always do free in "load buddy, free,
unload buddy" sequence, separated freeing has no race with generate_buddy.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
fs/ext4/mballoc.c | 54 +++++++++++++++++++++++------------------------
1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 730397af6975..404121445fc3 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6685,35 +6685,39 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
ext4_warning(sb, "too many blocks added to group %u",
block_group);
err = -EINVAL;
- goto error_return;
+ goto error_out;
+ }
+
+ err = ext4_mb_load_buddy(sb, block_group, &e4b);
+ if (err)
+ goto error_out;
+
+ if (!ext4_sb_block_valid(sb, NULL, block, count)) {
+ ext4_error(sb, "Adding blocks in system zones - "
+ "Block = %llu, count = %lu",
+ block, count);
+ err = -EINVAL;
+ goto error_clean;
}
bitmap_bh = ext4_read_block_bitmap(sb, block_group);
if (IS_ERR(bitmap_bh)) {
err = PTR_ERR(bitmap_bh);
bitmap_bh = NULL;
- goto error_return;
+ goto error_clean;
}
desc = ext4_get_group_desc(sb, block_group, &gd_bh);
if (!desc) {
err = -EIO;
- goto error_return;
- }
-
- if (!ext4_sb_block_valid(sb, NULL, block, count)) {
- ext4_error(sb, "Adding blocks in system zones - "
- "Block = %llu, count = %lu",
- block, count);
- err = -EINVAL;
- goto error_return;
+ goto error_clean;
}
BUFFER_TRACE(bitmap_bh, "getting write access");
err = ext4_journal_get_write_access(handle, sb, bitmap_bh,
EXT4_JTR_NONE);
if (err)
- goto error_return;
+ goto error_clean;
/*
* We are about to modify some metadata. Call the journal APIs
@@ -6723,7 +6727,7 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
BUFFER_TRACE(gd_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, sb, gd_bh, EXT4_JTR_NONE);
if (err)
- goto error_return;
+ goto error_clean;
for (i = 0, clusters_freed = 0; i < cluster_count; i++) {
BUFFER_TRACE(bitmap_bh, "clear bit");
@@ -6736,26 +6740,14 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
}
}
- err = ext4_mb_load_buddy(sb, block_group, &e4b);
- if (err)
- goto error_return;
-
- /*
- * need to update group_info->bb_free and bitmap
- * with group lock held. generate_buddy look at
- * them with group lock_held
- */
ext4_lock_group(sb, block_group);
mb_clear_bits(bitmap_bh->b_data, bit, cluster_count);
- mb_free_blocks(NULL, &e4b, bit, cluster_count);
free_clusters_count = clusters_freed +
ext4_free_group_clusters(sb, desc);
ext4_free_group_clusters_set(sb, desc, free_clusters_count);
ext4_block_bitmap_csum_set(sb, desc, bitmap_bh);
ext4_group_desc_csum_set(sb, block_group, desc);
ext4_unlock_group(sb, block_group);
- percpu_counter_add(&sbi->s_freeclusters_counter,
- clusters_freed);
if (sbi->s_log_groups_per_flex) {
ext4_group_t flex_group = ext4_flex_group(sbi, block_group);
@@ -6764,8 +6756,6 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
flex_group)->free_clusters);
}
- ext4_mb_unload_buddy(&e4b);
-
/* We dirtied the bitmap block */
BUFFER_TRACE(bitmap_bh, "dirtied bitmap block");
err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
@@ -6776,8 +6766,16 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
if (!err)
err = ret;
-error_return:
+ ext4_lock_group(sb, block_group);
+ mb_free_blocks(NULL, &e4b, bit, cluster_count);
+ ext4_unlock_group(sb, block_group);
+ percpu_counter_add(&sbi->s_freeclusters_counter,
+ clusters_freed);
+
+error_clean:
brelse(bitmap_bh);
+ ext4_mb_unload_buddy(&e4b);
+error_out:
ext4_std_error(sb, err);
return err;
}
--
2.30.0
next prev parent reply other threads:[~2023-09-19 12:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-19 20:15 [PATCH v7 00/12] cleanups and unit test for mballoc Kemeng Shi
2023-09-19 20:15 ` [PATCH v7 01/12] ext4: make state in ext4_mb_mark_bb to be bool Kemeng Shi
2023-09-27 6:10 ` Ritesh Harjani
2023-09-27 6:51 ` Kemeng Shi
2023-09-19 20:15 ` [PATCH v7 02/12] ext4: factor out codes to update block bitmap and group descriptor on disk from ext4_mb_mark_bb Kemeng Shi
2023-09-27 8:49 ` Ritesh Harjani
2023-09-28 3:31 ` Kemeng Shi
2023-09-28 4:42 ` Ritesh Harjani
2023-09-28 7:45 ` Kemeng Shi
2023-09-19 20:15 ` [PATCH v7 03/12] ext4: call ext4_mb_mark_context in ext4_free_blocks_simple Kemeng Shi
2023-09-27 8:52 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 04/12] ext4: extend ext4_mb_mark_context to support allocation under journal Kemeng Shi
2023-09-27 9:13 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 05/12] ext4: call ext4_mb_mark_context in ext4_mb_mark_diskspace_used Kemeng Shi
2023-09-27 9:58 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 06/12] ext4: Separate block bitmap and buddy bitmap freeing in ext4_mb_clear_bb() Kemeng Shi
2023-09-27 11:06 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 07/12] ext4: call ext4_mb_mark_context in ext4_mb_clear_bb Kemeng Shi
2023-09-27 11:14 ` Ritesh Harjani
2023-09-19 20:15 ` Kemeng Shi [this message]
2023-09-27 11:16 ` [PATCH v7 08/12] ext4: Separate block bitmap and buddy bitmap freeing in ext4_group_add_blocks() Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 09/12] ext4: call ext4_mb_mark_context " Kemeng Shi
2023-09-27 11:19 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 10/12] ext4: add some kunit stub for mballoc kunit test Kemeng Shi
2023-09-19 20:15 ` [PATCH v7 11/12] ext4: add first unit test for ext4_mb_new_blocks_simple in mballoc Kemeng Shi
2023-09-27 11:39 ` Ritesh Harjani
2023-09-19 20:15 ` [PATCH v7 12/12] ext4: run mballoc test with different layouts setting Kemeng Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230919201532.310085-9-shikemeng@huaweicloud.com \
--to=shikemeng@huaweicloud.com \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).