* [PATCH 1/2] ext4: reduce computation of overhead during resize
@ 2022-06-30 2:17 Kiselev, Oleg
2022-07-14 13:46 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Kiselev, Oleg @ 2022-06-30 2:17 UTC (permalink / raw)
To: linux-ext4@vger.kernel.org; +Cc: Theodore Ts'o
This patch avoids doing an O(n**2)-complexity walk through every flex group.
Instead, it uses the already computed overhead information for the newly
allocated space, and simply adds it to the previously calculated
overhead stored in the superblock. This drastically reduces the time
taken to resize very large bigalloc filesystems (from 3+ hours for a
64TB fs down to milliseconds).
Signed-off-by: Oleg Kiselev <okiselev@amazon.com>
---
fs/ext4/resize.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 8b70a4701293..2acc9fca99ea 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1380,6 +1380,16 @@ static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
return err;
}
+static void ext4_set_overhead(struct super_block *sb,
+ const ext4_grpblk_t overhead)
+{
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct ext4_super_block *es = sbi->s_es;
+ sbi->s_overhead += overhead;
+ es->s_overhead_clusters = cpu_to_le32((unsigned long) sbi->s_overhead);
+ smp_wmb();
+}
+
/*
* ext4_update_super() updates the super block so that the newly added
* groups can be seen by the filesystem.
@@ -1482,8 +1492,16 @@ static void ext4_update_super(struct super_block *sb,
/*
* Update the fs overhead information
+ *
+ * For bigalloc, if the superblock already has a properly calculated
+ * overhead, update it wth a value based on numbers already computed
+ * above for the newly allocated capacity.
*/
- ext4_calculate_overhead(sb);
+ if (ext4_has_feature_bigalloc(sb) && (sbi->s_overhead != 0))
+ ext4_set_overhead(sb,
+ EXT4_NUM_B2C(sbi, blocks_count - free_blocks));
+ else
+ ext4_calculate_overhead(sb);
if (test_opt(sb, DEBUG))
printk(KERN_DEBUG "EXT4-fs: added group %u:"
--
2.32.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH 1/2] ext4: reduce computation of overhead during resize
2022-06-30 2:17 [PATCH 1/2] ext4: reduce computation of overhead during resize Kiselev, Oleg
@ 2022-07-14 13:46 ` Jan Kara
2022-07-14 19:53 ` Kiselev, Oleg
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2022-07-14 13:46 UTC (permalink / raw)
To: Kiselev, Oleg; +Cc: linux-ext4@vger.kernel.org, Theodore Ts'o
On Thu 30-06-22 02:17:21, Kiselev, Oleg wrote:
> This patch avoids doing an O(n**2)-complexity walk through every flex group.
> Instead, it uses the already computed overhead information for the newly
> allocated space, and simply adds it to the previously calculated
> overhead stored in the superblock. This drastically reduces the time
> taken to resize very large bigalloc filesystems (from 3+ hours for a
> 64TB fs down to milliseconds).
>
> Signed-off-by: Oleg Kiselev <okiselev@amazon.com>
> ---
> fs/ext4/resize.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
Overall this looks fine, a few smaller comments below.
>
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 8b70a4701293..2acc9fca99ea 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -1380,6 +1380,16 @@ static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
> return err;
> }
>
> +static void ext4_set_overhead(struct super_block *sb,
> + const ext4_grpblk_t overhead)
> +{
ext4_add_overhead() would be a better name I suppose. Also the 'overhead'
should rather be ext4_fsblk_t to be on the safe side...
> + struct ext4_sb_info *sbi = EXT4_SB(sb);
> + struct ext4_super_block *es = sbi->s_es;
Empty line between variable declarations and the code please.
> + sbi->s_overhead += overhead;
> + es->s_overhead_clusters = cpu_to_le32((unsigned long) sbi->s_overhead);
^^^ the typecast looks
bogus here...
> + smp_wmb();
> +}
The barrier without any comment makes me really wonder why it is here...
But I get ext4_calculate_overhead() has is as well so you're just keeping
it.
> +
> /*
> * ext4_update_super() updates the super block so that the newly added
> * groups can be seen by the filesystem.
> @@ -1482,8 +1492,16 @@ static void ext4_update_super(struct super_block *sb,
>
> /*
> * Update the fs overhead information
> + *
> + * For bigalloc, if the superblock already has a properly calculated
> + * overhead, update it wth a value based on numbers already computed
^^ with
> + * above for the newly allocated capacity.
> */
> - ext4_calculate_overhead(sb);
> + if (ext4_has_feature_bigalloc(sb) && (sbi->s_overhead != 0))
> + ext4_set_overhead(sb,
> + EXT4_NUM_B2C(sbi, blocks_count - free_blocks));
> + else
> + ext4_calculate_overhead(sb);
>
> if (test_opt(sb, DEBUG))
> printk(KERN_DEBUG "EXT4-fs: added group %u:"
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH 1/2] ext4: reduce computation of overhead during resize
2022-07-14 13:46 ` Jan Kara
@ 2022-07-14 19:53 ` Kiselev, Oleg
2022-07-15 9:27 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Kiselev, Oleg @ 2022-07-14 19:53 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4@vger.kernel.org, Theodore Ts'o
Thanks for the review, Jan.
> On Jul 14, 2022, at 6:46 AM, Jan Kara <jack@suse.cz> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Thu 30-06-22 02:17:21, Kiselev, Oleg wrote:
>> This patch avoids doing an O(n**2)-complexity walk through every flex group.
>> Instead, it uses the already computed overhead information for the newly
>> allocated space, and simply adds it to the previously calculated
>> overhead stored in the superblock. This drastically reduces the time
>> taken to resize very large bigalloc filesystems (from 3+ hours for a
>> 64TB fs down to milliseconds).
>>
>> Signed-off-by: Oleg Kiselev <okiselev@amazon.com>
>> ---
>> fs/ext4/resize.c | 20 +++++++++++++++++++-
>> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> Overall this looks fine, a few smaller comments below.
>
>>
>> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
>> index 8b70a4701293..2acc9fca99ea 100644
>> --- a/fs/ext4/resize.c
>> +++ b/fs/ext4/resize.c
>> @@ -1380,6 +1380,16 @@ static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
>> return err;
>> }
>>
>> +static void ext4_set_overhead(struct super_block *sb,
>> + const ext4_grpblk_t overhead)
>> +{
>
> ext4_add_overhead() would be a better name I suppose. Also the 'overhead'
> should rather be ext4_fsblk_t to be on the safe side...
Done.
>
>> + struct ext4_sb_info *sbi = EXT4_SB(sb);
>> + struct ext4_super_block *es = sbi->s_es;
>
> Empty line between variable declarations and the code please.
Done
>
>> + sbi->s_overhead += overhead;
>> + es->s_overhead_clusters = cpu_to_le32((unsigned long) sbi->s_overhead);
> ^^^ the typecast looks
> bogus here...
This cast is the reverse of le32_to_cpu() cast done in fs/ext4/super.c:__ext4_fill_super():
sbi->s_overhead = le32_to_cpu(es->s_overhead_clusters);
And follows the logic of casting done in fs/ext4/ioctl.c:set_overhead() and fs/ext4/ioctl.c:ext4_update_overhead().
>
>> + smp_wmb();
>> +}
>
> The barrier without any comment makes me really wonder why it is here...
> But I get ext4_calculate_overhead() has is as well so you're just keeping
> it.
Yes, exactly. I am aware of only one place where s_overhead and s_overhead_clusters have to be coherent, in ext4_update_overhead(), which potentially may be called concurrently with the resize. Keeping the write barrier here seemed like a safe choice, since it’s not in the regular IO path, so the added expense is incurred only very rarely.
>
>> +
>> /*
>> * ext4_update_super() updates the super block so that the newly added
>> * groups can be seen by the filesystem.
>> @@ -1482,8 +1492,16 @@ static void ext4_update_super(struct super_block *sb,
>>
>> /*
>> * Update the fs overhead information
>> + *
>> + * For bigalloc, if the superblock already has a properly calculated
>> + * overhead, update it wth a value based on numbers already computed
> ^^ with
>
>> + * above for the newly allocated capacity.
>> */
>> - ext4_calculate_overhead(sb);
>> + if (ext4_has_feature_bigalloc(sb) && (sbi->s_overhead != 0))
>> + ext4_set_overhead(sb,
>> + EXT4_NUM_B2C(sbi, blocks_count - free_blocks));
>> + else
>> + ext4_calculate_overhead(sb);
>>
>> if (test_opt(sb, DEBUG))
>> printk(KERN_DEBUG "EXT4-fs: added group %u:"
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
I’ll resubmit the updated patch.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH 1/2] ext4: reduce computation of overhead during resize
2022-07-14 19:53 ` Kiselev, Oleg
@ 2022-07-15 9:27 ` Jan Kara
2022-07-15 23:52 ` Kiselev, Oleg
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2022-07-15 9:27 UTC (permalink / raw)
To: Kiselev, Oleg; +Cc: Jan Kara, linux-ext4@vger.kernel.org, Theodore Ts'o
On Thu 14-07-22 19:53:38, Kiselev, Oleg wrote:
> >
> >> + sbi->s_overhead += overhead;
> >> + es->s_overhead_clusters = cpu_to_le32((unsigned long) sbi->s_overhead);
> > ^^^ the typecast looks
> > bogus here...
>
> This cast is the reverse of le32_to_cpu() cast done in fs/ext4/super.c:__ext4_fill_super():
> sbi->s_overhead = le32_to_cpu(es->s_overhead_clusters);
> And follows the logic of casting done in fs/ext4/ioctl.c:set_overhead() and fs/ext4/ioctl.c:ext4_update_overhead().
I didn't mean the cpu_to_le32() call but rather the (unsigned long) part.
That is pointless because sbi->s_overhead is already 'unsigned long' and
even if it was not, I have hard time seeing a reason why would casting to
unsigned long make any difference here.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] ext4: reduce computation of overhead during resize
2022-07-15 9:27 ` Jan Kara
@ 2022-07-15 23:52 ` Kiselev, Oleg
0 siblings, 0 replies; 5+ messages in thread
From: Kiselev, Oleg @ 2022-07-15 23:52 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4@vger.kernel.org, Theodore Ts'o
> On Jul 15, 2022, at 2:27 AM, Jan Kara <jack@suse.cz> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Thu 14-07-22 19:53:38, Kiselev, Oleg wrote:
>>>
>>>> + sbi->s_overhead += overhead;
>>>> + es->s_overhead_clusters = cpu_to_le32((unsigned long) sbi->s_overhead);
>>> ^^^ the typecast looks
>>> bogus here...
>>
>> This cast is the reverse of le32_to_cpu() cast done in fs/ext4/super.c:__ext4_fill_super():
>> sbi->s_overhead = le32_to_cpu(es->s_overhead_clusters);
>> And follows the logic of casting done in fs/ext4/ioctl.c:set_overhead() and fs/ext4/ioctl.c:ext4_update_overhead().
>
> I didn't mean the cpu_to_le32() call but rather the (unsigned long) part.
> That is pointless because sbi->s_overhead is already 'unsigned long' and
> even if it was not, I have hard time seeing a reason why would casting to
> unsigned long make any difference here.
Got it. You are right. The indent of your comment got mangled by mail, so it looked like it was directed to cpu_to_ie32()!
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-07-15 23:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-30 2:17 [PATCH 1/2] ext4: reduce computation of overhead during resize Kiselev, Oleg
2022-07-14 13:46 ` Jan Kara
2022-07-14 19:53 ` Kiselev, Oleg
2022-07-15 9:27 ` Jan Kara
2022-07-15 23:52 ` Kiselev, Oleg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox