* [PATCH] ext4: fix FS_IOC_GETFSMAP handling
@ 2024-10-23 13:59 Theodore Ts'o
2024-10-25 7:06 ` Zhang Yi
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2024-10-23 13:59 UTC (permalink / raw)
To: Ext4 Developers List; +Cc: Darrick J. Wong, Theodore Ts'o
The original implementation ext4's FS_IOC_GETFSMAP handling only
worked when the range of queried blocks included at least one free
(unallocated) block range. This is because how the metadata blocks
were emitted was as a side effect of ext4_mballoc_query_range()
calling ext4_getfsmap_datadev_helper(), and that function was only
called when a free block range was identified. As a result, this
caused generic/365 to fail.
Fix this by creating a new function ext4_getfsmap_meta_helper() which
gets called so that blocks before the first free block range in a
block group can get properly reported.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
fs/ext4/fsmap.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-
fs/ext4/mballoc.c | 18 ++++++++++++----
fs/ext4/mballoc.h | 1 +
3 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
index df853c4d3a8c..383c6edea6dd 100644
--- a/fs/ext4/fsmap.c
+++ b/fs/ext4/fsmap.c
@@ -185,6 +185,56 @@ static inline ext4_fsblk_t ext4_fsmap_next_pblk(struct ext4_fsmap *fmr)
return fmr->fmr_physical + fmr->fmr_length;
}
+static int ext4_getfsmap_meta_helper(struct super_block *sb,
+ ext4_group_t agno, ext4_grpblk_t start,
+ ext4_grpblk_t len, void *priv)
+{
+ struct ext4_getfsmap_info *info = priv;
+ struct ext4_fsmap *p;
+ struct ext4_fsmap *tmp;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ ext4_fsblk_t fsb, fs_start, fs_end;
+ int error;
+
+ fs_start = fsb = (EXT4_C2B(sbi, start) +
+ ext4_group_first_block_no(sb, agno));
+ fs_end = fs_start + EXT4_C2B(sbi, len);
+
+ /* Return relevant extents from the meta_list */
+ list_for_each_entry_safe(p, tmp, &info->gfi_meta_list, fmr_list) {
+ if (p->fmr_physical < info->gfi_next_fsblk) {
+ list_del(&p->fmr_list);
+ kfree(p);
+ continue;
+ }
+ if (p->fmr_physical <= fs_start ||
+ p->fmr_physical + p->fmr_length <= fs_end) {
+ /* Emit the retained free extent record if present */
+ if (info->gfi_lastfree.fmr_owner) {
+ error = ext4_getfsmap_helper(sb, info,
+ &info->gfi_lastfree);
+ if (error)
+ return error;
+ info->gfi_lastfree.fmr_owner = 0;
+ }
+ error = ext4_getfsmap_helper(sb, info, p);
+ if (error)
+ return error;
+ fsb = p->fmr_physical + p->fmr_length;
+ if (info->gfi_next_fsblk < fsb)
+ info->gfi_next_fsblk = fsb;
+ list_del(&p->fmr_list);
+ kfree(p);
+ continue;
+ }
+ }
+ if (info->gfi_next_fsblk < fsb)
+ info->gfi_next_fsblk = fsb;
+
+ return 0;
+}
+
+
/* Transform a blockgroup's free record into a fsmap */
static int ext4_getfsmap_datadev_helper(struct super_block *sb,
ext4_group_t agno, ext4_grpblk_t start,
@@ -539,6 +589,7 @@ static int ext4_getfsmap_datadev(struct super_block *sb,
error = ext4_mballoc_query_range(sb, info->gfi_agno,
EXT4_B2C(sbi, info->gfi_low.fmr_physical),
EXT4_B2C(sbi, info->gfi_high.fmr_physical),
+ ext4_getfsmap_meta_helper,
ext4_getfsmap_datadev_helper, info);
if (error)
goto err;
@@ -560,7 +611,8 @@ static int ext4_getfsmap_datadev(struct super_block *sb,
/* Report any gaps at the end of the bg */
info->gfi_last = true;
- error = ext4_getfsmap_datadev_helper(sb, end_ag, last_cluster, 0, info);
+ error = ext4_getfsmap_datadev_helper(sb, end_ag, last_cluster + 1,
+ 0, info);
if (error)
goto err;
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index d73e38323879..92f49d7eb3c0 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6999,13 +6999,14 @@ int
ext4_mballoc_query_range(
struct super_block *sb,
ext4_group_t group,
- ext4_grpblk_t start,
+ ext4_grpblk_t first,
ext4_grpblk_t end,
+ ext4_mballoc_query_range_fn meta_formatter,
ext4_mballoc_query_range_fn formatter,
void *priv)
{
void *bitmap;
- ext4_grpblk_t next;
+ ext4_grpblk_t start, next;
struct ext4_buddy e4b;
int error;
@@ -7016,10 +7017,19 @@ ext4_mballoc_query_range(
ext4_lock_group(sb, group);
- start = max(e4b.bd_info->bb_first_free, start);
+ start = max(e4b.bd_info->bb_first_free, first);
if (end >= EXT4_CLUSTERS_PER_GROUP(sb))
end = EXT4_CLUSTERS_PER_GROUP(sb) - 1;
-
+ if (meta_formatter && start != first) {
+ if (start > end)
+ start = end;
+ ext4_unlock_group(sb, group);
+ error = meta_formatter(sb, group, first, start - first,
+ priv);
+ if (error)
+ goto out_unload;
+ ext4_lock_group(sb, group);
+ }
while (start <= end) {
start = mb_find_next_zero_bit(bitmap, end + 1, start);
if (start > end)
diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h
index d8553f1498d3..f8280de3e882 100644
--- a/fs/ext4/mballoc.h
+++ b/fs/ext4/mballoc.h
@@ -259,6 +259,7 @@ ext4_mballoc_query_range(
ext4_group_t agno,
ext4_grpblk_t start,
ext4_grpblk_t end,
+ ext4_mballoc_query_range_fn meta_formatter,
ext4_mballoc_query_range_fn formatter,
void *priv);
--
2.45.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: fix FS_IOC_GETFSMAP handling
2024-10-23 13:59 [PATCH] ext4: fix FS_IOC_GETFSMAP handling Theodore Ts'o
@ 2024-10-25 7:06 ` Zhang Yi
2024-10-25 15:42 ` Theodore Ts'o
0 siblings, 1 reply; 6+ messages in thread
From: Zhang Yi @ 2024-10-25 7:06 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Darrick J. Wong, Ext4 Developers List
On 2024/10/23 21:59, Theodore Ts'o wrote:
> The original implementation ext4's FS_IOC_GETFSMAP handling only
> worked when the range of queried blocks included at least one free
> (unallocated) block range. This is because how the metadata blocks
> were emitted was as a side effect of ext4_mballoc_query_range()
> calling ext4_getfsmap_datadev_helper(), and that function was only
> called when a free block range was identified. As a result, this
> caused generic/365 to fail.
>
> Fix this by creating a new function ext4_getfsmap_meta_helper() which
> gets called so that blocks before the first free block range in a
> block group can get properly reported.
>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
> fs/ext4/fsmap.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-
> fs/ext4/mballoc.c | 18 ++++++++++++----
> fs/ext4/mballoc.h | 1 +
> 3 files changed, 68 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
> index df853c4d3a8c..383c6edea6dd 100644
> --- a/fs/ext4/fsmap.c
> +++ b/fs/ext4/fsmap.c
> @@ -185,6 +185,56 @@ static inline ext4_fsblk_t ext4_fsmap_next_pblk(struct ext4_fsmap *fmr)
> return fmr->fmr_physical + fmr->fmr_length;
> }
>
> +static int ext4_getfsmap_meta_helper(struct super_block *sb,
> + ext4_group_t agno, ext4_grpblk_t start,
> + ext4_grpblk_t len, void *priv)
> +{
> + struct ext4_getfsmap_info *info = priv;
> + struct ext4_fsmap *p;
> + struct ext4_fsmap *tmp;
> + struct ext4_sb_info *sbi = EXT4_SB(sb);
> + ext4_fsblk_t fsb, fs_start, fs_end;
> + int error;
> +
> + fs_start = fsb = (EXT4_C2B(sbi, start) +
> + ext4_group_first_block_no(sb, agno));
> + fs_end = fs_start + EXT4_C2B(sbi, len);
> +
> + /* Return relevant extents from the meta_list */
> + list_for_each_entry_safe(p, tmp, &info->gfi_meta_list, fmr_list) {
> + if (p->fmr_physical < info->gfi_next_fsblk) {
> + list_del(&p->fmr_list);
> + kfree(p);
> + continue;
> + }
> + if (p->fmr_physical <= fs_start ||
> + p->fmr_physical + p->fmr_length <= fs_end) {
> + /* Emit the retained free extent record if present */
> + if (info->gfi_lastfree.fmr_owner) {
> + error = ext4_getfsmap_helper(sb, info,
> + &info->gfi_lastfree);
> + if (error)
> + return error;
> + info->gfi_lastfree.fmr_owner = 0;
> + }
> + error = ext4_getfsmap_helper(sb, info, p);
> + if (error)
> + return error;
> + fsb = p->fmr_physical + p->fmr_length;
> + if (info->gfi_next_fsblk < fsb)
> + info->gfi_next_fsblk = fsb;
> + list_del(&p->fmr_list);
> + kfree(p);
> + continue;
> + }
> + }
> + if (info->gfi_next_fsblk < fsb)
> + info->gfi_next_fsblk = fsb;
> +
> + return 0;
> +}
> +
> +
> /* Transform a blockgroup's free record into a fsmap */
> static int ext4_getfsmap_datadev_helper(struct super_block *sb,
> ext4_group_t agno, ext4_grpblk_t start,
> @@ -539,6 +589,7 @@ static int ext4_getfsmap_datadev(struct super_block *sb,
> error = ext4_mballoc_query_range(sb, info->gfi_agno,
> EXT4_B2C(sbi, info->gfi_low.fmr_physical),
> EXT4_B2C(sbi, info->gfi_high.fmr_physical),
> + ext4_getfsmap_meta_helper,
> ext4_getfsmap_datadev_helper, info);
> if (error)
> goto err;
> @@ -560,7 +611,8 @@ static int ext4_getfsmap_datadev(struct super_block *sb,
>
> /* Report any gaps at the end of the bg */
> info->gfi_last = true;
> - error = ext4_getfsmap_datadev_helper(sb, end_ag, last_cluster, 0, info);
> + error = ext4_getfsmap_datadev_helper(sb, end_ag, last_cluster + 1,
> + 0, info);
> if (error)
> goto err;
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index d73e38323879..92f49d7eb3c0 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -6999,13 +6999,14 @@ int
> ext4_mballoc_query_range(
> struct super_block *sb,
> ext4_group_t group,
> - ext4_grpblk_t start,
> + ext4_grpblk_t first,
> ext4_grpblk_t end,
> + ext4_mballoc_query_range_fn meta_formatter,
> ext4_mballoc_query_range_fn formatter,
> void *priv)
> {
> void *bitmap;
> - ext4_grpblk_t next;
> + ext4_grpblk_t start, next;
> struct ext4_buddy e4b;
> int error;
>
> @@ -7016,10 +7017,19 @@ ext4_mballoc_query_range(
>
> ext4_lock_group(sb, group);
>
> - start = max(e4b.bd_info->bb_first_free, start);
> + start = max(e4b.bd_info->bb_first_free, first);
> if (end >= EXT4_CLUSTERS_PER_GROUP(sb))
> end = EXT4_CLUSTERS_PER_GROUP(sb) - 1;
> -
> + if (meta_formatter && start != first) {
> + if (start > end)
> + start = end;
> + ext4_unlock_group(sb, group);
> + error = meta_formatter(sb, group, first, start - first,
> + priv);
> + if (error)
> + goto out_unload;
> + ext4_lock_group(sb, group);
> + }
Hi, Ted!
Now it seems to be able to handle all the necessary metadata in the
query range here, can we remove the processing of info->gfi_meta_list
in ext4_getfsmap_datadev_helper() as well?
Thanks,
Yi.
> while (start <= end) {
> start = mb_find_next_zero_bit(bitmap, end + 1, start);
> if (start > end)
> diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h
> index d8553f1498d3..f8280de3e882 100644
> --- a/fs/ext4/mballoc.h
> +++ b/fs/ext4/mballoc.h
> @@ -259,6 +259,7 @@ ext4_mballoc_query_range(
> ext4_group_t agno,
> ext4_grpblk_t start,
> ext4_grpblk_t end,
> + ext4_mballoc_query_range_fn meta_formatter,
> ext4_mballoc_query_range_fn formatter,
> void *priv);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: fix FS_IOC_GETFSMAP handling
2024-10-25 7:06 ` Zhang Yi
@ 2024-10-25 15:42 ` Theodore Ts'o
2024-10-28 1:47 ` Zhang Yi
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2024-10-25 15:42 UTC (permalink / raw)
To: Zhang Yi; +Cc: Darrick J. Wong, Ext4 Developers List
On Fri, Oct 25, 2024 at 03:06:07PM +0800, Zhang Yi wrote:
>
> Now it seems to be able to handle all the necessary metadata in the
> query range here, can we remove the processing of info->gfi_meta_list
> in ext4_getfsmap_datadev_helper() as well?
Not without further code refactoring; we still need it if there are
metadata blocks in the middle of the block group. My main emphasis
was keeping the code changes as simple so it would be easy to
backport, and so I didn't do further optimizations.
As a related observation, I'm not entirely sure the current set of
abstractions, where we pass exactly one set of helper/callback
functions down through multiple functions is the best match for ext4.
It should be possible to significantly simplify the call stack by
reworking the GETFSMAP support.
On the other hand, the current, more complicated design might be
useful if at some point in the future, we were to add support for
reverse mapping for online fsck and/or if we add reflink support.
(See how Darrick implemented GETFSMAP for xfs.)
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: fix FS_IOC_GETFSMAP handling
2024-10-25 15:42 ` Theodore Ts'o
@ 2024-10-28 1:47 ` Zhang Yi
2024-10-28 3:59 ` Theodore Ts'o
0 siblings, 1 reply; 6+ messages in thread
From: Zhang Yi @ 2024-10-28 1:47 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Darrick J. Wong, Ext4 Developers List
On 2024/10/25 23:42, Theodore Ts'o wrote:
> On Fri, Oct 25, 2024 at 03:06:07PM +0800, Zhang Yi wrote:
>>
>> Now it seems to be able to handle all the necessary metadata in the
>> query range here, can we remove the processing of info->gfi_meta_list
>> in ext4_getfsmap_datadev_helper() as well?
>
> Not without further code refactoring; we still need it if there are
> metadata blocks in the middle of the block group.
IIRC, at the moment, fixed metadata does not appear in the middle of
the block group.
> My main emphasis
> was keeping the code changes as simple so it would be easy to
> backport, and so I didn't do further optimizations.
OK.
>
> As a related observation, I'm not entirely sure the current set of
> abstractions, where we pass exactly one set of helper/callback
> functions down through multiple functions is the best match for ext4.
> It should be possible to significantly simplify the call stack by
> reworking the GETFSMAP support.
Yeah, I have the same feeling too.
>
> On the other hand, the current, more complicated design might be
> useful if at some point in the future, we were to add support for
> reverse mapping for online fsck and/or if we add reflink support.
> (See how Darrick implemented GETFSMAP for xfs.)
>
Yes, the current GETFSMAP implementation on ext4 is incomplete and
has many limitations. If we had reserved mapping, we could achieve
more; for example, we could accurately query the inode to which a
block belongs, and the query efficiency would improve. As Baokun
mentioned earlier, we will take the time to look into the reverse
mapping first to hope to support it and maybe online fsck in the
future.
Thanks,
Yi.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: fix FS_IOC_GETFSMAP handling
2024-10-28 1:47 ` Zhang Yi
@ 2024-10-28 3:59 ` Theodore Ts'o
2024-10-29 2:20 ` Zhang Yi
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2024-10-28 3:59 UTC (permalink / raw)
To: Zhang Yi; +Cc: Darrick J. Wong, Ext4 Developers List
On Mon, Oct 28, 2024 at 09:47:46AM +0800, Zhang Yi wrote:
>
> IIRC, at the moment, fixed metadata does not appear in the middle of
> the block group.
The mke2fs from e2fsprogs will not create a file system like that
normally, no. *However* it is possible for fixed metadata to be in
the middle of the block group:
* If resize2fs does an off-line file system growth without a resize inode
or the reserved blocks in the resize inode has been exhausted
* If e2fsck needs to reallocate some fixed metadata blocks as part of
repairing a (very badly) corrupted file system.
* If there are blocks reported to mke2fs when the file system is created
* If a non-standard mkfs.ext4 is used by some other operating system which
reimplemented mke2fs for some reason (for example, because they wanted
to avoid using GPL'ed code).
So while these cases are quite uncommon, they *can* happen in the
wild, and we want GETFSMAP to be able to handle these file systems
correctly.
Cheers,
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ext4: fix FS_IOC_GETFSMAP handling
2024-10-28 3:59 ` Theodore Ts'o
@ 2024-10-29 2:20 ` Zhang Yi
0 siblings, 0 replies; 6+ messages in thread
From: Zhang Yi @ 2024-10-29 2:20 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Darrick J. Wong, Ext4 Developers List
On 2024/10/28 11:59, Theodore Ts'o wrote:
> On Mon, Oct 28, 2024 at 09:47:46AM +0800, Zhang Yi wrote:
>>
>> IIRC, at the moment, fixed metadata does not appear in the middle of
>> the block group.
>
> The mke2fs from e2fsprogs will not create a file system like that
> normally, no. *However* it is possible for fixed metadata to be in
> the middle of the block group:
>
> * If resize2fs does an off-line file system growth without a resize inode
> or the reserved blocks in the resize inode has been exhausted
> * If e2fsck needs to reallocate some fixed metadata blocks as part of
> repairing a (very badly) corrupted file system.
> * If there are blocks reported to mke2fs when the file system is created
> * If a non-standard mkfs.ext4 is used by some other operating system which
> reimplemented mke2fs for some reason (for example, because they wanted
> to avoid using GPL'ed code).
>
> So while these cases are quite uncommon, they *can* happen in the
> wild, and we want GETFSMAP to be able to handle these file systems
> correctly.
>
> Cheers,
>
Ha, I see, thank you for the correction.
Cheers,
Yi.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-10-29 2:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-23 13:59 [PATCH] ext4: fix FS_IOC_GETFSMAP handling Theodore Ts'o
2024-10-25 7:06 ` Zhang Yi
2024-10-25 15:42 ` Theodore Ts'o
2024-10-28 1:47 ` Zhang Yi
2024-10-28 3:59 ` Theodore Ts'o
2024-10-29 2:20 ` Zhang Yi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).