From: "Darrick J. Wong" <djwong@kernel.org>
To: bodonnel@redhat.com
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs_repair: -EFSBADCRC needs action when read verifier detects it.
Date: Wed, 26 Feb 2025 10:20:02 -0800 [thread overview]
Message-ID: <20250226182002.GU6242@frogsfrogsfrogs> (raw)
In-Reply-To: <20250226173335.558221-1-bodonnel@redhat.com>
On Wed, Feb 26, 2025 at 11:32:22AM -0600, bodonnel@redhat.com wrote:
> From: Bill O'Donnell <bodonnel@redhat.com>
>
> For xfs_repair, there is a case when -EFSBADCRC is encountered but not
> acted on. Modify da_read_buf to check for and repair. The current
> implementation fails for the case:
>
> $ xfs_repair xfs_metadump_hosting.dmp.image
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> Metadata CRC error detected at 0x46cde8, xfs_dir3_block block 0xd3c50/0x1000
> bad directory block magic # 0x16011664 in block 0 for directory inode 867467
> corrupt directory block 0 for inode 867467
Curious -- this corrupt directory block fails the magic checks but
process_dir2_data returns 0 because it didn't find any corruption.
So it looks like we release the directory buffer (without dirtying it to
reset the checksum)...
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 3
> - agno = 2
> bad directory block magic # 0x16011664 in block 0 for directory inode 867467
...and then it shows up here again...
> Phase 5 - rebuild AG headers and trees...
> - reset superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - traversing filesystem ...
> bad directory block magic # 0x16011664 for directory inode 867467 block 0: fixing magic # to 0x58444233
...and again here. Now we reset the magic and dirty the buffer...
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify and correct link counts...
> Metadata corruption detected at 0x46cc88, xfs_dir3_block block 0xd3c50/0x1000
...but I guess we haven't fixed anything in the buffer, so the verifier
trips. What code does 0x46cc88 map to in the dir3 block verifier
function? That might reflect some missing code in process_dir2_data.
> libxfs_bwrite: write verifier failed on xfs_dir3_block bno 0xd3c50/0x8
> xfs_repair: Releasing dirty buffer to free list!
> xfs_repair: Refusing to write a corrupt buffer to the data device!
> xfs_repair: Lost a write to the data device!
>
> fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair.
>
>
> With the patch applied:
> $ xfs_repair xfs_metadump_hosting.dmp.image
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> Metadata CRC error detected at 0x46ce28, xfs_dir3_block block 0xd3c50/0x1000
> bad directory block magic # 0x16011664 in block 0 for directory inode 867467
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> bad directory block magic # 0x16011664 in block 0 for directory inode 867467
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> Phase 5 - rebuild AG headers and trees...
> - reset superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - traversing filesystem ...
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> Metadata CRC error detected at 0x46ce28, xfs_dir3_block block 0xd3c50/0x1000
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> bad directory block magic # 0x16011664 for directory inode 867467 block 0: fixing magic # to 0x58444233
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> rebuilding directory inode 867467
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> cache_node_put: node put on refcount 0 (node=0x7f46ac0c5610)
> cache_node_put: node put on node (0x7f46ac0c5610) in MRU list
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify and correct link counts...
> done
>
> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
> ---
> repair/da_util.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/repair/da_util.c b/repair/da_util.c
> index 7f94f4012062..0a4785e6f69b 100644
> --- a/repair/da_util.c
> +++ b/repair/da_util.c
> @@ -66,6 +66,9 @@ da_read_buf(
> }
> libxfs_buf_read_map(mp->m_dev, map, nex, LIBXFS_READBUF_SALVAGE,
> &bp, ops);
> + if (bp->b_error == -EFSBADCRC) {
> + libxfs_buf_relse(bp);
This introduces a use-after-free on the buffer pointer.
--D
> + }
> if (map != map_array)
> free(map);
> return bp;
> --
> 2.48.1
>
>
next prev parent reply other threads:[~2025-02-26 18:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-26 17:32 [PATCH] xfs_repair: -EFSBADCRC needs action when read verifier detects it bodonnel
2025-02-26 18:20 ` Darrick J. Wong [this message]
2025-02-28 15:27 ` Bill O'Donnell
2025-02-28 17:35 ` Darrick J. Wong
2025-03-13 19:07 ` Bill O'Donnell
2025-03-13 21:40 ` Eric Sandeen
-- strict thread matches above, loose matches on Subject: below --
2025-02-26 16:43 bodonnel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250226182002.GU6242@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=bodonnel@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox