* [PATCH] repair: fix wrong logic when validating node magic number
@ 2015-08-13 7:01 Eryu Guan
2015-08-13 7:15 ` Eryu Guan
0 siblings, 1 reply; 5+ messages in thread
From: Eryu Guan @ 2015-08-13 7:01 UTC (permalink / raw)
To: xfs; +Cc: Eryu Guan
Magic number is wrong only when != XFS_DA_NODE_MAGIC and
!= XFS_DA3_NODE_MAGIC.
This is triggered by shared/002 when testing 512 block size XFS.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
bad magic number febe in block 64 (108) for directory inode 35
......
Fix it by changing "||" to "&&".
Signed-off-by: Eryu Guan <eguan@redhat.com>
---
repair/attr_repair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 62f80e7..83a07a8 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -570,7 +570,7 @@ verify_da_path(xfs_mount_t *mp,
* entry count, verify level
*/
bad = 0;
- if (nodehdr.magic != XFS_DA_NODE_MAGIC ||
+ if (nodehdr.magic != XFS_DA_NODE_MAGIC &&
nodehdr.magic != XFS_DA3_NODE_MAGIC) {
do_warn(
_("bad magic number %x in block %u (%" PRIu64 ") for directory inode %" PRIu64 "\n"),
--
2.4.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number
2015-08-13 7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan
@ 2015-08-13 7:15 ` Eryu Guan
2015-08-27 3:35 ` Zorro Lang
2015-09-01 14:04 ` Brian Foster
0 siblings, 2 replies; 5+ messages in thread
From: Eryu Guan @ 2015-08-13 7:15 UTC (permalink / raw)
To: xfs
On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> != XFS_DA3_NODE_MAGIC.
>
> This is triggered by shared/002 when testing 512 block size XFS.
>
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan (but don't clear) agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> bad magic number febe in block 64 (108) for directory inode 35
> ......
>
> Fix it by changing "||" to "&&".
>
> Signed-off-by: Eryu Guan <eguan@redhat.com>
With this patch applied, shared/002 still fails on 512 block size XFS,
full xfs_repair -n output is
*** xfs_repair -n output ***
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
problem with attribute contents in inode 35
would clear attr fork
bad nblocks 67 for inode 35, would reset to 0
bad anextents 5 for inode 35, would reset to 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
*** end xfs_repair output
And a simplified reproducer is just adding >= 577 xattrs to file foo on
512 block size XFS, no dmflaky is needed.
num_xattrs=577
for ((i = 1; i <= $num_xattrs; i++)); do
name="user.attr_$(printf "%04d" $i)"
$SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
done
And it's easily reproduced.
Thanks,
Eryu
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number
2015-08-13 7:15 ` Eryu Guan
@ 2015-08-27 3:35 ` Zorro Lang
2015-09-01 14:04 ` Brian Foster
1 sibling, 0 replies; 5+ messages in thread
From: Zorro Lang @ 2015-08-27 3:35 UTC (permalink / raw)
To: Eryu Guan; +Cc: xfs
2015-08-13 15:15 GMT+08:00 Eryu Guan <eguan@redhat.com>:
> On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
>> Magic number is wrong only when != XFS_DA_NODE_MAGIC and
>> != XFS_DA3_NODE_MAGIC.
>>
>> This is triggered by shared/002 when testing 512 block size XFS.
>>
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>> - scan filesystem freespace and inode maps...
>> - found root inode chunk
>> Phase 3 - for each AG...
>> - scan (but don't clear) agi unlinked lists...
>> - process known inodes and perform inode discovery...
>> - agno = 0
>> bad magic number febe in block 64 (108) for directory inode 35
>> ......
>>
>> Fix it by changing "||" to "&&".
>>
>> Signed-off-by: Eryu Guan <eguan@redhat.com>
>
> With this patch applied, shared/002 still fails on 512 block size XFS,
This failure not only be reproduced on 512 block size XFS. When I increase
the stress of shared/002, this bug can be reproduced on any block size XFS.
For example, on my test machine, when I increase $num_attrs to 6000, this
bug be reproduced on 1k block size xfs. When increased $num_attrs to 80k,
this bug be reproduced on 4k block size xfs.
So this's not a block size related bug.
BTW, xfs_repair can repair this corruption. And from shared/002 output, we can
see shared/002 use getfattr to sure all xattrs haven been wrote in
device correctly.
So this corruption maybe due to xfs log problems.
Thanks,
Zorro Lang
> full xfs_repair -n output is
>
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan (but don't clear) agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> problem with attribute contents in inode 35
> would clear attr fork
> bad nblocks 67 for inode 35, would reset to 0
> bad anextents 5 for inode 35, would reset to 0
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> No modify flag set, skipping phase 5
> Phase 6 - check inode connectivity...
> - traversing filesystem ...
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify link counts...
> No modify flag set, skipping filesystem flush and exiting.
> *** end xfs_repair output
>
> And a simplified reproducer is just adding >= 577 xattrs to file foo on
> 512 block size XFS, no dmflaky is needed.
>
> num_xattrs=577
> for ((i = 1; i <= $num_xattrs; i++)); do
> name="user.attr_$(printf "%04d" $i)"
> $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> done
>
> And it's easily reproduced.
>
> Thanks,
> Eryu
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number
2015-08-13 7:15 ` Eryu Guan
2015-08-27 3:35 ` Zorro Lang
@ 2015-09-01 14:04 ` Brian Foster
2015-09-02 2:19 ` Eryu Guan
1 sibling, 1 reply; 5+ messages in thread
From: Brian Foster @ 2015-09-01 14:04 UTC (permalink / raw)
To: Eryu Guan; +Cc: xfs
On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote:
> On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> > Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> > != XFS_DA3_NODE_MAGIC.
> >
> > This is triggered by shared/002 when testing 512 block size XFS.
> >
> > Phase 1 - find and verify superblock...
> > Phase 2 - using internal log
> > - scan filesystem freespace and inode maps...
> > - found root inode chunk
> > Phase 3 - for each AG...
> > - scan (but don't clear) agi unlinked lists...
> > - process known inodes and perform inode discovery...
> > - agno = 0
> > bad magic number febe in block 64 (108) for directory inode 35
> > ......
> >
> > Fix it by changing "||" to "&&".
> >
> > Signed-off-by: Eryu Guan <eguan@redhat.com>
>
> With this patch applied, shared/002 still fails on 512 block size XFS,
> full xfs_repair -n output is
>
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan (but don't clear) agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> problem with attribute contents in inode 35
> would clear attr fork
> bad nblocks 67 for inode 35, would reset to 0
> bad anextents 5 for inode 35, would reset to 0
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> No modify flag set, skipping phase 5
> Phase 6 - check inode connectivity...
> - traversing filesystem ...
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify link counts...
> No modify flag set, skipping filesystem flush and exiting.
> *** end xfs_repair output
>
> And a simplified reproducer is just adding >= 577 xattrs to file foo on
> 512 block size XFS, no dmflaky is needed.
>
> num_xattrs=577
> for ((i = 1; i <= $num_xattrs; i++)); do
> name="user.attr_$(printf "%04d" $i)"
> $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> done
>
> And it's easily reproduced.
>
Thanks for the reproducer. This looks like a bug in xfs_repair. Care to
test the appended hunk?
Brian
---8<---
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 83a07a8..b76618a 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t *mp,
}
newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp);
- btree = M_DIROPS(mp)->node_tree_p(node);
+ btree = M_DIROPS(mp)->node_tree_p(newnode);
M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode);
/*
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number
2015-09-01 14:04 ` Brian Foster
@ 2015-09-02 2:19 ` Eryu Guan
0 siblings, 0 replies; 5+ messages in thread
From: Eryu Guan @ 2015-09-02 2:19 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Sep 01, 2015 at 10:04:43AM -0400, Brian Foster wrote:
> On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote:
> > On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> > > Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> > > != XFS_DA3_NODE_MAGIC.
> > >
> > > This is triggered by shared/002 when testing 512 block size XFS.
> > >
> > > Phase 1 - find and verify superblock...
> > > Phase 2 - using internal log
> > > - scan filesystem freespace and inode maps...
> > > - found root inode chunk
> > > Phase 3 - for each AG...
> > > - scan (but don't clear) agi unlinked lists...
> > > - process known inodes and perform inode discovery...
> > > - agno = 0
> > > bad magic number febe in block 64 (108) for directory inode 35
> > > ......
> > >
> > > Fix it by changing "||" to "&&".
> > >
> > > Signed-off-by: Eryu Guan <eguan@redhat.com>
> >
> > With this patch applied, shared/002 still fails on 512 block size XFS,
> > full xfs_repair -n output is
> >
> > *** xfs_repair -n output ***
> > Phase 1 - find and verify superblock...
> > Phase 2 - using internal log
> > - scan filesystem freespace and inode maps...
> > - found root inode chunk
> > Phase 3 - for each AG...
> > - scan (but don't clear) agi unlinked lists...
> > - process known inodes and perform inode discovery...
> > - agno = 0
> > problem with attribute contents in inode 35
> > would clear attr fork
> > bad nblocks 67 for inode 35, would reset to 0
> > bad anextents 5 for inode 35, would reset to 0
> > - agno = 1
> > - agno = 2
> > - agno = 3
> > - process newly discovered inodes...
> > Phase 4 - check for duplicate blocks...
> > - setting up duplicate extent list...
> > - check for inodes claiming duplicate blocks...
> > - agno = 0
> > - agno = 1
> > - agno = 2
> > - agno = 3
> > No modify flag set, skipping phase 5
> > Phase 6 - check inode connectivity...
> > - traversing filesystem ...
> > - traversal finished ...
> > - moving disconnected inodes to lost+found ...
> > Phase 7 - verify link counts...
> > No modify flag set, skipping filesystem flush and exiting.
> > *** end xfs_repair output
> >
> > And a simplified reproducer is just adding >= 577 xattrs to file foo on
> > 512 block size XFS, no dmflaky is needed.
> >
> > num_xattrs=577
> > for ((i = 1; i <= $num_xattrs; i++)); do
> > name="user.attr_$(printf "%04d" $i)"
> > $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> > done
> >
> > And it's easily reproduced.
> >
>
> Thanks for the reproducer. This looks like a bug in xfs_repair. Care to
> test the appended hunk?
Test passed after applying this patch, thanks! Tested by shared/002 and
the simplified reproducer above.
Thanks,
Eryu
>
> Brian
>
> ---8<---
>
> diff --git a/repair/attr_repair.c b/repair/attr_repair.c
> index 83a07a8..b76618a 100644
> --- a/repair/attr_repair.c
> +++ b/repair/attr_repair.c
> @@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t *mp,
> }
>
> newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp);
> - btree = M_DIROPS(mp)->node_tree_p(node);
> + btree = M_DIROPS(mp)->node_tree_p(newnode);
> M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode);
>
> /*
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-02 2:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-13 7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan
2015-08-13 7:15 ` Eryu Guan
2015-08-27 3:35 ` Zorro Lang
2015-09-01 14:04 ` Brian Foster
2015-09-02 2:19 ` Eryu Guan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox