* [PATCH] repair: fix wrong logic when validating node magic number
@ 2015-08-13 7:01 Eryu Guan
2015-08-13 7:15 ` Eryu Guan
0 siblings, 1 reply; 5+ messages in thread
From: Eryu Guan @ 2015-08-13 7:01 UTC (permalink / raw)
To: xfs; +Cc: Eryu Guan
Magic number is wrong only when != XFS_DA_NODE_MAGIC and
!= XFS_DA3_NODE_MAGIC.
This is triggered by shared/002 when testing 512 block size XFS.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
bad magic number febe in block 64 (108) for directory inode 35
......
Fix it by changing "||" to "&&".
Signed-off-by: Eryu Guan <eguan@redhat.com>
---
repair/attr_repair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 62f80e7..83a07a8 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -570,7 +570,7 @@ verify_da_path(xfs_mount_t *mp,
* entry count, verify level
*/
bad = 0;
- if (nodehdr.magic != XFS_DA_NODE_MAGIC ||
+ if (nodehdr.magic != XFS_DA_NODE_MAGIC &&
nodehdr.magic != XFS_DA3_NODE_MAGIC) {
do_warn(
_("bad magic number %x in block %u (%" PRIu64 ") for directory inode %" PRIu64 "\n"),
--
2.4.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] repair: fix wrong logic when validating node magic number 2015-08-13 7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan @ 2015-08-13 7:15 ` Eryu Guan 2015-08-27 3:35 ` Zorro Lang 2015-09-01 14:04 ` Brian Foster 0 siblings, 2 replies; 5+ messages in thread From: Eryu Guan @ 2015-08-13 7:15 UTC (permalink / raw) To: xfs On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote: > Magic number is wrong only when != XFS_DA_NODE_MAGIC and > != XFS_DA3_NODE_MAGIC. > > This is triggered by shared/002 when testing 512 block size XFS. > > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan (but don't clear) agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > bad magic number febe in block 64 (108) for directory inode 35 > ...... > > Fix it by changing "||" to "&&". > > Signed-off-by: Eryu Guan <eguan@redhat.com> With this patch applied, shared/002 still fails on 512 block size XFS, full xfs_repair -n output is *** xfs_repair -n output *** Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 problem with attribute contents in inode 35 would clear attr fork bad nblocks 67 for inode 35, would reset to 0 bad anextents 5 for inode 35, would reset to 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. *** end xfs_repair output And a simplified reproducer is just adding >= 577 xattrs to file foo on 512 block size XFS, no dmflaky is needed. num_xattrs=577 for ((i = 1; i <= $num_xattrs; i++)); do name="user.attr_$(printf "%04d" $i)" $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo done And it's easily reproduced. Thanks, Eryu _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number 2015-08-13 7:15 ` Eryu Guan @ 2015-08-27 3:35 ` Zorro Lang 2015-09-01 14:04 ` Brian Foster 1 sibling, 0 replies; 5+ messages in thread From: Zorro Lang @ 2015-08-27 3:35 UTC (permalink / raw) To: Eryu Guan; +Cc: xfs 2015-08-13 15:15 GMT+08:00 Eryu Guan <eguan@redhat.com>: > On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote: >> Magic number is wrong only when != XFS_DA_NODE_MAGIC and >> != XFS_DA3_NODE_MAGIC. >> >> This is triggered by shared/002 when testing 512 block size XFS. >> >> Phase 1 - find and verify superblock... >> Phase 2 - using internal log >> - scan filesystem freespace and inode maps... >> - found root inode chunk >> Phase 3 - for each AG... >> - scan (but don't clear) agi unlinked lists... >> - process known inodes and perform inode discovery... >> - agno = 0 >> bad magic number febe in block 64 (108) for directory inode 35 >> ...... >> >> Fix it by changing "||" to "&&". >> >> Signed-off-by: Eryu Guan <eguan@redhat.com> > > With this patch applied, shared/002 still fails on 512 block size XFS, This failure not only be reproduced on 512 block size XFS. When I increase the stress of shared/002, this bug can be reproduced on any block size XFS. For example, on my test machine, when I increase $num_attrs to 6000, this bug be reproduced on 1k block size xfs. When increased $num_attrs to 80k, this bug be reproduced on 4k block size xfs. So this's not a block size related bug. BTW, xfs_repair can repair this corruption. And from shared/002 output, we can see shared/002 use getfattr to sure all xattrs haven been wrote in device correctly. So this corruption maybe due to xfs log problems. Thanks, Zorro Lang > full xfs_repair -n output is > > *** xfs_repair -n output *** > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan (but don't clear) agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > problem with attribute contents in inode 35 > would clear attr fork > bad nblocks 67 for inode 35, would reset to 0 > bad anextents 5 for inode 35, would reset to 0 > - agno = 1 > - agno = 2 > - agno = 3 > - process newly discovered inodes... > Phase 4 - check for duplicate blocks... > - setting up duplicate extent list... > - check for inodes claiming duplicate blocks... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 3 > No modify flag set, skipping phase 5 > Phase 6 - check inode connectivity... > - traversing filesystem ... > - traversal finished ... > - moving disconnected inodes to lost+found ... > Phase 7 - verify link counts... > No modify flag set, skipping filesystem flush and exiting. > *** end xfs_repair output > > And a simplified reproducer is just adding >= 577 xattrs to file foo on > 512 block size XFS, no dmflaky is needed. > > num_xattrs=577 > for ((i = 1; i <= $num_xattrs; i++)); do > name="user.attr_$(printf "%04d" $i)" > $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo > done > > And it's easily reproduced. > > Thanks, > Eryu > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number 2015-08-13 7:15 ` Eryu Guan 2015-08-27 3:35 ` Zorro Lang @ 2015-09-01 14:04 ` Brian Foster 2015-09-02 2:19 ` Eryu Guan 1 sibling, 1 reply; 5+ messages in thread From: Brian Foster @ 2015-09-01 14:04 UTC (permalink / raw) To: Eryu Guan; +Cc: xfs On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote: > On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote: > > Magic number is wrong only when != XFS_DA_NODE_MAGIC and > > != XFS_DA3_NODE_MAGIC. > > > > This is triggered by shared/002 when testing 512 block size XFS. > > > > Phase 1 - find and verify superblock... > > Phase 2 - using internal log > > - scan filesystem freespace and inode maps... > > - found root inode chunk > > Phase 3 - for each AG... > > - scan (but don't clear) agi unlinked lists... > > - process known inodes and perform inode discovery... > > - agno = 0 > > bad magic number febe in block 64 (108) for directory inode 35 > > ...... > > > > Fix it by changing "||" to "&&". > > > > Signed-off-by: Eryu Guan <eguan@redhat.com> > > With this patch applied, shared/002 still fails on 512 block size XFS, > full xfs_repair -n output is > > *** xfs_repair -n output *** > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan (but don't clear) agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > problem with attribute contents in inode 35 > would clear attr fork > bad nblocks 67 for inode 35, would reset to 0 > bad anextents 5 for inode 35, would reset to 0 > - agno = 1 > - agno = 2 > - agno = 3 > - process newly discovered inodes... > Phase 4 - check for duplicate blocks... > - setting up duplicate extent list... > - check for inodes claiming duplicate blocks... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 3 > No modify flag set, skipping phase 5 > Phase 6 - check inode connectivity... > - traversing filesystem ... > - traversal finished ... > - moving disconnected inodes to lost+found ... > Phase 7 - verify link counts... > No modify flag set, skipping filesystem flush and exiting. > *** end xfs_repair output > > And a simplified reproducer is just adding >= 577 xattrs to file foo on > 512 block size XFS, no dmflaky is needed. > > num_xattrs=577 > for ((i = 1; i <= $num_xattrs; i++)); do > name="user.attr_$(printf "%04d" $i)" > $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo > done > > And it's easily reproduced. > Thanks for the reproducer. This looks like a bug in xfs_repair. Care to test the appended hunk? Brian ---8<--- diff --git a/repair/attr_repair.c b/repair/attr_repair.c index 83a07a8..b76618a 100644 --- a/repair/attr_repair.c +++ b/repair/attr_repair.c @@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t *mp, } newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp); - btree = M_DIROPS(mp)->node_tree_p(node); + btree = M_DIROPS(mp)->node_tree_p(newnode); M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode); /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] repair: fix wrong logic when validating node magic number 2015-09-01 14:04 ` Brian Foster @ 2015-09-02 2:19 ` Eryu Guan 0 siblings, 0 replies; 5+ messages in thread From: Eryu Guan @ 2015-09-02 2:19 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On Tue, Sep 01, 2015 at 10:04:43AM -0400, Brian Foster wrote: > On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote: > > On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote: > > > Magic number is wrong only when != XFS_DA_NODE_MAGIC and > > > != XFS_DA3_NODE_MAGIC. > > > > > > This is triggered by shared/002 when testing 512 block size XFS. > > > > > > Phase 1 - find and verify superblock... > > > Phase 2 - using internal log > > > - scan filesystem freespace and inode maps... > > > - found root inode chunk > > > Phase 3 - for each AG... > > > - scan (but don't clear) agi unlinked lists... > > > - process known inodes and perform inode discovery... > > > - agno = 0 > > > bad magic number febe in block 64 (108) for directory inode 35 > > > ...... > > > > > > Fix it by changing "||" to "&&". > > > > > > Signed-off-by: Eryu Guan <eguan@redhat.com> > > > > With this patch applied, shared/002 still fails on 512 block size XFS, > > full xfs_repair -n output is > > > > *** xfs_repair -n output *** > > Phase 1 - find and verify superblock... > > Phase 2 - using internal log > > - scan filesystem freespace and inode maps... > > - found root inode chunk > > Phase 3 - for each AG... > > - scan (but don't clear) agi unlinked lists... > > - process known inodes and perform inode discovery... > > - agno = 0 > > problem with attribute contents in inode 35 > > would clear attr fork > > bad nblocks 67 for inode 35, would reset to 0 > > bad anextents 5 for inode 35, would reset to 0 > > - agno = 1 > > - agno = 2 > > - agno = 3 > > - process newly discovered inodes... > > Phase 4 - check for duplicate blocks... > > - setting up duplicate extent list... > > - check for inodes claiming duplicate blocks... > > - agno = 0 > > - agno = 1 > > - agno = 2 > > - agno = 3 > > No modify flag set, skipping phase 5 > > Phase 6 - check inode connectivity... > > - traversing filesystem ... > > - traversal finished ... > > - moving disconnected inodes to lost+found ... > > Phase 7 - verify link counts... > > No modify flag set, skipping filesystem flush and exiting. > > *** end xfs_repair output > > > > And a simplified reproducer is just adding >= 577 xattrs to file foo on > > 512 block size XFS, no dmflaky is needed. > > > > num_xattrs=577 > > for ((i = 1; i <= $num_xattrs; i++)); do > > name="user.attr_$(printf "%04d" $i)" > > $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo > > done > > > > And it's easily reproduced. > > > > Thanks for the reproducer. This looks like a bug in xfs_repair. Care to > test the appended hunk? Test passed after applying this patch, thanks! Tested by shared/002 and the simplified reproducer above. Thanks, Eryu > > Brian > > ---8<--- > > diff --git a/repair/attr_repair.c b/repair/attr_repair.c > index 83a07a8..b76618a 100644 > --- a/repair/attr_repair.c > +++ b/repair/attr_repair.c > @@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t *mp, > } > > newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp); > - btree = M_DIROPS(mp)->node_tree_p(node); > + btree = M_DIROPS(mp)->node_tree_p(newnode); > M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode); > > /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-02 2:19 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-13 7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan 2015-08-13 7:15 ` Eryu Guan 2015-08-27 3:35 ` Zorro Lang 2015-09-01 14:04 ` Brian Foster 2015-09-02 2:19 ` Eryu Guan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox