public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] repair: fix wrong logic when validating node magic number
@ 2015-08-13  7:01 Eryu Guan
  2015-08-13  7:15 ` Eryu Guan
  0 siblings, 1 reply; 5+ messages in thread
From: Eryu Guan @ 2015-08-13  7:01 UTC (permalink / raw)
  To: xfs; +Cc: Eryu Guan

Magic number is wrong only when != XFS_DA_NODE_MAGIC and
!= XFS_DA3_NODE_MAGIC.

This is triggered by shared/002 when testing 512 block size XFS.

  Phase 1 - find and verify superblock...
  Phase 2 - using internal log
          - scan filesystem freespace and inode maps...
          - found root inode chunk
  Phase 3 - for each AG...
          - scan (but don't clear) agi unlinked lists...
          - process known inodes and perform inode discovery...
          - agno = 0
  bad magic number febe in block 64 (108) for directory inode 35
  ......

Fix it by changing "||" to "&&".

Signed-off-by: Eryu Guan <eguan@redhat.com>
---
 repair/attr_repair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 62f80e7..83a07a8 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -570,7 +570,7 @@ verify_da_path(xfs_mount_t	*mp,
 		 * entry count, verify level
 		 */
 		bad = 0;
-		if (nodehdr.magic != XFS_DA_NODE_MAGIC ||
+		if (nodehdr.magic != XFS_DA_NODE_MAGIC &&
 		    nodehdr.magic != XFS_DA3_NODE_MAGIC)  {
 			do_warn(
 	_("bad magic number %x in block %u (%" PRIu64 ") for directory inode %" PRIu64 "\n"),
-- 
2.4.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] repair: fix wrong logic when validating node magic number
  2015-08-13  7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan
@ 2015-08-13  7:15 ` Eryu Guan
  2015-08-27  3:35   ` Zorro Lang
  2015-09-01 14:04   ` Brian Foster
  0 siblings, 2 replies; 5+ messages in thread
From: Eryu Guan @ 2015-08-13  7:15 UTC (permalink / raw)
  To: xfs

On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> != XFS_DA3_NODE_MAGIC.
> 
> This is triggered by shared/002 when testing 512 block size XFS.
> 
>   Phase 1 - find and verify superblock...
>   Phase 2 - using internal log
>           - scan filesystem freespace and inode maps...
>           - found root inode chunk
>   Phase 3 - for each AG...
>           - scan (but don't clear) agi unlinked lists...
>           - process known inodes and perform inode discovery...
>           - agno = 0
>   bad magic number febe in block 64 (108) for directory inode 35
>   ......
> 
> Fix it by changing "||" to "&&".
> 
> Signed-off-by: Eryu Guan <eguan@redhat.com>

With this patch applied, shared/002 still fails on 512 block size XFS,
full xfs_repair -n output is                                                                                                                  
                                                                                                                                              
*** xfs_repair -n output ***                                                                                                                  
Phase 1 - find and verify superblock...                                                                                                       
Phase 2 - using internal log                                                                                                                  
        - scan filesystem freespace and inode maps...                                                                                         
        - found root inode chunk                                                                                                              
Phase 3 - for each AG...                                                                                                                      
        - scan (but don't clear) agi unlinked lists...                                                                                        
        - process known inodes and perform inode discovery...                                                                                 
        - agno = 0                                                                                                                            
problem with attribute contents in inode 35                                                                                                   
would clear attr fork                                                                                                                         
bad nblocks 67 for inode 35, would reset to 0                                                                                                 
bad anextents 5 for inode 35, would reset to 0                                                                                                
        - agno = 1                                                                                                                            
        - agno = 2                                                                                                                            
        - agno = 3                                                                                                                            
        - process newly discovered inodes...                                                                                                  
Phase 4 - check for duplicate blocks...                                                                                                       
        - setting up duplicate extent list...                                                                                                 
        - check for inodes claiming duplicate blocks...                                                                                       
        - agno = 0                                                                                                                            
        - agno = 1                                                                                                                            
        - agno = 2                                                                                                                            
        - agno = 3                                                                                                                            
No modify flag set, skipping phase 5                                                                                                          
Phase 6 - check inode connectivity...                                                                                                         
        - traversing filesystem ...                                                                                                           
        - traversal finished ...                                                                                                              
        - moving disconnected inodes to lost+found ...                                                                                        
Phase 7 - verify link counts...                                                                                                               
No modify flag set, skipping filesystem flush and exiting.                                                                                    
*** end xfs_repair output

And a simplified reproducer is just adding >= 577 xattrs to file foo on
512 block size XFS, no dmflaky is needed.

num_xattrs=577
for ((i = 1; i <= $num_xattrs; i++)); do
        name="user.attr_$(printf "%04d" $i)"
        $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
done

And it's easily reproduced.

Thanks,
Eryu

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] repair: fix wrong logic when validating node magic number
  2015-08-13  7:15 ` Eryu Guan
@ 2015-08-27  3:35   ` Zorro Lang
  2015-09-01 14:04   ` Brian Foster
  1 sibling, 0 replies; 5+ messages in thread
From: Zorro Lang @ 2015-08-27  3:35 UTC (permalink / raw)
  To: Eryu Guan; +Cc: xfs

2015-08-13 15:15 GMT+08:00 Eryu Guan <eguan@redhat.com>:
> On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
>> Magic number is wrong only when != XFS_DA_NODE_MAGIC and
>> != XFS_DA3_NODE_MAGIC.
>>
>> This is triggered by shared/002 when testing 512 block size XFS.
>>
>>   Phase 1 - find and verify superblock...
>>   Phase 2 - using internal log
>>           - scan filesystem freespace and inode maps...
>>           - found root inode chunk
>>   Phase 3 - for each AG...
>>           - scan (but don't clear) agi unlinked lists...
>>           - process known inodes and perform inode discovery...
>>           - agno = 0
>>   bad magic number febe in block 64 (108) for directory inode 35
>>   ......
>>
>> Fix it by changing "||" to "&&".
>>
>> Signed-off-by: Eryu Guan <eguan@redhat.com>
>
> With this patch applied, shared/002 still fails on 512 block size XFS,


This failure not only be reproduced on 512 block size XFS. When I increase
the stress of shared/002, this bug can be reproduced on any block size XFS.

For example, on my test machine, when I increase $num_attrs to 6000, this
bug be reproduced on 1k block size xfs. When increased $num_attrs to 80k,
this bug be reproduced on 4k block size xfs.

So this's not a block size related bug.

BTW, xfs_repair can repair this corruption. And from shared/002 output, we can
see shared/002 use getfattr to sure all xattrs haven been wrote in
device correctly.
So this corruption maybe due to xfs log problems.

Thanks,
Zorro Lang

> full xfs_repair -n output is
>
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
>         - scan filesystem freespace and inode maps...
>         - found root inode chunk
> Phase 3 - for each AG...
>         - scan (but don't clear) agi unlinked lists...
>         - process known inodes and perform inode discovery...
>         - agno = 0
> problem with attribute contents in inode 35
> would clear attr fork
> bad nblocks 67 for inode 35, would reset to 0
> bad anextents 5 for inode 35, would reset to 0
>         - agno = 1
>         - agno = 2
>         - agno = 3
>         - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
>         - setting up duplicate extent list...
>         - check for inodes claiming duplicate blocks...
>         - agno = 0
>         - agno = 1
>         - agno = 2
>         - agno = 3
> No modify flag set, skipping phase 5
> Phase 6 - check inode connectivity...
>         - traversing filesystem ...
>         - traversal finished ...
>         - moving disconnected inodes to lost+found ...
> Phase 7 - verify link counts...
> No modify flag set, skipping filesystem flush and exiting.
> *** end xfs_repair output
>
> And a simplified reproducer is just adding >= 577 xattrs to file foo on
> 512 block size XFS, no dmflaky is needed.
>
> num_xattrs=577
> for ((i = 1; i <= $num_xattrs; i++)); do
>         name="user.attr_$(printf "%04d" $i)"
>         $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> done
>
> And it's easily reproduced.
>
> Thanks,
> Eryu
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] repair: fix wrong logic when validating node magic number
  2015-08-13  7:15 ` Eryu Guan
  2015-08-27  3:35   ` Zorro Lang
@ 2015-09-01 14:04   ` Brian Foster
  2015-09-02  2:19     ` Eryu Guan
  1 sibling, 1 reply; 5+ messages in thread
From: Brian Foster @ 2015-09-01 14:04 UTC (permalink / raw)
  To: Eryu Guan; +Cc: xfs

On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote:
> On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> > Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> > != XFS_DA3_NODE_MAGIC.
> > 
> > This is triggered by shared/002 when testing 512 block size XFS.
> > 
> >   Phase 1 - find and verify superblock...
> >   Phase 2 - using internal log
> >           - scan filesystem freespace and inode maps...
> >           - found root inode chunk
> >   Phase 3 - for each AG...
> >           - scan (but don't clear) agi unlinked lists...
> >           - process known inodes and perform inode discovery...
> >           - agno = 0
> >   bad magic number febe in block 64 (108) for directory inode 35
> >   ......
> > 
> > Fix it by changing "||" to "&&".
> > 
> > Signed-off-by: Eryu Guan <eguan@redhat.com>
> 
> With this patch applied, shared/002 still fails on 512 block size XFS,
> full xfs_repair -n output is                                                                                                                  
>                                                                                                                                               
> *** xfs_repair -n output ***                                                                                                                  
> Phase 1 - find and verify superblock...                                                                                                       
> Phase 2 - using internal log                                                                                                                  
>         - scan filesystem freespace and inode maps...                                                                                         
>         - found root inode chunk                                                                                                              
> Phase 3 - for each AG...                                                                                                                      
>         - scan (but don't clear) agi unlinked lists...                                                                                        
>         - process known inodes and perform inode discovery...                                                                                 
>         - agno = 0                                                                                                                            
> problem with attribute contents in inode 35                                                                                                   
> would clear attr fork                                                                                                                         
> bad nblocks 67 for inode 35, would reset to 0                                                                                                 
> bad anextents 5 for inode 35, would reset to 0                                                                                                
>         - agno = 1                                                                                                                            
>         - agno = 2                                                                                                                            
>         - agno = 3                                                                                                                            
>         - process newly discovered inodes...                                                                                                  
> Phase 4 - check for duplicate blocks...                                                                                                       
>         - setting up duplicate extent list...                                                                                                 
>         - check for inodes claiming duplicate blocks...                                                                                       
>         - agno = 0                                                                                                                            
>         - agno = 1                                                                                                                            
>         - agno = 2                                                                                                                            
>         - agno = 3                                                                                                                            
> No modify flag set, skipping phase 5                                                                                                          
> Phase 6 - check inode connectivity...                                                                                                         
>         - traversing filesystem ...                                                                                                           
>         - traversal finished ...                                                                                                              
>         - moving disconnected inodes to lost+found ...                                                                                        
> Phase 7 - verify link counts...                                                                                                               
> No modify flag set, skipping filesystem flush and exiting.                                                                                    
> *** end xfs_repair output
> 
> And a simplified reproducer is just adding >= 577 xattrs to file foo on
> 512 block size XFS, no dmflaky is needed.
> 
> num_xattrs=577
> for ((i = 1; i <= $num_xattrs; i++)); do
>         name="user.attr_$(printf "%04d" $i)"
>         $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> done
> 
> And it's easily reproduced.
> 

Thanks for the reproducer. This looks like a bug in xfs_repair. Care to
test the appended hunk?

Brian

---8<---

diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 83a07a8..b76618a 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t	*mp,
 		}
 
 		newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp);
-		btree = M_DIROPS(mp)->node_tree_p(node);
+		btree = M_DIROPS(mp)->node_tree_p(newnode);
 		M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode);
 
 		/*

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] repair: fix wrong logic when validating node magic number
  2015-09-01 14:04   ` Brian Foster
@ 2015-09-02  2:19     ` Eryu Guan
  0 siblings, 0 replies; 5+ messages in thread
From: Eryu Guan @ 2015-09-02  2:19 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Tue, Sep 01, 2015 at 10:04:43AM -0400, Brian Foster wrote:
> On Thu, Aug 13, 2015 at 03:15:24PM +0800, Eryu Guan wrote:
> > On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
> > > Magic number is wrong only when != XFS_DA_NODE_MAGIC and
> > > != XFS_DA3_NODE_MAGIC.
> > > 
> > > This is triggered by shared/002 when testing 512 block size XFS.
> > > 
> > >   Phase 1 - find and verify superblock...
> > >   Phase 2 - using internal log
> > >           - scan filesystem freespace and inode maps...
> > >           - found root inode chunk
> > >   Phase 3 - for each AG...
> > >           - scan (but don't clear) agi unlinked lists...
> > >           - process known inodes and perform inode discovery...
> > >           - agno = 0
> > >   bad magic number febe in block 64 (108) for directory inode 35
> > >   ......
> > > 
> > > Fix it by changing "||" to "&&".
> > > 
> > > Signed-off-by: Eryu Guan <eguan@redhat.com>
> > 
> > With this patch applied, shared/002 still fails on 512 block size XFS,
> > full xfs_repair -n output is                                                                                                                  
> >                                                                                                                                               
> > *** xfs_repair -n output ***                                                                                                                  
> > Phase 1 - find and verify superblock...                                                                                                       
> > Phase 2 - using internal log                                                                                                                  
> >         - scan filesystem freespace and inode maps...                                                                                         
> >         - found root inode chunk                                                                                                              
> > Phase 3 - for each AG...                                                                                                                      
> >         - scan (but don't clear) agi unlinked lists...                                                                                        
> >         - process known inodes and perform inode discovery...                                                                                 
> >         - agno = 0                                                                                                                            
> > problem with attribute contents in inode 35                                                                                                   
> > would clear attr fork                                                                                                                         
> > bad nblocks 67 for inode 35, would reset to 0                                                                                                 
> > bad anextents 5 for inode 35, would reset to 0                                                                                                
> >         - agno = 1                                                                                                                            
> >         - agno = 2                                                                                                                            
> >         - agno = 3                                                                                                                            
> >         - process newly discovered inodes...                                                                                                  
> > Phase 4 - check for duplicate blocks...                                                                                                       
> >         - setting up duplicate extent list...                                                                                                 
> >         - check for inodes claiming duplicate blocks...                                                                                       
> >         - agno = 0                                                                                                                            
> >         - agno = 1                                                                                                                            
> >         - agno = 2                                                                                                                            
> >         - agno = 3                                                                                                                            
> > No modify flag set, skipping phase 5                                                                                                          
> > Phase 6 - check inode connectivity...                                                                                                         
> >         - traversing filesystem ...                                                                                                           
> >         - traversal finished ...                                                                                                              
> >         - moving disconnected inodes to lost+found ...                                                                                        
> > Phase 7 - verify link counts...                                                                                                               
> > No modify flag set, skipping filesystem flush and exiting.                                                                                    
> > *** end xfs_repair output
> > 
> > And a simplified reproducer is just adding >= 577 xattrs to file foo on
> > 512 block size XFS, no dmflaky is needed.
> > 
> > num_xattrs=577
> > for ((i = 1; i <= $num_xattrs; i++)); do
> >         name="user.attr_$(printf "%04d" $i)"
> >         $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> > done
> > 
> > And it's easily reproduced.
> > 
> 
> Thanks for the reproducer. This looks like a bug in xfs_repair. Care to
> test the appended hunk?

Test passed after applying this patch, thanks! Tested by shared/002 and
the simplified reproducer above.

Thanks,
Eryu

> 
> Brian
> 
> ---8<---
> 
> diff --git a/repair/attr_repair.c b/repair/attr_repair.c
> index 83a07a8..b76618a 100644
> --- a/repair/attr_repair.c
> +++ b/repair/attr_repair.c
> @@ -562,7 +562,7 @@ verify_da_path(xfs_mount_t	*mp,
>  		}
>  
>  		newnode = (xfs_da_intnode_t *)XFS_BUF_PTR(bp);
> -		btree = M_DIROPS(mp)->node_tree_p(node);
> +		btree = M_DIROPS(mp)->node_tree_p(newnode);
>  		M_DIROPS(mp)->node_hdr_from_disk(&nodehdr, newnode);
>  
>  		/*

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-09-02  2:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-13  7:01 [PATCH] repair: fix wrong logic when validating node magic number Eryu Guan
2015-08-13  7:15 ` Eryu Guan
2015-08-27  3:35   ` Zorro Lang
2015-09-01 14:04   ` Brian Foster
2015-09-02  2:19     ` Eryu Guan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox