linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Rare xfsqa test failure
@ 2009-08-18 11:57 Theodore Ts'o
  2009-08-18 14:56 ` Theodore Tso
  2009-08-18 17:07 ` Andreas Dilger
  0 siblings, 2 replies; 7+ messages in thread
From: Theodore Ts'o @ 2009-08-18 11:57 UTC (permalink / raw)
  To: linux-ext4


As a heads up, I'm seeing a rare xfsqa test failure with the stable
portion of the ext4 patch queue; it doesn't hit all the time, but when
it does, i_size is corrupted:

Inode 22047, i_size is 922788, should be 942080.  Fix?

922788/4096 is 225 plus a fraction, while 942080/4096 is 230.  The
debugfs information is as follows:

debugfs:  stat <22047>
Inode: 22047   Type: regular    Mode:  0666   Flags: 0x80000
Generation: 3536075281    Version: 0x00000000:00000001
User:     0   Group:     0   Size: 922788
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 1320
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x4a8a953b:546bc3d4 -- Tue Aug 18 07:49:15 2009
 atime: 0x4a8a953b:29927210 -- Tue Aug 18 07:49:15 2009
 mtime: 0x4a8a953b:546bc3d4 -- Tue Aug 18 07:49:15 2009
crtime: 0x4a8a951c:5ac789dc -- Tue Aug 18 07:48:44 2009
Size of extra inode fields: 28
EXTENTS:
(65-80): 60720-60735, (81-222 [uninit]): 1181574-1181715, (223-229): 1181716-118
1722
debugfs:  

So it looks like there's a race which can cause ext4 to somehow miss an
i_size update.

						- Ted


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-18 11:57 Rare xfsqa test failure Theodore Ts'o
@ 2009-08-18 14:56 ` Theodore Tso
  2009-08-18 17:07 ` Andreas Dilger
  1 sibling, 0 replies; 7+ messages in thread
From: Theodore Tso @ 2009-08-18 14:56 UTC (permalink / raw)
  To: linux-ext4

On Tue, Aug 18, 2009 at 07:57:42AM -0400, Theodore Ts'o wrote:
> 
> As a heads up, I'm seeing a rare xfsqa test failure with the stable
> portion of the ext4 patch queue; it doesn't hit all the time, but when
> it does, i_size is corrupted:

The problem shows up with stock 2.6.31-rc4.

I just noticed I'm running with a patch to xfsqa that I had forgotten
to push upstream.  xfsqa's check_generic_filesystem doesn't force a
filesystem check, so people running the tests wouldn't have noticed
the problem.

Here's the patch; I'll send it to the XFS folks.

						- Ted

commit fbbeb08db507e26f61f44451ce52f9bac24cd8fa
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Tue Aug 18 10:51:37 2009 -0400

    Add ext2/3/4-specific _check_extN_filesystem function
    
    The _check_generic_filesystem function doesn't force a full filesystem
    check, so filesystem inconsistencies after a test wouldn't be noticed.
    To fix this, I added an extN specific check filesystem function.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/common.rc b/common.rc
index 82b0d51..da5f99e 100644
--- a/common.rc
+++ b/common.rc
@@ -865,6 +865,52 @@ _check_generic_filesystem()
     return 0
 }
 
+# Check an ext2/3/4 filesystem
+#
+_check_extN_filesystem()
+{
+    device=$1
+
+    # If type is set, we're mounted
+    type=`_fs_type $device`
+    ok=1
+
+    if [ "$type" = "$FSTYP" ]
+    then
+        # mounted ...
+        mountpoint=`_umount_or_remount_ro $device`
+    fi
+
+    e2fsck -nf $device >$tmp.fsck 2>&1
+    if [ $? -ne 0 ]
+    then
+        echo "_check_extN_filesystem: filesystem on $device is inconsistent (see $seq.full)"
+
+        echo "_check_extN filesystem: filesystem on $device is inconsistent" >>$here/$seq.full
+        echo "*** e2fsck output ***"                          >>$here/$seq.full
+        cat $tmp.fsck                                         >>$here/$seq.full
+        echo "*** end e2fsck output"                          >>$here/$seq.full
+
+        ok=0
+    fi
+    rm -f $tmp.fsck
+
+    if [ $ok -eq 0 ]
+    then
+        echo "*** mount output ***"                           >>$here/$seq.full
+        _mount                                                >>$here/$seq.full
+        echo "*** end mount output"                           >>$here/$seq.full
+    elif [ "$type" = "$FSTYP" ]
+    then
+	# was mounted ...
+	_mount_or_remount_rw "$MOUNT_OPTIONS" $device $mountpoint
+	ok=$?
+    fi
+
+    [ $ok -eq 0 ] && exit 1
+    return 0
+}
+
 # run xfs_check and friends on a FS.
 
 _check_xfs_filesystem()
@@ -1033,6 +1079,9 @@ _check_test_fs()
     udf)
 	# do nothing for now
 	;;
+    ext2|ext3|ext4)
+	_check_extN_filesystem $TEST_DEV
+	;;
     *)
 	_check_generic_filesystem $TEST_DEV
 	;;
@@ -1059,6 +1108,9 @@ _check_scratch_fs()
     nfs*)
 	# Don't know how to check an NFS filesystem, yet.
 	;;
+    ext2|ext3|ext4)
+	_check_extN_filesystem $SCRATCH_DEV
+	;;
     *)
 	_check_generic_filesystem $SCRATCH_DEV
 	;;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-18 11:57 Rare xfsqa test failure Theodore Ts'o
  2009-08-18 14:56 ` Theodore Tso
@ 2009-08-18 17:07 ` Andreas Dilger
  2009-08-18 21:42   ` Theodore Tso
  1 sibling, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2009-08-18 17:07 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

On Aug 18, 2009  07:57 -0400, Theodore Ts'o wrote:
> As a heads up, I'm seeing a rare xfsqa test failure with the stable
> portion of the ext4 patch queue; it doesn't hit all the time, but when
> it does, i_size is corrupted:
> 
> Inode 22047, i_size is 922788, should be 942080.  Fix?
> 
> 922788/4096 is 225 plus a fraction, while 942080/4096 is 230.  The
> debugfs information is as follows:
> 
> debugfs:  stat <22047>
> Inode: 22047   Type: regular    Mode:  0666   Flags: 0x80000
> Generation: 3536075281    Version: 0x00000000:00000001
> User:     0   Group:     0   Size: 922788
> File ACL: 0    Directory ACL: 0
> Links: 1   Blockcount: 1320
> Fragment:  Address: 0    Number: 0    Size: 0
>  ctime: 0x4a8a953b:546bc3d4 -- Tue Aug 18 07:49:15 2009
>  atime: 0x4a8a953b:29927210 -- Tue Aug 18 07:49:15 2009
>  mtime: 0x4a8a953b:546bc3d4 -- Tue Aug 18 07:49:15 2009
> crtime: 0x4a8a951c:5ac789dc -- Tue Aug 18 07:48:44 2009
> Size of extra inode fields: 28
> EXTENTS:
> (65-80): 60720-60735, (81-222 [uninit]): 1181574-1181715, (223-229): 1181716-118
> 1722
> debugfs:  
> 
> So it looks like there's a race which can cause ext4 to somehow miss an
> i_size update.

Are you sure it is a failure to update i_size, or is it possibly an
fallocate that extends the block count beyond i_size?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-18 17:07 ` Andreas Dilger
@ 2009-08-18 21:42   ` Theodore Tso
  2009-08-19 15:28     ` Eric Sandeen
  0 siblings, 1 reply; 7+ messages in thread
From: Theodore Tso @ 2009-08-18 21:42 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

On Tue, Aug 18, 2009 at 11:07:05AM -0600, Andreas Dilger wrote:
> > EXTENTS:
> > (65-80): 60720-60735, (81-222 [uninit]): 1181574-1181715, (223-229): 1181716-118
> > 1722
> > debugfs:  
> > 
> > So it looks like there's a race which can cause ext4 to somehow miss an
> > i_size update.
> 
> Are you sure it is a failure to update i_size, or is it possibly an
> fallocate that extends the block count beyond i_size?

Look at the EXTENTS report from debugfs; blocks 81-222 are
uninitialized from an fallocate, but block 223-229 are initialized.

	      	      		     	   - Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-18 21:42   ` Theodore Tso
@ 2009-08-19 15:28     ` Eric Sandeen
  2009-08-19 16:40       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2009-08-19 15:28 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Andreas Dilger, linux-ext4

Theodore Tso wrote:
> On Tue, Aug 18, 2009 at 11:07:05AM -0600, Andreas Dilger wrote:
>>> EXTENTS:
>>> (65-80): 60720-60735, (81-222 [uninit]): 1181574-1181715, (223-229): 1181716-118
>>> 1722
>>> debugfs:  
>>>
>>> So it looks like there's a race which can cause ext4 to somehow miss an
>>> i_size update.
>> Are you sure it is a failure to update i_size, or is it possibly an
>> fallocate that extends the block count beyond i_size?
> 
> Look at the EXTENTS report from debugfs; blocks 81-222 are
> uninitialized from an fallocate, but block 223-229 are initialized.
> 
> 	      	      		     	   - Ted

This was from test 013?

If so, that calls ltp's fsstress, which does not call fallocate nor
posix_fallocate.  It only does preallocation on xfs via the old
xfs-specific ioctl (though I suppose we should add it...)

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-19 15:28     ` Eric Sandeen
@ 2009-08-19 16:40       ` Christoph Hellwig
  2009-08-19 16:51         ` Eric Sandeen
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2009-08-19 16:40 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Tso, Andreas Dilger, linux-ext4

On Wed, Aug 19, 2009 at 10:28:04AM -0500, Eric Sandeen wrote:
> If so, that calls ltp's fsstress, which does not call fallocate nor
> posix_fallocate.  It only does preallocation on xfs via the old
> xfs-specific ioctl (though I suppose we should add it...)

Which in modern kernels is implemented in common code and gets routed to
->falllocate.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Rare xfsqa test failure
  2009-08-19 16:40       ` Christoph Hellwig
@ 2009-08-19 16:51         ` Eric Sandeen
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2009-08-19 16:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Theodore Tso, Andreas Dilger, linux-ext4

Christoph Hellwig wrote:
> On Wed, Aug 19, 2009 at 10:28:04AM -0500, Eric Sandeen wrote:
>> If so, that calls ltp's fsstress, which does not call fallocate nor
>> posix_fallocate.  It only does preallocation on xfs via the old
>> xfs-specific ioctl (though I suppose we should add it...)
> 
> Which in modern kernels is implemented in common code and gets routed to
> ->falllocate.

Oh right....

I was thinking it didn't try any xfs ioctls on non-xfs filesystems but I
guess that's not the case.

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-08-19 16:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-18 11:57 Rare xfsqa test failure Theodore Ts'o
2009-08-18 14:56 ` Theodore Tso
2009-08-18 17:07 ` Andreas Dilger
2009-08-18 21:42   ` Theodore Tso
2009-08-19 15:28     ` Eric Sandeen
2009-08-19 16:40       ` Christoph Hellwig
2009-08-19 16:51         ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).