public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.0-test5-mm3 & XFS FS Corruption
@ 2003-09-21 15:47 Walt H
  2003-09-21 18:08 ` 2.6.0-test5-mm3 & XFS FS Corruption (or not?) Walt H
  0 siblings, 1 reply; 6+ messages in thread
From: Walt H @ 2003-09-21 15:47 UTC (permalink / raw)
  To: linux-kernel, Linux XFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 1329 bytes --]

Hello,

Just recently upgraded to 2.6.0-test5-mm3 and am experiencing some
corruption issues on an XFS filesystem. I've been tracking the mm series
through the 2.6.0-test kernels and this is the first I've seen.

I use rsync to backup a volume on a software raid device to another raid
set created from device mapper. The target is a 2 drive setup connected
to a PDC20276 which is defined as a raid0 in the Promise BIOS. There's
no ata-raid in 2.6 yet, so I use dm which works fine.

An rsync from the software raid to the Promise set results in FS
corruption on an XFS filesystem. The rsync will usually complete, but I
am unable to use the filesystem afterward. I have a backup fstab that
needs to be copied and it results in "Operation not permitted", strace'd
output attached.

If I umount the filesystem and run xfs_repair on it, it will proceed
till the end where it rebuilds directory inode 256 - this is the only
item needing repair. This happens every time. I can then mount it and
copy my file with no problem. This corruption is consistent and so far
has happened each time I use rsync to backup since going to -mm3. I've
tried patching in a CVS pull of the xfs filesystem from 9/20/2003 and
have the same results.

Any ideas? Let me know if you need more information or would like me to
try something. Thanks,

-Walt

[-- Attachment #2: fstab-strace.txt --]
[-- Type: text/plain, Size: 3849 bytes --]

execve("/bin/cp", ["cp", "fstab.backup", "fstab"], [/* 41 vars */]) = 0
uname({sysname="Linux", nodename="waltsathlon.localhost.net", release="2.6.0-test5-mm3", version="#2 SMP Fri Sep 19 19:34:53 PDT 2003", machine="i686"}) = 0
brk(0)                                  = 0x8056000
open("/etc/ld.so.preload", O_RDONLY)    = 3
fstat64(3, {st_dev=makedev(9, 4), st_ino=1711, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=0, st_size=0, st_atime=2003/09/19-20:20:20, st_mtime=2003/09/02-20:05:15, st_ctime=2003/09/02-20:05:15}) = 0
close(3)                                = 0
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_dev=makedev(9, 4), st_ino=606953, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=216, st_size=107854, st_atime=2003/09/20-21:38:16, st_mtime=2003/09/20-21:06:32, st_ctime=2003/09/20-21:06:32}) = 0
mmap2(NULL, 107854, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40000000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\0305A"..., 1024) = 1024
fstat64(3, {st_dev=makedev(9, 4), st_ino=686802, st_mode=S_IFREG|0755, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=2840, st_size=1452573, st_atime=2003/09/20-21:38:16, st_mtime=2003/08/08-20:07:48, st_ctime=2003/09/12-07:10:22}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001b000
mmap2(0x4133c000, 1215204, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4133c000
mprotect(0x4145f000, 23268, PROT_NONE)  = 0
mmap2(0x4145f000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x122) = 0x4145f000
mmap2(0x41463000, 6884, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x41463000
close(3)                                = 0
munmap(0x40000000, 107854)              = 0
brk(0)                                  = 0x8056000
brk(0x8057000)                          = 0x8057000
brk(0)                                  = 0x8057000
geteuid32()                             = 0
umask(0)                                = 022
lstat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
stat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
stat64("fstab.backup", {st_dev=makedev(254, 4), st_ino=12583203, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=697, st_atime=2003/09/20-08:32:58, st_mtime=2003/07/14-16:29:28, st_ctime=2003/07/15-17:38:17}) = 0
stat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
open("fstab.backup", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_dev=makedev(254, 4), st_ino=12583203, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=697, st_atime=2003/09/20-08:32:58, st_mtime=2003/07/14-16:29:28, st_ctime=2003/07/15-17:38:17}) = 0
open("fstab", O_WRONLY|O_TRUNC|O_LARGEFILE) = -1 EPERM (Operation not permitted)
write(2, "cp: ", 4cp: )                     = 4
write(2, "cannot create regular file `fsta"..., 34cannot create regular file `fstab') = 34
write(2, ": Operation not permitted", 25: Operation not permitted) = 25
write(2, "\n", 1
)                       = 1
close(3)                                = 0
_exit(1)                                = ?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)
  2003-09-21 15:47 2.6.0-test5-mm3 & XFS FS Corruption Walt H
@ 2003-09-21 18:08 ` Walt H
  2003-09-21 19:48   ` Steve Lord
  0 siblings, 1 reply; 6+ messages in thread
From: Walt H @ 2003-09-21 18:08 UTC (permalink / raw)
  To: Walt H; +Cc: linux-kernel, Linux XFS Mailing List

Just a follow-up to my earlier post:

I've put in the xfs code from mm2 into the mm3 tree and all files get
copied and I can manually copy the fstab.backup file afterward. I
realized that the "rebuilding directory inode 256" was the lost+found
directory, which contained 4 old zero length files. That was the key.
XFS under -mm2 doesn't care about old lost+found directories, while -mm3
does. If I removed the source lost+found/ and retried rsync's with -mm3,
it finishes fine and I can copy fstab files. Adding a bogus lost+found
dir with any file in it at the source, and retrying the rsync will lead
to a state where I can't overwrite the existing /etc/fstab file at the
end. So it doesn't look like there's actually any filesystem corruption,
just a strange bug. Hope that helps,

-Walt


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)
  2003-09-21 18:08 ` 2.6.0-test5-mm3 & XFS FS Corruption (or not?) Walt H
@ 2003-09-21 19:48   ` Steve Lord
  2003-09-22  1:01     ` Walt H
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Lord @ 2003-09-21 19:48 UTC (permalink / raw)
  To: Walt H; +Cc: Linux Kernel, Linux XFS Mailing List

On Sun, 2003-09-21 at 13:08, Walt H wrote:
> Just a follow-up to my earlier post:
> 
> I've put in the xfs code from mm2 into the mm3 tree and all files get
> copied and I can manually copy the fstab.backup file afterward. I
> realized that the "rebuilding directory inode 256" was the lost+found
> directory, which contained 4 old zero length files. That was the key.
> XFS under -mm2 doesn't care about old lost+found directories, while -mm3
> does. If I removed the source lost+found/ and retried rsync's with -mm3,
> it finishes fine and I can copy fstab files. Adding a bogus lost+found
> dir with any file in it at the source, and retrying the rsync will lead
> to a state where I can't overwrite the existing /etc/fstab file at the
> end. So it doesn't look like there's actually any filesystem corruption,
> just a strange bug. Hope that helps,
> 
> -Walt
> 

If I am correct, test5-mm3 contains a bad version of the xfs code, there
was a bug where the i_flags field was setup from an uninitialized stack
variable. mm3 came out during the two days this was in Linus's tree.
I had some very odd behavior with this code base, rm -r -f would try and
cd into files and other bizzare things, files could appear to be
immutable or append only or things they were not. This sounds like
similar behavior you that you saw. It is fixed in the latest code Linus
has.

Steve





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)
  2003-09-21 19:48   ` Steve Lord
@ 2003-09-22  1:01     ` Walt H
  2003-09-22  1:12       ` Nathan Scott
  0 siblings, 1 reply; 6+ messages in thread
From: Walt H @ 2003-09-22  1:01 UTC (permalink / raw)
  To: Steve Lord; +Cc: Linux Kernel, Linux XFS Mailing List

Steve Lord wrote:

> 
> If I am correct, test5-mm3 contains a bad version of the xfs code, there
> was a bug where the i_flags field was setup from an uninitialized stack
> variable. mm3 came out during the two days this was in Linus's tree.
> I had some very odd behavior with this code base, rm -r -f would try and
> cd into files and other bizzare things, files could appear to be
> immutable or append only or things they were not. This sounds like
> similar behavior you that you saw. It is fixed in the latest code Linus
> has.
> 
> Steve

Thanks for the reply Steve. I'm guessing that this code hasn't hit CVS
yet, as I can still reproduce it with a current CVS @ 9/21/03 ~ 17:30
PST  Sounds like this is a known issue, so I'll just go back to the xfs
code from -mm2 for now.

-Walt




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)
  2003-09-22  1:01     ` Walt H
@ 2003-09-22  1:12       ` Nathan Scott
  2003-09-22  1:28         ` Walt H
  0 siblings, 1 reply; 6+ messages in thread
From: Nathan Scott @ 2003-09-22  1:12 UTC (permalink / raw)
  To: Walt H; +Cc: Linux Kernel, Linux XFS Mailing List

On Sun, Sep 21, 2003 at 06:01:06PM -0700, Walt H wrote:
> Steve Lord wrote:
> > 
> > If I am correct, test5-mm3 contains a bad version of the xfs code, there
> > was a bug where the i_flags field was setup from an uninitialized stack
> > variable. mm3 came out during the two days this was in Linus's tree.
> > I had some very odd behavior with this code base, rm -r -f would try and
> > cd into files and other bizzare things, files could appear to be
> > immutable or append only or things they were not. This sounds like
> > similar behavior you that you saw. It is fixed in the latest code Linus
> > has.
> 
> Thanks for the reply Steve. I'm guessing that this code hasn't hit CVS
> yet, as I can still reproduce it with a current CVS @ 9/21/03 ~ 17:30
> PST  Sounds like this is a known issue, so I'll just go back to the xfs
> code from -mm2 for now.
> 

The fix is below, I'd be interested in whether or not you still have
problems after applying this.

thanks.

-- 
Nathan


--- /usr/tmp/TmpDir.2990917-0/linux/fs/xfs/linux/xfs_vnode.c_1.117	Mon Sep 22 11:10:21 2003
+++ linux/fs/xfs/linux/xfs_vnode.c	Fri Sep 19 13:17:14 2003
@@ -200,7 +200,7 @@
 	vn_trace_entry(vp, "vn_revalidate", (inst_t *)__return_address);
 	ASSERT(vp->v_fbhv != NULL);
 
-	va.va_mask = XFS_AT_STAT;
+	va.va_mask = XFS_AT_STAT|XFS_AT_GENCOUNT;
 	VOP_GETATTR(vp, &va, 0, NULL, error);
 	if (!error) {
 		inode = LINVFS_GET_IP(vp);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)
  2003-09-22  1:12       ` Nathan Scott
@ 2003-09-22  1:28         ` Walt H
  0 siblings, 0 replies; 6+ messages in thread
From: Walt H @ 2003-09-22  1:28 UTC (permalink / raw)
  To: Nathan Scott; +Cc: Linux Kernel, Linux XFS Mailing List

Nathan Scott wrote:

> The fix is below, I'd be interested in whether or not you still have
> problems after applying this.
> 
> thanks.
> 

That appears to have cleared it up. I tried the tests I discovered in my
earlier e-mail of creating bogus lost+found etc... and couldn't get the
filesystem to fail. Mind you, I only ran an rsync over a 2GB filesystem,
but previously the problem was exhibited 100% of the time. I'll bang on
this for a while. Hopefully you don't hear back from me right away :)
Thanks,

-Walt


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-09-22  1:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-21 15:47 2.6.0-test5-mm3 & XFS FS Corruption Walt H
2003-09-21 18:08 ` 2.6.0-test5-mm3 & XFS FS Corruption (or not?) Walt H
2003-09-21 19:48   ` Steve Lord
2003-09-22  1:01     ` Walt H
2003-09-22  1:12       ` Nathan Scott
2003-09-22  1:28         ` Walt H

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox