* xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop case from Roger Willcocks
@ 2008-12-22 16:58 Christoph Hellwig
2008-12-22 20:28 ` xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom " Roger Willcocks
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2008-12-22 16:58 UTC (permalink / raw)
To: Roger Willcocks; +Cc: xfs
Hi Roger,
I believe the xfstests case 111 is based on a report by you. Do you
remember what was going on there? From a look at the testcase it
overwrites an inode cluster and then tries to bulkstat them. This works
fine with a non-debug kernel, but due to debug kernels panicing it fails
there.
Do you remember what the testcase was looking for? I suspect we should
just not run it for debug kernels, but I'd like to know more about it
so we can add comments describing it.
Cheers,
Christoph
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom Roger Willcocks
2008-12-22 16:58 xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop case from Roger Willcocks Christoph Hellwig
@ 2008-12-22 20:28 ` Roger Willcocks
2008-12-22 20:50 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Roger Willcocks @ 2008-12-22 20:28 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
> Hi Roger,
>
> I believe the xfstests case 111 is based on a report by you. Do you
> remember what was going on there? From a look at the testcase it
> overwrites an inode cluster and then tries to bulkstat them. This works
> fine with a non-debug kernel, but due to debug kernels panicing it fails
> there.
>
> Do you remember what the testcase was looking for? I suspect we should
> just not run it for debug kernels, but I'd like to know more about it
> so we can add comments describing it.
>
> Cheers,
> Christoph
>
Hi Christoph,
here are the relevant extracts from our in-house bugzilla (bug 3675). Since
the problem only occurs when the disk is corrupted, I don't see any problem
with skipping the test on debug kernels.
** 2006-02-01
xfs_fsr can get into a state where one processor spends 100% of its time
looping in the kernel. The application can't be killed. 'top' shows it using
50% CPU (i.e. all of one of the two processors).
oprofile reveals that one processor spends about 2/3 of its time in xfs.ko.
It
looks like the offending syscall is xfs_bulkstat.
** 2006-02-03
Looks like xfs_itobp (map inode number to disk buffer) detects a corrupted
inode (bad magic number). That causes a break out of a loop in xfs_bulkstat,
skipping setting the teminating condition of a containing loop.
I'll file a bug report with SGI.
** 2006-02-03
SGI say 'Ayup, I think you're right'-
http://marc.theaimsgroup.com/?t=113889680200006
** 2006-02-07
A bad inode magic number can cause the xfs_bulkstat syscall to get stuck
looping in the kernel.
To reproduce: (don't try this at home folks!) -
mkfs.xfs /dev/sda
mount filesystem and create 1000 or so files (I copied a handy 313-byte
file).
run this program:
---------
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
char buffer[32768];
void nuke()
{
int i;
for (i = 2048; i < 32768-1; i++)
if (buffer[i] == 'I' && buffer[i+1] == 'N')
buffer[i] = buffer[i+1] = 'X';
}
int main(int argc, char* argv[])
{
int f = open("/dev/sda", O_RDWR);
if (lseek(f, 32768, SEEK_SET) < 0) perror("lseek");
if (read(f, buffer, 32768) != 32768) perror("read");
nuke();
if (lseek(f, 32768, SEEK_SET) < 0) perror("lseek");
if (write(f, buffer, 32768) != 32768) perror("write");
close(f);
}
---------
mount the disk and run xfs_fsr. It immediately gets stuck in a kernel loop.
** 2006-02-08
SGI have added a corresponding regression test to the xfs_cmds package
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/111?rev=1.1
--
Roger
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom Roger Willcocks
2008-12-22 20:28 ` xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom " Roger Willcocks
@ 2008-12-22 20:50 ` Christoph Hellwig
0 siblings, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2008-12-22 20:50 UTC (permalink / raw)
To: Roger Willcocks; +Cc: Christoph Hellwig, xfs
On Mon, Dec 22, 2008 at 08:28:59PM -0000, Roger Willcocks wrote:
> Hi Christoph,
>
> here are the relevant extracts from our in-house bugzilla (bug 3675).
> Since the problem only occurs when the disk is corrupted, I don't see any
> problem with skipping the test on debug kernels.
Thanks a lot, that's some very helpful notes. I'll put a shortened
version of this into the testcase as a comment.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-12-22 20:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-22 16:58 xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop case from Roger Willcocks Christoph Hellwig
2008-12-22 20:28 ` xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom " Roger Willcocks
2008-12-22 20:50 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox