public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] handle ext3 directory corruption better
@ 2006-10-20 19:17 Eric Sandeen
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2006-10-20 19:17 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: ext4 development

(as previously discussed on the ext4 list)

I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

basically it makes a filesystem, splats some random bits over it,
then tries to mount it and do some simple filesystem actions.

At best, the filesystem catches the corruption gracefully.
At worst, things spin out of control.

As you might guess, we found a couple places in ext3 where things 
spin out of control :)

First, we had a corrupted directory that was never checked
for consistency... it was corrupt, and pointed to another bad "entry"
of length 0.  The for() loop looped forever, since the length
of ext3_next_entry(de) was 0, and we kept looking at the same
pointer over and over and over and over... I modeled this check
and subsequent action on what is done for other directory types
in ext3_readdir...

(adding this check adds some computational expense; I am testing 
a followup patch to reduce the number of times we check and re-check
these directory entries, in all cases.  Thanks for the idea, Andreas).

Next we had a root directory inode which had a corrupted size, claimed
to be > 200M on a 4M filesystem.  There was only really 1 block in the 
directory, but because the size was so large, readdir kept coming back 
for more, spewing thousands of printk's along the way.

Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.

With these two changes fsfuzz test survives quite well on ext3.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Index: linux-2.6.18/fs/ext3/namei.c
===================================================================
--- linux-2.6.18.orig/fs/ext3/namei.c
+++ linux-2.6.18/fs/ext3/namei.c
@@ -551,6 +551,15 @@ static int htree_dirblock_to_tree(struct
 					   dir->i_sb->s_blocksize -
 					   EXT3_DIR_REC_LEN(0));
 	for (; de < top; de = ext3_next_entry(de)) {
+		if (!ext3_check_dir_entry("htree_dirblock_to_tree", dir, de, bh,
+					(block<<EXT3_BLOCK_SIZE_BITS(dir->i_sb))
+						+((char *)de - bh->b_data))) {
+			/* On error, skip the f_pos to the next block. */
+			dir_file->f_pos = (dir_file->f_pos |
+					(dir->i_sb->s_blocksize - 1)) + 1;
+			brelse (bh);
+			return count;
+		}
 		ext3fs_dirhash(de->name, de->name_len, hinfo);
 		if ((hinfo->hash < start_hash) ||
 		    ((hinfo->hash == start_hash) &&
Index: linux-2.6.18/fs/ext3/dir.c
===================================================================
--- linux-2.6.18.orig/fs/ext3/dir.c
+++ linux-2.6.18/fs/ext3/dir.c
@@ -151,6 +151,9 @@ static int ext3_readdir(struct file * fi
 			ext3_error (sb, "ext3_readdir",
 				"directory #%lu contains a hole at offset %lu",
 				inode->i_ino, (unsigned long)filp->f_pos);
+			/* corrupt size?  Maybe no more blocks to read */
+			if (filp->f_pos > inode->i_blocks << 9)
+				break;
 			filp->f_pos += sb->s_blocksize - offset;
 			continue;
 		}



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
@ 2006-10-21 15:29 Steve Grubb
  2006-10-31  9:57 ` Pavel Machek
  0 siblings, 1 reply; 7+ messages in thread
From: Steve Grubb @ 2006-10-21 15:29 UTC (permalink / raw)
  To: linux-kernel

>I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
>http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

Oops, I didn't know this was going to be mentioned and deleted the file. The 
current release is 0.5 and I've symlinked it to the address Eric mentioned 
above. That said, I would like to say a couple things about the program. 

Bugs found by fuzzing falls into 2 categories, robustness and security. Its 
very possible to overflow stacks or find signed/unsigned conversion problems 
which can be exploited by malicious users. It is also expected by people that 
the OS tolerate errors. If you have defective media, you may need to access 
the drive in attempt to salvage what you can. Or maybe someone walks by with 
specially doctored USB stick and jams it in your desktop computer while you 
are away. The automounter then mounts and reads the initial directory...boom.

To help find these kind of problems, I worked on a program, fsfuzz, that can 
create all sorts of errors. The initial idea for the program comes from LMH. 
The tool saves the image that crashed your machine so that you can replay the 
problem and study it. This program has killed all the file systems it 
currently supports in the latest rawhide kernel - except swap. Virtually 
every file system in the current kernel can be used to oops or lockup a 
machine. Currently supported filesystems include:

ext2/3
swap
iso9660
vfat/msdos
cramfs
squashfs
xfs
hfs
gfs2

The way that the program works falls into this general pattern, it creates an 
initial file system image, corrupts it, then loopback mounts it, and tries 
various operations. If that passes, it corrupts the image in a different way 
and repeats.

The initial image is created in one of 2 ways, either dd a file or mkdir a 
directory depending on what the filesystem creation tools call for. To 
corrupt the image, a version of mangle is used. Mangle is a program that 
corrupts about 10% of the data and favors bytes with a value > 128 to induce 
signed/unsigned problems. The corrupted image is exercised by a program 
called run_test. I separated it from the main program so that you can replay 
a test and debug what is happening.

So, to wrap up...anyone that has anything to do with file system development 
may want to give this tool a try to see how robust any given file system is. 
The program can be grabbed here:

http://people.redhat.com/sgrubb/files/fsfuzzer-0.5.tar.gz

There is a README  file that explains more about using it. I am taking patches 
if you have any ideas about supporting file systems not already covered or 
ideas for new tests to exercise different filesystem operations.

-Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
  2006-10-21 15:29 [PATCH] handle ext3 directory corruption better Steve Grubb
@ 2006-10-31  9:57 ` Pavel Machek
  2006-10-31 13:35   ` Phillip Susi
  2006-11-01 15:29   ` Steve Grubb
  0 siblings, 2 replies; 7+ messages in thread
From: Pavel Machek @ 2006-10-31  9:57 UTC (permalink / raw)
  To: Steve Grubb; +Cc: linux-kernel

Hi!

> The initial image is created in one of 2 ways, either dd a file or mkdir a 
> directory depending on what the filesystem creation tools call for. To 
> corrupt the image, a version of mangle is used. Mangle is a program that 
> corrupts about 10% of the data and favors bytes with a value > 128 to induce 
> signed/unsigned problems. The corrupted image is exercised by a program 
> called run_test. I separated it from the main program so that you can replay 
> a test and debug what is happening.
> 
> So, to wrap up...anyone that has anything to do with file system development 
> may want to give this tool a try to see how robust any given file system is. 
> The program can be grabbed here:
> 
> http://people.redhat.com/sgrubb/files/fsfuzzer-0.5.tar.gz

Nice... can you run the same tool against fsck, too?

I did that some time ago, with less evil tool, and got some
interesting results.

(Expectation is that no matter how you corrupt fs, fsck will get it
back to consistent state...)
						Pavel
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
  2006-10-31  9:57 ` Pavel Machek
@ 2006-10-31 13:35   ` Phillip Susi
  2006-11-01 15:29   ` Steve Grubb
  1 sibling, 0 replies; 7+ messages in thread
From: Phillip Susi @ 2006-10-31 13:35 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Steve Grubb, linux-kernel

Another expectation is that after the fsck, you won't loose any more 
data that you could access by mounting the damaged filesystem.  There 
are a lot of horror stories out there of people only having a slightly 
damaged fs, and after a fsck, they lost a lot more data.

Pavel Machek wrote:
> 
> Nice... can you run the same tool against fsck, too?
> 
> I did that some time ago, with less evil tool, and got some
> interesting results.
> 
> (Expectation is that no matter how you corrupt fs, fsck will get it
> back to consistent state...)
> 						Pavel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
  2006-10-31  9:57 ` Pavel Machek
  2006-10-31 13:35   ` Phillip Susi
@ 2006-11-01 15:29   ` Steve Grubb
  2006-11-10 20:57     ` Eric Sandeen
  1 sibling, 1 reply; 7+ messages in thread
From: Steve Grubb @ 2006-11-01 15:29 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Tuesday 31 October 2006 04:57, Pavel Machek wrote:
> Nice... can you run the same tool against fsck, too?

I'll see if I can make that work, too. The fuzzer tries to preserve the bad 
image so that you can replay the problem for debugging. I think its just a 
matter of making another copy and using that one instead.

-Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
  2006-11-01 15:29   ` Steve Grubb
@ 2006-11-10 20:57     ` Eric Sandeen
  2006-11-12 13:39       ` Pavel Machek
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2006-11-10 20:57 UTC (permalink / raw)
  To: Steve Grubb; +Cc: Pavel Machek, linux-kernel

Steve Grubb wrote:
> On Tuesday 31 October 2006 04:57, Pavel Machek wrote:
>> Nice... can you run the same tool against fsck, too?
> 
> I'll see if I can make that work, too. The fuzzer tries to preserve the bad 
> image so that you can replay the problem for debugging. I think its just a 
> matter of making another copy and using that one instead.

I played with this on xfs a little bit in my spare time, found some
xfs_repair problems.  :)  I'm sure other fs's would have issues as well.

Ideally it would probably be good for the tool to have a "use" mode (try
to use the corrupted fs) and a "check" mode (try to fsck the corrupted fs).

In use   mode, it'd be:  mkfs, fuzz, mount, populate (etc), unmount.
In check mode, it'd be:  mkfs, mount, populate, unmount, fuzz, fsck.

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] handle ext3 directory corruption better
  2006-11-10 20:57     ` Eric Sandeen
@ 2006-11-12 13:39       ` Pavel Machek
  0 siblings, 0 replies; 7+ messages in thread
From: Pavel Machek @ 2006-11-12 13:39 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Steve Grubb, linux-kernel

Hi!

> >> Nice... can you run the same tool against fsck, too?
> > 
> > I'll see if I can make that work, too. The fuzzer tries to preserve the bad 
> > image so that you can replay the problem for debugging. I think its just a 
> > matter of making another copy and using that one instead.
> 
> I played with this on xfs a little bit in my spare time, found some
> xfs_repair problems.  :)  I'm sure other fs's would have issues as well.

Yes... I played with similar tool few years ago on ext2, and it lead
to fixing couple of bugs in e2fsck, too. vfat/reiser were too buggy
for this test to be useful.

> Ideally it would probably be good for the tool to have a "use" mode (try
> to use the corrupted fs) and a "check" mode (try to fsck the corrupted fs).
> 
> In use   mode, it'd be:  mkfs, fuzz, mount, populate (etc), unmount.
> In check mode, it'd be:  mkfs, mount, populate, unmount, fuzz, fsck.

Yes, that's what I did back then.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-11-12 13:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-21 15:29 [PATCH] handle ext3 directory corruption better Steve Grubb
2006-10-31  9:57 ` Pavel Machek
2006-10-31 13:35   ` Phillip Susi
2006-11-01 15:29   ` Steve Grubb
2006-11-10 20:57     ` Eric Sandeen
2006-11-12 13:39       ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2006-10-20 19:17 Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox