public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: ramdisk corruption problems - was: RE: pivot_root and initrd  kern el panic woes
@ 2001-12-20 19:06 Torrey Hoffman
  2001-12-20 19:46 ` Linus Torvalds
  0 siblings, 1 reply; 37+ messages in thread
From: Torrey Hoffman @ 2001-12-20 19:06 UTC (permalink / raw)
  To: Tachino Nobuhiro; +Cc: andersen, linux-kernel, viro

Yes, this does fix the problem.  Thank you very much!

Hopefully something like this will make it into 2.4.18?

(I have to admit I applied and tested your patch to 
2.4.17-rc2, and did not test -rc2 without your patch.  
However, -rc1 has the bug and I don't think any other changes 
between -rc1 and -rc2 would have fixed this.  If someone out 
there wants me to, I can recompile and test -rc2 vanilla.)

Torrey Hoffman


Tachino Nobuhiro (tachino@open.nm.fujitsu.co.jp) wrote:
 
> Hello,
> 
> Following patch may fix your problem. 
> 
> diff -r -u linux-2.4.17-rc2.org/drivers/block/rd.c 
> linux-2.4.17-rc2/drivers/block/rd.c
> --- linux-2.4.17-rc2.org/drivers/block/rd.c	Thu Dec 20 20:30:57 2001
> +++ linux-2.4.17-rc2/drivers/block/rd.c	Thu Dec 20 20:46:53 2001
> @@ -194,9 +194,11 @@
>  static int ramdisk_readpage(struct file *file, struct page * page)
>  {
>  	if (!Page_Uptodate(page)) {
> -		memset(kmap(page), 0, PAGE_CACHE_SIZE);
> -		kunmap(page);
> -		flush_dcache_page(page);
> +		if (!page->buffers) {
> +			memset(kmap(page), 0, PAGE_CACHE_SIZE);
> +			kunmap(page);
> +			flush_dcache_page(page);
> +		}
>  		SetPageUptodate(page);
>  	}
>  	UnlockPage(page);
> 
> 
>   grow_dev_page() creates not Uptodate page which has valid 
> buffers, so
> it is wrong that ramdisk_readpage() clears whole page unconditionally.
> 
> 
> At Tue, 18 Dec 2001 12:14:49 -0800,
> Torrey Hoffman wrote:
> > 
> > More information!  And a workaround!
> > 
> > I conjecture that the ramdisk driver (post 2.4.9) only grabs
> > VM pages properly if it is accessed directly, as a dd to 
> > /dev/ram0 does.  I further conjecture that accessing the 
> > ramdisk through a mounted filesystem does not grab pages 
> > properly.
> > 
> > The reason I believe this is that removing the call to 
> > "freeramdisk" from my original script avoids corruption.  
> > 
> > Another way to avoid ramdisk corruption is to 
> > "dd if=/dev/zero of=/dev/ram0 bs=1k count=4000" 
> > immediately after the call to freeramdisk.
> > 
> > If my conjecture is right, then the corruption is caused 
> > because mke2fs on a "freed" /dev/ram0 doesn't touch every 
> > block of the fs, leaving "holes" where pages are not properly 
> > grabbed from the VM. The resulting filesystem appears to work, 
> > but dd'ing from /dev/ram0 gets a broken filesystem image.
> > 
> > Note that "freeramdisk /dev/ram0" is pretty much just:
> > #define FLKFLSBUF  _IO(0x12,97) /* flush buffer cache */
> > f = open("/dev/ram0", O_RDWR);
> > ioctl(f, BLKFLSBUF);
> > 
> > To experiment for yourself, stick the following script in
> > a subdirectory which also contains a "testdir" directory
> > with about 3 MB of data.
> > 
> > - - - - - - - -
> > #!/bin/bash
> > 
> > # to tickle the bug, do the freeramdisk but not the
> > # dd from /dev/zero to /dev/ram0.  
> > 
> > freeramdisk /dev/ram0
> > #dd if=/dev/zero of=/dev/ram0 bs=1k count=4000
> > 
> > mke2fs -m0 /dev/ram0 4000
> > mount -t ext2 /dev/ram0 /mnt/ramdisk
> > rm -rf /mnt/ramdisk/*
> > 
> > cp -a ./testdir /mnt/ramdisk
> > umount /dev/ram0
> > 
> > dd if=/dev/ram0 of=ram0.img bs=1k count=4000
> > dd if=ram0.img of=/dev/ram0 bs=1k count=4000
> > 
> > mount -t ext2 /dev/ram0 /mnt/ramdisk
> > diff -q -r ./testdir /mnt/ramdisk/testdir
> > 
> > # If diff reports mismatches, you saw the bug.
> > 
> > umount /dev/ram0
> > - - - - - - - -
> > 
> > If the gods of the VM and VFS don't bother to look at 
> > it, I might take a peek at the relevant kernel code myself.  
> > Might take two months of study before I know enough though.
> > 
> > Torrey

^ permalink raw reply	[flat|nested] 37+ messages in thread
[parent not found: <Pine.GSO.4.21.0112210151020.15555-100000@weyl.math.psu.edu>]
* RE: ramdisk corruption problems - was: RE: pivot_root and initrd  kern el panic woes
@ 2001-12-18 20:14 Torrey Hoffman
  2001-12-20 12:19 ` Tachino Nobuhiro
  0 siblings, 1 reply; 37+ messages in thread
From: Torrey Hoffman @ 2001-12-18 20:14 UTC (permalink / raw)
  To: 'andersen@codepoet.org'; +Cc: linux-kernel, 'viro@math.psu.edu'

More information!  And a workaround!

I conjecture that the ramdisk driver (post 2.4.9) only grabs
VM pages properly if it is accessed directly, as a dd to 
/dev/ram0 does.  I further conjecture that accessing the 
ramdisk through a mounted filesystem does not grab pages 
properly.

The reason I believe this is that removing the call to 
"freeramdisk" from my original script avoids corruption.  

Another way to avoid ramdisk corruption is to 
"dd if=/dev/zero of=/dev/ram0 bs=1k count=4000" 
immediately after the call to freeramdisk.

If my conjecture is right, then the corruption is caused 
because mke2fs on a "freed" /dev/ram0 doesn't touch every 
block of the fs, leaving "holes" where pages are not properly 
grabbed from the VM. The resulting filesystem appears to work, 
but dd'ing from /dev/ram0 gets a broken filesystem image.

Note that "freeramdisk /dev/ram0" is pretty much just:
#define FLKFLSBUF  _IO(0x12,97) /* flush buffer cache */
f = open("/dev/ram0", O_RDWR);
ioctl(f, BLKFLSBUF);

To experiment for yourself, stick the following script in
a subdirectory which also contains a "testdir" directory
with about 3 MB of data.

- - - - - - - -
#!/bin/bash

# to tickle the bug, do the freeramdisk but not the
# dd from /dev/zero to /dev/ram0.  

freeramdisk /dev/ram0
#dd if=/dev/zero of=/dev/ram0 bs=1k count=4000

mke2fs -m0 /dev/ram0 4000
mount -t ext2 /dev/ram0 /mnt/ramdisk
rm -rf /mnt/ramdisk/*

cp -a ./testdir /mnt/ramdisk
umount /dev/ram0

dd if=/dev/ram0 of=ram0.img bs=1k count=4000
dd if=ram0.img of=/dev/ram0 bs=1k count=4000

mount -t ext2 /dev/ram0 /mnt/ramdisk
diff -q -r ./testdir /mnt/ramdisk/testdir

# If diff reports mismatches, you saw the bug.

umount /dev/ram0
- - - - - - - -

If the gods of the VM and VFS don't bother to look at 
it, I might take a peek at the relevant kernel code myself.  
Might take two months of study before I know enough though.

Torrey


^ permalink raw reply	[flat|nested] 37+ messages in thread
* ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes
@ 2001-12-18  1:44 Torrey Hoffman
  0 siblings, 0 replies; 37+ messages in thread
From: Torrey Hoffman @ 2001-12-18  1:44 UTC (permalink / raw)
  To: 'andersen@codepoet.org'; +Cc: linux-kernel, 'viro@math.psu.edu'

Thanks to Erik Anderson and Al Viro, your responses to my first
report helped me to produce this much more accurate report.

I've narrowed it down quite a bit.  It's a problem with ramdisk 
corruption on some 2.4 kernels, not specifically initrd, and 
definitely not a problem with booting initrd.  

executive summary:
  dd'ing from /dev/ram0 usually produces a corrupted ramdisk
  image.

This is reproducible on:
 2.4.12 
 2.4.16 
 2.4.17-pre2 + low-latency patch + custom tweaks 

However, I cannot reproduce the problem on:
 2.4.8-26mdk  (default Mandrake 8.1 kernel)
 2.4.9 

On 2.4.10, I can't do the test, seems like ramdisk was *really*
broken on that kernel.  (no room on ramdisk after mke2fs???)

I now have a simple script that checks for the problem and was 
tried on each of the kernels listed above:

---------------------------------------------------
#!/bin/bash

umount /dev/ram0
./rootfs/bin/busybox freeramdisk /dev/ram0

mke2fs -m0 /dev/ram0 4000
mount -t ext2 /dev/ram0 /mnt/ramdisk
cp -a rootfs/* /mnt/ramdisk
umount /dev/ram0
dd if=/dev/ram0 of=initrd bs=1k count=4000 

dd if=initrd of=/dev/ram0 bs=1k count=4000 
mount -t ext2 /dev/ram0 /mnt/ramdisk

diff -q -r /mnt/ramdisk/bin ./rootfs/bin
diff -q -r /mnt/ramdisk/lib ./rootfs/lib

---------------------------------------------------

On kernels with the problem, the scripted diff reports that most 
(or all?) the binaries and libraries are corrupt.  This contradicts 
my earlier problem report, sorry about that. 

I'm eager to help track this down, I can test patches, supply 
more information, give you a tar.gz of the contents of my rootfs,
or do whatever it takes.   In the meantime I've gone back to 
the default Mandrake 2.4.8 kernel.  It's noticably slower. :-(


A few loose ends...

Erik Andersen wrote:
> Any particular reason you are using a version of busybox that is 
> quite old?  You really should get a newer release -- I've fixed a 
> lot of bugs since then.

Yes, that is one of the reasons I'm working on this again - I'm 
updating my old initrd with the new kernel and the latest versions 
of all the tools I'm using, including busybox.  BTW, thanks 
for your work on busybox, it's great.

[...]

> Can you sucessfully chroot into your rootfs dir?

Good suggestion, that helped set me on the right track to finding
the problem.  I can chroot into my rootfs "source" directory, but 
(unsurprisingly) attempting to chroot to a corrupted image segfaults,
that's how I discovered the corruption, not sure how I missed it 
before.  I'll blame it on Friday afternoon confusion.

Torrey

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2002-01-07  4:32 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-20 19:06 ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes Torrey Hoffman
2001-12-20 19:46 ` Linus Torvalds
2001-12-20 22:56   ` Alexander Viro
2001-12-20 23:42   ` Andrea Arcangeli
2001-12-21  1:49     ` Andrea Arcangeli
     [not found]       ` <3C22CF16.C78B1F19@zip.com.au>
2001-12-29 15:40         ` Andrea Arcangeli
2001-12-30  6:19           ` Andrew Morton
2001-12-30  6:33             ` Alexander Viro
2001-12-30  6:38               ` Andrew Morton
2001-12-30  7:17                 ` Alexander Viro
2001-12-30 10:15                   ` ramdisk corruption problems - was: RE: pivot_root and initrd Alan Cox
2001-12-31  0:08                 ` ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes Andrea Arcangeli
2001-12-30  7:08               ` Andrew Morton
2001-12-30  7:29                 ` Alexander Viro
2001-12-30  7:59                   ` ramdisk corruption problems - was: RE: pivot_root and initrdkern " Andrew Morton
2001-12-30 17:40                     ` Linus Torvalds
2001-12-31  0:28                       ` Andrea Arcangeli
2001-12-31  0:35                         ` Linus Torvalds
2001-12-31  1:00                           ` Andrea Arcangeli
2001-12-31  0:05               ` ramdisk corruption problems - was: RE: pivot_root and initrd kern " Andrea Arcangeli
2002-01-05 11:43                 ` Andrew Morton
2002-01-05 14:04                   ` Trond Myklebust
2002-01-07  3:08                   ` Andrea Arcangeli
2002-01-07  3:49                     ` Andrew Morton
2002-01-07  4:31                       ` Andrea Arcangeli
2001-12-30 23:56             ` Andrea Arcangeli
2001-12-31 10:06             ` Daniel Phillips
2002-01-04 16:38             ` Stephen C. Tweedie
2002-01-05  7:53       ` Andrew Morton
2002-01-07  1:08         ` Andrea Arcangeli
2001-12-21  1:38   ` Tachino Nobuhiro
2001-12-21  1:51     ` Everyone else but TWO Andre Hedrick
     [not found] <Pine.GSO.4.21.0112210151020.15555-100000@weyl.math.psu.edu>
2001-12-21 23:11 ` ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes Linus Torvalds
2001-12-21 23:39   ` Alexander Viro
  -- strict thread matches above, loose matches on Subject: below --
2001-12-18 20:14 Torrey Hoffman
2001-12-20 12:19 ` Tachino Nobuhiro
2001-12-18  1:44 Torrey Hoffman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox