public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* filesystem corruption (ReiserFS, 2.6.6): regions replaced by \000 bytes
@ 2004-05-28 12:28 David Madore
  2004-05-28 12:46 ` Chris Mason
  0 siblings, 1 reply; 15+ messages in thread
From: David Madore @ 2004-05-28 12:28 UTC (permalink / raw)
  To: linux-kernel

Hi folks.

I'm afraid this bug-report will be rather worthless as it is, because
the bug has proven remarkable elusive and has defeated all my attempts
to track it down to a precise test case or set of circumstances.  But
since it seems important, I thought it might be worth a post anyway.
Any help is appreciated in clarifying the circumstances which trigger
the problem, or generally in making this report more useful.

The bottom line: I've experienced file corruption, of the following
nature: consecutive regions (all, it seems, aligned on 256-byte
boundaries, and typically around 1kb or 2kb in length) of seemingly
random files are replaced by null bytes.  The filesystem is ReiserFS
(but as nearly all my filesystems are ReiserFS anyway, I cannot
conclusively say that the bug is indeed in ReiserFS).  The problem has
occurred with kernel versions Debian-packaged 2.6.6-1-686-smp and
home-compiled 2.6.6-mm2 (SMP also).  The hardware is an Intel
bi-PII450 (with 256MB RAM) using aic7xxx as low-level disk driver; and
I have good reasons to think that the hardware is sound (and the
memory banks in particular).  Distribution is Debian Sarge (with a few
unstable packages as well).  System load is moderate.  The affected
files were typically corrupted during operation of Debian "apt-get":
either while updating the apt-cache (which became corrupted) or while
extracting packages (which randomly corrupted newly extracted files).

That's about all I can say for sure.  I have tried to reproduce the
problem by stress-testing the filesystem (creating large numbers of
small files, or small numbers of large files, containing ARC4 streams
produced by <URL: ftp://quatramaran.ens.fr/pub/madore/misc/arc4gen.c >
and checking them afterward against the same procedural generator) --
but to no avail: even by trying heavily concurrent access I have not
been able to reproduce a single occurrence of the bug).  Maybe apt-get
has a specific way of writing files (but I can't really think how;
might it use mmap() in some way? that doesn't sound plausible) which
makes it trigger the bug.

I also have a UP box (Intel PIII600 with 384MB RAM and IDE disks)
which, even though it has a nearly identical setup and is used in a
roughly equal way, has not experienced any kind of corruption; so
maybe the bug is SMP-specific (which would explain it going more or
less unnoticed) -- on the other hand, a friend of mine has mentioned
having observed similar problems on a UP box with 2.6.5 installed
under very heavy load on ReiserFS, but I can't say more here.  I'm
really sorry that all this is very fuzzy.

Any suggestions (either on how to fix the problem or to work around
it, or on how to reproduce it experimentally) are welcome.  A friendly
pat on the back would also be welcome. :-)

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.eleves.ens.fr:8080/home/madore/ )

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-06-01 18:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-28 12:28 filesystem corruption (ReiserFS, 2.6.6): regions replaced by \000 bytes David Madore
2004-05-28 12:46 ` Chris Mason
2004-05-28 13:05   ` Lenar Lõhmus
2004-05-28 16:24   ` Tomas Szepe
2004-05-28 16:29     ` Chris Mason
2004-05-28 16:42       ` Pat
2004-05-28 16:54         ` Chris Mason
2004-05-28 16:45       ` Tomas Szepe
2004-05-28 16:55         ` Chris Mason
2004-05-28 16:58         ` Steven Cole
2004-05-29 11:56     ` Lenar Lõhmus
     [not found]   ` <1085750828.1914.385.camel@tribesman.namesys.com>
     [not found]     ` <1085751695.22636.3163.camel@watt.suse.com>
2004-05-31 16:48       ` I would like to see ReiserFS V3 enter a feature freeze real soon Hans Reiser
2004-06-01 11:37         ` Chris Mason
2004-06-01 17:02           ` Hans Reiser
2004-06-01 18:53             ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox