From: Hans Reiser <reiser@namesys.com>
To: Andreas Dilger <adilger@clusterfs.com>
Cc: Oleg Drokin <green@namesys.com>,
Zygo Blaxell <eazgwmir@umail.furryterror.org>,
reiserfs-list@namesys.com
Subject: Re: Corrupted/unreadable journal: reiser vs. ext3
Date: Fri, 14 Feb 2003 22:19:05 +0300 [thread overview]
Message-ID: <3E4D4129.8040103@namesys.com> (raw)
In-Reply-To: <20030214120630.O22930@schatzie.adilger.int>
Andreas Dilger wrote:
>On Feb 14, 2003 16:34 +0300, Hans Reiser wrote:
>
>
>>Andreas Dilger wrote:
>>
>>
>>>Yeah, I keep giving him good reasons to change his mind, even a little,
>>>like "have 'reiserfsck -a' just check the superblock and return with a
>>>code > 1 if there is an error" so that an admin can at least do something
>>>about it if the filesystem is broken, before it gets mounted/written to
>>>again and the brokenness multiplies unknown to the user...
>>>
>>>
>>I don't understand you.
>>
>>
>
>Ok, so the reiserfs kernel code detects an error on disk, what does it
>do? Print out an error message, maybe BUG? There is an "error" field
>in the reiserfs superblock, I hope it is set when the kernel detects
>something bad.
>
>So, now what happens? Maybe the user doesn't read their syslog and
>doesn't see the error, or the error is just a prelude to memory corruption
>which causes the system to crash. When the system boots again, it goes
>on its merry way, mounting the reiserfs filesystem with _known_ errors
>on it, using bad allocation bitmaps, directories btrees, etc and maybe
>double allocating blocks or overwriting blocks from other files causing
>them to become corrupt, etc, etc, etc. Until finally the filesystem is
>totally corrupt, the system crashes miserably, the user emails this list
>and reiserfsck has an impossible job trying to fix the filesystem.
>
>Instead, what I propose is to have "reiserfsck -a" AS A STARTING POINT
>simply check for a valid reiserfs superblock and the absence of the
>"error" flag before declaring the filesystem clean and allowing the
>system to boot.
>
>What's even worse, the reiserfs_read_super (at least 2.4.18 RH kernel)
>code OVERWRITES the superblock error status at mount time, making it
>worse than useless, since each mount hides any errors that were detected
>before the crash:
>
> s->u.reiserfs_sb.s_mount_state = SB_REISERFS_STATE(s);
> s->u.reiserfs_sb.s_mount_state = REISERFS_VALID_FS ;
>
Andreas seems reasonable, Vitaly, what are your thoughts?
>
>
>
>>>Next, add journal replay to reiserfsck if it isn't already there,
>>>
>>>
>>>
>>Why, when it is in the kernel?
>>
>>
>
>Because that is the next stage to allowing reiserfsck do checks on the
>filesystem after a crash. Do you tell me you would rather (and you
>must, because it obviously currently does) have reiserfsck just throw
>away everything in the journal, leaving possibly inconsistent data in
>the filesystem for it to check? Or maybe make the user mount the
>filesystem (which obviously has problems or they wouldn't be running
>reiserfsck to do a full check) just to clear out the journal and maybe
>risk crashing or corruption if the filesystem is strangely corrupted?
>
Vitaly, answer this.
>
>
>
>>Maybe having some code to check whether fsck was run in the last 3
>>months, and if not then if the user types y in the next 30 seconds
>>during boot it will be run, would make sense.
>>
>>
>
>Sure, that would be great, given the prevelance of memory errors and
>IDE DMA errors that show up these days, which the filesystem and the
>journal can do nothing about.
>
>
>
>>The ext2 tradition of checking the number of mounts since the last fsck
>>is simply counting the wrong thing.
>>
>>
>
>It's only a matter of defaults safe vs. fast... e2fsck defaults to safe,
>checking occasionally for possible corruption, vs. reiserfs waiting for
>fatal corruption before forcing the user to run reiserfsck (which is so
>heavily discouraged (on the list, documentation, when run), that nobody
>runs it for fear of damaging their filesystem further.
>
It is probably not so dangerous anymore.
> You are well aware
>that the e2fsck check intervals can be tuned per-filesystem and even
>disabled if desired (it prints options for how to do this at mke2fs time
>and is clearly documented for the experienced user). For a boot-once-a-day
>machine, the default is to check about once a month (at most 6 months for
>the time check), and if machines are crashing more often, then they should
>probably be checked more often because _something_ has to be causing crashes.
>
The idea that how often you boot determines how often it checks is just
silly, sorry.
>
>Having reiserfsck just do read-only checks shouldn't force you to type
>"yes" (and we mean "yes" because this is so scary, mere mortals shouldn't
>be doing this). Hans, you've always talked about making things easy for
>the average user (error messages and such), don't you think that making
>a data consistency check for the user a little less intimidating too?
>
>
>
I think that you should have to agree that you have time to wait for
fsck before you get stuck with a 1 day large server fsck.
--
Hans
next prev parent reply other threads:[~2003-02-14 19:19 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-02-12 20:05 Corrupted/unreadable journal: reiser vs. ext3 Dirk Schenkewitz
2003-02-13 22:49 ` Zygo Blaxell
2003-02-14 0:32 ` Hans Reiser
2003-02-14 8:18 ` Oleg Drokin
2003-02-14 10:13 ` Andreas Dilger
2003-02-14 10:17 ` Oleg Drokin
2003-02-14 10:50 ` Andreas Dilger
2003-02-14 10:59 ` Oleg Drokin
2003-02-14 13:34 ` Hans Reiser
2003-02-14 16:04 ` Rudy Zijlstra
2003-02-14 19:06 ` Andreas Dilger
2003-02-14 19:19 ` Hans Reiser [this message]
2003-02-15 12:51 ` Vitaly Fertman
2003-02-15 13:00 ` Vitaly Fertman
2003-02-18 19:50 ` Hans Reiser
2003-02-18 20:05 ` Vitaly Fertman
2003-02-18 22:18 ` Hans Reiser
2003-02-15 13:04 ` Anders Widman
2003-02-15 13:23 ` Oleg Drokin
2003-02-17 19:43 ` Hans Reiser
2003-02-17 23:35 ` Error <-> Partition Correspondance [was Re: Corrupted/unreadable journal: reiser vs. ext3] Manuel Krause
2003-02-18 6:54 ` Oleg Drokin
2003-02-21 7:27 ` Manuel Krause
[not found] ` <20030221103757.B28866@namesys.com>
2003-02-21 8:22 ` reiserfs messages cleanup patch Manuel Krause
2003-02-21 8:32 ` Oleg Drokin
2003-02-15 22:37 ` Corrupted/unreadable journal: reiser vs. ext3 Andreas Dilger
2003-02-18 7:04 ` fsck on boot (was: Re: Corrupted/unreadable journal: reiser vs. ext3) Ookhoi
2003-02-18 18:21 ` Corrupted/unreadable journal: reiser vs. ext3 Hans Reiser
2003-02-18 19:22 ` Oleg Drokin
2003-02-18 19:28 ` Hans Reiser
2003-02-18 21:17 ` Valdis.Kletnieks
2003-02-18 22:02 ` Matthias Andree
2003-02-19 6:26 ` Oleg Drokin
2003-02-18 22:23 ` Hans Reiser
-- strict thread matches above, loose matches on Subject: below --
2003-02-20 9:55 Dirk Schenkewitz
2003-02-20 10:20 ` Anders Widman
2003-02-17 10:04 Dirk Schenkewitz
2003-02-20 1:27 ` Juan Quintela
2003-02-20 9:03 ` Anders Widman
2003-02-14 14:30 Dirk Schenkewitz
2003-02-14 14:20 Dirk Schenkewitz
2003-02-14 20:58 ` Valdis.Kletnieks
2003-02-14 0:18 Sam Vilain
2003-02-23 23:31 ` Zygo Blaxell
2003-02-24 1:14 ` Anders Widman
2003-02-14 0:17 Sam Vilain
2003-02-14 0:16 Sam Vilain
2003-02-23 23:10 ` Zygo Blaxell
2003-02-12 20:57 Dirk Schenkewitz
2003-02-12 18:27 Anders Widman
2003-02-11 19:43 berthiaume_wayne
2003-02-12 10:48 ` Dirk Schenkewitz
2003-02-12 10:59 ` Hans Reiser
2003-02-12 11:24 ` Frank Baumgart
2003-02-12 11:35 ` Stefan Traby
2003-02-12 11:54 ` Dirk Schenkewitz
2003-02-12 12:42 ` Hans Reiser
2003-02-12 13:25 ` Dirk Schenkewitz
2003-02-12 16:22 ` Sam Vilain
2003-02-12 16:53 ` Anders Widman
2003-02-12 17:19 ` Hans Reiser
2003-02-12 17:40 ` Anders Widman
2003-02-12 18:15 ` Dirk Mueller
2003-02-12 18:20 ` Anders Widman
2003-02-12 18:20 ` Chris Dukes
2003-02-13 20:08 ` Zygo Blaxell
2003-02-11 18:59 Dirk Schenkewitz
2003-02-11 20:27 ` Hans Reiser
2003-02-11 21:30 ` Mike Hodson
2003-02-11 21:47 ` Hans Reiser
2003-02-11 21:58 ` Hans Reiser
2003-02-12 6:35 ` Oleg Drokin
2003-02-11 23:11 ` Adam Goryachev
2003-02-11 23:17 ` Anders Widman
2003-02-12 0:12 ` Hans Reiser
2003-02-12 10:23 ` Anders Widman
2003-02-12 10:47 ` Hans Reiser
2003-02-12 11:12 ` Adam Goryachev
2003-02-12 13:42 ` Anders Widman
2003-02-12 14:15 ` Russell Coker
2003-02-12 15:26 ` Anders Widman
2003-02-12 16:22 ` bscott
2003-02-12 16:28 ` Russell Coker
2003-02-12 16:40 ` Anders Widman
2003-02-13 3:42 ` Zygo Blaxell
2003-02-13 10:13 ` Anders Widman
2003-02-13 14:44 ` Rudy Zijlstra
2003-02-13 3:31 ` Zygo Blaxell
2003-02-12 16:39 ` Sam Vilain
2003-02-12 5:12 ` Ross Vandegrift
2003-02-12 7:17 ` Oleg Drokin
2003-02-12 10:17 ` Alexander Lyamin
2003-02-12 10:19 ` Alexander Lyamin
2003-02-12 16:25 ` Vitaly Fertman
2003-02-12 16:56 ` Anders Widman
2003-02-12 17:13 ` Oleg Drokin
2003-02-12 1:02 ` Mike Hodson
2003-02-12 7:25 ` Oleg Drokin
2003-02-12 9:45 ` Hans Reiser
2003-02-12 16:09 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3E4D4129.8040103@namesys.com \
--to=reiser@namesys.com \
--cc=adilger@clusterfs.com \
--cc=eazgwmir@umail.furryterror.org \
--cc=green@namesys.com \
--cc=reiserfs-list@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.