All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuba Ober <kuba@mareimbrium.org>
To: berthiaume_wayne@emc.com
Cc: reiserfs-list@namesys.com
Subject: Re: A couple of questions
Date: Fri, 17 May 2002 12:03:21 -0400	[thread overview]
Message-ID: <200205171203.21328.kuba@mareimbrium.org> (raw)
In-Reply-To: <93F527C91A6ED411AFE10050040665D0049BFA37@corpusmx1.us.dg.com>

On pi±tek 17 maj 2002 09:11 am, berthiaume_wayne@emc.com wrote:
> Kuba, I guess the question that should be posed this way: What is
> the downside of not asking the user and just fixing what can be fixed? Is
> there a potential for unrepairable damage if you were to fix blindly
> without "user" intervention?

The downside is that with fsck's that are quick hacks, you really require user 
to know a lot, and you ask complicated questions.

Fsck should *first* get all the information it can from the fs and digest it. 
Only then can it try to fix things, and ask questions about things that are 
doubtful. There are certain things that are 99.999995 true, almost 
assertions, because certain damage patterns have extremely slim chances of 
occuring.

This is essentially a way to formulate fsck algorithm in terms similar to some 
expert systems.

Example: I'll use FAT16 for the example fs, since I assume most people know it 
well enough. With FAT16 filesystem, it was quite easy to discern clusters 
occupied by directories from clusters occupied by file data. And then, there 
was more data that increased the probablity that you indeed had a proper 
directory cluster. It might have went in steps:
(assuming all fat copies were zeroed)
1. Read all disk clusters, detect those that are probable directories basing 
solely on cluster contents. Define an "is-directory" property for each 
cluster. Assign 0 to this property in those clusters which failed detection 
in this step, and 0.8 to clusters which were detected.
2. Check for mutual links between directories detected thus far (the forward 
and backward links). Bump the "is-directory" probabilities for clusters that 
have passed to 1.0.
3. Assign "is-first-cluster" probabilities for all clusters. Set them to value 
of "is-directory" from the directory cluster that contained an entry pointing 
to this cluster, or 0 if nothing points to them.
4. Check for consecutive directory clusters, starting at all clusters having 
is-first-cluster > 0 && is-directory > 0. Bump "is-directory" basing on 
best-known neighbors, etc, ...

There were many shortcuts taken here, since I ignore multiply-linked entries, 
loops, etc. It was meant as example of the idea, not implementation.  There 
is a lot of what-if kind of approach in fs recovery, and by providing an 
expert-system fuzzy-logic (ie non-binary) approach, there can be a lot of 
knowledge gained about a filesystem without asking a single question. We're 
really looking for answers if we depend on a piece of information in doing a 
recovery decision, and we consider the information we have to be too 
doubtful.

That also means that the fsck/recovery program needs to do a lot of stuff, a 
lot more than one thinks. The typical "multi-pass" approach where errors are 
fixed from lower-level to higher-level is wrong, since it inherently either 
looses information, or doesn't have it yet in earlier steps.

There can be only three passes: gather data from the media, ask additional 
questions to the user if they are needed, do the fixes. I don't see it any 
other way, and I was always thinking of an ideal fsck tool in these terms 
since I was about 12 (late 80's, already had a third HD in my 286/8 machine, 
and had done a few recovery operations with diskeditor). An example of wrong 
approach is say norton disk doctor, on FAT16: it would first check FAT, fix 
that, and only afterwards check & fix directory structure -- it looses a lot 
of information that each of the passes keeps to itself, eg. fixing cluster 
chains in FAT doesn't really look at what those clusters contain, etc.

Cheers, Kuba

  reply	other threads:[~2002-05-17 16:03 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-17 13:11 A couple of questions berthiaume_wayne
2002-05-17 16:03 ` Kuba Ober [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-05-27 13:39 Paul Millar
2010-05-27 14:56 ` Hubert Kario
2010-05-31 17:59   ` Paul Millar
2010-06-02 16:19     ` Hubert Kario
2010-05-27 16:00 ` Chris Mason
2010-05-31 18:06   ` Paul Millar
2010-05-31 20:33     ` Mike Fedyk
2010-06-02 11:56       ` Paul Millar
2010-06-01 13:39     ` Martin K. Petersen
2010-06-02 13:40       ` Paul Millar
2010-06-04  1:17         ` Martin K. Petersen
2005-04-18 11:51 Imre Simon
2005-04-18 15:31 ` Linus Torvalds
2005-04-18 16:23   ` Paul Jackson
2002-05-17 15:27 Steve Pratt
2002-05-16 18:48 Steve Pratt
2002-05-16 18:44 Steve Pratt
2002-05-16 18:55 ` Oleg Drokin
2002-05-16 20:33 ` Hans Reiser
2002-05-16 21:23   ` Kuba Ober
2002-05-16 21:44     ` Lehmann 
2002-05-16 23:57       ` Hans Reiser
2002-05-17  0:45         ` Philipp Gühring
2002-05-17  1:06           ` Manuel Krause
2002-05-17 15:21           ` Kuba Ober
2002-05-17  0:17       ` Manuel Krause
2002-05-17 15:04       ` Kuba Ober
2002-05-18 20:40         ` Hans Reiser
2002-05-17 15:05       ` Kuba Ober
2002-05-16 21:44     ` Lehmann 
2002-05-17 13:10     ` Valdis.Kletnieks
2002-05-17 15:35       ` Kuba Ober
2002-05-16 15:11 Steve Pratt
2002-05-16 15:35 ` Oleg Drokin
2002-05-16 14:52 Steve Pratt
2002-05-16 15:13 ` Hans Reiser
2002-05-15 21:22 Steve Pratt
2002-05-16  5:20 ` Oleg Drokin
2002-05-16  9:42   ` Hans Reiser
2002-05-16 11:40     ` Oleg Drokin
2002-05-16 11:54       ` Hans Reiser
2001-10-10 11:28 Adil EL YOUSSEFI
2001-10-10 12:11 ` David Woodhouse
1999-03-02 13:11 Neil Booth
1999-03-15 18:58 ` Stephen C. Tweedie
1999-03-15 22:46   ` neil
1999-03-16 12:22     ` Stephen C. Tweedie
1999-03-16  2:11   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200205171203.21328.kuba@mareimbrium.org \
    --to=kuba@mareimbrium.org \
    --cc=berthiaume_wayne@emc.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.