All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephan von Krawczynski <skraw@ithnet.com>
To: "dbz" <hwallenstone@gmx.de>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: Some very basic questions
Date: Mon, 27 Oct 2008 16:43:52 +0100	[thread overview]
Message-ID: <20081027164352.f7e39d1e.skraw@ithnet.com> (raw)
In-Reply-To: <11a001c93453$7ec465a0$0a00a8c0@ALDI2>

On Wed, 22 Oct 2008 16:35:55 +0200
"dbz" <hwallenstone@gmx.de> wrote:

> concerning this discussion, I'd like to put up some "requests" which 
> strongly oppose to those brought up initially:
> 
> - if you run into an error in the fs structure or any IO error that prevents 
> you from bringing the fs into a consistent state, please simply oops. If a 
> user feels that availability is a main issue, he has to use a failover 
> solution. In this case a fast and clean cut is desireable and no 
> "pray-and-hope-mode" or "90%-mode". If avaliability is not the issue, it is 
> in any case most important that data on the fs is safe. If you don't oops, 
> you risk to pose further damage onto the filesystem and end up with a 
> completely destroyed fs.

Hi Gerald,

this is a good proposal to explain why most failover setups do indeed not
work. If you look at numerous internet howtos about building failover you will
recognise that 95% talk about servers that syncronise their fs by all kinds of
tools _offline_, like drbd - or choose some network-dependant raid, like nbd
or enbd. All these have in common that they are unreliable just because of the
needed mounting during failover. In your example: if box 1 oopses because of
some error, chances are that box 2 trying to mount the very same data (which
should be because of raid or sync) will indeed fail to mount, too. That leaves
you with exactly nothing in hand.
 
> - if you get any IO error, please **don't** put up a number of retries or 
> anything. If the device reports an error simply believe it. It is bad enough 
> that many block drivers or controllers try to be smart and put up hundreds 
> of retries. Adding further retries you only end up in wasting hours on 
> useless retries. If availability is an issue, the user again has to put up a 
> failover solution. Again, a clean cut is what is needed. The user has to 
> make shure he uses appropiate configuration according to the importance of 
> his data (mirroring on the fs and/or RAID, failover ...)

Well, this leaves you with my proposal to optionally stop retrying, marking
files or (better) blocks as dead.
 
> - if during mount something unexpected comes up and you can't be shure that 
> the fs will work properly, please deny mounting and request a fsck. This can 
> be easily handled by a start- or mount-script. During mount, take the time 
> you need to ensure that the fs looks proper and safe to use. I'd rather now 
> during boot that something is wrong than to run with a foul fs and end up 
> with data loss or any other mixup later on.

As explained above it is exactly the lack of parallel mounts that drives you
to not having a lot of time during mount. A failover that takes only 10 minutes
for re-mount is no failover, it is sh.t. ext? btw hardly ever mounts TBs at
below 10 minutes.
 
> - btrfs is no cluster fs, so there is no point of even thinking about it. If 
> somebody feels he needs multiple writeable mounts of the same fs, please use 
> a cluster fs. Of course, you have to live with the tradeoffs. Dreaming of a 
> fs that uses something like witchcraft to do things like locking, quorums, 
> cache synchronisation without penalty and, of course, without any 
> configuration, is pointless.

This reads pretty much like "a processor is a processor and not multiple
processors". We all know today that this time has passed. In 5 years you
should pretty much say the same for "single fs" vs. "cluster fs". 
 
> In my opinon, the whole thing comes up from the idea of using cheap hardware 
> and out-of-the-box configurations to keep promises of reliability and 
> availability which are not realistic. There is a reason why there are more 
> expensive HDDs, RAIDs, SANs with volume mirroring, multipathing and so on. 
> Simply ignoring the fact that you have to use the proper tools to address 
> specific problems and pray to the toothfairy to put a 
> solve-all-my-problems-fs under your pillow is no solution. I'd rather have a 
> solid fs with deterministic behavior and some state-of-the-art features.

Well, sorry to say, but I begin to sound a bit like Joseph Stiglitz
trying to explain why neoliberalism does not work out.
Please accept that this world is full of failure of all kinds. If you deny
that all your models and ideas will only be failures, too.
All I am saying is that we should accept that dead sectors, braindead
firmware-programmers, production in jungle-environment, transportation in
rough areas, high temperatures, high humidity, harddisks that have no disks
and so on are facts of live. And only a childs answer can be : "oops"
(sorry could not resist this one ;-)
 
> Just my 2c.
> (Gerald) 

-- 
Regards,
Stephan

  reply	other threads:[~2008-10-27 15:43 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-22 14:35 Some very basic questions dbz
2008-10-27 15:43 ` Stephan von Krawczynski [this message]
2008-10-28  3:45   ` Re[2]: " sftf
  -- strict thread matches above, loose matches on Subject: below --
2008-10-21 17:37 calin
2008-10-21 20:08 ` jim owens
2008-10-22  7:15   ` Avi Kivity
2008-10-22 14:13     ` jim owens
2008-10-22 14:25       ` Avi Kivity
2008-10-21 11:23 Stephan von Krawczynski
2008-10-21 12:13 ` Andi Kleen
2008-10-21 14:22   ` Stephan von Krawczynski
2008-10-21 15:34     ` jim owens
2008-10-22 11:36       ` Stephan von Krawczynski
2008-10-22 12:15         ` Avi Kivity
2008-10-22 13:03           ` Ric Wheeler
2008-10-22 13:13             ` Chris Mason
2008-10-22 13:16             ` Avi Kivity
2008-10-21 13:20 ` jim owens
2008-10-21 17:01   ` Stephan von Krawczynski
2008-10-21 17:15     ` Christoph Hellwig
2008-10-21 17:31       ` Ric Wheeler
2008-10-22 12:27         ` Stephan von Krawczynski
2008-10-22 13:15           ` Chris Mason
2008-10-22 13:27             ` Ric Wheeler
2008-10-22 14:32               ` Avi Kivity
2008-10-22 14:36                 ` Chris Mason
2008-10-22 14:40                   ` Avi Kivity
2008-10-22 14:46                 ` Ric Wheeler
2008-10-22 14:54                   ` Avi Kivity
2008-10-22 15:02                     ` Ric Wheeler
2008-10-22 15:13                       ` Avi Kivity
2008-10-22 15:25                         ` Ric Wheeler
2008-10-22 15:33                           ` Chris Mason
2008-10-22 15:43                             ` Avi Kivity
2008-10-22 15:54                               ` Ric Wheeler
2008-10-22 18:28                                 ` Avi Kivity
2008-10-22 15:39                           ` Avi Kivity
2008-10-22 13:52             ` Stephan von Krawczynski
2008-10-22 15:56               ` Michel Salim
2008-10-22 16:56                 ` jim owens
2008-10-23  9:47                 ` Stephan von Krawczynski
2008-10-22 11:40       ` Stephan von Krawczynski
2008-10-21 13:59 ` Chris Mason
2008-10-21 16:09   ` Andi Kleen
2008-10-22 11:43     ` Stephan von Krawczynski
2008-10-21 16:27   ` Stephan von Krawczynski
2008-10-21 16:59     ` Andi Kleen
2008-10-22 11:46       ` Stephan von Krawczynski
2008-10-21 17:49     ` Chris Mason
2008-10-22 12:19       ` Stephan von Krawczynski
2008-10-22 12:48         ` Jeff Schroeder
2008-10-22 14:02           ` Stephan von Krawczynski
2008-10-22 13:50         ` Chris Mason
2008-10-22 14:04           ` Matthias Wächter
2008-10-22 14:32             ` Ric Wheeler
2008-10-22 14:44               ` jim owens
2008-10-24  8:42           ` Chris Samuel
2008-10-24  8:39         ` Chris Samuel
2008-10-21 20:54   ` Eric Anopolsky
2008-10-21 22:18     ` Ric Wheeler
2008-10-22  2:29       ` Eric Anopolsky
2008-10-22 10:42         ` Ric Wheeler
2008-10-22 10:53           ` Tejun Heo
2008-10-22 12:57             ` Ric Wheeler
2008-10-22 12:57             ` Ric Wheeler
2008-10-22 13:15               ` Tejun Heo
2008-10-22 13:19                 ` Chris Mason
2008-10-22 13:38                   ` Ric Wheeler
2008-10-22 13:59                     ` Chris Mason
2008-10-22 14:23                       ` Ric Wheeler
2008-10-22 13:23                 ` Ric Wheeler
2008-10-22 16:14                   ` Tejun Heo
2008-10-22 16:34                     ` Ric Wheeler
2008-10-23  3:59                       ` Tejun Heo
2008-10-22 18:32                     ` Avi Kivity
2008-10-22 19:13                       ` jim owens
2008-10-22 19:22                         ` Avi Kivity
2008-10-22 19:59                       ` Ric Wheeler
2008-10-22 21:31                     ` Eric Anopolsky
2008-10-22 21:56                       ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081027164352.f7e39d1e.skraw@ithnet.com \
    --to=skraw@ithnet.com \
    --cc=hwallenstone@gmx.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.