Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ric Wheeler <rwheeler@redhat.com>
To: "Ted Ts'o" <tytso@mit.edu>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Bernd Schubert <bs_lists@aakef.fastmail.fm>,
	linux-ext4@vger.kernel.org, Bernd Schubert <bschubert@ddn.com>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Date: Sun, 24 Oct 2010 09:55:21 -0400	[thread overview]
Message-ID: <4CC43AC9.8000409@redhat.com> (raw)
In-Reply-To: <20101023221714.GB24650@thunk.org>

  On 10/23/2010 06:17 PM, Ted Ts'o wrote:
> On Sat, Oct 23, 2010 at 06:00:05PM +0200, Amir Goldstein wrote:
>> IMHO, and I've said it before, the mount flag which Bernd requests
>> already exists, namely 'errors=', both as mount option and as
>> persistent default, but it is not enforced correctly on mount time.
>> If an administrator decides that the correct behavior when error is
>> detected is abort or remount-ro, what's the sense it letting the
>> filesystem mount read-write without fixing the problem?
> Again, consider the case of the root filesystem containing an error.
> When the error is first discovered during the source of the system's
> operation, and it's set to errors=panic, you want to immediately
> reboot the system.  But then, when root file system is mounted, it
> would be bad to have the system immediately panic again.  Instead,
> what you want to have happen is to allow e2fsck to run, correct the
> file system errors, and then system can go back to normal operation.
>
> So the current behavior was deliberately designed to be the way that
> it is, and the difference is between "what do you do when you come
> across a file system error", which is what the errors= mount option is
> all about, and "this file system has some kind of error associated
> with it".  Just because it has an error associated with it does not
> mean that immediately rebooting is the right thing to do, even if the
> file system is set to "errors=panic".  In fact, in the case of a root
> file system, it is manifestly the wrong thing to do.  If we did what
> you suggested, then the system would be trapped in a reboot loop
> forever.
>
> 							- Ted

I am still fuzzy on the use case here.

In any shared ext* file system (pacemaker or other), you have some basic rules:

* you cannot have the file system mounted on more than one node
* failover must fence out any other nodes before starting recovery
* failover (once the node is assured that it is uniquely mounting the file 
system) must do any recovery required to clean up the state

Using ext* (or xfs) in an active/passive cluster with fail over rules that 
follow the above is really common today.

I don't see what the use case here is - are we trying to pretend that pacemaker 
+ ext* allows us to have a single, shared file system in a cluster mounted on 
multiple nodes?

Why not use ocfs2 or gfs2 for that?

Thanks!

Ric

next prev parent reply	other threads:[~2010-10-24 13:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-22 13:33 ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure Bernd Schubert
2010-10-22 17:25 ` Ted Ts'o
2010-10-22 17:42   ` Bernd Schubert
2010-10-22 18:32     ` Ted Ts'o
2010-10-22 18:54       ` Bernd Schubert
2010-10-23 16:00   ` Amir Goldstein
2010-10-23 17:46     ` Bernd Schubert
2010-10-23 22:26       ` Ted Ts'o
2010-10-23 23:56         ` Bernd Schubert
2010-10-24  0:20           ` Bernd Schubert
2010-10-24  1:08             ` Ted Ts'o
2010-10-24 14:42               ` Bernd Schubert
2010-10-23 22:17     ` Ted Ts'o
2010-10-24  8:50       ` Amir Goldstein
2010-10-24 13:55       ` Ric Wheeler [this message]
2010-10-24 14:30         ` Bernd Schubert
2010-10-24 15:20           ` Ric Wheeler
2010-10-24 15:39             ` Bernd Schubert
2010-10-24 15:49               ` Ric Wheeler
2010-10-24 16:16                 ` Bernd Schubert
2010-10-24 16:43                   ` Ric Wheeler
2010-10-25 10:14                     ` Andreas Dilger
2010-10-25 11:45                       ` Ric Wheeler
2010-10-25 12:54                         ` Ric Wheeler
2010-10-25 14:57                           ` Andreas Dilger
2010-10-25 19:49                             ` Ric Wheeler
2010-10-25 20:08                               ` Bernd Schubert
2010-10-25 20:10                                 ` Ric Wheeler
2010-10-25 19:43                       ` Eric Sandeen
2010-10-25 20:37                         ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC43AC9.8000409@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=bs_lists@aakef.fastmail.fm \
    --cc=bschubert@ddn.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).