linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bernd Schubert <bs_lists@aakef.fastmail.fm>
To: "Ted Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, Bernd Schubert <bschubert@ddn.com>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Date: Fri, 22 Oct 2010 20:54:41 +0200	[thread overview]
Message-ID: <201010222054.42083.bs_lists@aakef.fastmail.fm> (raw)
In-Reply-To: <20101022183219.GQ3127@thunk.org>

On Friday, October 22, 2010, Ted Ts'o wrote:
> On Fri, Oct 22, 2010 at 07:42:49PM +0200, Bernd Schubert wrote:
> > No, it is far more difficult than that. The devices are managed by
> > pacemaker.  Which means: I/O errors come up -> Lustre complains
> > about that in its proc file. Pacemaker monitoring fails, so
> > pacemaker stops the device and starts it again.
> 
> I'm not sure what errors you're referring to, but if the errors are

There are multiple ways to let Lustre tell you that there is problem. 
Underlying filesystem related is just one of many.

> related to file system inconsistencies, by definition umounting and
> re-mounting isn't going to fix things, and could result in more
> damage.  For certain errors, you really do need to run e2fsck before
> remounting the device.

Yes and that is exactly why I'm asking for another mount option to not allow 
mounts when the filesystem knows better.

> 
> Can you not change pacemaker to stop the device, run e2fsck, and then
> remount the file system?

I am sure I could spend the next 4 weeks to write code that would allow to do 
that with Lustre and pacemaker. But at the same time, it seems far more easy 
to add another mount flag to ext4...

I also cannot simply set a max_failcount=1 in pacemaker, at that would 
completely be against an HA concept. There are so many ways to increase the 
failcount, for example Lustre bugs (ext4 unrelated), pacemaker bugs, human 
errors (something missing on one node, but available on another), etc. A few 
failures (ext4 unrelated) are absolutely 'normal' over a couple of month and 
there is no reason not to allow that.

I'm not asking you to implement another feature, but I'm asking if a patch to 
add a new option would be accepted. I also cannot promise to implement that 
any time soon, given that I will leave DDN end of November. But it seems to be 
option useful for everyone including my desktop. So either I do that over the 
next 4 weeks when I find a minute or during x-mas or so.

Thanks,
Bernd

-- 
Bernd Schubert
DataDirect Networks

  reply	other threads:[~2010-10-22 18:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-22 13:33 ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure Bernd Schubert
2010-10-22 17:25 ` Ted Ts'o
2010-10-22 17:42   ` Bernd Schubert
2010-10-22 18:32     ` Ted Ts'o
2010-10-22 18:54       ` Bernd Schubert [this message]
2010-10-23 16:00   ` Amir Goldstein
2010-10-23 17:46     ` Bernd Schubert
2010-10-23 22:26       ` Ted Ts'o
2010-10-23 23:56         ` Bernd Schubert
2010-10-24  0:20           ` Bernd Schubert
2010-10-24  1:08             ` Ted Ts'o
2010-10-24 14:42               ` Bernd Schubert
2010-10-23 22:17     ` Ted Ts'o
2010-10-24  8:50       ` Amir Goldstein
2010-10-24 13:55       ` Ric Wheeler
2010-10-24 14:30         ` Bernd Schubert
2010-10-24 15:20           ` Ric Wheeler
2010-10-24 15:39             ` Bernd Schubert
2010-10-24 15:49               ` Ric Wheeler
2010-10-24 16:16                 ` Bernd Schubert
2010-10-24 16:43                   ` Ric Wheeler
2010-10-25 10:14                     ` Andreas Dilger
2010-10-25 11:45                       ` Ric Wheeler
2010-10-25 12:54                         ` Ric Wheeler
2010-10-25 14:57                           ` Andreas Dilger
2010-10-25 19:49                             ` Ric Wheeler
2010-10-25 20:08                               ` Bernd Schubert
2010-10-25 20:10                                 ` Ric Wheeler
2010-10-25 19:43                       ` Eric Sandeen
2010-10-25 20:37                         ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201010222054.42083.bs_lists@aakef.fastmail.fm \
    --to=bs_lists@aakef.fastmail.fm \
    --cc=bschubert@ddn.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).