From: Ric Wheeler <rwheeler@redhat.com>
To: Andreas Dilger <andreas.dilger@oracle.com>
Cc: Bernd Schubert <bschubert@ddn.com>, "Ted Ts'o" <tytso@mit.edu>,
Amir Goldstein <amir73il@gmail.com>,
Bernd Schubert <bs_lists@aakef.fastmail.fm>,
Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Date: Mon, 25 Oct 2010 07:45:50 -0400 [thread overview]
Message-ID: <4CC56DEE.8020306@redhat.com> (raw)
In-Reply-To: <2D4557FB-DE12-43C3-A277-EE4DD82F0BFF@oracle.com>
On 10/25/2010 06:14 AM, Andreas Dilger wrote:
> On 2010-10-25, at 00:43, Ric Wheeler wrote:
>> On 10/24/2010 12:16 PM, Bernd Schubert wrote:
>>> ... sometimes the error state is only set *after* mounting the filesystem,
>>> so difficult to script it. And as I also wrote, running e2fsck from that
>>> script and to do a complete fs check is not appropriate, as that might
>>> simply time out. Again not Lustre specific. So after some discussion,
>>> the proposed solution is to add a "journal recovery only" option to e2fsck
>>> and to do that before the mount. I will add that to the 'lustre_server'
>>> agent (which is part of Lustre now), but leave it to someone else to that
>>> for the 'Filesystem' agent script (I'm not using that script myself and
>>> IMHO it is already too complex, as it tries to support all filesystems -
>>> shell code is ideal anymore then).
>> Why not simply have your script attempt to mount the file system? If it succeeds, it will replay the journal. If it fails, you will need to fall back to the long fsck which is unavoidable.
> I don't really agree with this. The whole reason for having the error flag in the superblock and ALWAYS running e2fsck at mount time to replay the journal is that e2fsck should be done before mounting the filesystem.
>
> I really dislike the reiserfs/XFS model where a filesystem is mounted and fsck is not run in advance, and then if there is a serious error in the filesystem this needs to be detected by the kernel, the filesystem unmounted, e2fsck started, and the filesystem remounted... That's just backward.
>
Even if you disagree with the model, that would seem to solve the issue for
Bernd without having to make a change in the utilities.
Thanks!
Ric
>> We spend a lot of time and testing to make sure that ext* can be shot at any point and come back after a storage outage and still mount.
> Sure, it can still mount, but the only thing it might be able to do is detect the error and remount the filesystem read-only or panic... That's why e2fsck should ALWAYS be run BEFORE the filesystem is mounted.
>
> Bernd's issue (the part that I agree with) is that the error may only be recorded in the journal, not in the ext3 superblock, and there is no easy way to detect this from userspace. Allowing e2fsck to only replay the journal is useful this problem. Another similar issue is that if tune2fs is run on an unmounted filesystem that hasn't had a journal replay, then it may modify the superblock, but journal replay will clobber this. There are other similar issues.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
next prev parent reply other threads:[~2010-10-25 11:46 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-22 13:33 ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure Bernd Schubert
2010-10-22 17:25 ` Ted Ts'o
2010-10-22 17:42 ` Bernd Schubert
2010-10-22 18:32 ` Ted Ts'o
2010-10-22 18:54 ` Bernd Schubert
2010-10-23 16:00 ` Amir Goldstein
2010-10-23 17:46 ` Bernd Schubert
2010-10-23 22:26 ` Ted Ts'o
2010-10-23 23:56 ` Bernd Schubert
2010-10-24 0:20 ` Bernd Schubert
2010-10-24 1:08 ` Ted Ts'o
2010-10-24 14:42 ` Bernd Schubert
2010-10-23 22:17 ` Ted Ts'o
2010-10-24 8:50 ` Amir Goldstein
2010-10-24 13:55 ` Ric Wheeler
2010-10-24 14:30 ` Bernd Schubert
2010-10-24 15:20 ` Ric Wheeler
2010-10-24 15:39 ` Bernd Schubert
2010-10-24 15:49 ` Ric Wheeler
2010-10-24 16:16 ` Bernd Schubert
2010-10-24 16:43 ` Ric Wheeler
2010-10-25 10:14 ` Andreas Dilger
2010-10-25 11:45 ` Ric Wheeler [this message]
2010-10-25 12:54 ` Ric Wheeler
2010-10-25 14:57 ` Andreas Dilger
2010-10-25 19:49 ` Ric Wheeler
2010-10-25 20:08 ` Bernd Schubert
2010-10-25 20:10 ` Ric Wheeler
2010-10-25 19:43 ` Eric Sandeen
2010-10-25 20:37 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CC56DEE.8020306@redhat.com \
--to=rwheeler@redhat.com \
--cc=amir73il@gmail.com \
--cc=andreas.dilger@oracle.com \
--cc=bs_lists@aakef.fastmail.fm \
--cc=bschubert@ddn.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).