From: Theodore Ts'o <tytso@mit.edu>
To: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>,
"Luck, Tony" <tony.luck@intel.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
"Kleen, Andi" <andi.kleen@intel.com>,
"Wu, Fengguang" <fengguang.wu@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jan Kara <jack@suse.cz>, Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
Akira Fujita <a-fujita@rs.jp.nec.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page
Date: Mon, 29 Oct 2012 14:24:56 -0400 [thread overview]
Message-ID: <20121029182455.GA7098@thunk.org> (raw)
In-Reply-To: <m27gq9r2cu.fsf@firstfloor.org>
On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote:
> > Agreed, if we're going to add an xattr, then we might as well store
>
> I don't think an xattr makes sense for this. It's sufficient to keep
> this state in memory.
>
> In general these error paths are hard to test and it's important
> to keep them as simple as possible. Doing IO and other complexities
> just doesn't make sense. Just have the simplest possible path
> that can do the job.
It's actually pretty easy to test this particular one, and certainly
one of the things I'd strongly encourage in this patch series is the
introduction of an interface via madvise and fadvise that allows us to
simulate an ECC hard error event. So I don't think "it's hard to
test" is a reason not to do the right thing. Let's make it easy to
test, and the include it in xfstests. All of the core file systems
are regularly running regression tests via xfstests, so if we define a
standard way of testing this function (this is why I suggested
fadvise/madvise instead of an ioctl, but in a pinch we could do this
via an ioctl instead).
Note that the problem that we're dealing with is buffered writes; so
it's quite possible that the process which wrote the file, thus
dirtying the page cache, has already exited; so there's no way we can
guarantee we can inform the process which wrote the file via a signal
or a error code return. It's also possible that part of the file has
been written out to the disk, so forcibly crashing the system and
rebooting isn't necessarily going to save the file from being
corrupted.
Also, if you're going to keep this state in memory, what happens if
the inode gets pushed out of memory? Do we pin the inode? Or do we
just say that it's stored into memory until some non-deterministic
time (as far as userspace programs are concerned, but if they are
running in a tight memory cgroup, it might be very short time later)
suddenly the state gets lost?
I think it's fair that if there are file systems that don't have a
single bit they can allocate in the inode, we can either accept
Jun'ichi's suggest to just forcibly crash the system, or we can allow
the state to be stored in memory. But I suspect the core file systems
that might be used by enterprise-class workloads will want to provide
something better.
I'm not that convinced that we need to insert an xattr; after all, not
all file systems support xattrs at all, and I think a single bit
indicating that the file has corrupted data is sufficient. But I
think it would be useful to at least optionally support a persistent
storage of this bit.
- Ted
next prev parent reply other threads:[~2012-10-29 18:25 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-25 15:12 [PATCH 0/3] HWPOISON: improve error_remove_page() Naoya Horiguchi
2012-10-25 15:12 ` [PATCH 1/3] mm: print out information of file affected by memory error Naoya Horiguchi
2012-10-25 19:32 ` Jan Kara
2012-10-25 20:34 ` Naoya Horiguchi
2012-10-25 15:12 ` [PATCH 2/3] ext4: introduce ext4_error_remove_page Naoya Horiguchi
2012-10-25 19:39 ` Jan Kara
2012-10-26 6:12 ` Theodore Ts'o
2012-10-26 16:55 ` Luck, Tony
2012-10-26 18:46 ` Theodore Ts'o
2012-10-26 22:24 ` Luck, Tony
2012-10-27 22:16 ` Theodore Ts'o
2012-10-28 1:57 ` Naoya Horiguchi
2012-10-29 1:16 ` Dave Chinner
2012-10-29 2:40 ` Theodore Ts'o
2012-10-29 10:37 ` Andi Kleen
2012-10-29 11:05 ` Jun'ichi Nomura
2012-10-29 18:24 ` Theodore Ts'o [this message]
2012-10-29 18:55 ` Jan Kara
2012-10-29 19:07 ` Andi Kleen
2012-10-29 21:47 ` Naoya Horiguchi
2012-10-30 0:00 ` Jun'ichi Nomura
2012-10-29 18:11 ` Luck, Tony
2012-10-31 0:21 ` Dave Chinner
2012-10-26 18:50 ` Naoya Horiguchi
2012-10-25 15:12 ` [PATCH 3/3] ext3: introduce ext3_error_remove_page Naoya Horiguchi
2012-10-25 19:45 ` Jan Kara
2012-10-25 20:35 ` Naoya Horiguchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121029182455.GA7098@thunk.org \
--to=tytso@mit.edu \
--cc=a-fujita@rs.jp.nec.com \
--cc=akpm@linux-foundation.org \
--cc=andi.kleen@intel.com \
--cc=andi@firstfloor.org \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).