From: Boaz Harrosh <bharrosh@panasas.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
<linux-fsdevel@vger.kernel.org>, <linux-ext4@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>,
LKML <linux-kernel@vger.kernel.org>,
Edward Shishkin <edward@redhat.com>
Subject: Re: [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error
Date: Tue, 17 Jan 2012 12:46:08 +0200 [thread overview]
Message-ID: <4F155170.5000206@panasas.com> (raw)
In-Reply-To: <CA+55aFxZ8dF8WagoyQPYTm92R1ZKd0G_tztqmAc+jrv0LkWGAA@mail.gmail.com>
On 01/17/2012 02:59 AM, Linus Torvalds wrote:
> On Mon, Jan 16, 2012 at 4:36 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Jan is right, Linus. His definition of what up-to-date means for
>> dirty buffers is correct, especially in the case of write errors.
>
> It's not a dirty buffer any more.
>
> Go look. We've long since cleared the dirty bit.
>
> So stop spouting garbage.
>
> My argument is simple: the contents ARE NOT CORRECT ENOUGH to be
> called "up-to-date and clean".
>
> And I outlined the two choices:
>
> - mark it dirty and continue trying to write it out forever
>
> - invalidate it.
>
> Anything else is crazy talk. And marking it dirty forever isn't really
> an option. So..
>
> Linus
I think this conversation is an hint to the fact that the page_cache-page
state machine is clear as mud. And I thought it was only me. For years
I want to catch some VFS guru to sit down and finally explain to me all
the stages and how they are expressed in page-flag bits.
Back to the conversation. The way I understood it (Which is probably wrong)
1. The application dirties a page it is in a *dirty* state.
2. Write-out begins, page goes into that in-write-out state (Am I correct)
Now the page comes back from write-out with an error. As Linus stated we can
not put it back to *dirty* state because it will probably never clear.
(We did bunch of retrys on the block level). And we can't keep it in-write-out
surly. But I think we should surly *not* put it in *not-clean* state. Because
that one implies reading and the worse we can do is read that page as it is
now.
Therefor I agree with Jan. That the best is to use that extra error bit
to indicate an *error-state*, which is up to the FS to handle.
If it was a read error - error-is-set clean-is-cleared
If it was a write err - error-is-set clean-is-set.
All the rest of the Kernel should consider these as a they are error-sate
and I really like Jan's patch of inspecting for error-bit and not the
not-clean in a write-out which is darn confusing. (Regardless of the meaning
of the clean-bit)
Now the filesystem needs to do something about these pages like put them in a Jurnal,
shove them in a recovery workQ or whatever. All the VFS/MM can do is like Linus
said wait until they are plain removed which is effectively like invalidating them.
(In the case the FS did nothing to fix it)
I wish there was some heavy logging when the VFS/MM trashes error-set but clean-set
pages (Write-errors), even a write-out of these buffers to some global journal, of
which tools can extract and amend later. (Like the USB snatched too soon example)
So I see Linus point of "we can't go back to any of the old states" but let's not
overload the clean-bit and use the proper error-bit like Jan suggested.
My $0.017
Boaz
next prev parent reply other threads:[~2012-01-17 10:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-05 14:40 [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 1/3] fs: Convert checks for write IO errors from !buffer_uptodate to buffer_write_io_error Jan Kara
2012-01-05 14:40 ` [PATCH 2/3] fs: Do not clear uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 3/3] ext2: Replace tests of write IO errors using buffer_uptodate Jan Kara
2012-01-05 22:16 ` [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Andrew Morton
2012-01-15 2:19 ` Linus Torvalds
2012-01-16 16:01 ` Jan Kara
2012-01-16 18:55 ` Linus Torvalds
2012-01-16 19:06 ` Linus Torvalds
2012-01-17 0:36 ` Dave Chinner
2012-01-17 0:59 ` Linus Torvalds
2012-01-17 10:46 ` Boaz Harrosh [this message]
2012-01-23 3:04 ` Dave Chinner
2012-01-23 21:47 ` Ted Ts'o
2012-01-23 23:49 ` Linus Torvalds
2012-01-24 6:12 ` Dave Chinner
2012-01-24 7:10 ` Linus Torvalds
2012-01-24 12:13 ` Jan Kara
2012-01-24 0:36 ` Dave Chinner
2012-01-26 12:17 ` Ric Wheeler
2012-01-26 20:51 ` Jan Kara
2012-01-26 20:58 ` Ric Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F155170.5000206@panasas.com \
--to=bharrosh@panasas.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=edward@redhat.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).