From: Boaz Harrosh <bharrosh@panasas.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
<linux-fsdevel@vger.kernel.org>, <linux-ext4@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>,
LKML <linux-kernel@vger.kernel.org>,
Edward Shishkin <edward@redhat.com>
Subject: Re: [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error
Date: Tue, 17 Jan 2012 12:46:08 +0200 [thread overview]
Message-ID: <4F155170.5000206@panasas.com> (raw)
In-Reply-To: <CA+55aFxZ8dF8WagoyQPYTm92R1ZKd0G_tztqmAc+jrv0LkWGAA@mail.gmail.com>
On 01/17/2012 02:59 AM, Linus Torvalds wrote:
> On Mon, Jan 16, 2012 at 4:36 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Jan is right, Linus. His definition of what up-to-date means for
>> dirty buffers is correct, especially in the case of write errors.
>
> It's not a dirty buffer any more.
>
> Go look. We've long since cleared the dirty bit.
>
> So stop spouting garbage.
>
> My argument is simple: the contents ARE NOT CORRECT ENOUGH to be
> called "up-to-date and clean".
>
> And I outlined the two choices:
>
> - mark it dirty and continue trying to write it out forever
>
> - invalidate it.
>
> Anything else is crazy talk. And marking it dirty forever isn't really
> an option. So..
>
> Linus
I think this conversation is an hint to the fact that the page_cache-page
state machine is clear as mud. And I thought it was only me. For years
I want to catch some VFS guru to sit down and finally explain to me all
the stages and how they are expressed in page-flag bits.
Back to the conversation. The way I understood it (Which is probably wrong)
1. The application dirties a page it is in a *dirty* state.
2. Write-out begins, page goes into that in-write-out state (Am I correct)
Now the page comes back from write-out with an error. As Linus stated we can
not put it back to *dirty* state because it will probably never clear.
(We did bunch of retrys on the block level). And we can't keep it in-write-out
surly. But I think we should surly *not* put it in *not-clean* state. Because
that one implies reading and the worse we can do is read that page as it is
now.
Therefor I agree with Jan. That the best is to use that extra error bit
to indicate an *error-state*, which is up to the FS to handle.
If it was a read error - error-is-set clean-is-cleared
If it was a write err - error-is-set clean-is-set.
All the rest of the Kernel should consider these as a they are error-sate
and I really like Jan's patch of inspecting for error-bit and not the
not-clean in a write-out which is darn confusing. (Regardless of the meaning
of the clean-bit)
Now the filesystem needs to do something about these pages like put them in a Jurnal,
shove them in a recovery workQ or whatever. All the VFS/MM can do is like Linus
said wait until they are plain removed which is effectively like invalidating them.
(In the case the FS did nothing to fix it)
I wish there was some heavy logging when the VFS/MM trashes error-set but clean-set
pages (Write-errors), even a write-out of these buffers to some global journal, of
which tools can extract and amend later. (Like the USB snatched too soon example)
So I see Linus point of "we can't go back to any of the old states" but let's not
overload the clean-bit and use the proper error-bit like Jan suggested.
My $0.017
Boaz
next prev parent reply other threads:[~2012-01-17 10:53 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-05 14:40 [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 1/3] fs: Convert checks for write IO errors from !buffer_uptodate to buffer_write_io_error Jan Kara
2012-01-05 14:40 ` [PATCH 2/3] fs: Do not clear uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 3/3] ext2: Replace tests of write IO errors using buffer_uptodate Jan Kara
2012-01-05 22:16 ` [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Andrew Morton
2012-01-15 2:19 ` Linus Torvalds
2012-01-15 2:19 ` Linus Torvalds
2012-01-16 16:01 ` Jan Kara
2012-01-16 18:55 ` Linus Torvalds
2012-01-16 19:06 ` Linus Torvalds
2012-01-17 0:36 ` Dave Chinner
2012-01-17 0:59 ` Linus Torvalds
2012-01-17 10:46 ` Boaz Harrosh [this message]
2012-01-23 3:04 ` Dave Chinner
2012-01-23 21:47 ` Ted Ts'o
2012-01-23 23:49 ` Linus Torvalds
2012-01-24 6:12 ` Dave Chinner
2012-01-24 7:10 ` Linus Torvalds
2012-01-24 12:13 ` Jan Kara
2012-01-24 0:36 ` Dave Chinner
2012-01-26 12:17 ` Ric Wheeler
2012-01-26 20:51 ` Jan Kara
2012-01-26 20:58 ` Ric Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F155170.5000206@panasas.com \
--to=bharrosh@panasas.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=edward@redhat.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.