public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@digeo.com>
To: Ross Biro <rossb@google.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [BUG] Failed writes marked clean?
Date: Fri, 08 Nov 2002 12:57:19 -0800	[thread overview]
Message-ID: <3DCC252F.65C0F70B@digeo.com> (raw)
In-Reply-To: 3DCC1EB5.4020303@google.com

Ross Biro wrote:
> 
> Perhaps I'm reading the code incorrectly, but in kernel versions 2.4.18
> and 2.5.46 it looks to me like in the case of a write, ll_rw_block
> always clears the dirty bit.  In the event of an error, nothing resets
> the dirty bit and the uptodate flag is cleared.  This means that if the
> same block needs to be read again, the buffer cache will see that the
> buffer is not uptodate and attempt to read the old contents of the
> buffer off of the device.  If the read suceeds the kernel ends up
> corrupting data.

That's correct, for metadata.  It may not be fully accurate for
file data, where the page state comes into play as well.

The handling of IO errors is very weird.  Especially for writes.
And poorly tested.  It needs a big revamp and testing.

> It seems to me that a better solution would be to mark the buffer as
> dirty and uptodate and then attempt to propogate the error as far back
> as possible.  Ideally something can be done to correct the problem at a
> higher level.  Before I dive in and attempt to do something about this,
> I wanted to make sure I was not missing anything important.  So am I
> full of it, or could this really be a problem?
> 

Well before going and changing stuff, we need to decide what to
change it _to_.  What do we want to happen if there's a read error?
And a write error?

For reads, it makes sense for the page/buffer to be left not uptodate,
and return an error.

For write errors, marking the page/buffer not uptodate doesn't make
a lot of sense.  Marking it clean makes sense if we're not going to retry
the write.  Marking it dirty, uptodate and unmapped would make sense
if we want to go and try a different part of the disk.  But it
doesn't make sense if the whole disk is dead.

Also, think about what a write error _means_.  Unless the disk is truly
ancient, it means that the device has run out of alternate space for
the block, or all writes are failing.  ie: it is a serious failure.

So perhaps the appropriate strategy on write errors is to mark the
device readonly and to drop all write data on the floor.  That means
clean+mapped+uptodate.

So yes, I think I agree with myself.  Write errors should leave the
page/buffer clean, uptodate, mapped, PageError (whatever the latter
maens...)

  parent reply	other threads:[~2002-11-08 20:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-08 20:29 [BUG] Failed writes marked clean? Ross Biro
2002-11-08 20:53 ` Linus Torvalds
2002-11-08 20:57 ` Andrew Morton [this message]
2002-11-08 21:30   ` Ross Biro
2002-11-08 23:35   ` Theodore Ts'o
2002-11-09  1:29     ` Bernd Eckenfels
2002-11-12 20:04     ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3DCC252F.65C0F70B@digeo.com \
    --to=akpm@digeo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rossb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox