linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Andi Kleen <andi@firstfloor.org>
Cc: hugh@veritas.com, riel@redhat.com, akpm@linux-foundation.org,
	chris.mason@oracle.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, fengguang.wu@intel.com
Subject: Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3
Date: Mon, 1 Jun 2009 14:05:38 +0200	[thread overview]
Message-ID: <20090601120537.GF5018@wotan.suse.de> (raw)
In-Reply-To: <20090528134520.GH1065@one.firstfloor.org>

On Thu, May 28, 2009 at 03:45:20PM +0200, Andi Kleen wrote:
> On Thu, May 28, 2009 at 02:08:54PM +0200, Nick Piggin wrote:
> > Then the data can not have been consumed, by DMA or otherwise? What
> 
> When the data was consumed we get a different machine check
> (or a different error if it was consumed by a IO device)
> 
> This code right now just handles the case of "CPU detected a page is broken
> is wrong, but hasn't consumed it yet"

OK. Out of curiosity, how often do you expect to see uncorrectable ECC
errors?

 
> > about transient kernel references to the (pagecache/anonymous) page
> > (such as, find_get_page for read(2), or get_user_pages callers).	
> 
> There are always races, after all the CPU could be just about 
> right now to consume. If we lose the race there will be just
> another machine check that stops the consumption of bad data.
> The hardware takes care of that.
> 
> The code here doesn't try to be a 100% coverage of all
> cases (that's obviously impossible), just to handle
> common page types. I also originally had ideas for more handlers,
> but found out how hard it is to test, so I burried a lot of fancy
> ideas :-)
> 
> If there are left over references we complain at least.

Well I don't know about that, because you'd probably have races
between lookups and taking reference. But nevermind.

 
> > > > > +	/*
> > > > > +	 * remove_from_page_cache assumes (mapping && !mapped)
> > > > > +	 */
> > > > > +	if (page_mapping(p) && !page_mapped(p)) {
> > > > > +		remove_from_page_cache(p);
> > > > > +		page_cache_release(p);
> > > > > +	}
> > > > 
> > > > remove_mapping would probably be a better idea. Otherwise you can
> > > > probably introduce pagecache removal vs page fault races whi
> > > > will make the kernel bug.
> > > 
> > > Can you be more specific about the problems?
> > 
> > Hmm, actually now that we hold the page lock over __do_fault (at least
> > for pagecache pages), this may not be able to trigger the race I was
> > thinking of (page becoming mapped). But I think still it is better
> > to use remove_mapping which is the standard way to remove such a page.
> 
> I had this originally, but Fengguang redid it because there was
> trouble with the reference count. remove_mapping always expects it to
> be 2, which we cannot guarantee.

OK, but it should still definitely use truncate code.

> > > > > +	if (mapping) {
> > > > > +		/*
> > > > > +		 * Truncate does the same, but we're not quite the same
> > > > > +		 * as truncate. Needs more checking, but keep it for now.
> > > > > +		 */
> > > > 
> > > > What's different about truncate? It would be good to reuse as much as possible.
> > > 
> > > Truncating removes the block on disk (we don't). Truncating shrinks
> > > the end of the file (we don't). It's more "temporal hole punch"
> > > Probably from the VM point of view it's very similar, but it's
> > > not the same.
> > 
> > Right, I just mean the pagecache side of the truncate. So you should
> > use truncate_inode_pages_range here.
> 
> Why?  I remember I was trying to use that function very early on but
> there was some problem.  For once it does its own locking which
> would conflict with ours.

Just extract the part where it has the page locked into a common
function.


> Also we already do a lot of the stuff it does (like unmapping).

That's OK, it will not try to unmap again if it sees !page_mapped.
 

> Is there anything concretely wrong with the current code?


/*
 * Truncate does the same, but we're not quite the same
 * as truncate. Needs more checking, but keep it for now.
 */

I guess that it duplicates this tricky truncate code and also
says it is different (but AFAIKS it is doing exactly the same
thing).



> > Well, the dirty data has never been corrupted before (ie. the data
> > in pagecache has been OK). It was just unable to make it back to
> > backing store. So a program could retry the write/fsync/etc or
> > try to write the data somewhere else.
> 
> In theory it could, but in practice it is very unlikely it would.

very unlikely to try rewriting the page again or taking evasive
action to save the data somewhere else? I think that's a bold
assumption.

At the very least, having a prompt "IO error, check your hotplug
device / network connection / etc and try again" I don't think
sounds unreasonable at all.


> > It kind of wants a new error code, but I can't imagine the difficulty
> > in doing that...
> 
> I don't think it's a good idea to change programs for this normally,
> most wouldn't anyways. Even kernel programmers have trouble with
> memory suddenly going bad, for user space programmers it would
> be pure voodoo.
 
No, I mean as something to distinguish it between the  case of IO
device going bad (in which case the userspace program has a number
of things it might try to do). 


> > Just seems overengineered. We could rewrite any if/switch statement like
> > that (and actually the compiler probably will if it is beneficial).
> 
> The reason I like it is that it separates the functions cleanly,
> without that there would be a dispatcher from hell. Yes it's a bit
> ugly that there is a lot of manual bit checking around now too,
> but as you go into all the corner cases originally clean code
> always tends to get more ugly (and this is a really ugly problem)

Well... it is just writing another dispatcher from hell in a
different way really, isn't it? How is it so much better than
a simple switch or if/elseif/else statement?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-06-01 12:05 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-27 20:12 [PATCH] [0/16] HWPOISON: Intro Andi Kleen
2009-05-27 20:12 ` [PATCH] [1/16] HWPOISON: Add page flag for poisoned pages Andi Kleen
2009-05-27 20:35   ` Larry H.
2009-05-27 21:15   ` Alan Cox
2009-05-28  7:54     ` Andi Kleen
2009-05-29 16:10       ` Rik van Riel
2009-05-29 16:37         ` Andi Kleen
2009-05-29 16:34           ` Rik van Riel
2009-05-29 18:24             ` Andi Kleen
2009-05-29 18:26               ` Rik van Riel
2009-05-29 18:42                 ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [2/16] HWPOISON: Export poison flag in /proc/kpageflags Andi Kleen
2009-05-29 16:37   ` Rik van Riel
2009-05-27 20:12 ` [PATCH] [3/16] HWPOISON: Export some rmap vma locking to outside world Andi Kleen
2009-05-27 20:12 ` [PATCH] [4/16] HWPOISON: Add support for poison swap entries v2 Andi Kleen
2009-05-28  8:46   ` Hidehiro Kawai
2009-05-28  9:11     ` Wu Fengguang
2009-05-28 10:42     ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [5/16] HWPOISON: Add new SIGBUS error codes for hardware poison signals Andi Kleen
2009-05-27 20:12 ` [PATCH] [6/16] HWPOISON: Add basic support for poisoned pages in fault handler v2 Andi Kleen
2009-05-29  4:15   ` Hidehiro Kawai
2009-05-29  6:28     ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [7/16] HWPOISON: Add various poison checks in mm/memory.c Andi Kleen
2009-05-27 20:12 ` [PATCH] [8/16] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler Andi Kleen
2009-05-27 20:12 ` [PATCH] [9/16] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-05-28  7:27   ` Nick Piggin
2009-05-28  8:03     ` Andi Kleen
2009-05-28  8:28       ` Nick Piggin
2009-05-28  9:02         ` Andi Kleen
2009-05-28 12:26           ` Nick Piggin
2009-05-27 20:12 ` [PATCH] [10/16] HWPOISON: Handle hardware poisoned pages in try_to_unmap Andi Kleen
2009-05-27 20:12 ` [PATCH] [11/16] HWPOISON: Handle poisoned pages in set_page_dirty() Andi Kleen
2009-05-27 20:12 ` [PATCH] [12/16] HWPOISON: check and isolate corrupted free pages Andi Kleen
2009-05-27 20:12 ` [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3 Andi Kleen
2009-05-28  8:26   ` Nick Piggin
2009-05-28  9:31     ` Andi Kleen
2009-05-28 12:08       ` Nick Piggin
2009-05-28 13:45         ` Andi Kleen
2009-05-28 14:50           ` Wu Fengguang
2009-06-04  6:25             ` Nai Xia
2009-06-07 16:02               ` Wu Fengguang
2009-06-08 11:06                 ` Nai Xia
2009-06-08 12:31                   ` Wu Fengguang
2009-06-08 14:46                     ` Nai Xia
2009-06-09  6:48                       ` Wu Fengguang
2009-06-09 10:48                         ` Nick Piggin
2009-06-09 12:15                           ` Wu Fengguang
2009-06-09 12:17                             ` Nick Piggin
2009-06-09 12:47                               ` Wu Fengguang
2009-06-09 13:36                                 ` Nai Xia
2009-05-28 16:56           ` Russ Anderson
2009-05-30  6:42             ` Andi Kleen
2009-06-01 11:39               ` Nick Piggin
2009-06-01 18:19                 ` Andi Kleen
2009-06-01 12:05           ` Nick Piggin [this message]
2009-06-01 18:51             ` Andi Kleen
2009-06-02 12:10               ` Nick Piggin
2009-06-02 12:34                 ` Andi Kleen
2009-06-02 12:37                   ` Nick Piggin
2009-06-02 12:55                     ` Andi Kleen
2009-06-02 13:03                       ` Nick Piggin
2009-06-02 13:20                         ` Andi Kleen
2009-06-02 13:19                           ` Nick Piggin
2009-06-02 13:46                             ` Andi Kleen
2009-06-02 13:47                               ` Nick Piggin
2009-06-02 14:05                                 ` Andi Kleen
2009-06-02 13:30                     ` Wu Fengguang
2009-06-02 14:07                       ` Nick Piggin
2009-05-28  9:59     ` Wu Fengguang
2009-05-28 10:11       ` Andi Kleen
2009-05-28 10:33         ` Wu Fengguang
2009-05-28 10:51           ` Andi Kleen
2009-05-28 11:03             ` Wu Fengguang
2009-05-28 12:15             ` Nick Piggin
2009-05-28 13:48               ` Andi Kleen
2009-05-28 12:23       ` Nick Piggin
2009-05-28 13:54         ` Wu Fengguang
2009-06-01 11:50           ` Nick Piggin
2009-06-01 14:05             ` Wu Fengguang
2009-06-01 14:40               ` Nick Piggin
2009-06-02 11:14                 ` Wu Fengguang
2009-06-02 12:19                   ` Nick Piggin
2009-06-02 12:51                     ` Wu Fengguang
2009-06-02 14:33                       ` Nick Piggin
2009-06-03 10:21                       ` Jens Axboe
2009-06-01 21:11               ` Hugh Dickins
2009-06-01 21:41                 ` Andi Kleen
2009-06-01 18:32             ` Andi Kleen
2009-06-02 12:00               ` Nick Piggin
2009-06-02 12:47                 ` Andi Kleen
2009-06-02 12:57                   ` Nick Piggin
2009-06-02 13:25                     ` Andi Kleen
2009-06-02 13:24                       ` Nick Piggin
2009-06-02 13:41                         ` Andi Kleen
2009-06-02 13:40                           ` Nick Piggin
2009-06-02 13:53                           ` Wu Fengguang
2009-06-02 14:06                             ` Andi Kleen
2009-06-02 14:12                               ` Wu Fengguang
2009-06-02 14:21                                 ` Nick Piggin
2009-06-02 13:46                     ` Wu Fengguang
2009-06-02 14:08                       ` Andi Kleen
2009-06-02 14:10                         ` Wu Fengguang
2009-06-02 14:14                           ` Nick Piggin
2009-06-02 15:17                       ` Nick Piggin
2009-06-02 17:27                         ` Andi Kleen
2009-06-03  9:35                           ` Nick Piggin
2009-06-03 11:24                             ` Andi Kleen
2009-06-02 13:02                   ` Wu Fengguang
2009-06-02 15:09                   ` Nick Piggin
2009-06-02 17:19                     ` Andi Kleen
2009-06-03  6:24                       ` Nick Piggin
2009-06-03 15:51               ` Wu Fengguang
2009-06-03 16:05                 ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [14/16] HWPOISON: FOR TESTING: Enable memory failure code unconditionally Andi Kleen
2009-05-27 20:12 ` [PATCH] [15/16] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Andi Kleen
2009-05-27 20:12 ` [PATCH] [16/16] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090601120537.GF5018@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=chris.mason@oracle.com \
    --cc=fengguang.wu@intel.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).