All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	hugh@veritas.com, riel@redhat.com, akpm@linux-foundation.org,
	chris.mason@oracle.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, fengguang.wu@intel.com
Subject: Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3
Date: Tue, 2 Jun 2009 14:55:38 +0200	[thread overview]
Message-ID: <20090602125538.GH1065@one.firstfloor.org> (raw)
In-Reply-To: <20090602123720.GF1392@wotan.suse.de>

On Tue, Jun 02, 2009 at 02:37:20PM +0200, Nick Piggin wrote:
> Because I don't see any difference (see my previous patch). I
> still don't know what it is supposed to be doing differently.
> So if you reinvent your own that looks close enough to truncate
> to warrant a comment to say /* this is close to truncate but
> not quite */, then yes I insist that you say exactly why it is
> not quite like truncate ;)

I will just delete that comment because it apparently causes so 
much confusion.

> 
>  
> > > I'm suggesting that EIO is traditionally for when the data still
> > > dirty in pagecache and was not able to get back to backing
> > > store. Do you deny that?
> > 
> > Yes. That is exactly the case when memory-failure triggers EIO
> > 
> > Memory error on a dirty file mapped page.
> 
> But it is no longer dirty, and the problem was not that the data
> was unable to be written back.

Sorry I don't understand. What do you mean with "no longer dirty"

Of course it's still dirty, just has to be discarded because it's 
corrupted.

> > > And I think the application might try to handle the case of a
> > > page becoming corrupted differently. Do you deny that?
> > 
> > You mean a clean file-mapped page? In this case there is no EIO,
> > memory-failure just drops the page and it is reloaded.
> > 
> > If the page is dirty we trigger EIO which as you said above is the 
> > right reaction.
> 
> No I mean the difference between the case of dirty page unable to
> be written to backing sotre, and the case of dirty page becoming
> corrupted.

Nick, I have really a hard time following you here.

What exactly do you want? 

A new errno? Or something else? If yes what precisely?

I currently don't see any sane way to report this to the application
through write().  That is because adding a new errno for something
is incredibly hard and often impossible, and that's certainly
the case here.

The application can detect it if it maps the 
shared page and waits for a SIGBUS, but not through write().

But I doubt there will be really any apps that do anything differently
here anyways. A clever app could retry a few times if it still
has a copy of the data, but that might even make sense on normal
IO errors (e.g. on a SAN).

> 
> 
> > > OK, given the range of errors that APIs are defined to return,
> > > then maybe EIO is the best option. I don't suppose it is possible
> > > to expand them to return something else?
> > 
> > Expand the syscalls to return other errnos on specific
> > kinds of IO error?
> >  
> > Of course that's possible, but it has the problem that you 
> > would need to fix all the applications that expect EIO for
> > IO error. The later I consider infeasible.
> 
> They would presumably exit or do some default thing, which I
> think would be fine.

No it's not fine if they would handle EIO. e.g. consider
a sophisticated database which likely has sophisticated
IO error mechanisms too (e.g. only abort the current commit)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

WARNING: multiple messages have this Message-ID (diff)
From: Andi Kleen <andi@firstfloor.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	hugh@veritas.com, riel@redhat.com, akpm@linux-foundation.org,
	chris.mason@oracle.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, fengguang.wu@intel.com
Subject: Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3
Date: Tue, 2 Jun 2009 14:55:38 +0200	[thread overview]
Message-ID: <20090602125538.GH1065@one.firstfloor.org> (raw)
In-Reply-To: <20090602123720.GF1392@wotan.suse.de>

On Tue, Jun 02, 2009 at 02:37:20PM +0200, Nick Piggin wrote:
> Because I don't see any difference (see my previous patch). I
> still don't know what it is supposed to be doing differently.
> So if you reinvent your own that looks close enough to truncate
> to warrant a comment to say /* this is close to truncate but
> not quite */, then yes I insist that you say exactly why it is
> not quite like truncate ;)

I will just delete that comment because it apparently causes so 
much confusion.

> 
>  
> > > I'm suggesting that EIO is traditionally for when the data still
> > > dirty in pagecache and was not able to get back to backing
> > > store. Do you deny that?
> > 
> > Yes. That is exactly the case when memory-failure triggers EIO
> > 
> > Memory error on a dirty file mapped page.
> 
> But it is no longer dirty, and the problem was not that the data
> was unable to be written back.

Sorry I don't understand. What do you mean with "no longer dirty"

Of course it's still dirty, just has to be discarded because it's 
corrupted.

> > > And I think the application might try to handle the case of a
> > > page becoming corrupted differently. Do you deny that?
> > 
> > You mean a clean file-mapped page? In this case there is no EIO,
> > memory-failure just drops the page and it is reloaded.
> > 
> > If the page is dirty we trigger EIO which as you said above is the 
> > right reaction.
> 
> No I mean the difference between the case of dirty page unable to
> be written to backing sotre, and the case of dirty page becoming
> corrupted.

Nick, I have really a hard time following you here.

What exactly do you want? 

A new errno? Or something else? If yes what precisely?

I currently don't see any sane way to report this to the application
through write().  That is because adding a new errno for something
is incredibly hard and often impossible, and that's certainly
the case here.

The application can detect it if it maps the 
shared page and waits for a SIGBUS, but not through write().

But I doubt there will be really any apps that do anything differently
here anyways. A clever app could retry a few times if it still
has a copy of the data, but that might even make sense on normal
IO errors (e.g. on a SAN).

> 
> 
> > > OK, given the range of errors that APIs are defined to return,
> > > then maybe EIO is the best option. I don't suppose it is possible
> > > to expand them to return something else?
> > 
> > Expand the syscalls to return other errnos on specific
> > kinds of IO error?
> >  
> > Of course that's possible, but it has the problem that you 
> > would need to fix all the applications that expect EIO for
> > IO error. The later I consider infeasible.
> 
> They would presumably exit or do some default thing, which I
> think would be fine.

No it's not fine if they would handle EIO. e.g. consider
a sophisticated database which likely has sophisticated
IO error mechanisms too (e.g. only abort the current commit)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-06-02 12:48 UTC|newest]

Thread overview: 232+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-27 20:12 [PATCH] [0/16] HWPOISON: Intro Andi Kleen
2009-05-27 20:12 ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [1/16] HWPOISON: Add page flag for poisoned pages Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:35   ` Larry H.
2009-05-27 20:35     ` Larry H.
2009-05-27 21:15   ` Alan Cox
2009-05-27 21:15     ` Alan Cox
2009-05-28  7:54     ` Andi Kleen
2009-05-28  7:54       ` Andi Kleen
2009-05-29 16:10       ` Rik van Riel
2009-05-29 16:10         ` Rik van Riel
2009-05-29 16:37         ` Andi Kleen
2009-05-29 16:37           ` Andi Kleen
2009-05-29 16:34           ` Rik van Riel
2009-05-29 16:34             ` Rik van Riel
2009-05-29 18:24             ` Andi Kleen
2009-05-29 18:24               ` Andi Kleen
2009-05-29 18:26               ` Rik van Riel
2009-05-29 18:26                 ` Rik van Riel
2009-05-29 18:42                 ` Andi Kleen
2009-05-29 18:42                   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [2/16] HWPOISON: Export poison flag in /proc/kpageflags Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-29 16:37   ` Rik van Riel
2009-05-29 16:37     ` Rik van Riel
2009-05-27 20:12 ` [PATCH] [3/16] HWPOISON: Export some rmap vma locking to outside world Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [4/16] HWPOISON: Add support for poison swap entries v2 Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-28  8:46   ` Hidehiro Kawai
2009-05-28  8:46     ` Hidehiro Kawai
2009-05-28  9:11     ` Wu Fengguang
2009-05-28  9:11       ` Wu Fengguang
2009-05-28 10:42     ` Andi Kleen
2009-05-28 10:42       ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [5/16] HWPOISON: Add new SIGBUS error codes for hardware poison signals Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [6/16] HWPOISON: Add basic support for poisoned pages in fault handler v2 Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-29  4:15   ` Hidehiro Kawai
2009-05-29  4:15     ` Hidehiro Kawai
2009-05-29  6:28     ` Andi Kleen
2009-05-29  6:28       ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [7/16] HWPOISON: Add various poison checks in mm/memory.c Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [8/16] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [9/16] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-28  7:27   ` Nick Piggin
2009-05-28  7:27     ` Nick Piggin
2009-05-28  8:03     ` Andi Kleen
2009-05-28  8:03       ` Andi Kleen
2009-05-28  8:28       ` Nick Piggin
2009-05-28  8:28         ` Nick Piggin
2009-05-28  9:02         ` Andi Kleen
2009-05-28  9:02           ` Andi Kleen
2009-05-28 12:26           ` Nick Piggin
2009-05-28 12:26             ` Nick Piggin
2009-05-27 20:12 ` [PATCH] [10/16] HWPOISON: Handle hardware poisoned pages in try_to_unmap Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [11/16] HWPOISON: Handle poisoned pages in set_page_dirty() Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [12/16] HWPOISON: check and isolate corrupted free pages Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v3 Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-28  8:26   ` Nick Piggin
2009-05-28  8:26     ` Nick Piggin
2009-05-28  9:31     ` Andi Kleen
2009-05-28  9:31       ` Andi Kleen
2009-05-28 12:08       ` Nick Piggin
2009-05-28 12:08         ` Nick Piggin
2009-05-28 13:45         ` Andi Kleen
2009-05-28 13:45           ` Andi Kleen
2009-05-28 14:50           ` Wu Fengguang
2009-05-28 14:50             ` Wu Fengguang
2009-06-04  6:25             ` Nai Xia
2009-06-04  6:25               ` Nai Xia
2009-06-07 16:02               ` Wu Fengguang
2009-06-07 16:02                 ` Wu Fengguang
2009-06-08 11:06                 ` Nai Xia
2009-06-08 11:06                   ` Nai Xia
2009-06-08 12:31                   ` Wu Fengguang
2009-06-08 12:31                     ` Wu Fengguang
2009-06-08 14:46                     ` Nai Xia
2009-06-08 14:46                       ` Nai Xia
2009-06-09  6:48                       ` Wu Fengguang
2009-06-09  6:48                         ` Wu Fengguang
2009-06-09 10:48                         ` Nick Piggin
2009-06-09 10:48                           ` Nick Piggin
2009-06-09 12:15                           ` Wu Fengguang
2009-06-09 12:15                             ` Wu Fengguang
2009-06-09 12:17                             ` Nick Piggin
2009-06-09 12:17                               ` Nick Piggin
2009-06-09 12:47                               ` Wu Fengguang
2009-06-09 12:47                                 ` Wu Fengguang
2009-06-09 13:36                                 ` Nai Xia
2009-06-09 13:36                                   ` Nai Xia
2009-05-28 16:56           ` Russ Anderson
2009-05-28 16:56             ` Russ Anderson
2009-05-30  6:42             ` Andi Kleen
2009-05-30  6:42               ` Andi Kleen
2009-06-01 11:39               ` Nick Piggin
2009-06-01 11:39                 ` Nick Piggin
2009-06-01 18:19                 ` Andi Kleen
2009-06-01 18:19                   ` Andi Kleen
2009-06-01 12:05           ` Nick Piggin
2009-06-01 12:05             ` Nick Piggin
2009-06-01 18:51             ` Andi Kleen
2009-06-01 18:51               ` Andi Kleen
2009-06-02 12:10               ` Nick Piggin
2009-06-02 12:10                 ` Nick Piggin
2009-06-02 12:34                 ` Andi Kleen
2009-06-02 12:34                   ` Andi Kleen
2009-06-02 12:37                   ` Nick Piggin
2009-06-02 12:37                     ` Nick Piggin
2009-06-02 12:55                     ` Andi Kleen [this message]
2009-06-02 12:55                       ` Andi Kleen
2009-06-02 13:03                       ` Nick Piggin
2009-06-02 13:03                         ` Nick Piggin
2009-06-02 13:20                         ` Andi Kleen
2009-06-02 13:20                           ` Andi Kleen
2009-06-02 13:19                           ` Nick Piggin
2009-06-02 13:19                             ` Nick Piggin
2009-06-02 13:46                             ` Andi Kleen
2009-06-02 13:46                               ` Andi Kleen
2009-06-02 13:47                               ` Nick Piggin
2009-06-02 13:47                                 ` Nick Piggin
2009-06-02 14:05                                 ` Andi Kleen
2009-06-02 14:05                                   ` Andi Kleen
2009-06-02 13:30                     ` Wu Fengguang
2009-06-02 13:30                       ` Wu Fengguang
2009-06-02 14:07                       ` Nick Piggin
2009-06-02 14:07                         ` Nick Piggin
2009-05-28  9:59     ` Wu Fengguang
2009-05-28  9:59       ` Wu Fengguang
2009-05-28 10:11       ` Andi Kleen
2009-05-28 10:11         ` Andi Kleen
2009-05-28 10:33         ` Wu Fengguang
2009-05-28 10:33           ` Wu Fengguang
2009-05-28 10:51           ` Andi Kleen
2009-05-28 10:51             ` Andi Kleen
2009-05-28 11:03             ` Wu Fengguang
2009-05-28 11:03               ` Wu Fengguang
2009-05-28 12:15             ` Nick Piggin
2009-05-28 12:15               ` Nick Piggin
2009-05-28 13:48               ` Andi Kleen
2009-05-28 13:48                 ` Andi Kleen
2009-05-28 12:23       ` Nick Piggin
2009-05-28 12:23         ` Nick Piggin
2009-05-28 13:54         ` Wu Fengguang
2009-05-28 13:54           ` Wu Fengguang
2009-06-01 11:50           ` Nick Piggin
2009-06-01 11:50             ` Nick Piggin
2009-06-01 14:05             ` Wu Fengguang
2009-06-01 14:05               ` Wu Fengguang
2009-06-01 14:40               ` Nick Piggin
2009-06-01 14:40                 ` Nick Piggin
2009-06-02 11:14                 ` Wu Fengguang
2009-06-02 11:14                   ` Wu Fengguang
2009-06-02 12:19                   ` Nick Piggin
2009-06-02 12:19                     ` Nick Piggin
2009-06-02 12:51                     ` Wu Fengguang
2009-06-02 12:51                       ` Wu Fengguang
2009-06-02 14:33                       ` Nick Piggin
2009-06-02 14:33                         ` Nick Piggin
2009-06-03 10:21                       ` Jens Axboe
2009-06-03 10:21                         ` Jens Axboe
2009-06-01 21:11               ` Hugh Dickins
2009-06-01 21:11                 ` Hugh Dickins
2009-06-01 21:41                 ` Andi Kleen
2009-06-01 21:41                   ` Andi Kleen
2009-06-01 18:32             ` Andi Kleen
2009-06-01 18:32               ` Andi Kleen
2009-06-02 12:00               ` Nick Piggin
2009-06-02 12:00                 ` Nick Piggin
2009-06-02 12:47                 ` Andi Kleen
2009-06-02 12:47                   ` Andi Kleen
2009-06-02 12:57                   ` Nick Piggin
2009-06-02 12:57                     ` Nick Piggin
2009-06-02 13:25                     ` Andi Kleen
2009-06-02 13:25                       ` Andi Kleen
2009-06-02 13:24                       ` Nick Piggin
2009-06-02 13:24                         ` Nick Piggin
2009-06-02 13:41                         ` Andi Kleen
2009-06-02 13:41                           ` Andi Kleen
2009-06-02 13:40                           ` Nick Piggin
2009-06-02 13:40                             ` Nick Piggin
2009-06-02 13:53                           ` Wu Fengguang
2009-06-02 13:53                             ` Wu Fengguang
2009-06-02 14:06                             ` Andi Kleen
2009-06-02 14:06                               ` Andi Kleen
2009-06-02 14:12                               ` Wu Fengguang
2009-06-02 14:12                                 ` Wu Fengguang
2009-06-02 14:21                                 ` Nick Piggin
2009-06-02 14:21                                   ` Nick Piggin
2009-06-02 13:46                     ` Wu Fengguang
2009-06-02 13:46                       ` Wu Fengguang
2009-06-02 14:08                       ` Andi Kleen
2009-06-02 14:08                         ` Andi Kleen
2009-06-02 14:10                         ` Wu Fengguang
2009-06-02 14:10                           ` Wu Fengguang
2009-06-02 14:14                           ` Nick Piggin
2009-06-02 14:14                             ` Nick Piggin
2009-06-02 15:17                       ` Nick Piggin
2009-06-02 15:17                         ` Nick Piggin
2009-06-02 17:27                         ` Andi Kleen
2009-06-02 17:27                           ` Andi Kleen
2009-06-03  9:35                           ` Nick Piggin
2009-06-03  9:35                             ` Nick Piggin
2009-06-03 11:24                             ` Andi Kleen
2009-06-03 11:24                               ` Andi Kleen
2009-06-02 13:02                   ` Wu Fengguang
2009-06-02 13:02                     ` Wu Fengguang
2009-06-02 15:09                   ` Nick Piggin
2009-06-02 15:09                     ` Nick Piggin
2009-06-02 17:19                     ` Andi Kleen
2009-06-02 17:19                       ` Andi Kleen
2009-06-03  6:24                       ` Nick Piggin
2009-06-03  6:24                         ` Nick Piggin
2009-06-03 15:51               ` Wu Fengguang
2009-06-03 15:51                 ` Wu Fengguang
2009-06-03 16:05                 ` Andi Kleen
2009-06-03 16:05                   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [14/16] HWPOISON: FOR TESTING: Enable memory failure code unconditionally Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [15/16] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Andi Kleen
2009-05-27 20:12   ` Andi Kleen
2009-05-27 20:12 ` [PATCH] [16/16] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Andi Kleen
2009-05-27 20:12   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090602125538.GH1065@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=fengguang.wu@intel.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.