Re: [PATCH 1/5] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wu Fengguang <fengguang.wu@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Andi Kleen <andi@firstfloor.org>,
	"riel@redhat.com" <riel@redhat.com>,
	"chris.mason@oracle.com" <chris.mason@oracle.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 1/5] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled
Date: Sat, 13 Jun 2009 00:14:31 +0800	[thread overview]
Message-ID: <20090612161431.GB5680@localhost> (raw)
In-Reply-To: <20090612153620.GB23483@elte.hu>

On Fri, Jun 12, 2009 at 11:36:20PM +0800, Ingo Molnar wrote:
> 
> * Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Fri, Jun 12, 2009 at 09:17:54PM +0800, Ingo Molnar wrote:
> > > 
> > > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > 
> > > > Hi Ingo,
> > > > 
> > > > On Fri, Jun 12, 2009 at 07:22:58PM +0800, Ingo Molnar wrote:
> > > > > 
> > > > > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > > > 
> > > > > > So as to eliminate one #ifdef in the c source.
> > > > > > 
> > > > > > Proposed by Nick Piggin.
> > > > > > 
> > > > > > CC: Nick Piggin <npiggin@suse.de>
> > > > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > > > > ---
> > > > > >  arch/x86/mm/fault.c |    3 +--
> > > > > >  include/linux/mm.h  |    7 ++++++-
> > > > > >  2 files changed, 7 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > --- sound-2.6.orig/arch/x86/mm/fault.c
> > > > > > +++ sound-2.6/arch/x86/mm/fault.c
> > > > > > @@ -819,14 +819,13 @@ do_sigbus(struct pt_regs *regs, unsigned
> > > > > >  	tsk->thread.error_code	= error_code;
> > > > > >  	tsk->thread.trap_no	= 14;
> > > > > >  
> > > > > > -#ifdef CONFIG_MEMORY_FAILURE
> > > > > >  	if (fault & VM_FAULT_HWPOISON) {
> > > > > >  		printk(KERN_ERR
> > > > > >  	"MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
> > > > > >  			tsk->comm, tsk->pid, address);
> > > > > >  		code = BUS_MCEERR_AR;
> > > > > >  	}
> > > > > > -#endif
> > > > > 
> > > > > Btw., anything like this should happen in close cooperation with 
> > > > > the x86 tree, not as some pure MM feature. I dont see Cc:s and 
> > > > > nothing that indicates that realization. What's going on here?
> > > > 
> > > > Ah sorry for the ignorance!  Andi has a nice overview of the big 
> > > > picture here: http://lkml.org/lkml/2009/6/3/371
> > > > 
> > > > In the above chunk, the process is trying to access the already 
> > > > corrupted page and thus shall be killed, otherwise it will either 
> > > > silently consume corrupted data, or will trigger another (deadly) 
> > > > MCE event and bring down the whole machine.
> > > 
> > > This seems like trying to handle a failure mode that cannot be and 
> > > shouldnt be 'handled' really. If there's an 'already corrupted' page 
> > > then the box should go down hard and fast, and we should not risk 
> > > _even more user data corruption_ by trying to 'continue' in the hope 
> > > of having hit some 'harmless' user process that can be killed ...
> > > 
> > > So i find the whole feature rather dubious - what's the point? We 
> > > should panic at this point - we just corrupted user data so that 
> > > piece of hardware cannot be trusted. Nor can any subsequent kernel 
> > > bug messages be trusted.
> > > 
> > > Do we really want this in the core Linux VM and in the architecture 
> > > pagefault handling code and elsewhere? Am i the only one who finds 
> > > this concept of 'handling' user data corruption rather dubious?
> > 
> > - The corrupted data only impacts one or more process(es)
> > - The corrupted data has not be consumed yet
> > 
> > The data corruption has not caused real hurt yet, and can be 
> > isolated to prevent future accesses.  So it makes sense to just 
> > kill the impacted process(es).
> 
> Dunno, this just looks like a license to allow more crappy hardware, 
> hm? I'm all for _logging_ errors, but hwpoison is not about that: it 
> is about allowing the hardware to limp along in 'enterprise' setups, 
> with a (false looking) 'guarantee' that everything is fine.
> 
> There's no guarantee that the fault doesnt hit something critical - 
> and by allowing 'harmless' faults we push up the noise level.
> 
> Any move from us to make faulty hardware more acceptable by 
> "handling" it in a percentage of cases (and crashing/corrupting in 
> other cases) is futile IMHO - it just sends the wrong general 
> message.
> 
> I.e. i think this thinking misses the general harm on for example 
> the quality of kernel bugreports: if such a system corrupts memory, 
> and crashes in a weird way - we'll get a weird kernel-crash report. 
> If it 'only' corrupts some user process in a 'harmless' way, we wont 
> get a crash report. Say the kernel crashes in 10% of the cases, 
> user-space crashes in 90% of the cases.
> 
> If we allow that 90% to continue, we make the 10% "bad" crash 
> proportion more prominent in our stats too. I.e. by allowing 
> 'harmless' bugs to be more acceptable in practice, we indirectly 
> increase the proportion of _bad_ crashes as well.
> 
> Do you accept that general point or am i wrong?
> 
> Computing along the von Neumann principles really depends on having 
> a sufficiently well working piece of hardware that one can trust 
> with a reasonable certainty. Probabilistic computing is fine too in 
> certain isolated fields where you say want some probabilistic result 
> to begin with (say the result of some property of the physical 
> world) - but in general purpose hardware i doubt it's the right kind 
> of approach ...

NAND flash is crappy - it is continuously rotting - it's wrong to
encourage its usage by inventing wear leveling and checksum algorithms
and to make SSD on top of them.

wireless network is crappy - it so much more unreliable than fibre networks.

PC servers are crappy - google invented the google file system? Damn it!


HWPOISON is a reliability enabling feature - if it enables prevalent
of crappy hardwares, let's celebrate changing the world~~

Thanks,
Fengguang

WARNING: multiple messages have this Message-ID (diff)

From: Wu Fengguang <fengguang.wu@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Andi Kleen <andi@firstfloor.org>,
	"riel@redhat.com" <riel@redhat.com>,
	"chris.mason@oracle.com" <chris.mason@oracle.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 1/5] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled
Date: Sat, 13 Jun 2009 00:14:31 +0800	[thread overview]
Message-ID: <20090612161431.GB5680@localhost> (raw)
In-Reply-To: <20090612153620.GB23483@elte.hu>

On Fri, Jun 12, 2009 at 11:36:20PM +0800, Ingo Molnar wrote:
> 
> * Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Fri, Jun 12, 2009 at 09:17:54PM +0800, Ingo Molnar wrote:
> > > 
> > > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > 
> > > > Hi Ingo,
> > > > 
> > > > On Fri, Jun 12, 2009 at 07:22:58PM +0800, Ingo Molnar wrote:
> > > > > 
> > > > > * Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > > > 
> > > > > > So as to eliminate one #ifdef in the c source.
> > > > > > 
> > > > > > Proposed by Nick Piggin.
> > > > > > 
> > > > > > CC: Nick Piggin <npiggin@suse.de>
> > > > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > > > > ---
> > > > > >  arch/x86/mm/fault.c |    3 +--
> > > > > >  include/linux/mm.h  |    7 ++++++-
> > > > > >  2 files changed, 7 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > --- sound-2.6.orig/arch/x86/mm/fault.c
> > > > > > +++ sound-2.6/arch/x86/mm/fault.c
> > > > > > @@ -819,14 +819,13 @@ do_sigbus(struct pt_regs *regs, unsigned
> > > > > >  	tsk->thread.error_code	= error_code;
> > > > > >  	tsk->thread.trap_no	= 14;
> > > > > >  
> > > > > > -#ifdef CONFIG_MEMORY_FAILURE
> > > > > >  	if (fault & VM_FAULT_HWPOISON) {
> > > > > >  		printk(KERN_ERR
> > > > > >  	"MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
> > > > > >  			tsk->comm, tsk->pid, address);
> > > > > >  		code = BUS_MCEERR_AR;
> > > > > >  	}
> > > > > > -#endif
> > > > > 
> > > > > Btw., anything like this should happen in close cooperation with 
> > > > > the x86 tree, not as some pure MM feature. I dont see Cc:s and 
> > > > > nothing that indicates that realization. What's going on here?
> > > > 
> > > > Ah sorry for the ignorance!  Andi has a nice overview of the big 
> > > > picture here: http://lkml.org/lkml/2009/6/3/371
> > > > 
> > > > In the above chunk, the process is trying to access the already 
> > > > corrupted page and thus shall be killed, otherwise it will either 
> > > > silently consume corrupted data, or will trigger another (deadly) 
> > > > MCE event and bring down the whole machine.
> > > 
> > > This seems like trying to handle a failure mode that cannot be and 
> > > shouldnt be 'handled' really. If there's an 'already corrupted' page 
> > > then the box should go down hard and fast, and we should not risk 
> > > _even more user data corruption_ by trying to 'continue' in the hope 
> > > of having hit some 'harmless' user process that can be killed ...
> > > 
> > > So i find the whole feature rather dubious - what's the point? We 
> > > should panic at this point - we just corrupted user data so that 
> > > piece of hardware cannot be trusted. Nor can any subsequent kernel 
> > > bug messages be trusted.
> > > 
> > > Do we really want this in the core Linux VM and in the architecture 
> > > pagefault handling code and elsewhere? Am i the only one who finds 
> > > this concept of 'handling' user data corruption rather dubious?
> > 
> > - The corrupted data only impacts one or more process(es)
> > - The corrupted data has not be consumed yet
> > 
> > The data corruption has not caused real hurt yet, and can be 
> > isolated to prevent future accesses.  So it makes sense to just 
> > kill the impacted process(es).
> 
> Dunno, this just looks like a license to allow more crappy hardware, 
> hm? I'm all for _logging_ errors, but hwpoison is not about that: it 
> is about allowing the hardware to limp along in 'enterprise' setups, 
> with a (false looking) 'guarantee' that everything is fine.
> 
> There's no guarantee that the fault doesnt hit something critical - 
> and by allowing 'harmless' faults we push up the noise level.
> 
> Any move from us to make faulty hardware more acceptable by 
> "handling" it in a percentage of cases (and crashing/corrupting in 
> other cases) is futile IMHO - it just sends the wrong general 
> message.
> 
> I.e. i think this thinking misses the general harm on for example 
> the quality of kernel bugreports: if such a system corrupts memory, 
> and crashes in a weird way - we'll get a weird kernel-crash report. 
> If it 'only' corrupts some user process in a 'harmless' way, we wont 
> get a crash report. Say the kernel crashes in 10% of the cases, 
> user-space crashes in 90% of the cases.
> 
> If we allow that 90% to continue, we make the 10% "bad" crash 
> proportion more prominent in our stats too. I.e. by allowing 
> 'harmless' bugs to be more acceptable in practice, we indirectly 
> increase the proportion of _bad_ crashes as well.
> 
> Do you accept that general point or am i wrong?
> 
> Computing along the von Neumann principles really depends on having 
> a sufficiently well working piece of hardware that one can trust 
> with a reasonable certainty. Probabilistic computing is fine too in 
> certain isolated fields where you say want some probabilistic result 
> to begin with (say the result of some property of the physical 
> world) - but in general purpose hardware i doubt it's the right kind 
> of approach ...

NAND flash is crappy - it is continuously rotting - it's wrong to
encourage its usage by inventing wear leveling and checksum algorithms
and to make SSD on top of them.

wireless network is crappy - it so much more unreliable than fibre networks.

PC servers are crappy - google invented the google file system? Damn it!


HWPOISON is a reliability enabling feature - if it enables prevalent
of crappy hardwares, let's celebrate changing the world~~

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-06-12 16:14 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-11 14:22 [PATCH 0/5] [RFC] HWPOISON incremental fixes Wu Fengguang
2009-06-11 14:22 ` Wu Fengguang
2009-06-11 14:22 ` [PATCH 1/5] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled Wu Fengguang
2009-06-11 14:22   ` Wu Fengguang
2009-06-11 15:44   ` Rik van Riel
2009-06-11 15:44     ` Rik van Riel
2009-06-12 10:00   ` Andi Kleen
2009-06-12 10:00     ` Andi Kleen
2009-06-12 13:15     ` Wu Fengguang
2009-06-12 13:15       ` Wu Fengguang
2009-06-12 11:22   ` Ingo Molnar
2009-06-12 11:22     ` Ingo Molnar
2009-06-12 12:57     ` Wu Fengguang
2009-06-12 12:57       ` Wu Fengguang
2009-06-12 13:17       ` Ingo Molnar
2009-06-12 13:17         ` Ingo Molnar
2009-06-12 13:33         ` Wu Fengguang
2009-06-12 13:33           ` Wu Fengguang
2009-06-12 15:36           ` Ingo Molnar
2009-06-12 15:36             ` Ingo Molnar
2009-06-12 16:14             ` Wu Fengguang [this message]
2009-06-12 16:14               ` Wu Fengguang
2009-06-12 18:07               ` Alan Cox
2009-06-12 18:07                 ` Alan Cox
2009-06-12 17:55             ` Theodore Tso
2009-06-12 17:55               ` Theodore Tso
2009-06-12 13:58         ` Andi Kleen
2009-06-12 13:58           ` Andi Kleen
2009-06-12 15:28         ` Linus Torvalds
2009-06-12 15:28           ` Linus Torvalds
2009-06-12 15:35           ` Ingo Molnar
2009-06-12 15:35             ` Ingo Molnar
2009-06-12 16:05             ` Rik van Riel
2009-06-12 16:05               ` Rik van Riel
2009-06-12 16:37             ` H. Peter Anvin
2009-06-12 16:37               ` H. Peter Anvin
2009-06-12 16:48               ` Ingo Molnar
2009-06-12 16:48                 ` Ingo Molnar
2009-06-15  7:04               ` Nick Piggin
2009-06-15  7:04                 ` Nick Piggin
2009-06-15  6:52             ` Nick Piggin
2009-06-15  6:52               ` Nick Piggin
2009-06-16 20:27               ` Russ Anderson
2009-06-16 20:27                 ` Russ Anderson
2009-06-17  7:51                 ` Nick Piggin
2009-06-17  7:51                   ` Nick Piggin
2009-06-12 15:45         ` Ingo Molnar
2009-06-12 15:45           ` Ingo Molnar
2009-06-12 16:12           ` Linus Torvalds
2009-06-12 16:12             ` Linus Torvalds
2009-06-11 14:22 ` [PATCH 2/5] HWPOISON: fix tasklist_lock/anon_vma locking order Wu Fengguang
2009-06-11 14:22   ` Wu Fengguang
2009-06-11 15:59   ` Rik van Riel
2009-06-11 15:59     ` Rik van Riel
2009-06-12 10:03   ` Andi Kleen
2009-06-12 10:03     ` Andi Kleen
2009-06-12 10:07     ` Nick Piggin
2009-06-12 10:07       ` Nick Piggin
2009-06-12 13:27     ` Wu Fengguang
2009-06-12 13:27       ` Wu Fengguang
2009-06-12 14:04       ` Wu Fengguang
2009-06-12 14:04         ` Wu Fengguang
2009-06-11 14:22 ` [PATCH 3/5] HWPOISON: remove early kill option for now Wu Fengguang
2009-06-11 14:22   ` Wu Fengguang
2009-06-11 16:06   ` Rik van Riel
2009-06-11 16:06     ` Rik van Riel
2009-06-12  9:59   ` Andi Kleen
2009-06-12  9:59     ` Andi Kleen
2009-06-11 14:22 ` [PATCH 4/5] HWPOISON: report sticky EIO for poisoned file Wu Fengguang
2009-06-11 14:22   ` Wu Fengguang
2009-06-11 16:31   ` Rik van Riel
2009-06-11 16:31     ` Rik van Riel
2009-06-12 10:07   ` Andi Kleen
2009-06-12 10:07     ` Andi Kleen
2009-06-12 13:41     ` Wu Fengguang
2009-06-12 13:41       ` Wu Fengguang
2009-06-11 14:22 ` [PATCH 5/5] HWPOISON: use the safer invalidate page for possible metadata pages Wu Fengguang
2009-06-11 14:22   ` Wu Fengguang
2009-06-11 16:36   ` Rik van Riel
2009-06-11 16:36     ` Rik van Riel
2009-06-12 10:56 ` [PATCH 0/5] [RFC] HWPOISON incremental fixes Andi Kleen
2009-06-12 10:56   ` Andi Kleen
2009-06-12 13:59   ` Wu Fengguang
2009-06-12 13:59     ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090612161431.GB5680@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=chris.mason@oracle.com \
    --cc=hpa@zytor.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.