From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Mel Gorman <mel@csn.ul.ie>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
"riel@redhat.com" <riel@redhat.com>,
"chris.mason@oracle.com" <chris.mason@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 00/22] HWPOISON: Intro (v5)
Date: Mon, 15 Jun 2009 22:22:25 +0800 [thread overview]
Message-ID: <20090615142225.GA11167@localhost> (raw)
In-Reply-To: <20090615122528.GA13256@wotan.suse.de>
On Mon, Jun 15, 2009 at 08:25:28PM +0800, Nick Piggin wrote:
> On Mon, Jun 15, 2009 at 08:10:01PM +0800, Wu Fengguang wrote:
> > On Mon, Jun 15, 2009 at 03:19:07PM +0800, Nick Piggin wrote:
> > > > For KVM you need early kill, for the others it remains to be seen.
> > >
> > > Right. It's almost like you need to do a per-process thing, and
> > > those that can handle things (such as the new SIGBUS or the new
> > > EIO) could get those, and others could be killed.
> >
> > To send early SIGBUS kills to processes who has called
> > sigaction(SIGBUS, ...)? KVM will sure do that. For other apps we
> > don't mind they can understand that signal at all.
>
> For apps that hook into SIGBUS for some other means and
Yes I was referring to the sigaction(SIGBUS) apps, others will
be late killed anyway.
> do not understand the new type of SIGBUS signal? What about
> those?
We introduced two new SIGBUS codes:
BUS_MCEERR_AO=5 for early kill
BUS_MCEERR_AR=4 for late kill
I'd assume a legacy application will handle them in the same way (both
are unexpected code to the application).
We don't care whether the application can be killed by BUS_MCEERR_AO
or BUS_MCEERR_AR depending on its SIGBUS handler implementation.
But (in the rare case) if the handler
- refused to die on BUS_MCEERR_AR, it may create a busy loop and
flooding of SIGBUS signals, which is a bug of the application.
BUS_MCEERR_AO is one time and won't lead to busy loops.
- does something that hurts itself (ie. data safety) on BUS_MCEERR_AO,
it may well hurt the same way on BUS_MCEERR_AR. The latter one is
unavoidable, so the application must be fixed anyway.
>
> > > Early-kill for KVM does seem like reasonable justification on the
> > > surface, but when I think more about it, I wonder does the guest
> > > actually stand any better chance to correct the error if it is
> > > reported at time T rather than T+delta? (who knows what the page
> > > will be used for at any given time).
> >
> > Early kill makes a lot difference for KVM. Think about the vast
> > amount of clean page cache pages. With early kill the page can be
> > trivially isolated. With late kill the whole virtual machine dies
> > hard.
>
> Why? In both cases it will enter the exception handler and
> attempt to do something about it... in both cases I would
> have thought there is some chance that the page error is not
> recoverable and some chance it is recoverable. Or am I
> missing something?
The early kill / late kill to KVM from the POV of host kernel matches
the MCE AO/AR events inside the KVM guest kernel. The key difference
between AO/AR is, whether the page is _being_ consumed.
It's a lot harder (if possible) to try to stop an active consumer.
For example, the clean cache pages can be consumed in many ways:
- be accessed by read()/write() or mapped read/write
- be reclaimed and then allocated for whatever new usage, for example,
be zeroed by __GFP_ZERO, or be insert into another file and start
read/write IO and be accessed by disk driver via DMA, or even be
allocated for kernel slabs..
Frankly speaking I don't know how to stop all the above consumers.
We now simply die on AR events.
> Anyway, I would like to see a basic analysis of those probabilities
> to justify early kill. Not saying there is no justification, but
> it would be helpful to see why.
That's fine. I'd be glad if the above explanation paves way to
solutions for AR events :)
Thanks,
Fengguang
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Mel Gorman <mel@csn.ul.ie>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
"riel@redhat.com" <riel@redhat.com>,
"chris.mason@oracle.com" <chris.mason@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 00/22] HWPOISON: Intro (v5)
Date: Mon, 15 Jun 2009 22:22:25 +0800 [thread overview]
Message-ID: <20090615142225.GA11167@localhost> (raw)
In-Reply-To: <20090615122528.GA13256@wotan.suse.de>
On Mon, Jun 15, 2009 at 08:25:28PM +0800, Nick Piggin wrote:
> On Mon, Jun 15, 2009 at 08:10:01PM +0800, Wu Fengguang wrote:
> > On Mon, Jun 15, 2009 at 03:19:07PM +0800, Nick Piggin wrote:
> > > > For KVM you need early kill, for the others it remains to be seen.
> > >
> > > Right. It's almost like you need to do a per-process thing, and
> > > those that can handle things (such as the new SIGBUS or the new
> > > EIO) could get those, and others could be killed.
> >
> > To send early SIGBUS kills to processes who has called
> > sigaction(SIGBUS, ...)? KVM will sure do that. For other apps we
> > don't mind they can understand that signal at all.
>
> For apps that hook into SIGBUS for some other means and
Yes I was referring to the sigaction(SIGBUS) apps, others will
be late killed anyway.
> do not understand the new type of SIGBUS signal? What about
> those?
We introduced two new SIGBUS codes:
BUS_MCEERR_AO=5 for early kill
BUS_MCEERR_AR=4 for late kill
I'd assume a legacy application will handle them in the same way (both
are unexpected code to the application).
We don't care whether the application can be killed by BUS_MCEERR_AO
or BUS_MCEERR_AR depending on its SIGBUS handler implementation.
But (in the rare case) if the handler
- refused to die on BUS_MCEERR_AR, it may create a busy loop and
flooding of SIGBUS signals, which is a bug of the application.
BUS_MCEERR_AO is one time and won't lead to busy loops.
- does something that hurts itself (ie. data safety) on BUS_MCEERR_AO,
it may well hurt the same way on BUS_MCEERR_AR. The latter one is
unavoidable, so the application must be fixed anyway.
>
> > > Early-kill for KVM does seem like reasonable justification on the
> > > surface, but when I think more about it, I wonder does the guest
> > > actually stand any better chance to correct the error if it is
> > > reported at time T rather than T+delta? (who knows what the page
> > > will be used for at any given time).
> >
> > Early kill makes a lot difference for KVM. Think about the vast
> > amount of clean page cache pages. With early kill the page can be
> > trivially isolated. With late kill the whole virtual machine dies
> > hard.
>
> Why? In both cases it will enter the exception handler and
> attempt to do something about it... in both cases I would
> have thought there is some chance that the page error is not
> recoverable and some chance it is recoverable. Or am I
> missing something?
The early kill / late kill to KVM from the POV of host kernel matches
the MCE AO/AR events inside the KVM guest kernel. The key difference
between AO/AR is, whether the page is _being_ consumed.
It's a lot harder (if possible) to try to stop an active consumer.
For example, the clean cache pages can be consumed in many ways:
- be accessed by read()/write() or mapped read/write
- be reclaimed and then allocated for whatever new usage, for example,
be zeroed by __GFP_ZERO, or be insert into another file and start
read/write IO and be accessed by disk driver via DMA, or even be
allocated for kernel slabs..
Frankly speaking I don't know how to stop all the above consumers.
We now simply die on AR events.
> Anyway, I would like to see a basic analysis of those probabilities
> to justify early kill. Not saying there is no justification, but
> it would be helpful to see why.
That's fine. I'd be glad if the above explanation paves way to
solutions for AR events :)
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-06-15 14:23 UTC|newest]
Thread overview: 158+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-15 2:45 [PATCH 00/22] HWPOISON: Intro (v5) Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 01/22] HWPOISON: Add page flag for poisoned pages Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 02/22] HWPOISON: Export some rmap vma locking to outside world Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 03/22] HWPOISON: Add support for poison swap entries v2 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 04/22] HWPOISON: Add new SIGBUS error codes for hardware poison signals Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 05/22] HWPOISON: Add basic support for poisoned pages in fault handler v3 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 06/22] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 07/22] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 08/22] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 09/22] HWPOISON: Handle hardware poisoned pages in try_to_unmap Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 13:09 ` Minchan Kim
2009-06-15 13:09 ` Minchan Kim
2009-06-15 15:26 ` Wu Fengguang
2009-06-15 15:26 ` Wu Fengguang
2009-06-16 0:03 ` Minchan Kim
2009-06-16 0:03 ` Minchan Kim
2009-06-16 13:49 ` Wu Fengguang
2009-06-16 13:49 ` Wu Fengguang
2009-06-17 0:28 ` Minchan Kim
2009-06-17 0:28 ` Minchan Kim
2009-06-17 7:23 ` Wu Fengguang
2009-06-17 7:23 ` Wu Fengguang
2009-06-17 13:27 ` Minchan Kim
2009-06-17 13:27 ` Minchan Kim
2009-06-17 13:37 ` Wu Fengguang
2009-06-17 13:37 ` Wu Fengguang
2009-06-17 13:43 ` Minchan Kim
2009-06-17 13:43 ` Minchan Kim
2009-06-17 14:03 ` Wu Fengguang
2009-06-17 14:03 ` Wu Fengguang
2009-06-17 14:08 ` Minchan Kim
2009-06-17 14:08 ` Minchan Kim
2009-06-17 14:12 ` Wu Fengguang
2009-06-17 14:12 ` Wu Fengguang
[not found] ` <28c262360906170644w65c08a8y2d2805fb08045804@mail.gmail.com>
[not found] ` <20090617135543.GA8079@localhost>
[not found] ` <28c262360906170703h3363b68dp74471358f647921e@mail.gmail.com>
2009-06-18 12:14 ` Wu Fengguang
2009-06-18 12:14 ` Wu Fengguang
2009-06-18 13:31 ` Minchan Kim
2009-06-18 13:31 ` Minchan Kim
2009-06-19 1:58 ` Wu Fengguang
2009-06-19 1:58 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 10/22] HWPOISON: check and isolate corrupted free pages v2 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 9:41 ` KAMEZAWA Hiroyuki
2009-06-15 9:41 ` KAMEZAWA Hiroyuki
2009-06-15 10:16 ` Wu Fengguang
2009-06-15 10:16 ` Wu Fengguang
2009-06-15 23:52 ` KAMEZAWA Hiroyuki
2009-06-15 23:52 ` KAMEZAWA Hiroyuki
2009-06-16 0:34 ` Wu Fengguang
2009-06-16 0:34 ` Wu Fengguang
2009-06-16 11:29 ` Hugh Dickins
2009-06-16 11:29 ` Hugh Dickins
2009-06-16 11:40 ` Wu Fengguang
2009-06-16 11:40 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 11/22] HWPOISON: Refactor truncate to allow direct truncating of page v3 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 12/22] HWPOISON: The high level memory error handler in the VM v7 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 13/22] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 14/22] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 15/22] HWPOISON: early kill cleanups and fixes Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 16/22] mm: move page flag numbers for user space to page-flags.h Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 17/22] HWPOISON: introduce struct hwpoison_control Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 18/22] HWPOISON: use compound head page Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 19/22] HWPOISON: detect free buddy pages explicitly Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 20/22] HWPOISON: collect infos that reflect the impact of the memory corruption Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 2:45 ` [PATCH 21/22] HWPOISON: send uevent to report " Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 6:29 ` Andi Kleen
2009-06-15 6:29 ` Andi Kleen
2009-06-15 9:56 ` Wu Fengguang
2009-06-15 9:56 ` Wu Fengguang
2009-06-16 0:35 ` Greg KH
2009-06-16 0:35 ` Greg KH
2009-06-15 2:45 ` [PATCH 22/22] HWPOISON: FOR TESTING: Enable memory failure code unconditionally Wu Fengguang
2009-06-15 2:45 ` Wu Fengguang
2009-06-15 3:18 ` [PATCH 00/22] HWPOISON: Intro (v5) Balbir Singh
2009-06-15 3:18 ` Balbir Singh
2009-06-15 4:27 ` Wu Fengguang
2009-06-15 4:27 ` Wu Fengguang
2009-06-15 6:44 ` Nick Piggin
2009-06-15 6:44 ` Nick Piggin
2009-06-15 7:09 ` Andi Kleen
2009-06-15 7:09 ` Andi Kleen
2009-06-15 7:19 ` Nick Piggin
2009-06-15 7:19 ` Nick Piggin
2009-06-15 12:10 ` Wu Fengguang
2009-06-15 12:10 ` Wu Fengguang
2009-06-15 12:25 ` Nick Piggin
2009-06-15 12:25 ` Nick Piggin
2009-06-15 14:22 ` Wu Fengguang [this message]
2009-06-15 14:22 ` Wu Fengguang
2009-06-17 6:37 ` [RFC][PATCH] HWPOISON: only early kill processes who installed SIGBUS handler Wu Fengguang
2009-06-17 6:37 ` Wu Fengguang
2009-06-17 8:04 ` Nick Piggin
2009-06-17 8:04 ` Nick Piggin
2009-06-17 9:55 ` Wu Fengguang
2009-06-17 9:55 ` Wu Fengguang
2009-06-17 10:00 ` Nick Piggin
2009-06-17 10:00 ` Nick Piggin
2009-06-17 11:56 ` Wu Fengguang
2009-06-17 11:56 ` Wu Fengguang
2009-06-18 9:56 ` Wu Fengguang
2009-06-18 9:56 ` Wu Fengguang
2009-06-15 8:14 ` [PATCH 00/22] HWPOISON: Intro (v5) Nick Piggin
2009-06-15 8:14 ` Nick Piggin
2009-06-15 10:09 ` Wu Fengguang
2009-06-15 10:09 ` Wu Fengguang
2009-06-15 10:36 ` Nick Piggin
2009-06-15 10:36 ` Nick Piggin
2009-06-15 11:41 ` Wu Fengguang
2009-06-15 11:41 ` Wu Fengguang
2009-06-15 12:51 ` Hugh Dickins
2009-06-15 12:51 ` Hugh Dickins
2009-06-15 13:00 ` Alan Cox
2009-06-15 13:00 ` Alan Cox
2009-06-15 13:29 ` Andi Kleen
2009-06-15 13:29 ` Andi Kleen
2009-06-15 13:28 ` H. Peter Anvin
2009-06-15 13:28 ` H. Peter Anvin
2009-06-15 14:48 ` Alan Cox
2009-06-15 14:48 ` Alan Cox
2009-06-15 15:24 ` Andi Kleen
2009-06-15 15:24 ` Andi Kleen
2009-06-15 15:28 ` Alan Cox
2009-06-15 15:28 ` Alan Cox
2009-06-15 16:19 ` Andi Kleen
2009-06-15 16:19 ` Andi Kleen
2009-06-15 16:28 ` Alan Cox
2009-06-15 16:28 ` Alan Cox
2009-06-15 17:07 ` Andi Kleen
2009-06-15 17:07 ` Andi Kleen
2009-06-16 19:44 ` Russ Anderson
2009-06-16 19:44 ` Russ Anderson
2009-06-16 20:28 ` H. Peter Anvin
2009-06-16 20:28 ` H. Peter Anvin
2009-06-16 20:54 ` Russ Anderson
2009-06-16 20:54 ` Russ Anderson
2009-06-16 20:58 ` H. Peter Anvin
2009-06-16 20:58 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090615142225.GA11167@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris.mason@oracle.com \
--cc=hpa@zytor.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.