From: Ingo Molnar <mingo@elte.hu>
To: Tony Luck <tony.luck@intel.com>
Cc: "Joe Perches" <joe@perches.com>,
"Mauro Carvalho Chehab" <mchehab@redhat.com>,
"Hidetoshi Seto" <seto.hidetoshi@jp.fujitsu.com>,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
"bluesmoke-devel@lists.sourceforge.net"
<bluesmoke-devel@lists.sourceforge.net>,
"Linux Edac Mailing List" <linux-edac@vger.kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>,
"Ben Woodard" <woodard@redhat.com>,
"Matt Domsch" <Matt_Domsch@dell.com>,
"Doug Thompson" <dougthompson@xmission.com>,
"Borislav Petkov" <bp@amd64.org>,
"Young, Brent" <brent.young@intel.com>,
"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Arnaldo Carvalho de Melo" <acme@redhat.com>
Subject: Re: Hardware Error Kernel Mini-Summit
Date: Wed, 19 May 2010 00:00:02 +0200 [thread overview]
Message-ID: <20100518220002.GA23739@elte.hu> (raw)
In-Reply-To: <AANLkTimPRvuoW5-OcPlPx5cvnCTJa7xhAQDSLrYziB4j@mail.gmail.com>
* Tony Luck <tony.luck@intel.com> wrote:
> > This gives us a broad platform to add various RAS
> > events as well, beyond raw hardware events: we could
> > for example events for various system anomalies such
> > as lockup messages, kernel warnings/oopses, IOMMU
> > exceptions - maybe even pure software concepts such as
> > fatal segmentation fault events, etc. etc.
>
> This looks like sticky ground. I can see the event
> mechanism passing data to a user daemon working well for
> all kinds of corrected and minor errors. But when you
> start talking about lockups and fatal errors things get
> a lot trickier. Often the main concern at this point is
> error containment. Making sure that the flaky data
> doesn't become visible (saved to storage, transmitted to
> the network, etc.). [...]
I was pointing beyond the narrow hardware (memory) error
point of view, towards a more generic 'system health'
thinking.
In the broader view it may makes sense to for example
define policy over excessive number of segfaults on a
server system (where excessive segfaults are an anomaly),
or a suspiciously large number of soft IO errors, etc.
But yes, of course, when it comes to hard memory errors,
those take precedence, and handling them (and
saving/propagating information about them while we still
can) is a priority.
> [...] Getting from a machine check handler through some
> context switches (and page faults etc.) to a user level
> daemon before the error gets recorded looks to be really
> hard.
As Boris mentioned it too, critical policy action can and
will be done straight in the kernel.
Ingo
next prev parent reply other threads:[~2010-05-18 22:00 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-17 18:23 Hardware Error Kernel Mini-Summit Mauro Carvalho Chehab
2010-05-17 22:41 ` Andi Kleen
2010-05-18 16:50 ` Mauro Carvalho Chehab
2010-05-18 18:10 ` Andi Kleen
2010-05-18 18:10 ` Andi Kleen
2010-05-18 6:52 ` Hidetoshi Seto
2010-05-18 6:52 ` Hidetoshi Seto
2010-05-18 16:44 ` Mauro Carvalho Chehab
2010-05-18 16:44 ` Mauro Carvalho Chehab
2010-05-18 17:42 ` Joe Perches
2010-05-18 17:59 ` Mauro Carvalho Chehab
2010-05-18 18:45 ` Andi Kleen
2010-05-18 18:57 ` Joe Perches
2010-05-18 18:53 ` Ingo Molnar
2010-05-18 19:08 ` Luck, Tony
2010-05-18 19:18 ` Borislav Petkov
2010-05-18 19:34 ` Ingo Molnar
2010-05-18 22:14 ` Eric W. Biederman
2010-05-18 22:14 ` Eric W. Biederman
2010-05-18 22:28 ` Andi Kleen
2010-05-19 1:14 ` Eric W. Biederman
2010-05-19 1:14 ` Eric W. Biederman
2010-05-19 6:46 ` Borislav Petkov
2010-05-19 7:09 ` Ingo Molnar
2010-05-19 11:54 ` Mauro Carvalho Chehab
2010-05-19 11:54 ` Mauro Carvalho Chehab
2010-05-20 12:37 ` Ingo Molnar
2010-06-14 10:03 ` Nils Carlson
2010-06-14 10:03 ` Nils Carlson
2010-06-14 11:49 ` Andi Kleen
2010-06-14 19:47 ` Nils Carlson
2010-06-14 19:47 ` Nils Carlson
2010-06-14 20:21 ` Andi Kleen
2010-06-14 21:02 ` Nils Carlson
2010-06-14 20:06 ` Eric W. Biederman
2010-06-14 20:06 ` Eric W. Biederman
2010-06-14 20:21 ` Luck, Tony
2010-06-14 20:36 ` Andi Kleen
2010-06-14 20:36 ` Andi Kleen
2010-06-14 21:34 ` Tony Luck
2010-06-14 21:34 ` Tony Luck
2010-06-14 23:46 ` Doug Thompson
2010-06-15 6:56 ` Andi Kleen
2010-06-15 8:06 ` Nils Carlson
2010-06-15 8:06 ` Nils Carlson
2010-06-15 10:01 ` Borislav Petkov
2010-06-15 11:41 ` Andi Kleen
2010-06-15 11:41 ` Andi Kleen
2010-06-15 12:21 ` Nils Carlson
2010-06-15 18:15 ` Luck, Tony
2010-06-15 18:38 ` Nils Carlson
2010-06-15 18:38 ` Nils Carlson
2010-06-15 19:37 ` Andi Kleen
2010-06-15 19:37 ` Andi Kleen
2010-06-15 19:35 ` Andi Kleen
2010-06-15 20:48 ` Nils Carlson
2010-06-15 20:48 ` Nils Carlson
2010-06-16 9:40 ` Andi Kleen
2010-06-16 9:40 ` Andi Kleen
2010-06-15 22:33 ` Tony Luck
2010-06-15 6:44 ` Andi Kleen
2010-06-15 6:44 ` Andi Kleen
2010-05-19 9:03 ` Andi Kleen
2010-05-24 16:21 ` Russ Anderson
2010-05-24 18:26 ` Andi Kleen
2010-05-24 18:26 ` Andi Kleen
2010-05-19 17:30 ` Tony Luck
2010-05-24 15:55 ` Russ Anderson
2010-05-24 17:35 ` Tony Luck
2010-05-24 18:31 ` Andi Kleen
2010-05-18 22:29 ` Ingo Molnar
2010-05-18 19:30 ` Ingo Molnar
2010-05-18 20:42 ` Ingo Molnar
2010-05-18 21:37 ` Tony Luck
2010-05-18 22:00 ` Ingo Molnar [this message]
2010-05-24 17:13 ` Russ Anderson
2010-05-19 6:39 ` Ingo Molnar
2010-05-18 13:06 ` Borislav Petkov
2010-05-18 16:52 ` Mauro Carvalho Chehab
2010-05-18 16:52 ` Mauro Carvalho Chehab
2010-05-18 17:06 ` Mauro Carvalho Chehab
2010-05-18 17:06 ` Mauro Carvalho Chehab
-- strict thread matches above, loose matches on Subject: below --
2010-06-16 8:57 George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100518220002.GA23739@elte.hu \
--to=mingo@elte.hu \
--cc=Matt_Domsch@dell.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@redhat.com \
--cc=bluesmoke-devel@lists.sourceforge.net \
--cc=bp@amd64.org \
--cc=brent.young@intel.com \
--cc=dougthompson@xmission.com \
--cc=fweisbec@gmail.com \
--cc=joe@perches.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=mingo@redhat.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=woodard@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.