From: Ingo Molnar <mingo@elte.hu>
To: Huang Ying <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>, Len Brown <lenb@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
Borislav Petkov <petkovbb@googlemail.com>,
"H. Peter Anvin" <hpa@zytor.com>, Don Zickus <dzickus@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mauro Carvalho Chehab <mchehab@redhat.com>,
"Luck, Tony" <tony.luck@intel.com>
Subject: Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support
Date: Tue, 26 Oct 2010 12:15:36 +0200 [thread overview]
Message-ID: <20101026101536.GC16552@elte.hu> (raw)
In-Reply-To: <1288083163.2862.592.camel@yhuang-dev>
* Huang Ying <ying.huang@intel.com> wrote:
> Hi, Thomas,
>
> On Tue, 2010-10-26 at 12:53 +0800, Thomas Gleixner wrote:
> > B1;2401;0cLen,
> >
> > On Mon, 25 Oct 2010, Len Brown wrote:
> >
> > > > NAKed-by: Ingo Molnar <mingo@elte.hu>
> > >
> > > Everybody knows that Linux has a lot to learn about RAS.
> > >
> > > I think to catch up, we need to play to Linux's strengths
> > > of continuous improvement. If we halt patches in this area
> > > then we could wait forever for the "perfect design".
> >
> > it's not about perfect design. It's about creating new user space
> > ABIs. The patches introduce another error reporting user space ABI
> > with an ad hoc "fits the needs" design.
> >
> > This is my major point of objection.
> >
> > I agree that Linux needs improvement on the RAS side, but does this
> > lack of features justify a new user space ABI which is totally
> > disconnected to existing RAS facilities ?
> >
> > No, it does not. It's not our problem that Intel wasted time on
> > creating another character device driver to report errors to user
> > space. The time spent to do so would have been sufficient to do a
> > proper integration into the existing infrastructure.
> >
> > I would not care at all if these patches would just introduce some
> > weird in kernel interfaces as we can clean that up at will. But
> > introducing a new user space ABI is setting the disconnect of RAS
> > related facilities into stone.
> >
> > From Kconfig:
> >
> > EDAC is designed to report errors in the core system.
> > These are low-level errors that are reported in the CPU or
> > supporting chipset or other subsystems:
> > memory errors, cache errors, PCI errors, thermal throttling, etc..
> > If unsure, select 'Y'.
> >
> > So please explain why your error reporting is so different from the
> > above that it justifies a separate facility. And you better come up
> > with a real good explanation other than we looked at EDAC and it did
> > not fit our needs.
>
> As far as I know, EDAC guys plan to use some other "perfect interface" in the
> future. So I think the current state is really waiting for the "perfect design".
Not sure what you mean by this, but Boris has posted links to his latest patch-set
in this thread, see:
http://kerneltrap.org/mailarchive/linux-kernel/2010/8/6/4603847
The Git coordinates are:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git, branch tip/perf/parse-events
The 'persistent events' facility he has prototyped there appears to be a good
potential match for the ERST store.
It would be very useful to have another feature there: to mark persistent events as
'dump into syslog on bootup', so that for example the contents of the ERST log could
be dumped right on bootup. [but ERST would not be the only persistent event that
could be marked like that.]
Note that we dont need/want other ABI accesses to the ERST log (i.e. we dont want
/dev/erst-dbg), because we want the benefits of the generalization: tooling (RAS and
other tooling) should learn how to deal with persistent events - not learn how to
deal with ERST logs ... or with warm bootup RAM-embedded logs ... or to deal with
kcrash embedded kernel logs ... etc.
There are many obvious advantages from implementing it like that: there's no need to
special-code ERST to printk or ERST to whatever other facility cross links - it
would be part of a generic/uniform event logging facility to begin with. ERST would
only implement its own, narrow, hardware-specific event accessor methods - nothing
else. Basically a small 'event driver'. This would be the most optimal, smallest,
easiest to maintain approach - with no facility duplication and no fragmentation.
It's certainly more work as well _for the first such example_ - but from that point
on any new hardware facility can be added with ease, and those too will fit into
existing tooling in a very natural way.
So please help out with the persistent events work. If you need any pointers we'd be
glad to help.
Thanks,
Ingo
prev parent reply other threads:[~2010-10-26 10:15 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-25 7:43 [PATCH -v2 0/9] ACPI, APEI patches for 2.6.37 Huang Ying
2010-10-25 7:43 ` [PATCH -v2 1/9] ACPI, APEI, Add ERST record ID cache Huang Ying
2010-10-25 7:43 ` [PATCH -v2 2/9] Add lock-less version of bitmap_set/clear Huang Ying
2010-10-25 7:43 ` [PATCH -v2 3/9] lock-less NULL terminated single list implementation Huang Ying
2010-10-25 7:43 ` [PATCH -v2 4/9] lock-less general memory allocator Huang Ying
2010-10-25 7:43 ` [PATCH -v2 5/9] Hardware error device core Huang Ying
2010-10-25 7:43 ` [PATCH -v2 6/9] Hardware error record persistent support Huang Ying
2010-10-25 7:43 ` [PATCH -v2 7/9] ACPI, APEI, Use ERST for hardware error persisting before panic Huang Ying
2010-10-25 7:43 ` [PATCH -v2 8/9] ACPI, APEI, Report GHES error record with hardware error device core Huang Ying
2010-10-25 7:43 ` [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support Huang Ying
2010-10-25 8:45 ` [NAK] " Ingo Molnar
2010-10-25 8:58 ` Huang Ying
2010-10-25 9:19 ` Andi Kleen
2010-10-25 11:15 ` Ingo Molnar
2010-10-25 12:04 ` Mauro Carvalho Chehab
2010-10-25 17:07 ` Tony Luck
2010-10-25 17:19 ` Mauro Carvalho Chehab
2010-10-25 12:37 ` Andi Kleen
2010-10-25 12:55 ` Ingo Molnar
2010-10-25 13:02 ` Ingo Molnar
2010-10-25 13:11 ` Andi Kleen
2010-10-25 13:47 ` Ingo Molnar
2010-10-25 15:14 ` Andi Kleen
2010-10-25 17:10 ` Ingo Molnar
2010-10-27 8:25 ` Ingo Molnar
2010-10-25 16:38 ` Thomas Gleixner
2010-10-25 9:25 ` Ingo Molnar
2010-10-25 17:14 ` Tony Luck
2010-10-25 20:23 ` Borislav Petkov
2010-10-25 21:23 ` Tony Luck
2010-10-25 21:51 ` Borislav Petkov
2010-10-25 23:35 ` Tony Luck
2010-10-26 6:26 ` Borislav Petkov
2010-10-26 1:06 ` Len Brown
2010-10-26 4:53 ` Thomas Gleixner
2010-10-26 7:22 ` Ingo Molnar
2010-10-26 7:30 ` Huang Ying
2010-10-26 7:55 ` Ingo Molnar
2010-10-26 8:32 ` Huang Ying
2010-10-26 10:03 ` Ingo Molnar
2010-10-26 8:38 ` Andi Kleen
2010-10-26 10:00 ` Thomas Gleixner
2010-10-26 8:52 ` Huang Ying
2010-10-26 10:15 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101026101536.GC16552@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=dzickus@redhat.com \
--cc=hpa@zytor.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=petkovbb@googlemail.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox