From: Ingo Molnar <mingo@elte.hu>
To: Huang Ying <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>, Len Brown <lenb@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
Borislav Petkov <petkovbb@googlemail.com>,
"H. Peter Anvin" <hpa@zytor.com>, Don Zickus <dzickus@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mauro Carvalho Chehab <mchehab@redhat.com>,
"Luck, Tony" <tony.luck@intel.com>
Subject: Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support
Date: Tue, 26 Oct 2010 12:15:36 +0200 [thread overview]
Message-ID: <20101026101536.GC16552@elte.hu> (raw)
In-Reply-To: <1288083163.2862.592.camel@yhuang-dev>
* Huang Ying <ying.huang@intel.com> wrote:
> Hi, Thomas,
>
> On Tue, 2010-10-26 at 12:53 +0800, Thomas Gleixner wrote:
> > B1;2401;0cLen,
> >
> > On Mon, 25 Oct 2010, Len Brown wrote:
> >
> > > > NAKed-by: Ingo Molnar <mingo@elte.hu>
> > >
> > > Everybody knows that Linux has a lot to learn about RAS.
> > >
> > > I think to catch up, we need to play to Linux's strengths
> > > of continuous improvement. If we halt patches in this area
> > > then we could wait forever for the "perfect design".
> >
> > it's not about perfect design. It's about creating new user space
> > ABIs. The patches introduce another error reporting user space ABI
> > with an ad hoc "fits the needs" design.
> >
> > This is my major point of objection.
> >
> > I agree that Linux needs improvement on the RAS side, but does this
> > lack of features justify a new user space ABI which is totally
> > disconnected to existing RAS facilities ?
> >
> > No, it does not. It's not our problem that Intel wasted time on
> > creating another character device driver to report errors to user
> > space. The time spent to do so would have been sufficient to do a
> > proper integration into the existing infrastructure.
> >
> > I would not care at all if these patches would just introduce some
> > weird in kernel interfaces as we can clean that up at will. But
> > introducing a new user space ABI is setting the disconnect of RAS
> > related facilities into stone.
> >
> > From Kconfig:
> >
> > EDAC is designed to report errors in the core system.
> > These are low-level errors that are reported in the CPU or
> > supporting chipset or other subsystems:
> > memory errors, cache errors, PCI errors, thermal throttling, etc..
> > If unsure, select 'Y'.
> >
> > So please explain why your error reporting is so different from the
> > above that it justifies a separate facility. And you better come up
> > with a real good explanation other than we looked at EDAC and it did
> > not fit our needs.
>
> As far as I know, EDAC guys plan to use some other "perfect interface" in the
> future. So I think the current state is really waiting for the "perfect design".
Not sure what you mean by this, but Boris has posted links to his latest patch-set
in this thread, see:
http://kerneltrap.org/mailarchive/linux-kernel/2010/8/6/4603847
The Git coordinates are:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git, branch tip/perf/parse-events
The 'persistent events' facility he has prototyped there appears to be a good
potential match for the ERST store.
It would be very useful to have another feature there: to mark persistent events as
'dump into syslog on bootup', so that for example the contents of the ERST log could
be dumped right on bootup. [but ERST would not be the only persistent event that
could be marked like that.]
Note that we dont need/want other ABI accesses to the ERST log (i.e. we dont want
/dev/erst-dbg), because we want the benefits of the generalization: tooling (RAS and
other tooling) should learn how to deal with persistent events - not learn how to
deal with ERST logs ... or with warm bootup RAM-embedded logs ... or to deal with
kcrash embedded kernel logs ... etc.
There are many obvious advantages from implementing it like that: there's no need to
special-code ERST to printk or ERST to whatever other facility cross links - it
would be part of a generic/uniform event logging facility to begin with. ERST would
only implement its own, narrow, hardware-specific event accessor methods - nothing
else. Basically a small 'event driver'. This would be the most optimal, smallest,
easiest to maintain approach - with no facility duplication and no fragmentation.
It's certainly more work as well _for the first such example_ - but from that point
on any new hardware facility can be added with ease, and those too will fit into
existing tooling in a very natural way.
So please help out with the persistent events work. If you need any pointers we'd be
glad to help.
Thanks,
Ingo
prev parent reply other threads:[~2010-10-26 10:15 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-25 7:43 [PATCH -v2 0/9] ACPI, APEI patches for 2.6.37 Huang Ying
2010-10-25 7:43 ` [PATCH -v2 1/9] ACPI, APEI, Add ERST record ID cache Huang Ying
2010-10-25 7:43 ` [PATCH -v2 2/9] Add lock-less version of bitmap_set/clear Huang Ying
2010-10-25 7:43 ` [PATCH -v2 3/9] lock-less NULL terminated single list implementation Huang Ying
2010-10-25 7:43 ` [PATCH -v2 4/9] lock-less general memory allocator Huang Ying
2010-10-25 7:43 ` [PATCH -v2 5/9] Hardware error device core Huang Ying
2010-10-25 7:43 ` [PATCH -v2 6/9] Hardware error record persistent support Huang Ying
2010-10-25 7:43 ` [PATCH -v2 7/9] ACPI, APEI, Use ERST for hardware error persisting before panic Huang Ying
2010-10-25 7:43 ` [PATCH -v2 8/9] ACPI, APEI, Report GHES error record with hardware error device core Huang Ying
2010-10-25 7:43 ` [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support Huang Ying
2010-10-25 8:45 ` [NAK] " Ingo Molnar
2010-10-25 8:58 ` Huang Ying
2010-10-25 9:19 ` Andi Kleen
2010-10-25 11:15 ` Ingo Molnar
2010-10-25 12:04 ` Mauro Carvalho Chehab
2010-10-25 17:07 ` Tony Luck
2010-10-25 17:19 ` Mauro Carvalho Chehab
2010-10-25 12:37 ` Andi Kleen
2010-10-25 12:55 ` Ingo Molnar
2010-10-25 13:02 ` Ingo Molnar
2010-10-25 13:11 ` Andi Kleen
2010-10-25 13:47 ` Ingo Molnar
2010-10-25 15:14 ` Andi Kleen
2010-10-25 17:10 ` Ingo Molnar
2010-10-27 8:25 ` Ingo Molnar
2010-10-25 16:38 ` Thomas Gleixner
2010-10-25 9:25 ` Ingo Molnar
2010-10-25 17:14 ` Tony Luck
2010-10-25 20:23 ` Borislav Petkov
2010-10-25 21:23 ` Tony Luck
2010-10-25 21:23 ` Tony Luck
2010-10-25 21:51 ` Borislav Petkov
2010-10-25 21:51 ` Borislav Petkov
2010-10-25 23:35 ` Tony Luck
2010-10-25 23:35 ` Tony Luck
2010-10-26 6:26 ` Borislav Petkov
2010-10-26 6:26 ` Borislav Petkov
2010-10-26 1:06 ` Len Brown
2010-10-26 4:53 ` Thomas Gleixner
2010-10-26 7:22 ` Ingo Molnar
2010-10-26 7:30 ` Huang Ying
2010-10-26 7:55 ` Ingo Molnar
2010-10-26 8:32 ` Huang Ying
2010-10-26 10:03 ` Ingo Molnar
2010-10-26 8:38 ` Andi Kleen
2010-10-26 10:00 ` Thomas Gleixner
2010-10-26 8:52 ` Huang Ying
2010-10-26 10:15 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101026101536.GC16552@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=dzickus@redhat.com \
--cc=hpa@zytor.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=petkovbb@googlemail.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.