From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Borislav Petkov <bp@amd64.org>
Cc: "Liu, Jinsong" <jinsong.liu@intel.com>,
tony.luck@intel.com, x86@kernel.org, linux-edac@vger.kernel.org,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [Xen-devel] [PATCH 1/3] Add mcelog support for xen platform
Date: Mon, 23 Apr 2012 11:06:57 -0400 [thread overview]
Message-ID: <20120423150657.GA24481@phenom.dumpdata.com> (raw)
In-Reply-To: <20120423060921.GA29378@aftab.osrc.amd.com>
> > This driver is not that much different from the APEI bridge to MCE code -
> > it just that instead of reading APEI blob data it reads it from an hypercall.
>
> Let me ask you this: is APEI a virtualization solution of some sort?
>
> No, it is the old windoze RAS crap but I guess Linux has to support it
> now too through BIOS. And x86 vendors will have to support it too.
>
> So it is piece of the firmware we'd have to deal with too.
>
> Now xen is a whole another deal - it is purely a piece of software.
Perfect. Software is more elastic than hardware - so the Xen ABI
for the MCE can be changed then to reflect the new format if required.
>
> > The fix seems quite easy - you change the 'struct mce' and 'mce_log()'
> > along with the drivers that use it.
>
> This is exactly what I have a problem with: having to take care of xen
> too. "No, Boris, nope, we cannot take your new feature because it breaks
> xen." and also "Have you tested this on xen too?" where the only thing I
> do is _hardware_ enablement and improving software support for it. And
> xen is not hardware...
Delegate testing to sub-maintainers. In this case that would be me
and Liu.
>
> [..]
>
> > If you are worried about breaking something, then you can just send
> > the change to me or Liu to test it out before committing API changes
> > in the MCE code.
>
> This probably sounds good now but I don't think code changes like
> that ever run as smoothly. Whenever there's breakage, there'll always
> be people screaming against it - I just don't want code that enables
Right, regressions are bad.
> hardware to be crippled and unable to change because it breaks
> completely unrelated pieces - it is bad as it is now.
Can you point me to the existing examples of MCE's badness?
I remember the Greg KK's patches to the dynamic vs static and per-cpu
initialization - but that wasn't due to the MCE API. I think that
was due to Key Sievens transition from SysFS subsystem API to device API
patches that broke bunch of stuff.
>
> > > And this has happened already with the whole microcode loading debacle.
> >
> > My recollection is that the existing microcode API had major issues that
> > could not fixed. The only fix was to make it be very early in the bootup
> > processes and that is what hpa would like developers to focus on.
>
> That was one side of the problem. The other was, AFAICR, creating a xen
> microcode driver which was "on the same level" as the hardware microcode
> drivers, which was completely bull*.
I think of Xen as the hypervisors on PowerPC boxes - for certain operations
you have to use hypercalls to do some hardware operations.
>
> The problem is xen growing stuff everywhere in arch/x86/ and this way,
> maybe even unwillingly, crippling development of hardware-related
> features. I know you're willing to help and I know you mean it well, but
> there's always some other problem in practice.
I am not sure I see why we cannot fix the practical problems as they pop
up?
>
> Now I keep wondering, why don't you guys simply create your own mcelog
> ring buffer and interface on the userspace tool side instead of hooking
> into lowlevel kernel stuff? I mean, the code is there, you simply have
> to copy it into arch/xen/ or whatever you have there. Why do you have to
Nowadays the kernel can transition to run under lguest, KVM, Xen or
baremetal as a single binary image instead of multiple compiled
kernels for a specific virtualization framework. As such, there is
no 'arch/xen,lguest,kvm', instead there is alternative_asm that patches
the low-level calls (set_pte, load_cr3, spinlock, time, etc), for the
appropiate virtualization (or CPU if done under baremetal) offering. Hence
the arch/x86 has expanded to support baremtal and virtualization
extensions in it (called paravirt_ops).
> hook into arch/x86/ instead of doing your own stuff?
I think what you are suggesting is to _not_ reuse existing APIs. That
seems counter-intuive to general software development. There are
exceptions of course - when the existing API needs to change a lot
(or needs to be thrown out), and there is this one little driver that
keeps on using the old interface and can't change - at that point I can
see the purpose of forking it. But until then - using existing APIs is
the way to go.
And I (along with Liu) will keep the Xen MCE driver evolving as it
needs to conform to the new kernel mcheck API.
>
> > > So, my suggestion is to copy the pieces you need and create your own xen
> > > version of /dev/mcelog and add it to dom0 so that there's no hooking
> > > into baremetal code and whenever a dom0 kernel is running, you can
> > > reroute the mcelog userspace tool to read /dev/xen_mcelog or whatever
> > > and not hook into the x86 versions.
> > >
> > > Because, if you'd hooked into it, just imagine one fine day, when we
> > > remove mcelog support, what screaming the xen people will be doing when
> > > mcelog doesn't work anymore.
> >
> > You would have more screaming from the distro camp about removing
> > /dev/mcelog.
>
> How do you know that? Don't you think that we probably would've talked
> to them already and made preparations for conversion first?
I was Googling around for it and I couldn't find anything that says
MCE is removed (which could be very well the fault of my poor
Googling-skills) and its replacement user-space program. Please do
point me to the URL so I can get some idea of what is brewing.
The one thing I saw was this https://lkml.org/lkml/2012/3/2/312 which
pointed to the /dev/mcelog struct changing (which then got NACK-ed), but
nothing about the internal 'struct mce' being dropped from drivers?
I couldn't find anything in the Documentation/feature-removal-schedule.txt
There are hints of ras_printk and /sys/devices/system/ras/agent but
they are related to printk (and I see that mce_log would use it too), but
that just seems to [from a driver perspective] to be the output code-paths
[like the MCE decoders] - and it allows to output to be put in a trace
buffer as well - instead of just in /dev/mcelog.
If the distros choose to stop using /dev/mcelog and use another mechanism
(and the MCE check drivers still do their job of collecting the data and
sending it downstream), then I don't see why "screaming the xen people will
be doing" - as they still get the MCE errors - in whatever way the distros
has choosen to present it.
>
> > But if that is your choice, I would send you an email asking how I
> > need to retool this driver to work with the new MCE gen2 code that you
> > had in mind.
>
> As I said above, I'm very sceptical this will ever work, I guess I'd
> have to live and see.
>
> Now, with your own buffer solution, nothing breaks and all is happy,
> a win-win, if you wish. I think this is much simpler and easier a
> solution.
Not sure what you mean by 'own buffer solution'. Are you talking about
using the trace_mce_record or the ras_printk instead of the mce_log?
I would think that is the job of the MCE decoders?
Please keep in mind that this driver is not trying to decode anything -
it just lifting raw events, massaging them a bit, and sending them
downstream. Similar to how the APEI does it.
next prev parent reply other threads:[~2012-04-23 15:12 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-19 13:29 [PATCH 1/3] Add mcelog support for xen platform Liu, Jinsong
2012-04-20 18:40 ` Konrad Rzeszutek Wilk
2012-04-20 20:00 ` Liu, Jinsong
2012-04-20 20:00 ` Liu, Jinsong
2012-04-21 4:14 ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-04-21 10:45 ` Borislav Petkov
2012-04-23 2:18 ` Konrad Rzeszutek Wilk
2012-04-23 6:09 ` Borislav Petkov
2012-04-23 15:06 ` Konrad Rzeszutek Wilk [this message]
2012-04-23 15:59 ` Borislav Petkov
2012-04-23 16:23 ` Luck, Tony
2012-04-23 16:54 ` Konrad Rzeszutek Wilk
2012-04-23 16:17 ` Liu, Jinsong
2012-04-23 15:27 ` Luck, Tony
2012-04-23 15:32 ` Konrad Rzeszutek Wilk
2012-04-23 16:17 ` Luck, Tony
2012-04-23 16:51 ` Konrad Rzeszutek Wilk
2012-04-23 15:43 ` Liu, Jinsong
2012-04-23 16:26 ` H. Peter Anvin
2012-04-23 17:05 ` Konrad Rzeszutek Wilk
2012-04-23 17:23 ` H. Peter Anvin
2012-04-23 16:17 ` H. Peter Anvin
2012-04-23 17:03 ` Konrad Rzeszutek Wilk
2012-04-23 17:16 ` H. Peter Anvin
2012-04-24 8:27 ` Andi Kleen
2012-04-24 8:27 ` Andi Kleen
2012-04-25 8:11 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120423150657.GA24481@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=bp@amd64.org \
--cc=hpa@zytor.com \
--cc=jinsong.liu@intel.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.