xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Borislav Petkov <bp@amd64.org>
Cc: "Liu, Jinsong" <jinsong.liu@intel.com>,
	tony.luck@intel.com, x86@kernel.org, linux-edac@vger.kernel.org,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [Xen-devel] [PATCH 1/3] Add mcelog support for xen platform
Date: Mon, 23 Apr 2012 11:06:57 -0400	[thread overview]
Message-ID: <20120423150657.GA24481@phenom.dumpdata.com> (raw)
In-Reply-To: <20120423060921.GA29378@aftab.osrc.amd.com>

> > This driver is not that much different from the APEI bridge to MCE code -
> > it just that instead of reading APEI blob data it reads it from an hypercall.
> 
> Let me ask you this: is APEI a virtualization solution of some sort?
> 
> No, it is the old windoze RAS crap but I guess Linux has to support it
> now too through BIOS. And x86 vendors will have to support it too.
> 
> So it is piece of the firmware we'd have to deal with too.
> 
> Now xen is a whole another deal - it is purely a piece of software.

Perfect. Software is more elastic than hardware - so the Xen ABI
for the MCE can be changed then to reflect the new format if required.
> 
> > The fix seems quite easy - you change the 'struct mce' and 'mce_log()'
> > along with the drivers that use it.
> 
> This is exactly what I have a problem with: having to take care of xen
> too. "No, Boris, nope, we cannot take your new feature because it breaks
> xen." and also "Have you tested this on xen too?" where the only thing I
> do is _hardware_ enablement and improving software support for it. And
> xen is not hardware...

Delegate testing to sub-maintainers. In this case that would be me
and Liu.

> 
> [..]
> 
> > If you are worried about breaking something, then you can just send
> > the change to me or Liu to test it out before committing API changes
> > in the MCE code.
> 
> This probably sounds good now but I don't think code changes like
> that ever run as smoothly. Whenever there's breakage, there'll always
> be people screaming against it - I just don't want code that enables

Right, regressions are bad.

> hardware to be crippled and unable to change because it breaks
> completely unrelated pieces - it is bad as it is now.

Can you point me to the existing examples of MCE's badness?

I remember the Greg KK's patches to the dynamic vs static and per-cpu
initialization - but that wasn't due to the MCE API. I think that
was due to Key Sievens transition from SysFS subsystem API to device API
patches that broke bunch of stuff.

> 
> > > And this has happened already with the whole microcode loading debacle.
> > 
> > My recollection is that the existing microcode API had major issues that
> > could not fixed. The only fix was to make it be very early in the bootup
> > processes and that is what hpa would like developers to focus on.
> 
> That was one side of the problem. The other was, AFAICR, creating a xen
> microcode driver which was "on the same level" as the hardware microcode
> drivers, which was completely bull*.

I think of Xen as the hypervisors on PowerPC boxes - for certain operations
you have to use hypercalls to do some hardware operations.

> 
> The problem is xen growing stuff everywhere in arch/x86/ and this way,
> maybe even unwillingly, crippling development of hardware-related
> features. I know you're willing to help and I know you mean it well, but
> there's always some other problem in practice.

I am not sure I see why we cannot fix the practical problems as they pop
up?

> 
> Now I keep wondering, why don't you guys simply create your own mcelog
> ring buffer and interface on the userspace tool side instead of hooking
> into lowlevel kernel stuff? I mean, the code is there, you simply have
> to copy it into arch/xen/ or whatever you have there. Why do you have to

Nowadays the kernel can transition to run under lguest, KVM, Xen or
baremetal as a single binary image instead of multiple compiled
kernels for a specific virtualization framework. As such, there is
no 'arch/xen,lguest,kvm', instead there is alternative_asm that patches
the low-level calls (set_pte, load_cr3, spinlock, time, etc), for the
appropiate virtualization (or CPU if done under baremetal) offering. Hence
the arch/x86 has expanded to support baremtal and virtualization
extensions in it (called paravirt_ops).

> hook into arch/x86/ instead of doing your own stuff?

I think what you are suggesting is to _not_ reuse existing APIs. That
seems counter-intuive to general software development. There are
exceptions of course - when the existing API needs to change a lot
(or needs to be thrown out), and there is this one little driver that
keeps on using the old interface and can't change - at that point I can
see the purpose of forking it. But until then - using existing APIs is
the way to go.

And I (along with Liu) will keep the Xen MCE driver evolving as it
needs to conform to the new kernel mcheck API.

> 
> > > So, my suggestion is to copy the pieces you need and create your own xen
> > > version of /dev/mcelog and add it to dom0 so that there's no hooking
> > > into baremetal code and whenever a dom0 kernel is running, you can
> > > reroute the mcelog userspace tool to read /dev/xen_mcelog or whatever
> > > and not hook into the x86 versions.
> > > 
> > > Because, if you'd hooked into it, just imagine one fine day, when we
> > > remove mcelog support, what screaming the xen people will be doing when
> > > mcelog doesn't work anymore.
> > 
> > You would have more screaming from the distro camp about removing
> > /dev/mcelog.
> 
> How do you know that? Don't you think that we probably would've talked
> to them already and made preparations for conversion first?

I was Googling around for it and I couldn't find anything that says
MCE is removed (which could be very well the fault of my poor
Googling-skills) and its replacement user-space program. Please do
point me to the URL so I can get some idea of what is brewing.

The one thing I saw was this https://lkml.org/lkml/2012/3/2/312 which
pointed to the /dev/mcelog struct changing (which then got NACK-ed), but
nothing about the internal 'struct mce' being dropped from drivers?

I couldn't find anything in the Documentation/feature-removal-schedule.txt 

There are hints of ras_printk and  /sys/devices/system/ras/agent but
they are related to printk (and I see that mce_log would use it too), but
that just seems to [from a driver perspective] to be the output code-paths
[like the MCE decoders] - and it allows to output to be put in a trace
buffer as well - instead of just in /dev/mcelog.

If the distros choose to stop using /dev/mcelog and use another mechanism
(and the MCE check drivers still do their job of collecting the data and
sending it downstream), then I don't see why "screaming the xen people will
be doing" - as they still get the MCE errors - in whatever way the distros
has choosen to present it.

> 
> > But if that is your choice, I would send you an email asking how I
> > need to retool this driver to work with the new MCE gen2 code that you
> > had in mind.
> 
> As I said above, I'm very sceptical this will ever work, I guess I'd
> have to live and see.
> 
> Now, with your own buffer solution, nothing breaks and all is happy,
> a win-win, if you wish. I think this is much simpler and easier a
> solution.

Not sure what you mean by 'own buffer solution'. Are you talking about
using the trace_mce_record or the ras_printk instead of the mce_log?
I would think that is the job of the MCE decoders?

Please keep in mind that this driver is not trying to decode anything -
it just lifting raw events, massaging them a bit, and sending them
downstream. Similar to how the APEI does it.

  reply	other threads:[~2012-04-23 15:06 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-19 13:29 [PATCH 1/3] Add mcelog support for xen platform Liu, Jinsong
2012-04-20 18:40 ` Konrad Rzeszutek Wilk
2012-04-20 20:00   ` Liu, Jinsong
2012-04-21  4:14 ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-04-21 10:45   ` Borislav Petkov
2012-04-23  2:18     ` Konrad Rzeszutek Wilk
2012-04-23  6:09       ` Borislav Petkov
2012-04-23 15:06         ` Konrad Rzeszutek Wilk [this message]
2012-04-23 15:59           ` Borislav Petkov
2012-04-23 16:23             ` Luck, Tony
2012-04-23 16:54             ` Konrad Rzeszutek Wilk
2012-04-23 16:17         ` Liu, Jinsong
2012-04-23 15:27     ` Luck, Tony
2012-04-23 15:32       ` Konrad Rzeszutek Wilk
2012-04-23 16:17         ` Luck, Tony
2012-04-23 16:51           ` Konrad Rzeszutek Wilk
2012-04-23 15:43       ` Liu, Jinsong
2012-04-23 16:26         ` H. Peter Anvin
2012-04-23 17:05           ` Konrad Rzeszutek Wilk
2012-04-23 17:23             ` H. Peter Anvin
2012-04-23 16:17       ` H. Peter Anvin
2012-04-23 17:03         ` Konrad Rzeszutek Wilk
2012-04-23 17:16           ` H. Peter Anvin
2012-04-24  8:27     ` Andi Kleen
2012-04-25  8:11       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120423150657.GA24481@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=bp@amd64.org \
    --cc=hpa@zytor.com \
    --cc=jinsong.liu@intel.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).