From: Andi Kleen <andi@firstfloor.org>
To: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>,
linux-kernel@vger.kernel.org, hpa@zytor.com, x86@kernel.org,
Huang Ying <ying.huang@intel.com>,
Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH 02/31] x86: MCE: Improve mce_get_rip v3
Date: Wed, 27 May 2009 09:23:02 +0200 [thread overview]
Message-ID: <20090527072302.GR846@one.firstfloor.org> (raw)
In-Reply-To: <4A1CC19C.3010409@jp.fujitsu.com>
On Wed, May 27, 2009 at 01:29:16PM +0900, Hidetoshi Seto wrote:
> Andi Kleen wrote:
> > From: Huang Ying <ying.huang@intel.com>
> >
> > Assume RIP is valid when either EIPV or RIPV are set.
>
> Bad description.
> If RIP means "restart IP" that is valid only if RIPV is set,
> this sentence doesn't make sense completely.
No it doesn't mean restart IP, it just means normal instruction
pointer like everywhere else.
>
> > This influences
> > whether the machine check exception handler decides to return or panic.
>
> I suppose you are pointing logics in:
Yes.
>
> mce_get_rip(&m, regs);
> :
> panicm = m;
> :
> /*
> * If the EIPV bit is set, it means the saved IP is the
> * instruction which caused the MCE.
> */
> if (m.mcgstatus & MCG_STATUS_EIPV)
> user_space = panicm.ip && (panicm.cs & 3);
>
> /*
> * If we know that the error was in user space, send a
> * SIGBUS. Otherwise, panic if tolerance is low.
> *
> * force_sig() takes an awful lot of locks and has a slight
> * risk of deadlocking.
> */
> if (user_space) {
> force_sig(SIGBUS, current);
> } else if (panic_on_oops || tolerant < 2) {
> mce_panic("Uncorrected machine check",
> &panicm, mcestart);
> }
>
> So EIPV without RIPV will be no ip and will result in panic,
> while expected result is SIGBUS.
First this is only for the !MCA recovery case. In the MCA recovery
case we have more information and can decide better.
In this case no EIPV means that the kernel isn't sure where the
error occurred so it cannot safely decide if it was user space
or kernel space and in the tolerant == 2 case has to panic
just in case a kernel kill would cause deadlock.
With MCA recovery this whole this is replaced by a new improved
mechanism using the high level handler.
> >
> > Also in addition do not force the RIP to be valid with the exact
> > register MSRs.
>
> I think the forced one is EIP:
> > - m->mcgstatus |= MCG_STATUS_EIPV;
True. Changed.
>
> And please note that it keep use CS on stack even if MSR is available.
>
> I made an alternative patch for this, with no functional change.
> Please consider replacing.
No, sorry I got burned too much last time you touched the description
of this simple patch. I think my description is simple and to the point
and this patch doesn't really deserve anything more.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
next prev parent reply other threads:[~2009-05-27 7:17 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-26 23:54 x86 MCE improvements series for 2.6.31 v2 Andi Kleen
2009-05-26 23:54 ` [PATCH 01/31] x86: MCE: Synchronize core after machine check handling Andi Kleen
2009-05-26 23:54 ` [PATCH 02/31] x86: MCE: Improve mce_get_rip v3 Andi Kleen
2009-05-27 4:29 ` Hidetoshi Seto
2009-05-27 7:23 ` Andi Kleen [this message]
2009-05-27 4:29 ` [PATCH] x86: MCE: Fix for getting IP/CS at MCE Hidetoshi Seto
2009-05-26 23:54 ` [PATCH 03/31] x86: MCE: Fix EIPV behaviour with !PCC Andi Kleen
2009-05-27 4:30 ` Hidetoshi Seto
2009-05-27 7:38 ` Andi Kleen
2009-05-27 7:38 ` Huang Ying
2009-05-27 8:53 ` Andi Kleen
2009-05-26 23:54 ` [PATCH 04/31] x86: MCE: Use extended sysattrs for the check_interval attribute Andi Kleen
2009-05-26 23:54 ` [PATCH 05/31] x86: MCE: Add machine check exception count in /proc/interrupts Andi Kleen
2009-05-26 23:54 ` [PATCH 06/31] x86: Fix panic with interrupts off (needed for MCE) Andi Kleen
2009-05-27 4:30 ` Hidetoshi Seto
2009-05-27 7:05 ` Andi Kleen
2009-05-26 23:54 ` [PATCH 07/31] x86: MCE: Log corrected errors when panicing Andi Kleen
2009-05-26 23:54 ` [PATCH 08/31] x86: MCE: Remove unused mce_events variable Andi Kleen
2009-05-26 23:54 ` [PATCH 09/31] x86: MCE: Remove mce_init unused argument Andi Kleen
2009-05-26 23:54 ` [PATCH 10/31] x86: MCE: Rename and align out2 label Andi Kleen
2009-05-26 23:54 ` [PATCH 11/31] x86: MCE: Implement bootstrapping for machine check wakeups Andi Kleen
2009-05-26 23:54 ` [PATCH 12/31] x86: MCE: Remove TSC print heuristic Andi Kleen
2009-05-26 23:54 ` [PATCH 13/31] x86: MCE: Drop BKL in mce_open Andi Kleen
2009-05-26 23:54 ` [PATCH 14/31] x86: MCE: Add table driven machine check grading Andi Kleen
2009-05-26 23:54 ` [PATCH 15/31] x86: MCE: Check early in exception handler if panic is needed Andi Kleen
2009-05-26 23:54 ` [PATCH 16/31] x86: MCE: Implement panic synchronization Andi Kleen
2009-05-26 23:54 ` [PATCH 17/31] x86: MCE: Switch x86 machine check handler to Monarch election. v2 Andi Kleen
2009-05-26 23:54 ` [PATCH 18/31] x86: MCE: Store record length into memory struct mce anchor Andi Kleen
2009-05-26 23:54 ` [PATCH 19/31] x86: MCE: Default to panic timeout for machine checks v2 Andi Kleen
2009-05-27 4:31 ` Hidetoshi Seto
2009-05-27 7:24 ` Andi Kleen
2009-05-27 4:31 ` [PATCH] x86: MCE: Fix for mce_panic_timeout Hidetoshi Seto
2009-05-27 10:07 ` Andi Kleen
2009-05-28 0:52 ` Hidetoshi Seto
2009-05-28 8:15 ` Andi Kleen
2009-05-26 23:54 ` [PATCH 20/31] x86: MCE: Improve documentation Andi Kleen
2009-05-26 23:54 ` [PATCH 21/31] x86: MCE: Support more than 256 CPUs in struct mce Andi Kleen
2009-05-26 23:54 ` [PATCH 22/31] x86: MCE: Extend struct mce user interface with more information Andi Kleen
2009-05-26 23:54 ` [PATCH 23/31] x86: MCE: Add MCE poll count to /proc/interrupts Andi Kleen
2009-05-26 23:54 ` [PATCH 24/31] x86: MCE: Don't print backtrace on machine checks with DEBUG_BUGVERBOSE Andi Kleen
2009-05-26 23:54 ` [PATCH 25/31] x86: MCE: Implement new status bits v2 Andi Kleen
2009-05-26 23:54 ` [PATCH 26/31] x86: MCE: Export MCE severities coverage via debugfs Andi Kleen
2009-05-26 23:54 ` [PATCH 27/31] x86: MCE: Print header/footer only once for multiple MCEs Andi Kleen
2009-05-27 4:31 ` Hidetoshi Seto
2009-05-27 7:10 ` Andi Kleen
2009-05-26 23:54 ` [PATCH 28/31] x86: MCE: Make non Monarch panic message "Fatal machine check" too v2 Andi Kleen
2009-05-26 23:54 ` [PATCH 29/31] x86: MCE: Rename mce_notify_user to mce_notify_irq Andi Kleen
2009-05-26 23:54 ` [PATCH 30/31] x86: MCE: Define MCE_VECTOR Andi Kleen
2009-05-26 23:54 ` [PATCH 31/31] x86: MCE: Support action-optional machine checks v2 Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090527072302.GR846@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=ak@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox