From: Andi Kleen <andi@firstfloor.org>
To: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>,
hpa@zytor.com, linux-kernel@vger.kernel.org, mingo@elte.hu,
tglx@linutronix.de
Subject: Re: [PATCH] [20/28] x86: MCE: Switch x86 machine check handler to Monarch election.
Date: Fri, 17 Apr 2009 15:09:44 +0200 [thread overview]
Message-ID: <20090417130944.GL14687@one.firstfloor.org> (raw)
In-Reply-To: <49E866D3.1020003@jp.fujitsu.com>
Thanks for your review.
On Fri, Apr 17, 2009 at 08:24:03PM +0900, Hidetoshi Seto wrote:
> > + goto out;
> > + if ((s64)*t < SPINUNIT) {
> > + /* CHECKME: Make panic default for 1 too? */
> > + if (tolerant < 1)
> > + mce_panic("Timeout synchronizing machine check over CPUs",
> > + NULL, NULL);
>
> Assuming that if we came here from mce_start() and panic, then I suppose no mce
> log would be appeared on the console since no cpu have invoked mce_log(&m) yet.
Well it would be rather random if the CPU who detects the timeout
actually has something useful to report.
More likely the useful information is in some CPU's banks who doesn't
answer.
On the other hand the real log will come out to disk after reboot from
the CPU registers (especially together with the new mce panic=30 default)
> Is it expected behavior?
Kind of expected. We could probably fix it, adding a fallback path here,
but due to the reasons above I have my doubts it would improve things
in practice.
> > + }
> > + *t -= SPINUNIT;
> > +out:
> > + touch_nmi_watchdog();
> > + return 0;
> > +}
> (snip)
> > +/*
> > + * Start of Monarch synchronization. This waits until all CPUs have
> > + * entered the ecception handler and then determines if any of them
> ^^^^^^^^^
> exception
Thanks fixed. I did actually run the spell checker on the comments,
but that one must have slipped through somehow.
>
> > + while (atomic_read(&mce_callin) != cpus) {
> > + if (mce_timed_out(&timeout)) {
> > + atomic_set(&global_nwo, 0);
> > + *order = -1;
> > + return no_way_out;
> > + }
> > + ndelay(SPINUNIT);
> > + }
> > +
> > + /*
> > + * Cache the global no_way_out state.
> > + */
> > + nwo = atomic_read(&global_nwo);
> > +
> > + /*
> > + * Monarch starts executing now, the others wait.
> > + */
> > + if (*order == 1) {
> > + atomic_set(&global_nwo, 0);
>
> Monarch should clear global_nwo after all Subjects have read it.
The subjects don't care about the global nwo state, it only matters to the
Monarch who does the panic in mce_end(). The only exception would be timeout,
but in this case all the decisions are local only anyways.
We ensure that all the subjects have written it first.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
next prev parent reply other threads:[~2009-04-17 13:06 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-07 15:07 [PATCH] [0/28] x86: MCE: Feature series for 2.6.31 Andi Kleen
2009-04-07 15:07 ` [PATCH] [1/28] x86: Fix panic with interrupts off (needed for MCE) Andi Kleen
2009-04-20 0:26 ` Hidetoshi Seto
2009-04-20 5:36 ` Andi Kleen
2009-04-07 15:07 ` [PATCH] [2/28] x86: MCE: Synchronize core after machine check handling Andi Kleen
2009-04-07 15:07 ` [PATCH] [3/28] x86: MCE: Remove assumption that RIP MSR is exact Andi Kleen
2009-04-07 15:07 ` [PATCH] [4/28] x86: MCE: Use symbolic macros to access MCG_CAP register Andi Kleen
2009-04-07 15:07 ` [PATCH] [5/28] x86: MCE: Use extended sysattrs for the check_interval attribute Andi Kleen
2009-04-07 15:07 ` [PATCH] [6/28] x86: MCE: Add machine check exception count in /proc/interrupts Andi Kleen
2009-04-08 5:00 ` Hidetoshi Seto
2009-04-08 9:56 ` Andi Kleen
2009-04-07 15:07 ` [PATCH] [7/28] x86: MCE: Log corrected errors when panicing Andi Kleen
2009-04-07 15:07 ` [PATCH] [8/28] x86: MCE: Remove unused mce_events variable Andi Kleen
2009-04-07 15:07 ` [PATCH] [9/28] x86: MCE: Remove machine check handler idle notify on 64bit Andi Kleen
2009-04-07 15:07 ` [PATCH] [10/28] x86: MCE: Remove oops_begin() use in 64bit machine check Andi Kleen
2009-04-07 15:07 ` [PATCH] [11/28] x86: MCE: Remove mce_init unused argument Andi Kleen
2009-04-07 15:07 ` [PATCH] [12/28] x86: MCE: Rename and align out2 label Andi Kleen
2009-04-07 15:07 ` [PATCH] [13/28] x86: MCE: Implement bootstrapping for machine check wakeups Andi Kleen
2009-04-07 15:07 ` [PATCH] [14/28] x86: MCE: Add MSR read wrappers for easier error injection Andi Kleen
2009-04-17 11:23 ` Hidetoshi Seto
2009-04-17 13:00 ` Andi Kleen
2009-04-17 23:55 ` H. Peter Anvin
2009-04-07 15:07 ` [PATCH] [15/28] x86: MCE: Remove TSC print heuristic Andi Kleen
2009-04-07 15:07 ` [PATCH] [16/28] x86: MCE: Drop BKL in mce_open Andi Kleen
2009-04-07 15:07 ` [PATCH] [17/28] x86: MCE: Add table driven machine check grading Andi Kleen
2009-04-07 15:08 ` [PATCH] [18/28] x86: MCE: Check early in exception handler if panic is needed Andi Kleen
2009-04-07 15:08 ` [PATCH] [19/28] x86: MCE: Implement panic synchronization Andi Kleen
2009-04-07 15:08 ` [PATCH] [20/28] x86: MCE: Switch x86 machine check handler to Monarch election Andi Kleen
2009-04-17 11:24 ` Hidetoshi Seto
2009-04-17 13:09 ` Andi Kleen [this message]
2009-04-17 13:53 ` [PATCH] [20/28] x86: MCE: Switch x86 machine check handler to Monarch election. II Andi Kleen
2009-04-07 15:08 ` [PATCH] [21/28] x86: MCE: Store record length into memory struct mce anchor Andi Kleen
2009-04-07 15:08 ` [PATCH] [22/28] x86: MCE: Default to panic timeout for machine checks Andi Kleen
2009-04-17 11:24 ` Hidetoshi Seto
2009-04-17 13:12 ` Andi Kleen
2009-04-07 15:08 ` [PATCH] [23/28] x86: MCE: Improve documentation Andi Kleen
2009-04-08 5:12 ` Hidetoshi Seto
2009-04-07 15:08 ` [PATCH] [24/28] x86: MCE: Support more than 256 CPUs in struct mce Andi Kleen
2009-04-07 15:08 ` [PATCH] [25/28] x86: MCE: Extend struct mce user interface with more information Andi Kleen
2009-04-07 15:08 ` [PATCH] [26/28] Export add_timer_on for modules Andi Kleen
2009-04-07 15:08 ` [PATCH] [27/28] MCE: Add basic error injection infrastructure Andi Kleen
2009-04-07 15:08 ` [PATCH] [28/28] x86: MCE: Implement new status bits Andi Kleen
2009-04-17 11:24 ` Hidetoshi Seto
2009-04-17 13:17 ` Andi Kleen
2009-04-17 11:24 ` [PATCH] [0/28] x86: MCE: Feature series for 2.6.31 Hidetoshi Seto
2009-04-17 13:28 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090417130944.GL14687@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox