From: Chen Gong <gong.chen@linux.intel.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Robert Richter <robert.richter@amd.com>,
Andrew Morton <akpm@linux-foundation.org>,
seiji.aguchi@hds.com, vgoyal@redhat.com, mjg@redhat.com,
tony.luck@intel.com, gong.chen@intel.com, satoru.moriya@hds.com,
avi@redhat.com
Subject: Re: [PATCH 1/3] x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus
Date: Wed, 12 Oct 2011 10:35:42 +0800 [thread overview]
Message-ID: <4E94FCFE.4090300@linux.intel.com> (raw)
In-Reply-To: <1318346686-12349-2-git-send-email-dzickus@redhat.com>
于 2011/10/11 23:24, Don Zickus 写道:
> A recent discussion started talking about the locking on the pstore fs
> and how it relates to the kmsg infrastructure. We noticed it was possible
> for userspace to r/w to the pstore fs (grabbing the locks in the process)
> and block the panic path from r/w to the same fs.
>
> The reason was the cpu with the lock could be doing work while the crashing
> cpu is panic'ing. Busting those spinlocks might cause those cpus to step
> on each other's data. Fine, fair enough.
>
> It was suggested it would be nice to serialize the panic path (ie stop
> the other cpus) and have only one cpu running. This would allow us to
> bust the spinlocks and not worry about another cpu stepping on the data.
>
> Of course, smp_send_stop() does this in the panic case. kmsg_dump() would
> have to be moved to be called after it. Easy enough.
>
> The only problem is on x86 the smp_send_stop() function calls the
> REBOOT_VECTOR. Any cpu with irqs disabled (which pstore and its backend
> ERST would do), block this IPI and thus do not stop. This makes it
> difficult to reliably log data to the pstore fs.
>
> The patch below switches from the REBOOT_VECTOR to NMI (and mimics what
> kdump does). Switching to NMI allows us to deliver the IPI when irqs are
> disabled, increasing the reliability of this function.
>
> However, Andi carefully noted that on some machines this approach does not
> work because of broken BIOSes or whatever.
>
> To help accomodate this, the next couple of patches will run a selftest and
> provide a knob to disable.
>
> V2:
> uses atomic ops to serialize the cpu that shuts everyone down
>
> Signed-off-by: Don Zickus<dzickus@redhat.com>
> ---
>
> [note] this patch sits on top of another NMI infrastructure change I have
> submitted, so the nmi registeration might not apply cleanly without that patch.
> ---
> arch/x86/kernel/smp.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 013e7eb..7bdbf6a 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -28,6 +28,7 @@
> #include<asm/mmu_context.h>
> #include<asm/proto.h>
> #include<asm/apic.h>
> +#include<asm/nmi.h>
> /*
> * Some notes on x86 processor bugs affecting SMP operation:
> *
> @@ -147,6 +148,59 @@ void native_send_call_func_ipi(const struct cpumask *mask)
> free_cpumask_var(allbutself);
> }
>
> +static atomic_t stopping_cpu = ATOMIC_INIT(-1);
> +
> +static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
> +{
> + /* We are registered on stopping cpu too, avoid spurious NMI */
> + if (raw_smp_processor_id() == atomic_read(&stopping_cpu))
> + return NMI_HANDLED;
> +
> + stop_this_cpu(NULL);
> +
> + return NMI_HANDLED;
> +}
> +
> +static void native_nmi_stop_other_cpus(int wait)
> +{
> + unsigned long flags;
> + unsigned long timeout;
> +
> + if (reboot_force)
> + return;
> +
> + /*
> + * Use an own vector here because smp_call_function
> + * does lots of things not suitable in a panic situation.
> + */
> + if (num_online_cpus()> 1) {
> + /* did someone beat us here? */
> + if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id() != -1))
> + return;
> +
> + if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
> + NMI_FLAG_FIRST, "smp_stop"))
> + return; /* return what? */
> +
> + /* sync above data before sending NMI */
> + wmb();
> +
> + apic->send_IPI_allbutself(NMI_VECTOR);
> +
> + /*
> + * Don't wait longer than a second if the caller
> + * didn't ask us to wait.
> + */
> + timeout = USEC_PER_SEC;
> + while (num_online_cpus()> 1&& (wait || timeout--))
> + udelay(1);
In this patch and next patch, how about using the same logic in commit 74d91e3c6
next prev parent reply other threads:[~2011-10-12 2:35 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-11 15:24 [PATCH 0/3] Use NMI to stop cpus Don Zickus
2011-10-11 15:24 ` [PATCH 1/3] x86, reboot: Use NMI instead of REBOOT_VECTOR " Don Zickus
2011-10-12 2:35 ` Chen Gong [this message]
2011-10-12 12:51 ` Don Zickus
2011-10-13 8:17 ` Chen Gong
2011-10-12 7:30 ` Ingo Molnar
2011-10-12 12:54 ` Don Zickus
2011-10-12 16:33 ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 2/3] x86, NMI: Add NMI IPI selftest Don Zickus
2011-10-12 7:27 ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 3/3] x86, NMI: knob to disable using NMI IPIs to stop cpus Don Zickus
2011-10-12 7:28 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E94FCFE.4090300@linux.intel.com \
--to=gong.chen@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@redhat.com \
--cc=dzickus@redhat.com \
--cc=gong.chen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mjg@redhat.com \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
--cc=satoru.moriya@hds.com \
--cc=seiji.aguchi@hds.com \
--cc=tony.luck@intel.com \
--cc=vgoyal@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox