public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chen Gong <gong.chen@linux.intel.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
	x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Robert Richter <robert.richter@amd.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	seiji.aguchi@hds.com, vgoyal@redhat.com, mjg@redhat.com,
	tony.luck@intel.com, gong.chen@intel.com, satoru.moriya@hds.com,
	avi@redhat.com
Subject: Re: [PATCH 1/3] x86, reboot:  Use NMI instead of REBOOT_VECTOR to stop cpus
Date: Wed, 12 Oct 2011 10:35:42 +0800	[thread overview]
Message-ID: <4E94FCFE.4090300@linux.intel.com> (raw)
In-Reply-To: <1318346686-12349-2-git-send-email-dzickus@redhat.com>

于 2011/10/11 23:24, Don Zickus 写道:
> A recent discussion started talking about the locking on the pstore fs
> and how it relates to the kmsg infrastructure.  We noticed it was possible
> for userspace to r/w to the pstore fs (grabbing the locks in the process)
> and block the panic path from r/w to the same fs.
>
> The reason was the cpu with the lock could be doing work while the crashing
> cpu is panic'ing.  Busting those spinlocks might cause those cpus to step
> on each other's data.  Fine, fair enough.
>
> It was suggested it would be nice to serialize the panic path (ie stop
> the other cpus) and have only one cpu running.  This would allow us to
> bust the spinlocks and not worry about another cpu stepping on the data.
>
> Of course, smp_send_stop() does this in the panic case.  kmsg_dump() would
> have to be moved to be called after it.  Easy enough.
>
> The only problem is on x86 the smp_send_stop() function calls the
> REBOOT_VECTOR.  Any cpu with irqs disabled (which pstore and its backend
> ERST would do), block this IPI and thus do not stop.  This makes it
> difficult to reliably log data to the pstore fs.
>
> The patch below switches from the REBOOT_VECTOR to NMI (and mimics what
> kdump does).  Switching to NMI allows us to deliver the IPI when irqs are
> disabled, increasing the reliability of this function.
>
> However, Andi carefully noted that on some machines this approach does not
> work because of broken BIOSes or whatever.
>
> To help accomodate this, the next couple of patches will run a selftest and
> provide a knob to disable.
>
> V2:
>    uses atomic ops to serialize the cpu that shuts everyone down
>
> Signed-off-by: Don Zickus<dzickus@redhat.com>
> ---
>
> [note] this patch sits on top of another NMI infrastructure change I have
> submitted, so the nmi registeration might not apply cleanly without that patch.
> ---
>   arch/x86/kernel/smp.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++-
>   1 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 013e7eb..7bdbf6a 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -28,6 +28,7 @@
>   #include<asm/mmu_context.h>
>   #include<asm/proto.h>
>   #include<asm/apic.h>
> +#include<asm/nmi.h>
>   /*
>    *	Some notes on x86 processor bugs affecting SMP operation:
>    *
> @@ -147,6 +148,59 @@ void native_send_call_func_ipi(const struct cpumask *mask)
>   	free_cpumask_var(allbutself);
>   }
>
> +static atomic_t stopping_cpu = ATOMIC_INIT(-1);
> +
> +static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
> +{
> +	/* We are registered on stopping cpu too, avoid spurious NMI */
> +	if (raw_smp_processor_id() == atomic_read(&stopping_cpu))
> +		return NMI_HANDLED;
> +
> +	stop_this_cpu(NULL);
> +
> +	return NMI_HANDLED;
> +}
> +
> +static void native_nmi_stop_other_cpus(int wait)
> +{
> +	unsigned long flags;
> +	unsigned long timeout;
> +
> +	if (reboot_force)
> +		return;
> +
> +	/*
> +	 * Use an own vector here because smp_call_function
> +	 * does lots of things not suitable in a panic situation.
> +	 */
> +	if (num_online_cpus()>  1) {
> +		/* did someone beat us here? */
> +		if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id() != -1))
> +			return;
> +
> +		if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
> +					 NMI_FLAG_FIRST, "smp_stop"))
> +			return;		/* return what? */
> +
> +		/* sync above data before sending NMI */
> +		wmb();
> +
> +		apic->send_IPI_allbutself(NMI_VECTOR);
> +
> +		/*
> +		 * Don't wait longer than a second if the caller
> +		 * didn't ask us to wait.
> +		 */
> +		timeout = USEC_PER_SEC;
> +		while (num_online_cpus()>  1&&  (wait || timeout--))
> +			udelay(1);

In this patch and next patch, how about using the same logic in commit 74d91e3c6


  reply	other threads:[~2011-10-12  2:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-11 15:24 [PATCH 0/3] Use NMI to stop cpus Don Zickus
2011-10-11 15:24 ` [PATCH 1/3] x86, reboot: Use NMI instead of REBOOT_VECTOR " Don Zickus
2011-10-12  2:35   ` Chen Gong [this message]
2011-10-12 12:51     ` Don Zickus
2011-10-13  8:17       ` Chen Gong
2011-10-12  7:30   ` Ingo Molnar
2011-10-12 12:54     ` Don Zickus
2011-10-12 16:33       ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 2/3] x86, NMI: Add NMI IPI selftest Don Zickus
2011-10-12  7:27   ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 3/3] x86, NMI: knob to disable using NMI IPIs to stop cpus Don Zickus
2011-10-12  7:28   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E94FCFE.4090300@linux.intel.com \
    --to=gong.chen@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=gong.chen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mjg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=satoru.moriya@hds.com \
    --cc=seiji.aguchi@hds.com \
    --cc=tony.luck@intel.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox