All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Gong <gong.chen@linux.intel.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
	x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Robert Richter <robert.richter@amd.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	seiji.aguchi@hds.com, vgoyal@redhat.com, mjg@redhat.com,
	tony.luck@intel.com, gong.chen@intel.com, satoru.moriya@hds.com,
	avi@redhat.com
Subject: Re: [PATCH 1/3] x86, reboot:  Use NMI instead of REBOOT_VECTOR to stop cpus
Date: Wed, 12 Oct 2011 10:35:42 +0800	[thread overview]
Message-ID: <4E94FCFE.4090300@linux.intel.com> (raw)
In-Reply-To: <1318346686-12349-2-git-send-email-dzickus@redhat.com>

于 2011/10/11 23:24, Don Zickus 写道:
> A recent discussion started talking about the locking on the pstore fs
> and how it relates to the kmsg infrastructure.  We noticed it was possible
> for userspace to r/w to the pstore fs (grabbing the locks in the process)
> and block the panic path from r/w to the same fs.
>
> The reason was the cpu with the lock could be doing work while the crashing
> cpu is panic'ing.  Busting those spinlocks might cause those cpus to step
> on each other's data.  Fine, fair enough.
>
> It was suggested it would be nice to serialize the panic path (ie stop
> the other cpus) and have only one cpu running.  This would allow us to
> bust the spinlocks and not worry about another cpu stepping on the data.
>
> Of course, smp_send_stop() does this in the panic case.  kmsg_dump() would
> have to be moved to be called after it.  Easy enough.
>
> The only problem is on x86 the smp_send_stop() function calls the
> REBOOT_VECTOR.  Any cpu with irqs disabled (which pstore and its backend
> ERST would do), block this IPI and thus do not stop.  This makes it
> difficult to reliably log data to the pstore fs.
>
> The patch below switches from the REBOOT_VECTOR to NMI (and mimics what
> kdump does).  Switching to NMI allows us to deliver the IPI when irqs are
> disabled, increasing the reliability of this function.
>
> However, Andi carefully noted that on some machines this approach does not
> work because of broken BIOSes or whatever.
>
> To help accomodate this, the next couple of patches will run a selftest and
> provide a knob to disable.
>
> V2:
>    uses atomic ops to serialize the cpu that shuts everyone down
>
> Signed-off-by: Don Zickus<dzickus@redhat.com>
> ---
>
> [note] this patch sits on top of another NMI infrastructure change I have
> submitted, so the nmi registeration might not apply cleanly without that patch.
> ---
>   arch/x86/kernel/smp.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++-
>   1 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 013e7eb..7bdbf6a 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -28,6 +28,7 @@
>   #include<asm/mmu_context.h>
>   #include<asm/proto.h>
>   #include<asm/apic.h>
> +#include<asm/nmi.h>
>   /*
>    *	Some notes on x86 processor bugs affecting SMP operation:
>    *
> @@ -147,6 +148,59 @@ void native_send_call_func_ipi(const struct cpumask *mask)
>   	free_cpumask_var(allbutself);
>   }
>
> +static atomic_t stopping_cpu = ATOMIC_INIT(-1);
> +
> +static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
> +{
> +	/* We are registered on stopping cpu too, avoid spurious NMI */
> +	if (raw_smp_processor_id() == atomic_read(&stopping_cpu))
> +		return NMI_HANDLED;
> +
> +	stop_this_cpu(NULL);
> +
> +	return NMI_HANDLED;
> +}
> +
> +static void native_nmi_stop_other_cpus(int wait)
> +{
> +	unsigned long flags;
> +	unsigned long timeout;
> +
> +	if (reboot_force)
> +		return;
> +
> +	/*
> +	 * Use an own vector here because smp_call_function
> +	 * does lots of things not suitable in a panic situation.
> +	 */
> +	if (num_online_cpus()>  1) {
> +		/* did someone beat us here? */
> +		if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id() != -1))
> +			return;
> +
> +		if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
> +					 NMI_FLAG_FIRST, "smp_stop"))
> +			return;		/* return what? */
> +
> +		/* sync above data before sending NMI */
> +		wmb();
> +
> +		apic->send_IPI_allbutself(NMI_VECTOR);
> +
> +		/*
> +		 * Don't wait longer than a second if the caller
> +		 * didn't ask us to wait.
> +		 */
> +		timeout = USEC_PER_SEC;
> +		while (num_online_cpus()>  1&&  (wait || timeout--))
> +			udelay(1);

In this patch and next patch, how about using the same logic in commit 74d91e3c6


  reply	other threads:[~2011-10-12  2:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-11 15:24 [PATCH 0/3] Use NMI to stop cpus Don Zickus
2011-10-11 15:24 ` [PATCH 1/3] x86, reboot: Use NMI instead of REBOOT_VECTOR " Don Zickus
2011-10-12  2:35   ` Chen Gong [this message]
2011-10-12 12:51     ` Don Zickus
2011-10-13  8:17       ` Chen Gong
2011-10-12  7:30   ` Ingo Molnar
2011-10-12 12:54     ` Don Zickus
2011-10-12 16:33       ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 2/3] x86, NMI: Add NMI IPI selftest Don Zickus
2011-10-12  7:27   ` Ingo Molnar
2011-10-11 15:24 ` [PATCH 3/3] x86, NMI: knob to disable using NMI IPIs to stop cpus Don Zickus
2011-10-12  7:28   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E94FCFE.4090300@linux.intel.com \
    --to=gong.chen@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=gong.chen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mjg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=satoru.moriya@hds.com \
    --cc=seiji.aguchi@hds.com \
    --cc=tony.luck@intel.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.