All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.xen@gmail.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>, xen-devel@lists.xen.org
Cc: Jan Beulich <JBeulich@suse.com>
Subject: Re: [PATCH] xen/kexec: Prevent deadlock because of one_cpu_only()
Date: Fri, 05 Apr 2013 22:21:36 +0100	[thread overview]
Message-ID: <CD84FEF0.4F07B%keir.xen@gmail.com> (raw)
In-Reply-To: <a559ec5225fcba963d13.1365191556@andrewcoop.uk.xensource.com>

On 05/04/2013 20:52, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:

> During automated testing of the Xen crash path via fatal NMIs, we discovered
> the following deadlock:
> 
> [22:47:04 UTC] (XEN) ****************************************
> [22:47:04 UTC] (XEN) Panic on CPU 0:
> [22:47:04 UTC] (XEN) FATAL TRAP: vector = 2 (nmi)
> [22:47:04 UTC] (XEN) [error_code=0000] , IN INTERRUPT CONTEXT
> [22:47:04 UTC] (XEN) ****************************************
> [22:47:04 UTC] (XEN)
> [22:47:04 UTC] (XEN) Reboot in five seconds...
> [22:47:05 UTC] (XEN) DMAR_IQA_REG = 839b34002
> [22:47:05 UTC] (XEN) DMAR_IQH_REG = 0
> [22:47:05 UTC] (XEN) DMAR_IQT_REG = 26b0
> [22:47:05 UTC] (XEN)
> [22:47:05 UTC] (XEN) ****************************************
> [22:47:05 UTC] (XEN) Panic on CPU 0:
> [22:47:05 UTC] (XEN) queue invalidate wait descriptor was not executed
> [22:47:05 UTC] (XEN) ****************************************
> [22:47:05 UTC] (XEN)
> [22:47:05 UTC] (XEN) Reboot in five seconds...
> [23:09:54 UTC]  The server is not powered on.  The Virtual Serial Port is not
> available.
> 
> The problem is quite difficult to reproduce (seen twice, unable to reproduce
> manually), but inspection of kexec_crash() path shows a certain deadlock
> because of one_cpu_only().
> 
> 
> This patch replaces the use of one_cpu_only() with a recursive spinlock which
> provides exclusion between different CPUs racing into kexec_crash(), but
> prevents deadlock if the same CPU reenters kexec_crash() again.

Does kexec_crash() not mind being reentered in this way?

 -- Keir

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> diff -r 996fbd7657bb -r a559ec5225fc xen/common/kexec.c
> --- a/xen/common/kexec.c
> +++ b/xen/common/kexec.c
> @@ -234,13 +234,6 @@ void __init set_kexec_crash_area_size(u6
>      }
>  }
>  
> -static void one_cpu_only(void)
> -{
> -    /* Only allow the first cpu to continue - force other cpus to spin */
> -    if ( test_and_set_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
> -        for ( ; ; ) ;
> -}
> -
>  /* Save the registers in the per-cpu crash note buffer. */
>  void kexec_crash_save_cpu(void)
>  {
> @@ -294,19 +287,25 @@ static void kexec_common_shutdown(void)
>      watchdog_disable();
>      console_start_sync();
>      spin_debug_disable();
> -    one_cpu_only();
>      acpi_dmar_reinstate();
>  }
>  
>  void kexec_crash(void)
>  {
> +    static DEFINE_SPINLOCK(crash_lock);
>      int pos;
>  
>      pos = (test_bit(KEXEC_FLAG_CRASH_POS, &kexec_flags) != 0);
>      if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
>          return;
>  
> +    /* In a cascade failure, it is possible to re-enter kexec_crash on the
> same
> +     * CPU.  This spinlock is deliberately never unlocked, to allow
> reentrance
> +     * on the same CPU, but provides excusion from different CPUs racing. */
> +    spin_lock_recursive(&crash_lock);
> +
>      kexecing = TRUE;
> +    set_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags);
>  
>      kexec_common_shutdown();
>      kexec_crash_save_cpu();

  reply	other threads:[~2013-04-05 21:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-05 19:52 [PATCH] xen/kexec: Prevent deadlock because of one_cpu_only() Andrew Cooper
2013-04-05 21:21 ` Keir Fraser [this message]
2013-04-06 15:03   ` Andrew Cooper
2013-04-08  7:48     ` Jan Beulich
2013-04-08  9:47       ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CD84FEF0.4F07B%keir.xen@gmail.com \
    --to=keir.xen@gmail.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.