All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, bharata@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH] Fix a deadlock case in the CPU hotplug flow
Date: Mon, 3 Sep 2018 11:47:08 +1000	[thread overview]
Message-ID: <20180903014708.GC2679@umbus.fritz.box> (raw)
In-Reply-To: <20180902141904.14479-1-joserz@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 4311 bytes --]

On Sun, Sep 02, 2018 at 11:19:04AM -0300, Jose Ricardo Ziviani wrote:
> We need to set cs->halted to 1 before calling ppc_set_compat. The reason
> is that ppc_set_compat kicks up the new thread created to manage the
> hotplugged KVM virtual CPU and the code drives directly to KVM_RUN
> ioctl. When cs->halted is 1, the code:
> 
> int kvm_cpu_exec(CPUState *cpu)
> ...
>      if (kvm_arch_process_async_events(cpu)) {
>          atomic_set(&cpu->exit_request, 0);
>          return EXCP_HLT;
>      }
> ...
> 
> returns before it reaches KVM_RUN, giving time to the main thread to
> finish its job. Otherwise we can fall in a deadlock because the KVM
> thread will issue the KVM_RUN ioctl while the main thread is setting up
> KVM registers. Depending on how these jobs are scheduled we'll end up
> freezing QEMU.
> 
> The following output shows kvm_vcpu_ioctl sleeping because it cannot get
> the mutex and never will.
> PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr.
> 
> STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL
> 
> PID: 61564  TASK: c000003e981e0780  CPU: 48  COMMAND: "qemu-system-ppc"
>  #0 [c000003e982679a0] __schedule at c000000000b10a44
>  #1 [c000003e98267a60] schedule at c000000000b113a8
>  #2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910
>  #3 [c000003e98267ab0] __mutex_lock at c000000000b132ec
>  #4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm]
>  #5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30
>  #6 [c000003e98267dc0] ksys_ioctl at c000000000408674
>  #7 [c000003e98267e10] sys_ioctl at c0000000004086f8
>  #8 [c000003e98267e30] system_call at c00000000000b488
> 
> crash> struct -x kvm.vcpus 0xc000003da0000000
> vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...}
> 
> crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000
>   mutex.owner = {
>     counter = 0xc000003a23a5c881 <- flag 1: waiters
>   },
> 
> crash> bt 0xc000003a23a5c880
> PID: 61579  TASK: c000003a23a5c880  CPU: 9   COMMAND: "CPU 4/KVM"
> (active)
> 
> crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000
>   mutex.wait_list = {
>     next = 0xc000003e98267b10,
>     prev = 0xc000003e98267b10
>   },
> 
> crash> struct -x mutex_waiter.task 0xc000003e98267b10
>   task = 0xc000003e981e0780
> 
> The following command-line was used to reproduce the problem (note: gdb
> and trace can change the results).
> 
>  $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \
>      -enable-kvm -m 4096 \
>      -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \
>      -display none -nographic \
>      -drive file=disk1.qcow2,format=qcow2
>  ...
>  (qemu) device_add host-spapr-cpu-core,core-id=4
> [no interaction is possible after it, only SIGKILL to take the terminal
> back]
> 
> Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>

Applied, thanks.

> ---
>  hw/ppc/spapr_cpu_core.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 876f0b3d9d..a73b244a3f 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque)
>  
>      cpu_reset(cs);
>  
> -    /* Set compatibility mode to match the boot CPU, which was either set
> -     * by the machine reset code or by CAS. This should never fail.
> -     */
> -    ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, &error_abort);
> -
>      /* All CPUs start halted.  CPU0 is unhalted from the machine level
>       * reset code and the rest are explicitly started up by the guest
>       * using an RTAS call */
>      cs->halted = 1;
>  
> +    /* Set compatibility mode to match the boot CPU, which was either set
> +     * by the machine reset code or by CAS. This should never fail.
> +     */
> +    ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, &error_abort);
> +
>      env->spr[SPR_HIOR] = 0;
>  
>      lpcr = env->spr[SPR_LPCR];

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2018-09-03  3:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-02 14:19 [Qemu-devel] [PATCH] Fix a deadlock case in the CPU hotplug flow Jose Ricardo Ziviani
2018-09-02 16:12 ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2018-09-03  1:47 ` David Gibson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2018-09-02  5:01 [Qemu-devel] " Jose Ricardo Ziviani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180903014708.GC2679@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=joserz@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.