linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Chris J Arges <chris.j.arges@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Rafael David Tinoco <inaddy@ubuntu.com>,
	Peter Anvin <hpa@zytor.com>,
	Jiang Liu <jiang.liu@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Gema Gomez <gema.gomez-solano@canonical.com>,
	the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [PATCH] smp/call: Detect stuck CSD locks
Date: Thu, 16 Apr 2015 13:04:23 +0200	[thread overview]
Message-ID: <20150416110423.GA15760@gmail.com> (raw)
In-Reply-To: <20150415195452.GA19953@canonical.com>


* Chris J Arges <chris.j.arges@canonical.com> wrote:

> Ingo,
> 
> Below are the patches and data I've gathered from the reproducer. My 
> methodology was as described previously; however I used gdb on the 
> qemu process in order to breakpoint L1 once we've detected the hang. 
> This made dumping the kvm_lapic structures on L0 more reliable.

Thanks!

So I have trouble interpreting the L1 backtrace, because it shows 
something entirely new (to me).

First lets clarify the terminology, to make sure I got the workload 
all right:

 - L0 is the host kernel, running native Linux. It's not locking up.

 - L1 is the guest kernel, running virtualized Linux. This is the one 
   that is locking up.

 - L2 is the nested guest kernel, running whatever test workload you 
   were running - this is obviously locking up together with L1.

Right?

So with that cleared up, the backtrace on L1 looks like this:

> * Crash dump backtrace from L1:
> 
> crash> bt -a
> PID: 26     TASK: ffff88013a4f1400  CPU: 0   COMMAND: "ksmd"
>  #0 [ffff88013a5039f0] machine_kexec at ffffffff8109d3ec
>  #1 [ffff88013a503a50] crash_kexec at ffffffff8114a763
>  #2 [ffff88013a503b20] panic at ffffffff818068e0
>  #3 [ffff88013a503ba0] csd_lock_wait at ffffffff8113f1e4
>  #4 [ffff88013a503bf0] generic_exec_single at ffffffff8113f2d0
>  #5 [ffff88013a503c60] smp_call_function_single at ffffffff8113f417
>  #6 [ffff88013a503c90] smp_call_function_many at ffffffff8113f7a4
>  #7 [ffff88013a503d20] flush_tlb_page at ffffffff810b3bf9
>  #8 [ffff88013a503d50] ptep_clear_flush at ffffffff81205e5e
>  #9 [ffff88013a503d80] try_to_merge_with_ksm_page at ffffffff8121a445
> #10 [ffff88013a503e00] ksm_scan_thread at ffffffff8121ac0e
> #11 [ffff88013a503ec0] kthread at ffffffff810df0fb
> #12 [ffff88013a503f50] ret_from_fork at ffffffff8180fc98

So this one, VCPU0, is trying to send an IPI to VCPU1:

> PID: 1674   TASK: ffff8800ba4a9e00  CPU: 1   COMMAND: "qemu-system-x86"
>  #0 [ffff88013fd05e20] crash_nmi_callback at ffffffff81091521
>  #1 [ffff88013fd05e30] nmi_handle at ffffffff81062560
>  #2 [ffff88013fd05ea0] default_do_nmi at ffffffff81062b0a
>  #3 [ffff88013fd05ed0] do_nmi at ffffffff81062c88
>  #4 [ffff88013fd05ef0] end_repeat_nmi at ffffffff81812241
>     [exception RIP: vmx_vcpu_run+992]
>     RIP: ffffffff8104cef0  RSP: ffff88013940bcb8  RFLAGS: 00000082
>     RAX: 0000000080000202  RBX: ffff880139b30000  RCX: ffff880139b30000
>     RDX: 0000000000000200  RSI: ffff880139b30000  RDI: ffff880139b30000
>     RBP: ffff88013940bd28   R8: 00007fe192b71110   R9: 00007fe192b71140
>     R10: 00007fff66407d00  R11: 00007fe1927f0060  R12: 0000000000000000
>     R13: 0000000000000001  R14: 0000000000000001  R15: 0000000000000000
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> --- <NMI exception stack> ---
>  #5 [ffff88013940bcb8] vmx_vcpu_run at ffffffff8104cef0
>  #6 [ffff88013940bcf8] vmx_handle_external_intr at ffffffff81040c18
>  #7 [ffff88013940bd30] kvm_arch_vcpu_ioctl_run at ffffffff8101b5ad
>  #8 [ffff88013940be00] kvm_vcpu_ioctl at ffffffff81007894
>  #9 [ffff88013940beb0] do_vfs_ioctl at ffffffff81253190
> #10 [ffff88013940bf30] sys_ioctl at ffffffff81253411
> #11 [ffff88013940bf80] system_call_fastpath at ffffffff8180fd4d

So the problem here that I can see is that L1's VCPU1 appears to be 
looping with interrupts disabled:

>     RIP: ffffffff8104cef0  RSP: ffff88013940bcb8  RFLAGS: 00000082

Look how RFLAGS doesn't have 0x200 set - so it's executing with 
interrupts disabled.

That is why the IPI does not get through to it, but kdump's NMI had no 
problem getting through.

This (assuming all backtraces are exact!):

>  #5 [ffff88013940bcb8] vmx_vcpu_run at ffffffff8104cef0
>  #6 [ffff88013940bcf8] vmx_handle_external_intr at ffffffff81040c18
>  #7 [ffff88013940bd30] kvm_arch_vcpu_ioctl_run at ffffffff8101b5ad

suggests that we called vmx_vcpu_run() from 
vmx_handle_external_intr(), and that we are executing L2 guest code 
with interrupts disabled.

How is this supposed to work? What mechanism does KVM have against a 
(untrusted) guest interrupt handler locking up?

I might be misunderstanding how this works at the KVM level, but from 
the APIC perspective the situation appears to be pretty clear: CPU1's 
interrupts are turned off, so it cannot receive IPIs, the CSD wait 
will eventually time out.

Now obviously it appears to be anomalous (assuming my analysis is 
correct) that the interrupt handler has locked up, but it's 
immaterial: a nested kernel must not allow its guest to lock it up.

Thanks,

	Ingo

  reply	other threads:[~2015-04-16 11:04 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-11 13:19 smp_call_function_single lockups Rafael David Tinoco
2015-02-11 18:18 ` Linus Torvalds
2015-02-11 19:59   ` Linus Torvalds
2015-02-11 20:42     ` Linus Torvalds
2015-02-12 16:38       ` Rafael David Tinoco
2015-02-18 22:25       ` Peter Zijlstra
2015-02-19 15:42         ` Rafael David Tinoco
2015-02-19 16:14           ` Linus Torvalds
2015-02-23 14:01             ` Rafael David Tinoco
2015-02-23 19:32               ` Linus Torvalds
2015-02-23 20:50                 ` Peter Zijlstra
2015-02-23 21:02                   ` Rafael David Tinoco
2015-02-19 16:16           ` Peter Zijlstra
2015-02-19 16:26           ` Linus Torvalds
2015-02-19 16:32             ` Rafael David Tinoco
2015-02-19 16:59               ` Linus Torvalds
2015-02-19 17:30                 ` Rafael David Tinoco
2015-02-19 17:39                 ` Linus Torvalds
2015-02-19 20:29                   ` Linus Torvalds
2015-02-19 21:59                     ` Linus Torvalds
2015-02-19 22:45                       ` Linus Torvalds
2015-03-31  3:15                         ` Chris J Arges
2015-03-31  4:28                           ` Linus Torvalds
2015-03-31 10:56                             ` [debug PATCHes] " Ingo Molnar
2015-03-31 22:38                               ` Chris J Arges
2015-04-01 12:39                                 ` Ingo Molnar
2015-04-01 14:10                                   ` Chris J Arges
2015-04-01 14:55                                     ` Ingo Molnar
2015-03-31  4:46                           ` Linus Torvalds
2015-03-31 15:08                           ` Linus Torvalds
2015-03-31 22:23                             ` Chris J Arges
2015-03-31 23:07                               ` Linus Torvalds
2015-04-01 14:32                                 ` Chris J Arges
2015-04-01 15:36                                   ` Linus Torvalds
2015-04-02  9:55                                     ` Ingo Molnar
2015-04-02 17:35                                       ` Linus Torvalds
2015-04-01 12:43                               ` Ingo Molnar
2015-04-01 16:10                                 ` Chris J Arges
2015-04-01 16:14                                   ` Linus Torvalds
2015-04-01 21:59                                     ` Chris J Arges
2015-04-02 17:31                                       ` Linus Torvalds
2015-04-02 18:26                                         ` Ingo Molnar
2015-04-02 18:51                                           ` Chris J Arges
2015-04-02 19:07                                             ` Ingo Molnar
2015-04-02 20:57                                               ` Linus Torvalds
2015-04-02 21:13                                               ` Chris J Arges
2015-04-03  5:43                                                 ` [PATCH] smp/call: Detect stuck CSD locks Ingo Molnar
2015-04-03  5:47                                                   ` Ingo Molnar
2015-04-06 16:58                                                   ` Chris J Arges
2015-04-06 17:32                                                     ` Linus Torvalds
2015-04-07  9:21                                                       ` Ingo Molnar
2015-04-07 20:59                                                         ` Chris J Arges
2015-04-07 21:15                                                           ` Linus Torvalds
2015-04-08  6:47                                                           ` Ingo Molnar
2015-04-13  3:56                                                             ` Chris J Arges
2015-04-13  6:14                                                               ` Ingo Molnar
2015-04-15 19:54                                                                 ` Chris J Arges
2015-04-16 11:04                                                                   ` Ingo Molnar [this message]
2015-04-16 15:58                                                                     ` Chris J Arges
2015-04-16 16:31                                                                       ` Ingo Molnar
2015-04-29 21:08                                                                         ` Chris J Arges
2015-05-11 14:00                                                                           ` Ingo Molnar
2015-05-20 18:19                                                                             ` Chris J Arges
2015-04-03  5:45                                                 ` smp_call_function_single lockups Ingo Molnar
2015-04-06 17:23                                         ` Chris J Arges
2015-02-20  9:30                     ` Ingo Molnar
2015-02-20 16:49                       ` Linus Torvalds
2015-02-20 19:41                         ` Ingo Molnar
2015-02-20 20:03                           ` Linus Torvalds
2015-02-20 20:11                             ` Ingo Molnar
2015-03-20 10:15       ` Peter Zijlstra
2015-03-20 16:26         ` Linus Torvalds
2015-03-20 17:14           ` Mike Galbraith
2015-04-01 14:22       ` Frederic Weisbecker
2015-04-18 10:13       ` [tip:locking/urgent] smp: Fix smp_call_function_single_async() locking tip-bot for Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150416110423.GA15760@gmail.com \
    --to=mingo@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=chris.j.arges@canonical.com \
    --cc=fweisbec@gmail.com \
    --cc=gema.gomez-solano@canonical.com \
    --cc=hpa@zytor.com \
    --cc=inaddy@ubuntu.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).