KVM HV crash - Alexander Graf

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Graf <agraf@suse.de>
To: kvm-ppc@vger.kernel.org
Subject: KVM HV crash
Date: Mon, 14 Jul 2014 11:24:07 +0000	[thread overview]
Message-ID: <53C3BDD7.4020609@suse.de> (raw)

Hi Paul,

I've just seen a crash on POWER7 in HV KVM on a host where I run HV and 
PR KVM VMs in parallel based on the latest code (linus/master merged 
with for-3.16 merged with kvm-ppc-queue plus some PR patches):

Unable to handle kernel paging request for data at address 0x0000000c
Faulting instruction address: 0xc0000000000667d8
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS\x1024 NUMA PowerNV
Modules linked in: kvm_hv kvm_pr kvm
CPU: 4 PID: 4973 Comm: qemu-system-ppc Not tainted 3.16.0-rc4-1.1-default+ #84
task: c000000f170addb0 ti: c000000f17154000 task.ti: c000000f17154000
NIP: c0000000000667d8 LR: d000000007d61ee8 CTR: c0000000000667c0
REGS: c000000f171574a0 TRAP: 0300   Not tainted  (3.16.0-rc4-1.1-default+)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 24002428  XER: 00000000
CFAR: c000000000009358 DAR: 000000000000000c DSISR: 42000000 SOFTE: 1
GPR00: d000000007d63be8 c000000f17157720 c0000000018c47d8 0000000000006020
GPR04: 0000000000000000 0000000000000001 c00000000190fb18 0000000000000000
GPR08: 0000000000000c00 0000000000000000 0000000000000004 d000000007d6a420
GPR12: c0000000000667c0 c00000000fdc2400 d000000007d73ef8 0000000000000007
GPR16: c000000f17157858 c000000f19db1000 c000000001948f88 d000000007d73ef8
GPR20: d000000007d73ef8 0000000000000004 0000000000000004 c000000001948584
GPR24: c000000f171577d0 c000000f19db0ff8 c00000000fdc3f00 c000000f15941da0
GPR28: c000000f15941da0 c000000f19db0fc0 c000000f19db0fc0 c000000f15941da0
NIP [c0000000000667d8] .xics_wake_cpu+0x18/0x30
LR [d000000007d61ee8] .kvmppc_start_thread+0x78/0xd0 [kvm_hv]
Call Trace:
[c000000f17157720] [c000000f171577d0] 0xc000000f171577d0 (unreliable)
[c000000f171577a0] [d000000007d63be8] .kvmppc_run_vcpu+0x5f8/0xef0 [kvm_hv]
[c000000f17157920] [d000000007d650c4] .kvmppc_vcpu_run_hv+0x344/0x740 [kvm_hv]
[c000000f171579f0] [d0000000079bd7cc] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
[c000000f17157a60] [d0000000079ba154] .kvm_arch_vcpu_ioctl_run+0x54/0x1a0 [kvm]
[c000000f17157af0] [d0000000079b4b58] .kvm_vcpu_ioctl+0x4b8/0x740 [kvm]
[c000000f17157cb0] [c000000000278858] .do_vfs_ioctl+0xa8/0x710
[c000000f17157d90] [c000000000278f84] .SyS_ioctl+0xc4/0xe0
[c000000f17157e30] [c00000000000a260] syscall_exit+0x0/0xa0
Instruction dump:
7c0004ac 91490004 39200001 992d028c 4e800020 60000000 3d22001a 78631f24
3929a788 39400004 7d29182a 7c0004ac <9949000c> 39200001 992d028c 4e800020
---[ end trace c05a9dd08ee8f5fc ]---



(gdb) x /10i xics_wake_cpu
    0xc0000000000667c0 <xics_wake_cpu>:    addis   r9,r2,26
    0xc0000000000667c4 <xics_wake_cpu+4>:    rldicr  r3,r3,3,60
    0xc0000000000667c8 <xics_wake_cpu+8>:    addi r9,r9,-22648
    0xc0000000000667cc <xics_wake_cpu+12>:    li      r10,4
    0xc0000000000667d0 <xics_wake_cpu+16>:    ldx     r9,r9,r3
    0xc0000000000667d4 <xics_wake_cpu+20>:    sync
    0xc0000000000667d8 <xics_wake_cpu+24>:    stb r10,12(r9)
    0xc0000000000667dc <xics_wake_cpu+28>:    li      r9,1
    0xc0000000000667e0 <xics_wake_cpu+32>:    stb r9,652(r13)
    0xc0000000000667e4 <xics_wake_cpu+36>:    blr

(gdb) x /5i 0xc0000000000667d8
    0xc0000000000667d8 <xics_wake_cpu+24>:    stb r10,12(r9)

(gdb) l *0xc0000000000667d8
0xc0000000000667d8 is in xics_wake_cpu 
(./arch/powerpc/include/asm/io.h:169).
169    DEF_MMIO_OUT_D(out_8,   8, stb);

(gdb) l *(kvmppc_run_vcpu+0x5f8)
0x3be8 is in kvmppc_run_vcpu (arch/powerpc/kvm/book3s_hv.c:1631).
1626
1627
1628        vc->pcpu = smp_processor_id();
1629        list_for_each_entry(vcpu, &vc->runnable_threads, 
arch.run_list) {
1630            kvmppc_start_thread(vcpu);
1631            kvmppc_create_dtl_entry(vcpu, vc);
1632        }
1633
1634        /* Set this explicitly in case thread 0 doesn't have a vcpu */
1635        get_paca()->kvm_hstate.kvm_vcore = vc;

So r9 is a NULL pointer. R9 is derived from the CPU ID. R3 at the point 
in time of the fault is 0x6020. If I read the rldicr correctly that 
means that cpu_id was 3076 which is a ridiculous number. The system only 
has 16 cores.

So I guess something in this calculation in kvmppc_start_thread() went 
wrong:

           cpu = vc->pcpu + vcpu->arch.ptid;

Either one of the pointers was incorrect (vc or vcpu) or the values inside.


Have you seen this before? Any idea what could be going wrong?

Alex

next             reply	other threads:[~2014-07-14 11:24 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-14 11:24 Alexander Graf [this message]
2014-07-14 17:08 ` KVM HV crash Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C3BDD7.4020609@suse.de \
    --to=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.