From: Alexander Graf <agraf@suse.de>
To: kvm-ppc@vger.kernel.org
Subject: KVM HV crash
Date: Mon, 14 Jul 2014 11:24:07 +0000 [thread overview]
Message-ID: <53C3BDD7.4020609@suse.de> (raw)
Hi Paul,
I've just seen a crash on POWER7 in HV KVM on a host where I run HV and
PR KVM VMs in parallel based on the latest code (linus/master merged
with for-3.16 merged with kvm-ppc-queue plus some PR patches):
Unable to handle kernel paging request for data at address 0x0000000c
Faulting instruction address: 0xc0000000000667d8
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS\x1024 NUMA PowerNV
Modules linked in: kvm_hv kvm_pr kvm
CPU: 4 PID: 4973 Comm: qemu-system-ppc Not tainted 3.16.0-rc4-1.1-default+ #84
task: c000000f170addb0 ti: c000000f17154000 task.ti: c000000f17154000
NIP: c0000000000667d8 LR: d000000007d61ee8 CTR: c0000000000667c0
REGS: c000000f171574a0 TRAP: 0300 Not tainted (3.16.0-rc4-1.1-default+)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 24002428 XER: 00000000
CFAR: c000000000009358 DAR: 000000000000000c DSISR: 42000000 SOFTE: 1
GPR00: d000000007d63be8 c000000f17157720 c0000000018c47d8 0000000000006020
GPR04: 0000000000000000 0000000000000001 c00000000190fb18 0000000000000000
GPR08: 0000000000000c00 0000000000000000 0000000000000004 d000000007d6a420
GPR12: c0000000000667c0 c00000000fdc2400 d000000007d73ef8 0000000000000007
GPR16: c000000f17157858 c000000f19db1000 c000000001948f88 d000000007d73ef8
GPR20: d000000007d73ef8 0000000000000004 0000000000000004 c000000001948584
GPR24: c000000f171577d0 c000000f19db0ff8 c00000000fdc3f00 c000000f15941da0
GPR28: c000000f15941da0 c000000f19db0fc0 c000000f19db0fc0 c000000f15941da0
NIP [c0000000000667d8] .xics_wake_cpu+0x18/0x30
LR [d000000007d61ee8] .kvmppc_start_thread+0x78/0xd0 [kvm_hv]
Call Trace:
[c000000f17157720] [c000000f171577d0] 0xc000000f171577d0 (unreliable)
[c000000f171577a0] [d000000007d63be8] .kvmppc_run_vcpu+0x5f8/0xef0 [kvm_hv]
[c000000f17157920] [d000000007d650c4] .kvmppc_vcpu_run_hv+0x344/0x740 [kvm_hv]
[c000000f171579f0] [d0000000079bd7cc] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
[c000000f17157a60] [d0000000079ba154] .kvm_arch_vcpu_ioctl_run+0x54/0x1a0 [kvm]
[c000000f17157af0] [d0000000079b4b58] .kvm_vcpu_ioctl+0x4b8/0x740 [kvm]
[c000000f17157cb0] [c000000000278858] .do_vfs_ioctl+0xa8/0x710
[c000000f17157d90] [c000000000278f84] .SyS_ioctl+0xc4/0xe0
[c000000f17157e30] [c00000000000a260] syscall_exit+0x0/0xa0
Instruction dump:
7c0004ac 91490004 39200001 992d028c 4e800020 60000000 3d22001a 78631f24
3929a788 39400004 7d29182a 7c0004ac <9949000c> 39200001 992d028c 4e800020
---[ end trace c05a9dd08ee8f5fc ]---
(gdb) x /10i xics_wake_cpu
0xc0000000000667c0 <xics_wake_cpu>: addis r9,r2,26
0xc0000000000667c4 <xics_wake_cpu+4>: rldicr r3,r3,3,60
0xc0000000000667c8 <xics_wake_cpu+8>: addi r9,r9,-22648
0xc0000000000667cc <xics_wake_cpu+12>: li r10,4
0xc0000000000667d0 <xics_wake_cpu+16>: ldx r9,r9,r3
0xc0000000000667d4 <xics_wake_cpu+20>: sync
0xc0000000000667d8 <xics_wake_cpu+24>: stb r10,12(r9)
0xc0000000000667dc <xics_wake_cpu+28>: li r9,1
0xc0000000000667e0 <xics_wake_cpu+32>: stb r9,652(r13)
0xc0000000000667e4 <xics_wake_cpu+36>: blr
(gdb) x /5i 0xc0000000000667d8
0xc0000000000667d8 <xics_wake_cpu+24>: stb r10,12(r9)
(gdb) l *0xc0000000000667d8
0xc0000000000667d8 is in xics_wake_cpu
(./arch/powerpc/include/asm/io.h:169).
169 DEF_MMIO_OUT_D(out_8, 8, stb);
(gdb) l *(kvmppc_run_vcpu+0x5f8)
0x3be8 is in kvmppc_run_vcpu (arch/powerpc/kvm/book3s_hv.c:1631).
1626
1627
1628 vc->pcpu = smp_processor_id();
1629 list_for_each_entry(vcpu, &vc->runnable_threads,
arch.run_list) {
1630 kvmppc_start_thread(vcpu);
1631 kvmppc_create_dtl_entry(vcpu, vc);
1632 }
1633
1634 /* Set this explicitly in case thread 0 doesn't have a vcpu */
1635 get_paca()->kvm_hstate.kvm_vcore = vc;
So r9 is a NULL pointer. R9 is derived from the CPU ID. R3 at the point
in time of the fault is 0x6020. If I read the rldicr correctly that
means that cpu_id was 3076 which is a ridiculous number. The system only
has 16 cores.
So I guess something in this calculation in kvmppc_start_thread() went
wrong:
cpu = vc->pcpu + vcpu->arch.ptid;
Either one of the pointers was incorrect (vc or vcpu) or the values inside.
Have you seen this before? Any idea what could be going wrong?
Alex
next reply other threads:[~2014-07-14 11:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-14 11:24 Alexander Graf [this message]
2014-07-14 17:08 ` KVM HV crash Alexander Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53C3BDD7.4020609@suse.de \
--to=agraf@suse.de \
--cc=kvm-ppc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.