From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Zyngier Subject: Re: Intermittent guest kernel crashes with v4.5-rc6. Date: Wed, 2 Mar 2016 14:16:44 +0000 Message-ID: <56D6F5CC.5020101@arm.com> References: <56D6F113.9020605@codeaurora.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 917DD40B58 for ; Wed, 2 Mar 2016 09:09:40 -0500 (EST) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WMkMvYU6zarm for ; Wed, 2 Mar 2016 09:09:37 -0500 (EST) Received: from foss.arm.com (foss.arm.com [217.140.101.70]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 46CD640B40 for ; Wed, 2 Mar 2016 09:09:37 -0500 (EST) In-Reply-To: <56D6F113.9020605@codeaurora.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Shanker Donthineni , kvmarm@lists.cs.columbia.edu List-Id: kvmarm@lists.cs.columbia.edu On 02/03/16 13:56, Shanker Donthineni wrote: > > For some reason v4.5-rc6 kernel is not stable for guest machines on > Qualcomm server platforms. > We are getting IABT translation faults while booting the guest kernel. > The problem disappears with > the following code snippet (insert "dsb ish" instruction just before > switching to EL1 guest). I am > using v4.5-rc6 kernel for both host and guest machines. > > Please let me know if you have any thoughts or ideas for tracing this > problem. > > --- a/arch/arm64/kvm/hyp/entry.S > +++ b/arch/arm64/kvm/hyp/entry.S > @@ -88,6 +88,7 @@ ENTRY(__guest_enter) > ldp x0, x1, [sp], #16 > > // Do not touch any register after this! > + dsb ish > eret > ENDPROC(__guest_enter) > > > Using below QEMU command for launching guest machine: > > qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ > -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ > -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ > -append 'earlycon=earlycon=pl011,0x09000000 \ > console=ttyAMA0,115200 root=/dev/ram' > > > Guest machine crash log messages: > > [ 0.000000] Booting Linux on physical CPU 0x0 > [ 0.000000] Boot CPU: AArch64 Processor [510f2811] > [ 0.000000] Bad mode in Synchronous Abort handler detected, code > 0x8600000f -- IABT (current EL) > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ > [ 0.000000] task: ffffffc000d52200 ti: ffffffc000d44000 task.ti: > ffffffc000d44000 > [ 0.000000] PC is at early_init_dt_scan_root+0x28/0x94 > [ 0.000000] LR is at of_scan_flat_dt+0x9c/0xd0 > [ 0.000000] pc : [] lr : [] > pstate: 800003c5 > [ 0.000000] sp : ffffffc000d47e80 > [ 0.000000] x29: ffffffc000d47e80 x28: 0000000000000000 > If you're getting a prefetch abort, it would be interesting to find out what instruction is there, whether the page is mapped at stage-2 or not, what are the stage-2 permissions... Basically, a full description of the memory state. Also, does it work if you do a "dsb ishst" instead? Thanks, M. -- Jazz is not dead. It just smells funny...