From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shanker Donthineni Subject: Re: Intermittent guest kernel crashes with v4.5-rc6. Date: Wed, 2 Mar 2016 08:59:42 -0600 Message-ID: <56D6FFDE.9050704@codeaurora.org> References: <56D6F113.9020605@codeaurora.org> <56D6F5CC.5020101@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id C666441145 for ; Wed, 2 Mar 2016 09:52:36 -0500 (EST) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z8Ltb7oRyKDc for ; Wed, 2 Mar 2016 09:52:35 -0500 (EST) Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 91B2D410EB for ; Wed, 2 Mar 2016 09:52:35 -0500 (EST) In-Reply-To: <56D6F5CC.5020101@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Marc Zyngier , kvmarm@lists.cs.columbia.edu List-Id: kvmarm@lists.cs.columbia.edu Hi Marc, Thanks for your quick reply. On 03/02/2016 08:16 AM, Marc Zyngier wrote: > On 02/03/16 13:56, Shanker Donthineni wrote: >> For some reason v4.5-rc6 kernel is not stable for guest machines on >> Qualcomm server platforms. >> We are getting IABT translation faults while booting the guest kernel. >> The problem disappears with >> the following code snippet (insert "dsb ish" instruction just before >> switching to EL1 guest). I am >> using v4.5-rc6 kernel for both host and guest machines. >> >> Please let me know if you have any thoughts or ideas for tracing this >> problem. >> >> --- a/arch/arm64/kvm/hyp/entry.S >> +++ b/arch/arm64/kvm/hyp/entry.S >> @@ -88,6 +88,7 @@ ENTRY(__guest_enter) >> ldp x0, x1, [sp], #16 >> >> // Do not touch any register after this! >> + dsb ish >> eret >> ENDPROC(__guest_enter) >> >> >> Using below QEMU command for launching guest machine: >> >> qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ >> -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ >> -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ >> -append 'earlycon=earlycon=pl011,0x09000000 \ >> console=ttyAMA0,115200 root=/dev/ram' >> >> >> Guest machine crash log messages: >> >> [ 0.000000] Booting Linux on physical CPU 0x0 >> [ 0.000000] Boot CPU: AArch64 Processor [510f2811] >> [ 0.000000] Bad mode in Synchronous Abort handler detected, code >> 0x8600000f -- IABT (current EL) >> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ >> [ 0.000000] task: ffffffc000d52200 ti: ffffffc000d44000 task.ti: >> ffffffc000d44000 >> [ 0.000000] PC is at early_init_dt_scan_root+0x28/0x94 >> [ 0.000000] LR is at of_scan_flat_dt+0x9c/0xd0 >> [ 0.000000] pc : [] lr : [] >> pstate: 800003c5 >> [ 0.000000] sp : ffffffc000d47e80 >> [ 0.000000] x29: ffffffc000d47e80 x28: 0000000000000000 >> > If you're getting a prefetch abort, it would be interesting to find out > what instruction is there, whether the page is mapped at stage-2 or not, > what are the stage-2 permissions... Basically, a full description of the > memory state. > > Also, does it work if you do a "dsb ishst" instead? > > Thanks, > > M. Most of the times it is faulting at ldr/str instructions. I have verified stage-1 page and the the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after IABT, everything perfectly matches. I am very confident that stage-1/stage-2 MMU page tables are correct. Instruction "dsb ishst" also fixing the problem. One more Interesting observation, if retry an instruction fetch that caused IABT, second time fetch is successful and I don't see IABT. I used below experimental code to test. --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -346,6 +346,7 @@ el1_sync: b.eq el1_undef cmp x24, #ESR_ELx_EC_BREAKPT_CUR // debug exception in EL1 b.ge el1_dbg + kernel_exit 1 b el1_inv el1_da: -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project