Linux KVM/arm64 development list
 help / color / mirror / Atom feed
From: Shanker Donthineni <shankerd@codeaurora.org>
To: Marc Zyngier <marc.zyngier@arm.com>, kvmarm@lists.cs.columbia.edu
Subject: Re: Intermittent guest kernel crashes with v4.5-rc6.
Date: Wed, 2 Mar 2016 08:59:42 -0600	[thread overview]
Message-ID: <56D6FFDE.9050704@codeaurora.org> (raw)
In-Reply-To: <56D6F5CC.5020101@arm.com>

Hi Marc,

Thanks for your quick reply.

On 03/02/2016 08:16 AM, Marc Zyngier wrote:
> On 02/03/16 13:56, Shanker Donthineni wrote:
>> For some reason v4.5-rc6 kernel is not stable for guest machines on
>> Qualcomm server platforms.
>> We are getting IABT translation faults while booting the guest kernel.
>> The problem disappears with
>> the following code snippet (insert "dsb ish" instruction just before
>> switching to EL1 guest). I am
>> using v4.5-rc6 kernel for both host and guest machines.
>>
>> Please let me know if you have any thoughts or ideas for tracing this
>> problem.
>>
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -88,6 +88,7 @@ ENTRY(__guest_enter)
>>           ldp     x0, x1, [sp], #16
>>
>>           // Do not touch any register after this!
>> +       dsb ish
>>           eret
>>    ENDPROC(__guest_enter)
>>
>>
>> Using below QEMU command for launching guest machine:
>>
>> qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3  \
>> -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \
>> -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \
>> -append 'earlycon=earlycon=pl011,0x09000000  \
>> console=ttyAMA0,115200 root=/dev/ram'
>>
>>
>> Guest machine crash log messages:
>>
>> [    0.000000] Booting Linux on physical CPU 0x0
>> [    0.000000] Boot CPU: AArch64 Processor [510f2811]
>> [    0.000000] Bad mode in Synchronous Abort handler detected, code
>> 0x8600000f -- IABT (current EL)
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+
>> [    0.000000] task: ffffffc000d52200 ti: ffffffc000d44000 task.ti:
>> ffffffc000d44000
>> [    0.000000] PC is at early_init_dt_scan_root+0x28/0x94
>> [    0.000000] LR is at of_scan_flat_dt+0x9c/0xd0
>> [    0.000000] pc : [<ffffffc000cb32e8>] lr : [<ffffffc000cb3248>]
>> pstate: 800003c5
>> [    0.000000] sp : ffffffc000d47e80
>> [    0.000000] x29: ffffffc000d47e80 x28: 0000000000000000
>>
> If you're getting a prefetch abort, it would be interesting to find out
> what instruction is there, whether the page is mapped at stage-2 or not,
> what are the stage-2 permissions... Basically, a full description of the
> memory state.
>
> Also, does it work if you do a "dsb ishst" instead?
>
> Thanks,
>
> 	M.

Most of the times it is faulting at ldr/str instructions. I have 
verified stage-1 page and  the
the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after 
IABT, everything
perfectly matches. I am very confident that stage-1/stage-2 MMU page 
tables are correct.

Instruction "dsb ishst" also fixing the problem.

One more Interesting observation, if retry an instruction fetch that 
caused IABT, second
time fetch is successful and I don't see IABT.  I used below 
experimental code to test.

--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -346,6 +346,7 @@ el1_sync:
         b.eq    el1_undef
         cmp     x24, #ESR_ELx_EC_BREAKPT_CUR    // debug exception in EL1
         b.ge    el1_dbg
+       kernel_exit 1
         b       el1_inv
  el1_da:


-- 
Shanker Donthineni
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

  reply	other threads:[~2016-03-02 14:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-02 13:56 Intermittent guest kernel crashes with v4.5-rc6 Shanker Donthineni
2016-03-02 14:16 ` Marc Zyngier
2016-03-02 14:59   ` Shanker Donthineni [this message]
2016-03-02 15:09     ` Marc Zyngier
2016-03-02 15:48       ` Shanker Donthineni
2016-03-02 17:35         ` Marc Zyngier
2016-03-03 13:25           ` Shanker Donthineni
2016-03-03 14:03             ` Marc Zyngier
2016-03-03 14:26               ` Shanker Donthineni
2016-03-03 14:38                 ` Marc Zyngier
     [not found]                   ` <56DE48B6.4060705@codeaurora.org>
2016-04-18 15:56                     ` Christopher Covington
2016-04-18 16:00                       ` Marc Zyngier
2016-03-02 14:48 ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D6FFDE.9050704@codeaurora.org \
    --to=shankerd@codeaurora.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox