From: Marc Zyngier <marc.zyngier@arm.com>
To: Shanker Donthineni <shankerd@codeaurora.org>,
kvmarm@lists.cs.columbia.edu
Subject: Re: Intermittent guest kernel crashes with v4.5-rc6.
Date: Wed, 2 Mar 2016 15:09:48 +0000 [thread overview]
Message-ID: <56D7023C.7050309@arm.com> (raw)
In-Reply-To: <56D6FFDE.9050704@codeaurora.org>
On 02/03/16 14:59, Shanker Donthineni wrote:
> Hi Marc,
>
> Thanks for your quick reply.
>
> On 03/02/2016 08:16 AM, Marc Zyngier wrote:
>> On 02/03/16 13:56, Shanker Donthineni wrote:
>>> For some reason v4.5-rc6 kernel is not stable for guest machines on
>>> Qualcomm server platforms.
>>> We are getting IABT translation faults while booting the guest kernel.
>>> The problem disappears with
>>> the following code snippet (insert "dsb ish" instruction just before
>>> switching to EL1 guest). I am
>>> using v4.5-rc6 kernel for both host and guest machines.
>>>
>>> Please let me know if you have any thoughts or ideas for tracing this
>>> problem.
>>>
>>> --- a/arch/arm64/kvm/hyp/entry.S
>>> +++ b/arch/arm64/kvm/hyp/entry.S
>>> @@ -88,6 +88,7 @@ ENTRY(__guest_enter)
>>> ldp x0, x1, [sp], #16
>>>
>>> // Do not touch any register after this!
>>> + dsb ish
>>> eret
>>> ENDPROC(__guest_enter)
>>>
>>>
>>> Using below QEMU command for launching guest machine:
>>>
>>> qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \
>>> -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \
>>> -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \
>>> -append 'earlycon=earlycon=pl011,0x09000000 \
>>> console=ttyAMA0,115200 root=/dev/ram'
>>>
>>>
>>> Guest machine crash log messages:
>>>
>>> [ 0.000000] Booting Linux on physical CPU 0x0
>>> [ 0.000000] Boot CPU: AArch64 Processor [510f2811]
>>> [ 0.000000] Bad mode in Synchronous Abort handler detected, code
>>> 0x8600000f -- IABT (current EL)
>>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+
>>> [ 0.000000] task: ffffffc000d52200 ti: ffffffc000d44000 task.ti:
>>> ffffffc000d44000
>>> [ 0.000000] PC is at early_init_dt_scan_root+0x28/0x94
>>> [ 0.000000] LR is at of_scan_flat_dt+0x9c/0xd0
>>> [ 0.000000] pc : [<ffffffc000cb32e8>] lr : [<ffffffc000cb3248>]
>>> pstate: 800003c5
>>> [ 0.000000] sp : ffffffc000d47e80
>>> [ 0.000000] x29: ffffffc000d47e80 x28: 0000000000000000
>>>
>> If you're getting a prefetch abort, it would be interesting to find out
>> what instruction is there, whether the page is mapped at stage-2 or not,
>> what are the stage-2 permissions... Basically, a full description of the
>> memory state.
>>
>> Also, does it work if you do a "dsb ishst" instead?
>>
>> Thanks,
>>
>> M.
>
> Most of the times it is faulting at ldr/str instructions. I have
> verified stage-1 page and the
> the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after
> IABT, everything
> perfectly matches. I am very confident that stage-1/stage-2 MMU page
> tables are correct.
>
> Instruction "dsb ishst" also fixing the problem.
>
> One more Interesting observation, if retry an instruction fetch that
> caused IABT, second
> time fetch is successful and I don't see IABT. I used below
> experimental code to test.
>
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -346,6 +346,7 @@ el1_sync:
> b.eq el1_undef
> cmp x24, #ESR_ELx_EC_BREAKPT_CUR // debug exception in EL1
> b.ge el1_dbg
> + kernel_exit 1
> b el1_inv
> el1_da:
>
>
OK, that's pretty scary, specially considering that we don't have a DSB
on that path. Do you ever see it exploding at EL0?
Thanks,
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2016-03-02 15:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-02 13:56 Intermittent guest kernel crashes with v4.5-rc6 Shanker Donthineni
2016-03-02 14:16 ` Marc Zyngier
2016-03-02 14:59 ` Shanker Donthineni
2016-03-02 15:09 ` Marc Zyngier [this message]
2016-03-02 15:48 ` Shanker Donthineni
2016-03-02 17:35 ` Marc Zyngier
2016-03-03 13:25 ` Shanker Donthineni
2016-03-03 14:03 ` Marc Zyngier
2016-03-03 14:26 ` Shanker Donthineni
2016-03-03 14:38 ` Marc Zyngier
[not found] ` <56DE48B6.4060705@codeaurora.org>
2016-04-18 15:56 ` Christopher Covington
2016-04-18 16:00 ` Marc Zyngier
2016-03-02 14:48 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56D7023C.7050309@arm.com \
--to=marc.zyngier@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=shankerd@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox