From: Christopher Covington <cov@codeaurora.org>
To: Shanker Donthineni <shankerd@codeaurora.org>,
Marc Zyngier <marc.zyngier@arm.com>,
kvmarm@lists.cs.columbia.edu
Subject: Re: Intermittent guest kernel crashes with v4.5-rc6.
Date: Mon, 18 Apr 2016 11:56:35 -0400 [thread overview]
Message-ID: <571503B3.6060001@codeaurora.org> (raw)
In-Reply-To: <56DE48B6.4060705@codeaurora.org>
On 03/07/2016 10:36 PM, Shanker Donthineni wrote:
> On 03/03/2016 08:38 AM, Marc Zyngier wrote:
>> On 03/03/16 14:26, Shanker Donthineni wrote:
>>> On 03/03/2016 08:03 AM, Marc Zyngier wrote:
>>>> On 03/03/16 13:25, Shanker Donthineni wrote:
>>>>> On 03/02/2016 11:35 AM, Marc Zyngier wrote:
>>>>>> On 02/03/16 15:48, Shanker Donthineni wrote:
>>>>>>
>>>>>>> We haven't started running heavy workloads in VMs. So far we
>>>>>>> have noticed this random nature behavior only during guest
>>>>>>> kernel boot (at EL1).
>>>>>>>
>>>>>>> We didn't see this problem on 4.3 kernel. Do you think it is
>>>>>>> related to TLB conflicts?
>>>>>> I cannot imagine why a DSB would solve a TLB conflict. But the fact
>>>>>> that
>>>>>> you didn't see it crashing on 4.3 is a good indication that something
>>>>>> else it at play.
>>>>>>
>>>>>> In 4.5, we've rewritten a large part of KVM in C, which has changed the
>>>>>> ordering of the various accesses a lot. It could be that a latent
>>>>>> problem is now exposed more widely.
>>>>>>
>>>>>> Can you try moving this DSB around and find out what is the earliest
>>>>>> point where it solves this problem? Some sort of bisection?
>>>>> The maximum I can move up 'dsb ishst' to the beginning of
>>>>> __guest_enter() but not out side of this function.
>>>>>
>>>>> I don't understand why it is failing below code, branch
>>>>> instruction causing problems.
>>>>>
>>>>> /* Jump in the fire! */
>>>>> + dsb(ishst);
>>>>> exit_code = __guest_enter(vcpu, host_ctxt);
>>>>> /* And we're baaack! */
>>>> That's very worrying. I can't see how the branch can have an influence
>>>> on the the DSB (nor why the DSB has an influence on the rest of the
>>>> execution, btw).
>>>>
>>>> What if you replace the DSB with an ISB? Do you observe a similar
>>>> behaviour (works if the barrier is in __guest_enter, but not if it is
>>>> outside)?
>>> I have already tried with isb without success. I did another
>>> experiment flush stage-2 TLBs before calling __guest_enetr(),
>>> it fixed the problem.
>> I suspected something like that. But it is such a massive hammer that it
>> will hide any sort of subtle bug (HW *and* SW).
>>
>>>> Another thing worth looking at is what happened just before we decided
>>>> to get back into the guest. Or to put it differently, what was the
>>>> reason to exit the first place. Was it a Stage-2 fault by any chance?
>>> I will collect as much possible debug data and update results
>>> to you. I went through your KVM refracted 'C' code and did not
>>> find any thing suspicious. I am thinking may be Qualcomm CPUs
>>> have a very aggressive prefech logic that causing the problem.
>> OK. Please keep me posted about your findings. Also maybe involving some
>> HW people ouwld be a good idea (running something in an emulator, for
>> example...).
This has been confirmed to be a hardware defect with a firmware workaround.
Regards,
Christopher Covington
--
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
next prev parent reply other threads:[~2016-04-18 15:54 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-02 13:56 Intermittent guest kernel crashes with v4.5-rc6 Shanker Donthineni
2016-03-02 14:16 ` Marc Zyngier
2016-03-02 14:59 ` Shanker Donthineni
2016-03-02 15:09 ` Marc Zyngier
2016-03-02 15:48 ` Shanker Donthineni
2016-03-02 17:35 ` Marc Zyngier
2016-03-03 13:25 ` Shanker Donthineni
2016-03-03 14:03 ` Marc Zyngier
2016-03-03 14:26 ` Shanker Donthineni
2016-03-03 14:38 ` Marc Zyngier
[not found] ` <56DE48B6.4060705@codeaurora.org>
2016-04-18 15:56 ` Christopher Covington [this message]
2016-04-18 16:00 ` Marc Zyngier
2016-03-02 14:48 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=571503B3.6060001@codeaurora.org \
--to=cov@codeaurora.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=marc.zyngier@arm.com \
--cc=shankerd@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.