Linux KVM/arm64 development list
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: Shanker Donthineni <shankerd@codeaurora.org>,
	kvmarm@lists.cs.columbia.edu
Subject: Re: Intermittent guest kernel crashes with v4.5-rc6.
Date: Thu, 3 Mar 2016 14:38:43 +0000	[thread overview]
Message-ID: <56D84C73.3070305@arm.com> (raw)
In-Reply-To: <56D849AF.2040606@codeaurora.org>

On 03/03/16 14:26, Shanker Donthineni wrote:
> 
> 
> On 03/03/2016 08:03 AM, Marc Zyngier wrote:
>> On 03/03/16 13:25, Shanker Donthineni wrote:
>>>
>>> On 03/02/2016 11:35 AM, Marc Zyngier wrote:
>>>> On 02/03/16 15:48, Shanker Donthineni wrote:
>>>>
>>>>> We haven't started running heavy workloads in VMs. So far we
>>>>> have noticed this random nature behavior only during guest
>>>>> kernel boot (at EL1).  
>>>>>
>>>>> We didn't see this problem on 4.3 kernel. Do you think it is
>>>>> related to TLB conflicts?
>>>> I cannot imagine why a DSB would solve a TLB conflict. But the fact that
>>>> you didn't see it crashing on 4.3 is a good indication that something
>>>> else it at play.
>>>>
>>>> In 4.5, we've rewritten a large part of KVM in C, which has changed the
>>>> ordering of the various accesses a lot. It could be that a latent
>>>> problem is now exposed more widely.
>>>>
>>>> Can you try moving this DSB around and find out what is the earliest
>>>> point where it solves this problem? Some sort of bisection?
>>> The maximum I can move up 'dsb ishst' to the beginning of
>>> __guest_enter() but not out side of this function.
>>>
>>> I don't understand why it is failing below code, branch
>>> instruction causing problems.
>>>
>>>     /* Jump in the fire! */
>>> +  dsb(ishst);
>>>     exit_code = __guest_enter(vcpu, host_ctxt);
>>>     /* And we're baaack! */
>> That's very worrying. I can't see how the branch can have an influence
>> on the the DSB (nor why the DSB has an influence on the rest of the
>> execution, btw).
>>
>> What if you replace the DSB with an ISB? Do you observe a similar
>> behaviour (works if the barrier is in __guest_enter, but not if it is
>> outside)?
> I have already tried with isb without success. I did another
> experiment flush stage-2 TLBs before calling __guest_enetr(),
> it fixed the problem.

I suspected something like that. But it is such a massive hammer that it
will hide any sort of subtle bug (HW *and* SW).

> 
>> Another thing worth looking at is what happened just before we decided
>> to get back into the guest. Or to put it differently, what was the
>> reason to exit the first place. Was it a Stage-2 fault by any chance?
> 
> I will collect as much possible debug data and update results
> to you. I went through your KVM refracted 'C' code and did not
> find any thing suspicious. I am thinking may be Qualcomm CPUs
> have a very aggressive prefech logic that causing the problem. 

OK. Please keep me posted about your findings. Also maybe involving some
HW people ouwld be a good idea (running something in an emulator, for
example...).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2016-03-03 14:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-02 13:56 Intermittent guest kernel crashes with v4.5-rc6 Shanker Donthineni
2016-03-02 14:16 ` Marc Zyngier
2016-03-02 14:59   ` Shanker Donthineni
2016-03-02 15:09     ` Marc Zyngier
2016-03-02 15:48       ` Shanker Donthineni
2016-03-02 17:35         ` Marc Zyngier
2016-03-03 13:25           ` Shanker Donthineni
2016-03-03 14:03             ` Marc Zyngier
2016-03-03 14:26               ` Shanker Donthineni
2016-03-03 14:38                 ` Marc Zyngier [this message]
     [not found]                   ` <56DE48B6.4060705@codeaurora.org>
2016-04-18 15:56                     ` Christopher Covington
2016-04-18 16:00                       ` Marc Zyngier
2016-03-02 14:48 ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D84C73.3070305@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=shankerd@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox