From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Subject: Re: [PATCH 00/33] KVM: PPC: Fix IRQ race in magic page code
Date: Mon, 28 Jul 2014 16:10:58 +0200
Message-ID: <53D659F1.1000301@suse.de>
References: <1403472217-22263-1-git-send-email-agraf@suse.de>	 <1403635989.26908.25.camel@snotra.buserror.net> <53A9FE8D.1060300@suse.de>	 <1403651745.2435.49.camel@snotra.buserror.net> <53AA0C78.60703@suse.de> <1403655689.2435.55.camel@snotra.buserror.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
To: Scott Wood <scottwood@freescale.com>
Return-path: <kvm-ppc-owner@vger.kernel.org>
In-Reply-To: <1403655689.2435.55.camel@snotra.buserror.net>
Sender: kvm-ppc-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org


On 25.06.14 02:21, Scott Wood wrote:
> On Wed, 2014-06-25 at 01:40 +0200, Alexander Graf wrote:
>> On 25.06.14 01:15, Scott Wood wrote:
>>> On Wed, 2014-06-25 at 00:41 +0200, Alexander Graf wrote:
>>>> On 24.06.14 20:53, Scott Wood wrote:
>>>>> The timer interrupt works, but I'm not fully convinced that it's a good
>>>> idea for things like MC events which we also block during critical
>>>> sections on e500v2.
>>> Are you concerned about the guest seeing machine checks that are (more)
>>> asynchronous with the error condition?  e500v2 machine checks are always
>>> asynchronous.  From the core manual:
>>>
>>>           Machine check interrupts are typically caused by a hardware or
>>>           memory subsystem failure or by an attempt to access an invalid
>>>           address. They may be caused indirectly by execution of an
>>>           instruction, but may not be recognized or reported until long
>>>           after the processor has executed past the instruction that
>>>           caused the machine check. As such, machine check interrupts are
>>>           not thought of as synchronous or asynchronous nor as precise or
>>>           imprecise.
>>>
>>> I don't think the lag would be a problem, and certainly it's better than
>>> the current situation.
>> So what value would you set the timer to? If the value is too small, we
>> never finish the critical section. If it's too big, we add lots of jitter.
> Maybe something like 100us?
>
> Single stepping would be better, though.
>
>>>> Single stepping is hard enough to get right on interaction between QEMU,
>>>> KVM and the guest. I didn't really want to make that stuff any more
>>>> complicated.
>>> I'm not sure that it would add much complexity.  We'd just need to check
>>> whether any source other than the magic page turned wants DCBR0_IC on,
>>> to determine whether to exit to userspace or not.
>> What if the guest is single stepping itself? How do we determine when to
>> unset the bit again? When we get out of the critical section? How do we
>> know what the value was before we set it?
> Keep track of each requester of single stepping separately, and only
> ever set the real bit by ORing them.

Considering that Paul started working on integrating the in-kernel 
emulator with KVM I think we're best off to just wait for that one and 
then use it :).


Alex