Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time

Linux KVM/arm64 development list
 help / color / mirror / Atom feed

From: Mario Smarduch <m.smarduch@samsung.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: kvm@vger.kernel.org, Marc Zyngier <marc.zyngier@arm.com>,
	borntraeger@de.ibm.com, Paolo Bonzini <pbonzini@redhat.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time
Date: Mon, 01 Jun 2015 08:48:22 -0700	[thread overview]
Message-ID: <556C7EC6.3010105@samsung.com> (raw)
In-Reply-To: <20150531065923.GB763@cbox>

On 05/30/2015 11:59 PM, Christoffer Dall wrote:
> Hi Mario,
> 
> On Fri, May 29, 2015 at 03:34:47PM -0700, Mario Smarduch wrote:
>> On 05/28/2015 11:49 AM, Christoffer Dall wrote:
>>> Until now we have been calling kvm_guest_exit after re-enabling
>>> interrupts when we come back from the guest, but this has the
>>> unfortunate effect that CPU time accounting done in the context of timer
>>> interrupts occurring while the guest is running doesn't properly notice
>>> that the time since the last tick was spent in the guest.
>>>
>>> Inspired by the comment in the x86 code, move the kvm_guest_exit() call
>>> below the local_irq_enable() call and change __kvm_guest_exit() to
>>> kvm_guest_exit(), because we are now calling this function with
>>> interrupts enabled.  We have to now explicitly disable preemption and
>>> not enable preemption before we've called kvm_guest_exit(), since
>>> otherwise we could be preempted and everything happening before we
>>> eventually get scheduled again would be accounted for as guest time.
>>>
>>> At the same time, move the trace_kvm_exit() call outside of the atomic
>>> section, since there is no reason for us to do that with interrupts
>>> disabled.
>>>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>> ---
>>> This patch is based on kvm/queue, because it has the kvm_guest_enter/exit
>>> rework recently posted by Christian Borntraeger.  I hope I got the logic
>>> of this right, there were 2 slightly worrying facts about this:
>>>
>>> First, we now enable and disable and enable interrupts on each exit
>>> path, but I couldn't see any performance overhead on hackbench - yes the
>>> only benchmark we care about.
>>>
>>> Second, looking at the ppc and mips code, they seem to also call
>>> kvm_guest_exit() before enabling interrupts, so I don't understand how
>>> guest CPU time accounting works on those architectures.
>>>
>>> Changes since v1:
>>>  - Tweak comment and commit text based on Marc's feedback.
>>>  - Explicitly disable preemption and enable it only after kvm_guest_exit().
>>>
>>>  arch/arm/kvm/arm.c | 21 +++++++++++++++++----
>>>  1 file changed, 17 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index e41cb11..fe8028d 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -532,6 +532,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  		kvm_vgic_flush_hwstate(vcpu);
>>>  		kvm_timer_flush_hwstate(vcpu);
>>>  
>>> +		preempt_disable();
>>>  		local_irq_disable();
>>>  
>>>  		/*
>>> @@ -544,6 +545,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  
>>>  		if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
>>>  			local_irq_enable();
>>> +			preempt_enable();
>>>  			kvm_timer_sync_hwstate(vcpu);
>>>  			kvm_vgic_sync_hwstate(vcpu);
>>>  			continue;
>>> @@ -559,8 +561,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>>  
>>>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>>> -		__kvm_guest_exit();
>>> -		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>>> +		/*
>>> +		 * Back from guest
>>> +		 *************************************************************/
>>> +
>>>  		/*
>>>  		 * We may have taken a host interrupt in HYP mode (ie
>>>  		 * while executing the guest). This interrupt is still
>>> @@ -574,8 +578,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  		local_irq_enable();
>>>  
>>>  		/*
>>> -		 * Back from guest
>>> -		 *************************************************************/
>>> +		 * We do local_irq_enable() before calling kvm_guest_exit() so
>>> +		 * that if a timer interrupt hits while running the guest we
>>> +		 * account that tick as being spent in the guest.  We enable
>>> +		 * preemption after calling kvm_guest_exit() so that if we get
>>> +		 * preempted we make sure ticks after that is not counted as
>>> +		 * guest time.
>>> +		 */
>>> +		kvm_guest_exit();
>>> +		trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>>> +		preempt_enable();
>>> +
>>>  
>>>  		kvm_timer_sync_hwstate(vcpu);
>>>  		kvm_vgic_sync_hwstate(vcpu);
>>>
>>
>> Hi Christoffer,
>>  so currently we take a snap shot when we enter the guest
>> (tsk->vtime_snap) and upon exit add the time we spent in
>> the guest and update accrued time, which appears correct.
> 
> not on arm64, because we don't select HAVE_VIRT_CPU_ACCOUNTING_GEN.  Or
> am I missing something obvious here?
I see what you mean we can't use cycle based accounting to accrue
Guest time.

> 
>>
>> With this patch it appears that interrupts running
>> in host mode are accrued to Guest time, and additional preemption
>> latency is added.
>>
> It is true that interrupt processing in host mode (if they hit on a CPU
> when it is running a guest) are accrued to guest time, but without this
> patch on current arm64 we accrue no CPU time to guest time at all, which
> is hardly more correct.
Yes if only ticks are supported then it's definitely better!

Nevertheless with high interrupt rate it will complicate root causing
issues, a lot of that time will go to guest.

> 
> If this patch is incorrect, then how does it work on x86, where
> handle_external_intr() is called (with a barrier in between) before
> kvm_guest_exit(), and where handle_external_intr() is simply
> local_irq_enable() on SVM and something more complicated on VMX ?
> 
> Finally, how exactly is preemption latency added here?   Won't IRQ
> processing run with higher priority than any task on your system, so the
> order of (1) process pending IRQs (2) call schedule if needed is still
> preserved here, but we call kvm_guest_exit() between (1) and (2) instead
> of before (1).

I may be missing something, but on return from interrupt with preempt
disabled we can't take the need resched path. And need to return
to KVM no?
> 
> It is entirely possible that I'm missing the mark here and everything
> gets solved by enabling HAVE_VIRT_CPU_ACCOUNTING_GEN or we need some
> extra logic?

I think something to look into for us, taking a low issue to high level
application - for state machine based type of applications (I guess like
NFV) it cause problems  to root cause issues, a lot of activities
run between ticks. For VM cycle based accounting is probably a must
in that case.

> 
> Thanks,
> -Christoffer
>

next prev parent reply	other threads:[~2015-06-01 15:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-28 18:49 [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time Christoffer Dall
2015-05-29 22:34 ` Mario Smarduch
2015-05-31  6:59   ` Christoffer Dall
2015-06-01 15:48     ` Mario Smarduch [this message]
2015-06-02  9:27       ` Christoffer Dall
2015-06-02 11:55         ` Christoffer Dall
2015-06-05 12:24         ` Mario Smarduch
2015-06-08 11:35           ` Christoffer Dall
2015-06-09 23:04             ` Mario Smarduch
2015-06-01  7:47 ` Christian Borntraeger
2015-06-01  9:08   ` Christoffer Dall
2015-06-01  9:21     ` Christian Borntraeger
2015-06-01 13:35       ` Christoffer Dall
2015-06-01 13:37         ` Christian Borntraeger
2015-06-02  9:28           ` Christoffer Dall
2015-06-01 11:34   ` Paolo Bonzini
2015-06-01 11:42     ` Christian Borntraeger
2015-06-01 11:52       ` Paolo Bonzini
2015-06-08 17:50 ` Marc Zyngier
2015-06-09 14:43   ` Christoffer Dall
2015-06-09 16:39     ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556C7EC6.3010105@samsung.com \
    --to=m.smarduch@samsung.com \
    --cc=borntraeger@de.ibm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox