Linux KVM/arm64 development list
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: Andrew Jones <drjones@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	kvmarm@lists.cs.columbia.edu, andre.przywara@arm.com
Subject: Re: [PATCH 0/8] KVM/ARM: Guest Entry/Exit optimizations
Date: Wed, 10 Feb 2016 12:24:06 +0000	[thread overview]
Message-ID: <56BB2BE6.8020600@arm.com> (raw)
In-Reply-To: <20160210120211.fjp2sxzivdhxug6p@hawk.localdomain>

On 10/02/16 12:02, Andrew Jones wrote:
> On Wed, Feb 10, 2016 at 08:34:21AM +0000, Marc Zyngier wrote:
>> On 09/02/16 20:59, Christoffer Dall wrote:
>>> On Mon, Feb 08, 2016 at 11:40:14AM +0000, Marc Zyngier wrote:
>>>> I've recently been looking at our entry/exit costs, and profiling
>>>> figures did show some very low hanging fruits.
>>>>
>>>> The most obvious cost is that accessing the GIC HW is slow. As in
>>>> "deadly slow", specially when GICv2 is involved. So not hammering the
>>>> HW when there is nothing to write is immediately beneficial, as this
>>>> is the most common cases (whatever people seem to think, interrupts
>>>> are a *rare* event).
>>>>
>>>> Another easy thing to fix is the way we handle trapped system
>>>> registers. We do insist on (mostly) sorting them, but we do perform a
>>>> linear search on trap. We can switch to a binary search for free, and
>>>> get immediate benefits (the PMU code, being extremely trap-happy,
>>>> benefits immediately from this).
>>>>
>>>> With these in place, I see an improvement of 20 to 30% (depending on
>>>> the platform) on our world-switch cycle count when running a set of
>>>> hand-crafted guests that are designed to only perform traps.
>>>
>>> I'm curious about the weight of these two?  My guess based on the
>>> measurement work I did is that the GIC is by far the worst sinner, but
>>> that was exacerbated on X-Gene compared to Seattle.
>>
>> Indeed, the GIC is the real pig. 80% of the benefit is provided by not
>> accessing it when not absolutely required. The sysreg access is only
>> visible for workloads that are extremely trap-happy, but that's what
>> happens with as soon as you start exercising the PMU code.
>>
>>>>
>>>> Methodology:
>>>>
>>>> * NULL-hypercall guest: Perform 65536 PSCI_0_2_FN_PSCI_VERSION calls,
>>>> and then a power-off:
>>>>
>>>> __start:
>>>> 	mov	x19, #(1 << 16)
>>>> 1:	mov	x0, #0x84000000
>>>> 	hvc	#0
>>>> 	sub	x19, x19, #1
>>>> 	cbnz	x19, 1b
>>>> 	mov	x0, #0x84000000
>>>> 	add	x0, x0, #9
>>>> 	hvc	#0
>>>> 	b	.
>>>>
>>>> * sysreg trap guest: Perform 2^20 PMSELR_EL0 accesses, and power-off:
>>>>
>>>> __start:
>>>> 	mov	x19, #(1 << 20)
>>>> 1:	mrs	x0, PMSELR_EL0
>>>> 	sub	x19, x19, #1
>>>> 	cbnz	x19, 1b
>>>> 	mov	x0, #0x84000000
>>>> 	add	x0, x0, #9
>>>> 	hvc	#0
>>>> 	b	.
>>>>
>>>> * These guests are profiled using perf and kvmtool:
>>>>
>>>> taskset -c 1 perf stat -e cycles:kh lkvm run -c1 --kernel do_sysreg.bin 2>&1 >/dev/null| grep cycles
>>>
>>> these would be good to add to kvm-unit-tests so we can keep an eye on
>>> this sort of thing...
> 
> I can work on that. (Actually I already had put this on my TODO when I
> saw this series. Your interest in it just bumped it up in priority :-)

Ah! You're in charge, then! ;-)

>>
>> Yeah, I was thinking of that too. In the meantime, I've also created a
>> GICv2 self-IPI test case, which has led to further improvement (a 10%
>> reduction in the number of cycles on Seattle). The ugly thing about that
>> test is that it knows where kvmtool places the GIC (I didn't fancy
>> parsing the DT in assembly code). Hopefully there is a way to abstract this.
> 
> I have a simple IPI test written for kvm-unit-tests already[*], but it's
> been laying around for a while. I can dust it off and make a self-IPI
> test out of it yet today though. I've been hesitating to post any gic
> related stuff to kvm-unit-tests, because I know Andre has been looking
> into it (and he has the gic expertise to do it more cleanly than I). I'll
> go ahead and post my little thing now though, as he can always review it
> and/or clean it up later :-)
> 
> [*] https://github.com/rhdrjones/kvm-unit-tests/commit/05af9b0361ac5eab58f46e5451e585c9625c3b75

For the record, the test case I've been running is this:

__start:
	mov	x19, #(1 << 20)

	mov	x0, #0x3fff0000		// Dist
	mov	x1, #0x3ffd0000		// CPU
	mov	w2, #1
	str	w2, [x0]		// Enable Group0
	ldr	w2, =0xa0a0a0a0
	str	w2, [x0, 0x400]		// A0 priority for SGI0-3
	mov	w2, #0x0f
	str	w2, [x0, #0x100]	// Enable SGI0-3
	mov	w2, #0xf0
	str	w2, [x1, #4]		// PMR
	mov	w2, #1
	str	w2, [x1]		// Enable CPU interface
	
1:
	mov	w2, #(2 << 24)		// Interrupt self with SGI0
	str	w2, [x0, #0xf00]

2:	ldr	w2, [x1, #0x0c]		// GICC_IAR
	cmp	w2, #0x3ff
	b.ne	3f

	wfi
	b	2b

3:	str	w2, [x1, #0x10]		// EOI

	sub	x19, x19, #1
	cbnz	x19, 1b

// Die
	mov	x0, #0x84000000
	add	x0, x0, #9
	hvc	#0
	b	.

Feel free to adapt it so it fits in your framework if you find it useful
(but I guess you'll be inclined to rewrite it in C).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2016-02-10 12:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-08 11:40 [PATCH 0/8] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
2016-02-08 11:40 ` [PATCH 1/8] arm64: KVM: Switch the sys_reg search to be a binary search Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-10 13:49   ` Alex Bennée
2016-02-10 14:00     ` Marc Zyngier
2016-02-08 11:40 ` [PATCH 2/8] ARM: KVM: Properly sort the invariant table Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-08 11:40 ` [PATCH 3/8] ARM: KVM: Enforce sorting of all CP tables Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-08 11:40 ` [PATCH 4/8] ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-08 11:40 ` [PATCH 5/8] ARM: KVM: Switch the CP reg search to be a binary search Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-08 11:40 ` [PATCH 6/8] KVM: arm/arm64: timer: Add active state caching Marc Zyngier
2016-02-10 12:44   ` Christoffer Dall
2016-02-08 11:40 ` [PATCH 7/8] KVM: arm/arm64: Avoid accessing GICH registers Marc Zyngier
2016-02-10 12:45   ` Christoffer Dall
2016-02-10 13:34     ` Marc Zyngier
2016-02-10 17:30       ` Christoffer Dall
2016-02-10 17:43         ` Marc Zyngier
2016-02-08 11:40 ` [PATCH 8/8] KVM: arm64: Avoid accessing ICH registers Marc Zyngier
2016-02-10 12:45   ` Christoffer Dall
2016-02-10 16:47     ` Marc Zyngier
2016-02-09 20:59 ` [PATCH 0/8] KVM/ARM: Guest Entry/Exit optimizations Christoffer Dall
2016-02-10  8:34   ` Marc Zyngier
2016-02-10 12:02     ` Andrew Jones
2016-02-10 12:24       ` Marc Zyngier [this message]
2016-02-10 20:40 ` Christoffer Dall
2016-02-16 20:05   ` Marc Zyngier
2016-02-17  9:15     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BB2BE6.8020600@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=drjones@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox