kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
@ 2016-02-17 16:40 Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 01/17] arm64: KVM: Switch the sys_reg search to be a binary search Marc Zyngier
                   ` (17 more replies)
  0 siblings, 18 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

I've recently been looking at our entry/exit costs, and profiling
figures did show some very low hanging fruits.

The most obvious cost is that accessing the GIC HW is slow. As in
"deadly slow", specially when GICv2 is involved. So not hammering the
HW when there is nothing to write (and even to read) is immediately
beneficial, as this is the most common cases (whatever people seem to
think, interrupts are a *rare* event). Similar work has also been done
for GICv3, with a reduced impact (it was less "bad" to start with).

Another easy thing to fix is the way we handle trapped system
registers. We do insist on (mostly) sorting them, but we do perform a
linear search on trap. We can switch to a binary search for free, and
get immediate benefits (the PMU code, being extremely trap-happy,
benefits immediately from this).

With these in place, I see an improvement of 10 to 40% (depending on
the platform) on our world-switch cycle count when running a set of
hand-crafted guests that are designed to only perform traps.

Please note that VM exits are actually a rare event on ARM. So don't
expect your guest to be 40% faster, this will hardly make a noticable
difference.

Methodology:

* NULL-hypercall guest: Perform 2^20 PSCI_0_2_FN_PSCI_VERSION calls,
and then a power-off:

__start:
	mov	x19, #(1 << 16)
1:	mov	x0, #0x84000000
	hvc	#0
	sub	x19, x19, #1
	cbnz	x19, 1b
	mov	x0, #0x84000000
	add	x0, x0, #9
	hvc	#0
	b	.

* Self IPI guest: Inject and handle 2^20 SGI0 using GICv2 or GICv3,
and then power-off:

__start:
	mov	x19, #(1 << 20)

	mrs	x0, id_aa64pfr0_el1
	ubfx	x0, x0, #24, #4
	and	x0, x0, #0xf
	cbz	x0, do_v2

	mrs	x0, s3_0_c12_c12_5	// ICC_SRE_EL1
	and	x0, x0, #1		// SRE bit
	cbnz	x0, do_v3

do_v2:
	mov	x0, #0x3fff0000		// Dist
	mov	x1, #0x3ffd0000		// CPU
	mov	w2, #1
	str	w2, [x0]		// Enable Group0
	ldr	w2, =0xa0a0a0a0
	str	w2, [x0, 0x400]		// A0 priority for SGI0-3
	mov	w2, #0x0f
	str	w2, [x0, #0x100]	// Enable SGI0-3
	mov	w2, #0xf0
	str	w2, [x1, #4]		// PMR
	mov	w2, #1
	str	w2, [x1]		// Enable CPU interface
	
1:
	mov	w2, #(2 << 24)		// Interrupt self with SGI0
	str	w2, [x0, #0xf00]

2:	ldr	w2, [x1, #0x0c]		// GICC_IAR
	cmp	w2, #0x3ff
	b.ne	3f

	wfi
	b	2b

3:	str	w2, [x1, #0x10]		// EOI

	sub	x19, x19, #1
	cbnz	x19, 1b

die:
	mov	x0, #0x84000000
	add	x0, x0, #9
	hvc	#0
	b	.

do_v3:
	mov	x0, #0x3fff0000		// Dist
	mov	x1, #0x3fbf0000		// Redist 0
	mov	x2, #0x10000
	add	x1, x1, x2		// SGI page
	mov	w2, #2
	str	w2, [x0]		// Enable Group1
	ldr	w2, =0xa0a0a0a0
	str	w2, [x1, 0x400]		// A0 priority for SGI0-3
	mov	w2, #0x0f
	str	w2, [x1, #0x100]	// Enable SGI0-3
	mov	w2, #0xf0
	msr	S3_0_c4_c6_0, x2	// PMR
	mov	w2, #1
	msr	S3_0_C12_C12_7, x2	// Enable Group1

1:
	mov	x2, #1
	msr	S3_0_c12_c11_5, x2	// Self SGI0

2:	mrs	x2, S3_0_c12_c12_0	// Read IAR1
	cmp	w2, #0x3ff
	b.ne	3f

	wfi
	b	2b

3:	msr	S3_0_c12_c12_1, x2	// EOI

	sub	x19, x19, #1
	cbnz	x19, 1b

	b	die

* sysreg trap guest: Perform 2^20 PMSELR_EL0 accesses, and power-off:

__start:
	mov	x19, #(1 << 20)
1:	mrs	x0, PMSELR_EL0
	sub	x19, x19, #1
	cbnz	x19, 1b
	mov	x0, #0x84000000
	add	x0, x0, #9
	hvc	#0
	b	.

* These guests are profiled using perf and kvmtool:

taskset -c 1 perf stat -e cycles:kh lkvm run -c1 --kernel do_sysreg.bin 2>&1 >/dev/null| grep cycles

The result is then divided by the number of iterations (2^20).

These tests have been run on three different platform (two GICv2
based, and one with GICv3 and legacy mode) and shown significant
improvements in all cases. I've only touched the arm64 GIC code, but
obviously the 32bit code should use it as well once we've migrated it
to C.

Vanilla v4.5-rc4
	     A             B            C-v2         C-v3
Null HVC:   8462          6566          6572         6505
Self SGI:  11961          8690          9541         8629
SysReg:     8952          6979          7212         7180

Patched v4.5-rc4
	     A             B            C-v2         C-v3
Null HVC:   5219  -38%    3957  -39%    5175  -21%   5158  -20%
Self SGI:   8946  -25%    6658  -23%    8547  -10%   7299  -15%
SysReg:     5314  -40%    4190  -40%    5417  -25%   5414  -24%

I've pushed out a branch (kvm-arm64/suck-less) to the usual location,
based on -rc4 + a few fixes I also posted today.

Thanks,

	M.

* From v1:
  - Fixed a nasty bug dealing with the active Priority Register
  - Maintenance interrupt lazy saving
  - More LR hackery
  - Adapted most of the series for GICv3 as well

Marc Zyngier (17):
  arm64: KVM: Switch the sys_reg search to be a binary search
  ARM: KVM: Properly sort the invariant table
  ARM: KVM: Enforce sorting of all CP tables
  ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit
  ARM: KVM: Switch the CP reg search to be a binary search
  KVM: arm/arm64: timer: Add active state caching
  arm64: KVM: vgic-v2: Avoid accessing GICH registers
  arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function
  arm64: KVM: vgic-v2: Do not save an LR known to be empty
  arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit
  arm64: KVM: vgic-v3: Avoid accessing ICH registers
  arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v3: Do not save an LR known to be empty
  arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation

 arch/arm/kvm/arm.c              |   1 +
 arch/arm/kvm/coproc.c           |  74 +++++----
 arch/arm/kvm/coproc.h           |   8 +-
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 144 +++++++++++++----
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 333 ++++++++++++++++++++++++++--------------
 arch/arm64/kvm/sys_regs.c       |  40 ++---
 include/kvm/arm_arch_timer.h    |   5 +
 include/kvm/arm_vgic.h          |   8 +-
 virt/kvm/arm/arch_timer.c       |  31 ++++
 virt/kvm/arm/vgic-v2-emul.c     |  10 +-
 virt/kvm/arm/vgic-v3.c          |   4 +-
 11 files changed, 452 insertions(+), 206 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 01/17] arm64: KVM: Switch the sys_reg search to be a binary search
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 02/17] ARM: KVM: Properly sort the invariant table Marc Zyngier
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

Our 64bit sys_reg table is about 90 entries long (so far, and the
PMU support is likely to increase this). This means that on average,
it takes 45 comparaisons to find the right entry (and actually the
full 90 if we have to search the invariant table).

Not the most efficient thing. Specially when you think that this
table is already sorted. Switching to a binary search effectively
reduces the search to about 7 comparaisons. Slightly better!

As an added bonus, the comparison is done by comparing all the
fields at once, instead of one at a time.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 40 ++++++++++++++++++++++------------------
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e90371..86dfb08 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -20,6 +20,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/bsearch.h>
 #include <linux/kvm_host.h>
 #include <linux/mm.h>
 #include <linux/uaccess.h>
@@ -942,29 +943,32 @@ static const struct sys_reg_desc *get_target_table(unsigned target,
 	}
 }
 
+#define reg_to_match_value(x)						\
+	({								\
+		unsigned long val;					\
+		val  = (x)->Op0 << 14;					\
+		val |= (x)->Op1 << 11;					\
+		val |= (x)->CRn << 7;					\
+		val |= (x)->CRm << 3;					\
+		val |= (x)->Op2;					\
+		val;							\
+	 })
+
+static int match_sys_reg(const void *key, const void *elt)
+{
+	const unsigned long pval = (unsigned long)key;
+	const struct sys_reg_desc *r = elt;
+
+	return pval - reg_to_match_value(r);
+}
+
 static const struct sys_reg_desc *find_reg(const struct sys_reg_params *params,
 					 const struct sys_reg_desc table[],
 					 unsigned int num)
 {
-	unsigned int i;
-
-	for (i = 0; i < num; i++) {
-		const struct sys_reg_desc *r = &table[i];
+	unsigned long pval = reg_to_match_value(params);
 
-		if (params->Op0 != r->Op0)
-			continue;
-		if (params->Op1 != r->Op1)
-			continue;
-		if (params->CRn != r->CRn)
-			continue;
-		if (params->CRm != r->CRm)
-			continue;
-		if (params->Op2 != r->Op2)
-			continue;
-
-		return r;
-	}
-	return NULL;
+	return bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
 }
 
 int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 02/17] ARM: KVM: Properly sort the invariant table
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 01/17] arm64: KVM: Switch the sys_reg search to be a binary search Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 03/17] ARM: KVM: Enforce sorting of all CP tables Marc Zyngier
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

Not having the invariant table properly sorted is an oddity, and
may get in the way of future optimisations. Let's fix it.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/coproc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index f3d88dc..16c74f8 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -645,6 +645,9 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 3), is32, NULL, get_TLBTR },
 	{ CRn( 0), CRm( 0), Op1( 0), Op2( 6), is32, NULL, get_REVIDR },
 
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
+	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
+
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, NULL, get_ID_PFR0 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 1), is32, NULL, get_ID_PFR1 },
 	{ CRn( 0), CRm( 1), Op1( 0), Op2( 2), is32, NULL, get_ID_DFR0 },
@@ -660,9 +663,6 @@ static struct coproc_reg invariant_cp15[] = {
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 3), is32, NULL, get_ID_ISAR3 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 4), is32, NULL, get_ID_ISAR4 },
 	{ CRn( 0), CRm( 2), Op1( 0), Op2( 5), is32, NULL, get_ID_ISAR5 },
-
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 1), is32, NULL, get_CLIDR },
-	{ CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR },
 };
 
 /*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 03/17] ARM: KVM: Enforce sorting of all CP tables
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 01/17] arm64: KVM: Switch the sys_reg search to be a binary search Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 02/17] ARM: KVM: Properly sort the invariant table Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 04/17] ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit Marc Zyngier
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

Since we're obviously terrible at sorting the CP tables, make sure
we're going to do it properly (or fail to boot). arm64 has had the
same mechanism for a while, and nobody ever broke it...

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/coproc.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 16c74f8..03f5d14 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -381,17 +381,26 @@ static const struct coproc_reg cp15_regs[] = {
 	{ CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar},
 };
 
+static int check_reg_table(const struct coproc_reg *table, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 1; i < n; i++) {
+		if (cmp_reg(&table[i-1], &table[i]) >= 0) {
+			kvm_err("reg table %p out of order (%d)\n", table, i - 1);
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 /* Target specific emulation tables */
 static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS];
 
 void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table)
 {
-	unsigned int i;
-
-	for (i = 1; i < table->num; i++)
-		BUG_ON(cmp_reg(&table->table[i-1],
-			       &table->table[i]) >= 0);
-
+	BUG_ON(check_reg_table(table->table, table->num));
 	target_tables[table->target] = table;
 }
 
@@ -1210,8 +1219,8 @@ void kvm_coproc_table_init(void)
 	unsigned int i;
 
 	/* Make sure tables are unique and in order. */
-	for (i = 1; i < ARRAY_SIZE(cp15_regs); i++)
-		BUG_ON(cmp_reg(&cp15_regs[i-1], &cp15_regs[i]) >= 0);
+	BUG_ON(check_reg_table(cp15_regs, ARRAY_SIZE(cp15_regs)));
+	BUG_ON(check_reg_table(invariant_cp15, ARRAY_SIZE(invariant_cp15)));
 
 	/* We abuse the reset function to overwrite the table itself. */
 	for (i = 0; i < ARRAY_SIZE(invariant_cp15); i++)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 04/17] ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (2 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 03/17] ARM: KVM: Enforce sorting of all CP tables Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 05/17] ARM: KVM: Switch the CP reg search to be a binary search Marc Zyngier
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

As we're going to play some tricks on the struct coproc_reg,
make sure its 64bit indicator field matches that of coproc_params.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/coproc.c | 4 ++--
 arch/arm/kvm/coproc.h | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 03f5d14..2a67f00 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -423,7 +423,7 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params,
 	for (i = 0; i < num; i++) {
 		const struct coproc_reg *r = &table[i];
 
-		if (params->is_64bit != r->is_64)
+		if (params->is_64bit != r->is_64bit)
 			continue;
 		if (params->CRn != r->CRn)
 			continue;
@@ -1105,7 +1105,7 @@ static int write_demux_regids(u64 __user *uindices)
 static u64 cp15_to_index(const struct coproc_reg *reg)
 {
 	u64 val = KVM_REG_ARM | (15 << KVM_REG_ARM_COPROC_SHIFT);
-	if (reg->is_64) {
+	if (reg->is_64bit) {
 		val |= KVM_REG_SIZE_U64;
 		val |= (reg->Op1 << KVM_REG_ARM_OPC1_SHIFT);
 		/*
diff --git a/arch/arm/kvm/coproc.h b/arch/arm/kvm/coproc.h
index 88d24a3..5acd097 100644
--- a/arch/arm/kvm/coproc.h
+++ b/arch/arm/kvm/coproc.h
@@ -37,7 +37,7 @@ struct coproc_reg {
 	unsigned long Op1;
 	unsigned long Op2;
 
-	bool is_64;
+	bool is_64bit;
 
 	/* Trapped access from guest, if non-NULL. */
 	bool (*access)(struct kvm_vcpu *,
@@ -141,7 +141,7 @@ static inline int cmp_reg(const struct coproc_reg *i1,
 		return i1->Op1 - i2->Op1;
 	if (i1->Op2 != i2->Op2)
 		return i1->Op2 - i2->Op2;
-	return i2->is_64 - i1->is_64;
+	return i2->is_64bit - i1->is_64bit;
 }
 
 
@@ -150,8 +150,8 @@ static inline int cmp_reg(const struct coproc_reg *i1,
 #define CRm64(_x)       .CRn = _x, .CRm = 0
 #define Op1(_x) 	.Op1 = _x
 #define Op2(_x) 	.Op2 = _x
-#define is64		.is_64 = true
-#define is32		.is_64 = false
+#define is64		.is_64bit = true
+#define is32		.is_64bit = false
 
 bool access_vm_reg(struct kvm_vcpu *vcpu,
 		   const struct coproc_params *p,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 05/17] ARM: KVM: Switch the CP reg search to be a binary search
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (3 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 04/17] ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 06/17] KVM: arm/arm64: timer: Add active state caching Marc Zyngier
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

Doing a linear search is a bit silly when we can do a binary search.
Not that we trap that so many things that it has become a burden yet,
but it makes sense to align it with the arm64 code.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/coproc.c | 41 +++++++++++++++++++++++------------------
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index 2a67f00..4f1c869 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -16,6 +16,8 @@
  * along with this program; if not, write to the Free Software
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
+
+#include <linux/bsearch.h>
 #include <linux/mm.h>
 #include <linux/kvm_host.h>
 #include <linux/uaccess.h>
@@ -414,29 +416,32 @@ static const struct coproc_reg *get_target_table(unsigned target, size_t *num)
 	return table->table;
 }
 
+#define reg_to_match_value(x)						\
+	({								\
+		unsigned long val;					\
+		val  = (x)->CRn << 11;					\
+		val |= (x)->CRm << 7;					\
+		val |= (x)->Op1 << 4;					\
+		val |= (x)->Op2 << 1;					\
+		val |= !(x)->is_64bit;					\
+		val;							\
+	 })
+
+static int match_reg(const void *key, const void *elt)
+{
+	const unsigned long pval = (unsigned long)key;
+	const struct coproc_reg *r = elt;
+
+	return pval - reg_to_match_value(r);
+}
+
 static const struct coproc_reg *find_reg(const struct coproc_params *params,
 					 const struct coproc_reg table[],
 					 unsigned int num)
 {
-	unsigned int i;
-
-	for (i = 0; i < num; i++) {
-		const struct coproc_reg *r = &table[i];
+	unsigned long pval = reg_to_match_value(params);
 
-		if (params->is_64bit != r->is_64bit)
-			continue;
-		if (params->CRn != r->CRn)
-			continue;
-		if (params->CRm != r->CRm)
-			continue;
-		if (params->Op1 != r->Op1)
-			continue;
-		if (params->Op2 != r->Op2)
-			continue;
-
-		return r;
-	}
-	return NULL;
+	return bsearch((void *)pval, table, num, sizeof(table[0]), match_reg);
 }
 
 static int emulate_cp15(struct kvm_vcpu *vcpu,
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 06/17] KVM: arm/arm64: timer: Add active state caching
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (4 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 05/17] ARM: KVM: Switch the CP reg search to be a binary search Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers Marc Zyngier
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

Programming the active state in the (re)distributor can be an
expensive operation so it makes some sense to try and reduce
the number of accesses as much as possible. So far, we
program the active state on each VM entry, but there is some
opportunity to do less.

An obvious solution is to cache the active state in memory,
and only program it in the HW when conditions change. But
because the HW can also change things under our feet (the active
state can transition from 1 to 0 when the guest does an EOI),
some precautions have to be taken, which amount to only caching
an "inactive" state, and always programing it otherwise.

With this in place, we observe a reduction of around 700 cycles
on a 2GHz GICv2 platform for a NULL hypercall.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/arm.c           |  1 +
 include/kvm/arm_arch_timer.h |  5 +++++
 virt/kvm/arm/arch_timer.c    | 31 +++++++++++++++++++++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959..af7c1a3 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -320,6 +320,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	vcpu->cpu = -1;
 
 	kvm_arm_set_running_vcpu(NULL);
+	kvm_timer_vcpu_put(vcpu);
 }
 
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 1800227..b651aed 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -55,6 +55,9 @@ struct arch_timer_cpu {
 
 	/* VGIC mapping */
 	struct irq_phys_map		*map;
+
+	/* Active IRQ state caching */
+	bool				active_cleared_last;
 };
 
 int kvm_timer_hyp_init(void);
@@ -74,4 +77,6 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
 void kvm_timer_schedule(struct kvm_vcpu *vcpu);
 void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
 
+void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
+
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 69bca18..bfec447 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -34,6 +34,11 @@ static struct timecounter *timecounter;
 static struct workqueue_struct *wqueue;
 static unsigned int host_vtimer_irq;
 
+void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.timer_cpu.active_cleared_last = false;
+}
+
 static cycle_t kvm_phys_timer_read(void)
 {
 	return timecounter->cc->read(timecounter->cc);
@@ -130,6 +135,7 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level)
 
 	BUG_ON(!vgic_initialized(vcpu->kvm));
 
+	timer->active_cleared_last = false;
 	timer->irq.level = new_level;
 	trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->map->virt_irq,
 				   timer->irq.level);
@@ -242,10 +248,35 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 	else
 		phys_active = false;
 
+	/*
+	 * We want to avoid hitting the (re)distributor as much as
+	 * possible, as this is a potentially expensive MMIO access
+	 * (not to mention locks in the irq layer), and a solution for
+	 * this is to cache the "active" state in memory.
+	 *
+	 * Things to consider: we cannot cache an "active set" state,
+	 * because the HW can change this behind our back (it becomes
+	 * "clear" in the HW). We must then restrict the caching to
+	 * the "clear" state.
+	 *
+	 * The cache is invalidated on:
+	 * - vcpu put, indicating that the HW cannot be trusted to be
+	 *   in a sane state on the next vcpu load,
+	 * - any change in the interrupt state
+	 *
+	 * Usage conditions:
+	 * - cached value is "active clear"
+	 * - value to be programmed is "active clear"
+	 */
+	if (timer->active_cleared_last && !phys_active)
+		return;
+
 	ret = irq_set_irqchip_state(timer->map->irq,
 				    IRQCHIP_STATE_ACTIVE,
 				    phys_active);
 	WARN_ON(ret);
+
+	timer->active_cleared_last = !phys_active;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (5 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 06/17] KVM: arm/arm64: timer: Add active state caching Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required Marc Zyngier
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

GICv2 registers are *slow*. As in "terrifyingly slow". Which is bad.
But we're equaly bad, as we make a point in accessing them even if
we don't have any interrupt in flight.

A good solution is to first find out if we have anything useful to
write into the GIC, and if we don't, to simply not do it. This
involves tracking which LRs actually have something valid there.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 72 ++++++++++++++++++++++++++++-------------
 include/kvm/arm_vgic.h          |  2 ++
 2 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
index e717612..5ab8d63 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -38,28 +38,41 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 
 	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
 	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
-	cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
-	eisr0  = readl_relaxed(base + GICH_EISR0);
-	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
-	if (unlikely(nr_lr > 32)) {
-		eisr1  = readl_relaxed(base + GICH_EISR1);
-		elrsr1 = readl_relaxed(base + GICH_ELRSR1);
-	} else {
-		eisr1 = elrsr1 = 0;
-	}
+
+	if (vcpu->arch.vgic_cpu.live_lrs) {
+		eisr0  = readl_relaxed(base + GICH_EISR0);
+		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
+		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
+		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
+
+		if (unlikely(nr_lr > 32)) {
+			eisr1  = readl_relaxed(base + GICH_EISR1);
+			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
+		} else {
+			eisr1 = elrsr1 = 0;
+		}
+
 #ifdef CONFIG_CPU_BIG_ENDIAN
-	cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
-	cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
+		cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
+		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
 #else
-	cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
-	cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
+		cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
+		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
 #endif
-	cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
 
-	writel_relaxed(0, base + GICH_HCR);
+		for (i = 0; i < nr_lr; i++)
+			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
+				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
 
-	for (i = 0; i < nr_lr; i++)
-		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
+		writel_relaxed(0, base + GICH_HCR);
+
+		vcpu->arch.vgic_cpu.live_lrs = 0;
+	} else {
+		cpu_if->vgic_eisr = 0;
+		cpu_if->vgic_elrsr = ~0UL;
+		cpu_if->vgic_misr = 0;
+		cpu_if->vgic_apr = 0;
+	}
 }
 
 /* vcpu is already in the HYP VA space */
@@ -70,15 +83,30 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
 	struct vgic_dist *vgic = &kvm->arch.vgic;
 	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
 	int i, nr_lr;
+	u64 live_lrs = 0;
 
 	if (!base)
 		return;
 
-	writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
-	writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR);
-	writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
-
 	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
+
 	for (i = 0; i < nr_lr; i++)
-		writel_relaxed(cpu_if->vgic_lr[i], base + GICH_LR0 + (i * 4));
+		if (cpu_if->vgic_lr[i] & GICH_LR_STATE)
+			live_lrs |= 1UL << i;
+
+	if (live_lrs) {
+		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
+		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
+		for (i = 0; i < nr_lr; i++) {
+			u32 val = 0;
+
+			if (live_lrs & (1UL << i))
+				val = cpu_if->vgic_lr[i];
+
+			writel_relaxed(val, base + GICH_LR0 + (i * 4));
+		}
+	}
+
+	writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR);
+	vcpu->arch.vgic_cpu.live_lrs = live_lrs;
 }
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 13a3d53..f473fd6 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -321,6 +321,8 @@ struct vgic_cpu {
 
 	/* Protected by the distributor's irq_phys_map_lock */
 	struct list_head	irq_phys_map_list;
+
+	u64		live_lrs;
 };
 
 #define LR_EMPTY	0xff
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (6 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function Marc Zyngier
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

Next on our list of useless accesses is the maintenance interrupt
status registers (GICH_MISR, GICH_EISR{0,1}).

It is pointless to save them if we haven't asked for a maintenance
interrupt the first place, which can only happen for two reasons:
- Underflow: GICH_HCR_UIE will be set,
- EOI: GICH_LR_EOI will be set.

These conditions can be checked on the in-memory copies of the regs.
Should any of these two condition be valid, we must read GICH_MISR.
We can then check for GICH_MISR_EOI, and only when set read
GICH_EISR*.

This means that in most case, we don't have to save them at all.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 54 +++++++++++++++++++++++++++++++++++------
 1 file changed, 47 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
index 5ab8d63..1bda5ce 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -23,6 +23,49 @@
 
 #include "hyp.h"
 
+static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu,
+					    void __iomem *base)
+{
+	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
+	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
+	u32 eisr0, eisr1;
+	int i;
+	bool expect_mi;
+
+	expect_mi = !!(cpu_if->vgic_hcr & GICH_HCR_UIE);
+
+	for (i = 0; i < nr_lr; i++) {
+		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
+				continue;
+
+		expect_mi |= (!(cpu_if->vgic_lr[i] & GICH_LR_HW) &&
+			      (cpu_if->vgic_lr[i] & GICH_LR_EOI));
+	}
+
+	if (expect_mi) {
+		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
+
+		if (cpu_if->vgic_misr & GICH_MISR_EOI) {
+			eisr0  = readl_relaxed(base + GICH_EISR0);
+			if (unlikely(nr_lr > 32))
+				eisr1  = readl_relaxed(base + GICH_EISR1);
+			else
+				eisr1 = 0;
+		} else {
+			eisr0 = eisr1 = 0;
+		}
+	} else {
+		cpu_if->vgic_misr = 0;
+		eisr0 = eisr1 = 0;
+	}
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+	cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1;
+#else
+	cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0;
+#endif
+}
+
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 {
@@ -30,7 +73,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
 	struct vgic_dist *vgic = &kvm->arch.vgic;
 	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
-	u32 eisr0, eisr1, elrsr0, elrsr1;
+	u32 elrsr0, elrsr1;
 	int i, nr_lr;
 
 	if (!base)
@@ -40,26 +83,23 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
 
 	if (vcpu->arch.vgic_cpu.live_lrs) {
-		eisr0  = readl_relaxed(base + GICH_EISR0);
 		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
-		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
 		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
 
 		if (unlikely(nr_lr > 32)) {
-			eisr1  = readl_relaxed(base + GICH_EISR1);
 			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
 		} else {
-			eisr1 = elrsr1 = 0;
+			elrsr1 = 0;
 		}
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
-		cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
 		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
 #else
-		cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
 		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
 #endif
 
+		save_maint_int_state(vcpu, base);
+ 
 		for (i = 0; i < nr_lr; i++)
 			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
 				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (7 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty Marc Zyngier
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

In order to make the saving path slightly more readable and
prepare for some more optimizations, let's more the GICH_ELRSR
saving to its own function.

No functional change.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
index 1bda5ce..c281374 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -66,6 +66,25 @@ static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu,
 #endif
 }
 
+static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
+{
+	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
+	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
+	u32 elrsr0, elrsr1;
+
+	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
+	if (unlikely(nr_lr > 32))
+		elrsr1 = readl_relaxed(base + GICH_ELRSR1);
+	else
+		elrsr1 = 0;
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+	cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
+#else
+	cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
+#endif
+}
+
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 {
@@ -73,7 +92,6 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
 	struct vgic_dist *vgic = &kvm->arch.vgic;
 	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
-	u32 elrsr0, elrsr1;
 	int i, nr_lr;
 
 	if (!base)
@@ -83,22 +101,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
 
 	if (vcpu->arch.vgic_cpu.live_lrs) {
-		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
-		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
-
-		if (unlikely(nr_lr > 32)) {
-			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
-		} else {
-			elrsr1 = 0;
-		}
-
-#ifdef CONFIG_CPU_BIG_ENDIAN
-		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
-#else
-		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
-#endif
+		cpu_if->vgic_apr = readl_relaxed(base + GICH_APR);
 
 		save_maint_int_state(vcpu, base);
+		save_elrsr(vcpu, base);
  
 		for (i = 0; i < nr_lr; i++)
 			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (8 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit Marc Zyngier
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

On exit, any empty LR will be signaled in GICH_ELRSR*. Which
means that we do not have to save it, and we can just clear
its state in the in-memory copy.

Take this opportunity to move the LR saving code into its
own function.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
index c281374..3dbbc6b 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -85,6 +85,25 @@ static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
 #endif
 }
 
+static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
+{
+	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
+	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
+	int i;
+
+	for (i = 0; i < nr_lr; i++) {
+		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
+			continue;
+
+		if (cpu_if->vgic_elrsr & (1UL << i)) {
+			cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
+			continue;
+		}
+
+		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
+	}
+}
+
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 {
@@ -92,12 +111,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
 	struct vgic_dist *vgic = &kvm->arch.vgic;
 	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
-	int i, nr_lr;
 
 	if (!base)
 		return;
 
-	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
 	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
 
 	if (vcpu->arch.vgic_cpu.live_lrs) {
@@ -105,10 +122,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 
 		save_maint_int_state(vcpu, base);
 		save_elrsr(vcpu, base);
- 
-		for (i = 0; i < nr_lr; i++)
-			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
-				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
+		save_lrs(vcpu, base);
 
 		writel_relaxed(0, base + GICH_HCR);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (9 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit Marc Zyngier
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

So far, we're always writing all possible LRs, setting the empty
ones with a zero value. This is obvious doing a low of work for
nothing, and we're better off clearing those we've actually
dirtied on the exit path (it is very rare to inject more than one
interrupt at a time anyway).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
index 3dbbc6b..e53f131 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
 		}
 
 		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
+		writel_relaxed(0, base + GICH_LR0 + (i * 4));
 	}
 }
 
@@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
 		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
 		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
 		for (i = 0; i < nr_lr; i++) {
-			u32 val = 0;
-
-			if (live_lrs & (1UL << i))
-				val = cpu_if->vgic_lr[i];
+			if (!(live_lrs & (1UL << i)))
+				continue;
 
-			writel_relaxed(val, base + GICH_LR0 + (i * 4));
+			writel_relaxed(cpu_if->vgic_lr[i],
+				       base + GICH_LR0 + (i * 4));
 		}
 	}
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (10 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-02 23:08   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 13/17] arm64: KVM: vgic-v3: Avoid accessing ICH registers Marc Zyngier
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

The GICD_SGIR register lives a long way from the beginning of
the handler array, which is searched linearly. As this is hit
pretty often, let's move it up. This saves us some precious
cycles when the guest is generating IPIs.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic-v2-emul.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..1b0bee0 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -321,6 +321,11 @@ static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
 
 static const struct vgic_io_range vgic_dist_ranges[] = {
 	{
+		.base		= GIC_DIST_SOFTINT,
+		.len		= 4,
+		.handle_mmio	= handle_mmio_sgi_reg,
+	},
+	{
 		.base		= GIC_DIST_CTRL,
 		.len		= 12,
 		.bits_per_irq	= 0,
@@ -387,11 +392,6 @@ static const struct vgic_io_range vgic_dist_ranges[] = {
 		.handle_mmio	= handle_mmio_cfg_reg,
 	},
 	{
-		.base		= GIC_DIST_SOFTINT,
-		.len		= 4,
-		.handle_mmio	= handle_mmio_sgi_reg,
-	},
-	{
 		.base		= GIC_DIST_SGI_PENDING_CLEAR,
 		.len		= VGIC_NR_SGIS,
 		.handle_mmio	= handle_mmio_sgi_clear,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 13/17] arm64: KVM: vgic-v3: Avoid accessing ICH registers
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (11 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-02-17 16:40 ` [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required Marc Zyngier
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

Just like on GICv2, we're a bit hammer-happy with GICv3, and access
them more often than we should.

Adopt a policy similar to what we do for GICv2, only save/restoring
the minimal set of registers. As we don't access the registers
linearly anymore (we may skip some), the convoluted accessors become
slightly simpler, and we can drop the ugly indexing macro that
tended to confuse the reviewers.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 294 +++++++++++++++++++++++++---------------
 include/kvm/arm_vgic.h          |   6 -
 virt/kvm/arm/vgic-v3.c          |   4 +-
 3 files changed, 183 insertions(+), 121 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 93f6c5c..b62c923 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -39,12 +39,104 @@
 		asm volatile("msr_s " __stringify(r) ", %0" : : "r" (__val));\
 	} while (0)
 
-/* vcpu is already in the HYP VA space */
+static u64 __hyp_text __gic_v3_get_lr(unsigned int lr)
+{
+	switch (lr & 0xf) {
+	case 0:
+		return read_gicreg(ICH_LR0_EL2);
+	case 1:
+		return read_gicreg(ICH_LR1_EL2);
+	case 2:
+		return read_gicreg(ICH_LR2_EL2);
+	case 3:
+		return read_gicreg(ICH_LR3_EL2);
+	case 4:
+		return read_gicreg(ICH_LR4_EL2);
+	case 5:
+		return read_gicreg(ICH_LR5_EL2);
+	case 6:
+		return read_gicreg(ICH_LR6_EL2);
+	case 7:
+		return read_gicreg(ICH_LR7_EL2);
+	case 8:
+		return read_gicreg(ICH_LR8_EL2);
+	case 9:
+		return read_gicreg(ICH_LR9_EL2);
+	case 10:
+		return read_gicreg(ICH_LR10_EL2);
+	case 11:
+		return read_gicreg(ICH_LR11_EL2);
+	case 12:
+		return read_gicreg(ICH_LR12_EL2);
+	case 13:
+		return read_gicreg(ICH_LR13_EL2);
+	case 14:
+		return read_gicreg(ICH_LR14_EL2);
+	case 15:
+		return read_gicreg(ICH_LR15_EL2);
+	}
+
+	unreachable();
+}
+
+static void __hyp_text __gic_v3_set_lr(u64 val, int lr)
+{
+	switch (lr & 0xf) {
+	case 0:
+		write_gicreg(val, ICH_LR0_EL2);
+		break;
+	case 1:
+		write_gicreg(val, ICH_LR1_EL2);
+		break;
+	case 2:
+		write_gicreg(val, ICH_LR2_EL2);
+		break;
+	case 3:
+		write_gicreg(val, ICH_LR3_EL2);
+		break;
+	case 4:
+		write_gicreg(val, ICH_LR4_EL2);
+		break;
+	case 5:
+		write_gicreg(val, ICH_LR5_EL2);
+		break;
+	case 6:
+		write_gicreg(val, ICH_LR6_EL2);
+		break;
+	case 7:
+		write_gicreg(val, ICH_LR7_EL2);
+		break;
+	case 8:
+		write_gicreg(val, ICH_LR8_EL2);
+		break;
+	case 9:
+		write_gicreg(val, ICH_LR9_EL2);
+		break;
+	case 10:
+		write_gicreg(val, ICH_LR10_EL2);
+		break;
+	case 11:
+		write_gicreg(val, ICH_LR11_EL2);
+		break;
+	case 12:
+		write_gicreg(val, ICH_LR12_EL2);
+		break;
+	case 13:
+		write_gicreg(val, ICH_LR13_EL2);
+		break;
+	case 14:
+		write_gicreg(val, ICH_LR14_EL2);
+		break;
+	case 15:
+		write_gicreg(val, ICH_LR15_EL2);
+		break;
+	}
+}
+
 void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 	u64 val;
-	u32 max_lr_idx, nr_pri_bits;
 
 	/*
 	 * Make sure stores to the GIC via the memory mapped interface
@@ -53,68 +145,58 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 	dsb(st);
 
 	cpu_if->vgic_vmcr  = read_gicreg(ICH_VMCR_EL2);
-	cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
-	cpu_if->vgic_eisr  = read_gicreg(ICH_EISR_EL2);
-	cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2);
 
-	write_gicreg(0, ICH_HCR_EL2);
-	val = read_gicreg(ICH_VTR_EL2);
-	max_lr_idx = vtr_to_max_lr_idx(val);
-	nr_pri_bits = vtr_to_nr_pri_bits(val);
+	if (vcpu->arch.vgic_cpu.live_lrs) {
+		int i;
+		u32 max_lr_idx, nr_pri_bits;
 
-	switch (max_lr_idx) {
-	case 15:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(15)] = read_gicreg(ICH_LR15_EL2);
-	case 14:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(14)] = read_gicreg(ICH_LR14_EL2);
-	case 13:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(13)] = read_gicreg(ICH_LR13_EL2);
-	case 12:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(12)] = read_gicreg(ICH_LR12_EL2);
-	case 11:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(11)] = read_gicreg(ICH_LR11_EL2);
-	case 10:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(10)] = read_gicreg(ICH_LR10_EL2);
-	case 9:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(9)] = read_gicreg(ICH_LR9_EL2);
-	case 8:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(8)] = read_gicreg(ICH_LR8_EL2);
-	case 7:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(7)] = read_gicreg(ICH_LR7_EL2);
-	case 6:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(6)] = read_gicreg(ICH_LR6_EL2);
-	case 5:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(5)] = read_gicreg(ICH_LR5_EL2);
-	case 4:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(4)] = read_gicreg(ICH_LR4_EL2);
-	case 3:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(3)] = read_gicreg(ICH_LR3_EL2);
-	case 2:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(2)] = read_gicreg(ICH_LR2_EL2);
-	case 1:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(1)] = read_gicreg(ICH_LR1_EL2);
-	case 0:
-		cpu_if->vgic_lr[VGIC_V3_LR_INDEX(0)] = read_gicreg(ICH_LR0_EL2);
-	}
+		cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
+		cpu_if->vgic_eisr  = read_gicreg(ICH_EISR_EL2);
+		cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2);
 
-	switch (nr_pri_bits) {
-	case 7:
-		cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
-		cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
-	case 6:
-		cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
-	default:
-		cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
-	}
+		write_gicreg(0, ICH_HCR_EL2);
+		val = read_gicreg(ICH_VTR_EL2);
+		max_lr_idx = vtr_to_max_lr_idx(val);
+		nr_pri_bits = vtr_to_nr_pri_bits(val);
 
-	switch (nr_pri_bits) {
-	case 7:
-		cpu_if->vgic_ap1r[3] = read_gicreg(ICH_AP1R3_EL2);
-		cpu_if->vgic_ap1r[2] = read_gicreg(ICH_AP1R2_EL2);
-	case 6:
-		cpu_if->vgic_ap1r[1] = read_gicreg(ICH_AP1R1_EL2);
-	default:
-		cpu_if->vgic_ap1r[0] = read_gicreg(ICH_AP1R0_EL2);
+		for (i = 0; i <= max_lr_idx; i++) {
+			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
+				cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
+		}
+
+		switch (nr_pri_bits) {
+		case 7:
+			cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
+			cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
+		case 6:
+			cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
+		default:
+			cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
+		}
+
+		switch (nr_pri_bits) {
+		case 7:
+			cpu_if->vgic_ap1r[3] = read_gicreg(ICH_AP1R3_EL2);
+			cpu_if->vgic_ap1r[2] = read_gicreg(ICH_AP1R2_EL2);
+		case 6:
+			cpu_if->vgic_ap1r[1] = read_gicreg(ICH_AP1R1_EL2);
+		default:
+			cpu_if->vgic_ap1r[0] = read_gicreg(ICH_AP1R0_EL2);
+		}
+
+		vcpu->arch.vgic_cpu.live_lrs = 0;
+	} else {
+		cpu_if->vgic_misr  = 0;
+		cpu_if->vgic_eisr  = 0;
+		cpu_if->vgic_elrsr = 0xffff;
+		cpu_if->vgic_ap0r[0] = 0;
+		cpu_if->vgic_ap0r[1] = 0;
+		cpu_if->vgic_ap0r[2] = 0;
+		cpu_if->vgic_ap0r[3] = 0;
+		cpu_if->vgic_ap1r[0] = 0;
+		cpu_if->vgic_ap1r[1] = 0;
+		cpu_if->vgic_ap1r[2] = 0;
+		cpu_if->vgic_ap1r[3] = 0;
 	}
 
 	val = read_gicreg(ICC_SRE_EL2);
@@ -128,6 +210,8 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 	u64 val;
 	u32 max_lr_idx, nr_pri_bits;
+	u16 live_lrs = 0;
+	int i;
 
 	/*
 	 * VFIQEn is RES1 if ICC_SRE_EL1.SRE is 1. This causes a
@@ -140,68 +224,51 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 	write_gicreg(cpu_if->vgic_sre, ICC_SRE_EL1);
 	isb();
 
-	write_gicreg(cpu_if->vgic_hcr, ICH_HCR_EL2);
-	write_gicreg(cpu_if->vgic_vmcr, ICH_VMCR_EL2);
-
 	val = read_gicreg(ICH_VTR_EL2);
 	max_lr_idx = vtr_to_max_lr_idx(val);
 	nr_pri_bits = vtr_to_nr_pri_bits(val);
 
-	switch (nr_pri_bits) {
-	case 7:
-		 write_gicreg(cpu_if->vgic_ap0r[3], ICH_AP0R3_EL2);
-		 write_gicreg(cpu_if->vgic_ap0r[2], ICH_AP0R2_EL2);
-	case 6:
-		 write_gicreg(cpu_if->vgic_ap0r[1], ICH_AP0R1_EL2);
-	default:
-		 write_gicreg(cpu_if->vgic_ap0r[0], ICH_AP0R0_EL2);
+	for (i = 0; i <= max_lr_idx; i++) {
+		if (cpu_if->vgic_lr[i] & ICH_LR_STATE)
+			live_lrs |= (1 << i);
 	}
 
-	switch (nr_pri_bits) {
-	case 7:
-		 write_gicreg(cpu_if->vgic_ap1r[3], ICH_AP1R3_EL2);
-		 write_gicreg(cpu_if->vgic_ap1r[2], ICH_AP1R2_EL2);
-	case 6:
-		 write_gicreg(cpu_if->vgic_ap1r[1], ICH_AP1R1_EL2);
-	default:
-		 write_gicreg(cpu_if->vgic_ap1r[0], ICH_AP1R0_EL2);
-	}	 	                           
+	write_gicreg(cpu_if->vgic_vmcr, ICH_VMCR_EL2);
+
+	if (live_lrs) {
+		write_gicreg(cpu_if->vgic_hcr, ICH_HCR_EL2);
+
+		switch (nr_pri_bits) {
+		case 7:
+			write_gicreg(cpu_if->vgic_ap0r[3], ICH_AP0R3_EL2);
+			write_gicreg(cpu_if->vgic_ap0r[2], ICH_AP0R2_EL2);
+		case 6:
+			write_gicreg(cpu_if->vgic_ap0r[1], ICH_AP0R1_EL2);
+		default:
+			write_gicreg(cpu_if->vgic_ap0r[0], ICH_AP0R0_EL2);
+		}
+
+		switch (nr_pri_bits) {
+		case 7:
+			write_gicreg(cpu_if->vgic_ap1r[3], ICH_AP1R3_EL2);
+			write_gicreg(cpu_if->vgic_ap1r[2], ICH_AP1R2_EL2);
+		case 6:
+			write_gicreg(cpu_if->vgic_ap1r[1], ICH_AP1R1_EL2);
+		default:
+			write_gicreg(cpu_if->vgic_ap1r[0], ICH_AP1R0_EL2);
+		}
 		 	                           
-	switch (max_lr_idx) {
-	case 15:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(15)], ICH_LR15_EL2);
-	case 14:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(14)], ICH_LR14_EL2);
-	case 13:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(13)], ICH_LR13_EL2);
-	case 12:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(12)], ICH_LR12_EL2);
-	case 11:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(11)], ICH_LR11_EL2);
-	case 10:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(10)], ICH_LR10_EL2);
-	case 9:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(9)], ICH_LR9_EL2);
-	case 8:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(8)], ICH_LR8_EL2);
-	case 7:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(7)], ICH_LR7_EL2);
-	case 6:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(6)], ICH_LR6_EL2);
-	case 5:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(5)], ICH_LR5_EL2);
-	case 4:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(4)], ICH_LR4_EL2);
-	case 3:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(3)], ICH_LR3_EL2);
-	case 2:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(2)], ICH_LR2_EL2);
-	case 1:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(1)], ICH_LR1_EL2);
-	case 0:
-		write_gicreg(cpu_if->vgic_lr[VGIC_V3_LR_INDEX(0)], ICH_LR0_EL2);
+		for (i = 0; i <= max_lr_idx; i++) {
+			val = 0;
+
+			if (live_lrs & (1 << i))
+				val = cpu_if->vgic_lr[i];
+
+			__gic_v3_set_lr(val, i);
+		}
 	}
 
+
 	/*
 	 * Ensures that the above will have reached the
 	 * (re)distributors. This ensure the guest will read the
@@ -209,6 +276,7 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 	 */
 	isb();
 	dsb(sy);
+	vcpu->arch.vgic_cpu.live_lrs = live_lrs;
 
 	/*
 	 * Prevent the guest from touching the GIC system registers if
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index f473fd6..281caf8 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -279,12 +279,6 @@ struct vgic_v2_cpu_if {
 	u32		vgic_lr[VGIC_V2_MAX_LRS];
 };
 
-/*
- * LRs are stored in reverse order in memory. make sure we index them
- * correctly.
- */
-#define VGIC_V3_LR_INDEX(lr)		(VGIC_V3_MAX_LRS - 1 - lr)
-
 struct vgic_v3_cpu_if {
 #ifdef CONFIG_KVM_ARM_VGIC_V3
 	u32		vgic_hcr;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 453eafd..11b5ff6 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -42,7 +42,7 @@ static u32 ich_vtr_el2;
 static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
 {
 	struct vgic_lr lr_desc;
-	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[VGIC_V3_LR_INDEX(lr)];
+	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[lr];
 
 	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
 		lr_desc.irq = val & ICH_LR_VIRTUALID_MASK;
@@ -106,7 +106,7 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 		lr_val |= ((u64)lr_desc.hwirq) << ICH_LR_PHYS_ID_SHIFT;
 	}
 
-	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[VGIC_V3_LR_INDEX(lr)] = lr_val;
+	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[lr] = lr_val;
 
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (12 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 13/17] arm64: KVM: vgic-v3: Avoid accessing ICH registers Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-03 19:21   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty Marc Zyngier
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

Next on our list of useless accesses is the maintenance interrupt
status registers (ICH_MISR_EL2, ICH_EISR_EL2).

It is pointless to save them if we haven't asked for a maintenance
interrupt the first place, which can only happen for two reasons:
- Underflow: ICH_HCR_UIE will be set,
- EOI: ICH_LR_EOI will be set.

These conditions can be checked on the in-memory copies of the regs.
Should any of these two condition be valid, we must read GICH_MISR.
We can then check for ICH_MISR_EOI, and only when set read
ICH_EISR_EL2.

This means that in most case, we don't have to save them at all.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index b62c923..3463c0a 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -133,6 +133,36 @@ static void __hyp_text __gic_v3_set_lr(u64 val, int lr)
 	}
 }
 
+static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu, int nr_lr)
+{
+	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
+	int i;
+	bool expect_mi;
+
+	expect_mi = !!(cpu_if->vgic_hcr & ICH_HCR_UIE);
+
+	for (i = 0; i < nr_lr; i++) {
+		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
+				continue;
+
+		expect_mi |= (!(cpu_if->vgic_lr[i] & ICH_LR_HW) &&
+			      (cpu_if->vgic_lr[i] & ICH_LR_EOI));
+	}
+
+	if (expect_mi) {
+		cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
+
+		if (cpu_if->vgic_misr & ICH_MISR_EOI) {
+			cpu_if->vgic_eisr = read_gicreg(ICH_EISR_EL2);
+		} else {
+			cpu_if->vgic_eisr = 0;
+		}
+	} else {
+		cpu_if->vgic_misr = 0;
+		cpu_if->vgic_eisr = 0;
+	}
+}
+
 void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
@@ -150,8 +180,6 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 		int i;
 		u32 max_lr_idx, nr_pri_bits;
 
-		cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
-		cpu_if->vgic_eisr  = read_gicreg(ICH_EISR_EL2);
 		cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2);
 
 		write_gicreg(0, ICH_HCR_EL2);
@@ -159,6 +187,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 		max_lr_idx = vtr_to_max_lr_idx(val);
 		nr_pri_bits = vtr_to_nr_pri_bits(val);
 
+		save_maint_int_state(vcpu, max_lr_idx + 1);
+
 		for (i = 0; i <= max_lr_idx; i++) {
 			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
 				cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (13 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-03 19:21   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit Marc Zyngier
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which
means that we do not have to save it, and we can just clear
its state in the in-memory copy.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 3463c0a..ff67296 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -190,8 +190,15 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 		save_maint_int_state(vcpu, max_lr_idx + 1);
 
 		for (i = 0; i <= max_lr_idx; i++) {
-			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
-				cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
+			if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
+				continue;
+
+			if (cpu_if->vgic_elrsr & (1 << i)) {
+				cpu_if->vgic_lr[i] &= ~ICH_LR_STATE;
+				continue;
+			}
+
+			cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
 		}
 
 		switch (nr_pri_bits) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (14 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-03 19:21   ` Christoffer Dall
  2016-02-17 16:40 ` [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation Marc Zyngier
  2016-02-29  0:57 ` [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Mihai Claudiu Caraman
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

So far, we're always writing all possible LRs, setting the empty
ones with a zero value. This is obvious doing a low of work for
nothing, and we're better off clearing those we've actually
dirtied on the exit path (it is very rare to inject more than one
interrupt at a time anyway).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index ff67296..5f12c57 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -199,6 +199,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 			}
 
 			cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
+			__gic_v3_set_lr(0, i);
 		}
 
 		switch (nr_pri_bits) {
@@ -296,12 +297,10 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 		}
 		 	                           
 		for (i = 0; i <= max_lr_idx; i++) {
-			val = 0;
-
-			if (live_lrs & (1 << i))
-				val = cpu_if->vgic_lr[i];
+			if (!(live_lrs & (1 << i)))
+				continue;
 
-			__gic_v3_set_lr(val, i);
+			__gic_v3_set_lr(cpu_if->vgic_lr[i], i);
 		}
 	}
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (15 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit Marc Zyngier
@ 2016-02-17 16:40 ` Marc Zyngier
  2016-03-03 19:21   ` Christoffer Dall
  2016-02-29  0:57 ` [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Mihai Claudiu Caraman
  17 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-17 16:40 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

The GICv3 specification mandates that ICH_AP0Rn_EL2 are set to
zero when running guests that use the memory mapped registers.

This is fine, as we initialize all ICH_AP0Rn_EL2 registers to
zero, and restore them on entry. But it also means that we
do not need to save these registers on exit. Profit!

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 5f12c57..c2f173d 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -202,14 +202,17 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 			__gic_v3_set_lr(0, i);
 		}
 
-		switch (nr_pri_bits) {
-		case 7:
-			cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
-			cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
-		case 6:
-			cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
-		default:
-			cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
+		/* ICH_AP0Rn is only valid for SRE==1 */
+		if (cpu_if->vgic_sre & ICC_SRE_EL1_SRE) {
+			switch (nr_pri_bits) {
+			case 7:
+				cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
+				cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
+			case 6:
+				cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
+			default:
+				cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
+			}
 		}
 
 		switch (nr_pri_bits) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* RE: [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
  2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
                   ` (16 preceding siblings ...)
  2016-02-17 16:40 ` [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation Marc Zyngier
@ 2016-02-29  0:57 ` Mihai Claudiu Caraman
  2016-02-29  8:26   ` Marc Zyngier
  17 siblings, 1 reply; 37+ messages in thread
From: Mihai Claudiu Caraman @ 2016-02-29  0:57 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	kvmarm@lists.cs.columbia.edu

Reported-by: Mihai Caraman <mihai.caraman@freescale.com>
Tested-by: Mihai Caraman <mihai.caraman@freescale.com>

40% improvements here and there will make the difference. 

Thanks,
Mike

> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu [mailto:kvmarm-bounces@lists.cs.columbia.edu] On Behalf Of Marc Zyngier
> Sent: Wednesday, February 17, 2016 6:41 PM
> To: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: kvm@vger.kernel.org; linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu
> Subject: [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
> 
> I've recently been looking at our entry/exit costs, and profiling figures did show some very low hanging fruits.
> 
> The most obvious cost is that accessing the GIC HW is slow. As in "deadly slow", specially when GICv2 is involved. So not hammering the HW when there is nothing to write (and even to read) is immediately beneficial, as this is the most common cases (whatever people seem to think, interrupts are a *rare* event). Similar work has also been done for GICv3, with a reduced impact (it was less "bad" to start with).
> 
> Another easy thing to fix is the way we handle trapped system registers. We do insist on (mostly) sorting them, but we do perform a linear search on trap. We can switch to a binary search for free, and get immediate benefits (the PMU code, being extremely trap-happy, benefits immediately from this).
> 
> With these in place, I see an improvement of 10 to 40% (depending on the platform) on our world-switch cycle count when running a set of hand-crafted guests that are designed to only perform traps.
> 
> Please note that VM exits are actually a rare event on ARM. So don't expect your guest to be 40% faster, this will hardly make a noticable difference.
> 
> Methodology:
> 
> * NULL-hypercall guest: Perform 2^20 PSCI_0_2_FN_PSCI_VERSION calls, and then a power-off:
> 
> __start:
> 	mov	x19, #(1 << 16)
> 1:	mov	x0, #0x84000000
> 	hvc	#0
> 	sub	x19, x19, #1
> 	cbnz	x19, 1b
> 	mov	x0, #0x84000000
> 	add	x0, x0, #9
> 	hvc	#0
> 	b	.
> 
> * Self IPI guest: Inject and handle 2^20 SGI0 using GICv2 or GICv3, and then power-off:
> 
> __start:
> 	mov	x19, #(1 << 20)
> 
> 	mrs	x0, id_aa64pfr0_el1
> 	ubfx	x0, x0, #24, #4
> 	and	x0, x0, #0xf
> 	cbz	x0, do_v2
> 
> 	mrs	x0, s3_0_c12_c12_5	// ICC_SRE_EL1
> 	and	x0, x0, #1		// SRE bit
> 	cbnz	x0, do_v3
> 
> do_v2:
> 	mov	x0, #0x3fff0000		// Dist
> 	mov	x1, #0x3ffd0000		// CPU
> 	mov	w2, #1
> 	str	w2, [x0]		// Enable Group0
> 	ldr	w2, =0xa0a0a0a0
> 	str	w2, [x0, 0x400]		// A0 priority for SGI0-3
> 	mov	w2, #0x0f
> 	str	w2, [x0, #0x100]	// Enable SGI0-3
> 	mov	w2, #0xf0
> 	str	w2, [x1, #4]		// PMR
> 	mov	w2, #1
> 	str	w2, [x1]		// Enable CPU interface
> 	
> 1:
> 	mov	w2, #(2 << 24)		// Interrupt self with SGI0
> 	str	w2, [x0, #0xf00]
> 
> 2:	ldr	w2, [x1, #0x0c]		// GICC_IAR
> 	cmp	w2, #0x3ff
> 	b.ne	3f
> 
> 	wfi
> 	b	2b
> 
> 3:	str	w2, [x1, #0x10]		// EOI
> 
> 	sub	x19, x19, #1
> 	cbnz	x19, 1b
> 
> die:
> 	mov	x0, #0x84000000
> 	add	x0, x0, #9
> 	hvc	#0
> 	b	.
> 
> do_v3:
> 	mov	x0, #0x3fff0000		// Dist
> 	mov	x1, #0x3fbf0000		// Redist 0
> 	mov	x2, #0x10000
> 	add	x1, x1, x2		// SGI page
> 	mov	w2, #2
> 	str	w2, [x0]		// Enable Group1
> 	ldr	w2, =0xa0a0a0a0
> 	str	w2, [x1, 0x400]		// A0 priority for SGI0-3
> 	mov	w2, #0x0f
> 	str	w2, [x1, #0x100]	// Enable SGI0-3
> 	mov	w2, #0xf0
> 	msr	S3_0_c4_c6_0, x2	// PMR
> 	mov	w2, #1
> 	msr	S3_0_C12_C12_7, x2	// Enable Group1
> 
> 1:
> 	mov	x2, #1
> 	msr	S3_0_c12_c11_5, x2	// Self SGI0
> 
> 2:	mrs	x2, S3_0_c12_c12_0	// Read IAR1
> 	cmp	w2, #0x3ff
> 	b.ne	3f
> 
> 	wfi
> 	b	2b
> 
> 3:	msr	S3_0_c12_c12_1, x2	// EOI
> 
> 	sub	x19, x19, #1
> 	cbnz	x19, 1b
> 
> 	b	die
> 
> * sysreg trap guest: Perform 2^20 PMSELR_EL0 accesses, and power-off:
> 
> __start:
> 	mov	x19, #(1 << 20)
> 1:	mrs	x0, PMSELR_EL0
> 	sub	x19, x19, #1
> 	cbnz	x19, 1b
> 	mov	x0, #0x84000000
> 	add	x0, x0, #9
> 	hvc	#0
> 	b	.
> 
> * These guests are profiled using perf and kvmtool:
> 
> taskset -c 1 perf stat -e cycles:kh lkvm run -c1 --kernel do_sysreg.bin 2>&1 >/dev/null| grep cycles
> 
> The result is then divided by the number of iterations (2^20).
> 
> These tests have been run on three different platform (two GICv2 based, and one with GICv3 and legacy mode) and shown significant improvements in all cases. I've only touched the arm64 GIC code, but obviously the 32bit code should use it as well once we've migrated it to C.
> 
> Vanilla v4.5-rc4
> 	     A             B            C-v2         C-v3
> Null HVC:   8462          6566          6572         6505
> Self SGI:  11961          8690          9541         8629
> SysReg:     8952          6979          7212         7180
> 
> Patched v4.5-rc4
> 	     A             B            C-v2         C-v3
> Null HVC:   5219  -38%    3957  -39%    5175  -21%   5158  -20%
> Self SGI:   8946  -25%    6658  -23%    8547  -10%   7299  -15%
> SysReg:     5314  -40%    4190  -40%    5417  -25%   5414  -24%
> 
> I've pushed out a branch (kvm-arm64/suck-less) to the usual location, based on -rc4 + a few fixes I also posted today.
> 
> Thanks,
> 
> 	M.
> 
> * From v1:
>   - Fixed a nasty bug dealing with the active Priority Register
>   - Maintenance interrupt lazy saving
>   - More LR hackery
>   - Adapted most of the series for GICv3 as well
> 
> Marc Zyngier (17):
>   arm64: KVM: Switch the sys_reg search to be a binary search
>   ARM: KVM: Properly sort the invariant table
>   ARM: KVM: Enforce sorting of all CP tables
>   ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit
>   ARM: KVM: Switch the CP reg search to be a binary search
>   KVM: arm/arm64: timer: Add active state caching
>   arm64: KVM: vgic-v2: Avoid accessing GICH registers
>   arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
>   arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function
>   arm64: KVM: vgic-v2: Do not save an LR known to be empty
>   arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
>   arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit
>   arm64: KVM: vgic-v3: Avoid accessing ICH registers
>   arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
>   arm64: KVM: vgic-v3: Do not save an LR known to be empty
>   arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
>   arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation
> 
>  arch/arm/kvm/arm.c              |   1 +
>  arch/arm/kvm/coproc.c           |  74 +++++----
>  arch/arm/kvm/coproc.h           |   8 +-
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 144 +++++++++++++----  arch/arm64/kvm/hyp/vgic-v3-sr.c | 333 ++++++++++++++++++++++++++--------------
>  arch/arm64/kvm/sys_regs.c       |  40 ++---
>  include/kvm/arm_arch_timer.h    |   5 +
>  include/kvm/arm_vgic.h          |   8 +-
>  virt/kvm/arm/arch_timer.c       |  31 ++++
>  virt/kvm/arm/vgic-v2-emul.c     |  10 +-
>  virt/kvm/arm/vgic-v3.c          |   4 +-
>  11 files changed, 452 insertions(+), 206 deletions(-)
> 
> --
> 2.1.4
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
  2016-02-29  0:57 ` [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Mihai Claudiu Caraman
@ 2016-02-29  8:26   ` Marc Zyngier
  2016-02-29 10:43     ` Mihai Claudiu Caraman
  0 siblings, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-02-29  8:26 UTC (permalink / raw)
  To: Mihai Claudiu Caraman, Christoffer Dall
  Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu

On 29/02/16 00:57, Mihai Claudiu Caraman wrote:
> Reported-by: Mihai Caraman <mihai.caraman@freescale.com>

Reported-by? Where? Did I miss an email (wouldn't surprise me)?

> Tested-by: Mihai Caraman <mihai.caraman@freescale.com>
> 
> 40% improvements here and there will make the difference. 

I'm sceptical that you will see a significant improvement on any
*realistic* workload, but thanks for testing!

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
  2016-02-29  8:26   ` Marc Zyngier
@ 2016-02-29 10:43     ` Mihai Claudiu Caraman
  0 siblings, 0 replies; 37+ messages in thread
From: Mihai Claudiu Caraman @ 2016-02-29 10:43 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu

> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier@arm.com] 
> Sent: Monday, February 29, 2016 10:26 AM
> To: Mihai Claudiu Caraman <mike.caraman@nxp.com>; Christoffer Dall <christoffer.dall@linaro.org>
> Cc: kvm@vger.kernel.org; linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu
> Subject: Re: [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations
> 
> On 29/02/16 00:57, Mihai Claudiu Caraman wrote:
> > Reported-by: Mihai Caraman <mihai.caraman@freescale.com>
> 
> Reported-by? Where? Did I miss an email (wouldn't surprise me)?

Private meeting between ARM and Freescale/NXP, it was me.

> 
> > Tested-by: Mihai Caraman <mihai.caraman@freescale.com>
> > 
> > 40% improvements here and there will make the difference. 
> 
> I'm sceptical that you will see a significant improvement

1% improvement with this patch is good enough.

> on any *realistic* workload

You will hear about this sooner than you think ;-)

> , but thanks for testing!

Thanks for the patch, we plan to contribute as well.

> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...

-Mike

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers
  2016-02-17 16:40 ` [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:39PM +0000, Marc Zyngier wrote:
> GICv2 registers are *slow*. As in "terrifyingly slow". Which is bad.
> But we're equaly bad, as we make a point in accessing them even if
> we don't have any interrupt in flight.
> 
> A good solution is to first find out if we have anything useful to
> write into the GIC, and if we don't, to simply not do it. This
> involves tracking which LRs actually have something valid there.
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

nice find with the APR in this one, my review-by is still valid here.

Thanks,
-Christoffer

> ---
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 72 ++++++++++++++++++++++++++++-------------
>  include/kvm/arm_vgic.h          |  2 ++
>  2 files changed, 52 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> index e717612..5ab8d63 100644
> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -38,28 +38,41 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  
>  	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
>  	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
> -	cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
> -	eisr0  = readl_relaxed(base + GICH_EISR0);
> -	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> -	if (unlikely(nr_lr > 32)) {
> -		eisr1  = readl_relaxed(base + GICH_EISR1);
> -		elrsr1 = readl_relaxed(base + GICH_ELRSR1);
> -	} else {
> -		eisr1 = elrsr1 = 0;
> -	}
> +
> +	if (vcpu->arch.vgic_cpu.live_lrs) {
> +		eisr0  = readl_relaxed(base + GICH_EISR0);
> +		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> +		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
> +		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
> +
> +		if (unlikely(nr_lr > 32)) {
> +			eisr1  = readl_relaxed(base + GICH_EISR1);
> +			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
> +		} else {
> +			eisr1 = elrsr1 = 0;
> +		}
> +
>  #ifdef CONFIG_CPU_BIG_ENDIAN
> -	cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
> -	cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
> +		cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
> +		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
>  #else
> -	cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
> -	cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
> +		cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
> +		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
>  #endif
> -	cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
>  
> -	writel_relaxed(0, base + GICH_HCR);
> +		for (i = 0; i < nr_lr; i++)
> +			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
> +				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
>  
> -	for (i = 0; i < nr_lr; i++)
> -		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> +		writel_relaxed(0, base + GICH_HCR);
> +
> +		vcpu->arch.vgic_cpu.live_lrs = 0;
> +	} else {
> +		cpu_if->vgic_eisr = 0;
> +		cpu_if->vgic_elrsr = ~0UL;
> +		cpu_if->vgic_misr = 0;
> +		cpu_if->vgic_apr = 0;
> +	}
>  }
>  
>  /* vcpu is already in the HYP VA space */
> @@ -70,15 +83,30 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>  	struct vgic_dist *vgic = &kvm->arch.vgic;
>  	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
>  	int i, nr_lr;
> +	u64 live_lrs = 0;
>  
>  	if (!base)
>  		return;
>  
> -	writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
> -	writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR);
> -	writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
> -
>  	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
> +
>  	for (i = 0; i < nr_lr; i++)
> -		writel_relaxed(cpu_if->vgic_lr[i], base + GICH_LR0 + (i * 4));
> +		if (cpu_if->vgic_lr[i] & GICH_LR_STATE)
> +			live_lrs |= 1UL << i;
> +
> +	if (live_lrs) {
> +		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
> +		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
> +		for (i = 0; i < nr_lr; i++) {
> +			u32 val = 0;
> +
> +			if (live_lrs & (1UL << i))
> +				val = cpu_if->vgic_lr[i];
> +
> +			writel_relaxed(val, base + GICH_LR0 + (i * 4));
> +		}
> +	}
> +
> +	writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR);
> +	vcpu->arch.vgic_cpu.live_lrs = live_lrs;
>  }
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 13a3d53..f473fd6 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -321,6 +321,8 @@ struct vgic_cpu {
>  
>  	/* Protected by the distributor's irq_phys_map_lock */
>  	struct list_head	irq_phys_map_list;
> +
> +	u64		live_lrs;
>  };
>  
>  #define LR_EMPTY	0xff
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
  2016-02-17 16:40 ` [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  2016-03-03  8:28     ` Marc Zyngier
  0 siblings, 1 reply; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

On Wed, Feb 17, 2016 at 04:40:40PM +0000, Marc Zyngier wrote:
> Next on our list of useless accesses is the maintenance interrupt
> status registers (GICH_MISR, GICH_EISR{0,1}).
> 
> It is pointless to save them if we haven't asked for a maintenance
> interrupt the first place, which can only happen for two reasons:
> - Underflow: GICH_HCR_UIE will be set,
> - EOI: GICH_LR_EOI will be set.
> 
> These conditions can be checked on the in-memory copies of the regs.
> Should any of these two condition be valid, we must read GICH_MISR.
> We can then check for GICH_MISR_EOI, and only when set read
> GICH_EISR*.
> 
> This means that in most case, we don't have to save them at all.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 54 +++++++++++++++++++++++++++++++++++------
>  1 file changed, 47 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> index 5ab8d63..1bda5ce 100644
> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -23,6 +23,49 @@
>  
>  #include "hyp.h"
>  
> +static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu,
> +					    void __iomem *base)
> +{
> +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> +	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
> +	u32 eisr0, eisr1;
> +	int i;
> +	bool expect_mi;
> +
> +	expect_mi = !!(cpu_if->vgic_hcr & GICH_HCR_UIE);
> +
> +	for (i = 0; i < nr_lr; i++) {
> +		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
> +				continue;
> +
> +		expect_mi |= (!(cpu_if->vgic_lr[i] & GICH_LR_HW) &&
> +			      (cpu_if->vgic_lr[i] & GICH_LR_EOI));
> +	}

Just eye balling the code it really feels crazy that this is faster than
reading two registers, but I believe that may just be the case given the
speed of the GIC.

As an alternative to this loop, you could keep a counter for the number
of requested EOI MIs and whenever we program an LR with the EOI bit set,
then we increment the counter, and whenever we clear such an LR, we
decrement the counter, and then you can just check here if it's
non-zero.  What do you think?  Is it worth it?  Does it make the code
even worse?

I can also write that on top of this patch if you'd like.

In any case, for this functionality:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

> +
> +	if (expect_mi) {
> +		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
> +
> +		if (cpu_if->vgic_misr & GICH_MISR_EOI) {
> +			eisr0  = readl_relaxed(base + GICH_EISR0);
> +			if (unlikely(nr_lr > 32))
> +				eisr1  = readl_relaxed(base + GICH_EISR1);
> +			else
> +				eisr1 = 0;
> +		} else {
> +			eisr0 = eisr1 = 0;
> +		}
> +	} else {
> +		cpu_if->vgic_misr = 0;
> +		eisr0 = eisr1 = 0;
> +	}
> +
> +#ifdef CONFIG_CPU_BIG_ENDIAN
> +	cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1;
> +#else
> +	cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0;
> +#endif
> +}
> +
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  {
> @@ -30,7 +73,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
>  	struct vgic_dist *vgic = &kvm->arch.vgic;
>  	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> -	u32 eisr0, eisr1, elrsr0, elrsr1;
> +	u32 elrsr0, elrsr1;
>  	int i, nr_lr;
>  
>  	if (!base)
> @@ -40,26 +83,23 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
>  
>  	if (vcpu->arch.vgic_cpu.live_lrs) {
> -		eisr0  = readl_relaxed(base + GICH_EISR0);
>  		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> -		cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR);
>  		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
>  
>  		if (unlikely(nr_lr > 32)) {
> -			eisr1  = readl_relaxed(base + GICH_EISR1);
>  			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
>  		} else {
> -			eisr1 = elrsr1 = 0;
> +			elrsr1 = 0;
>  		}
>  
>  #ifdef CONFIG_CPU_BIG_ENDIAN
> -		cpu_if->vgic_eisr  = ((u64)eisr0 << 32) | eisr1;
>  		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
>  #else
> -		cpu_if->vgic_eisr  = ((u64)eisr1 << 32) | eisr0;
>  		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
>  #endif
>  
> +		save_maint_int_state(vcpu, base);
> + 
>  		for (i = 0; i < nr_lr; i++)
>  			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
>  				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function
  2016-02-17 16:40 ` [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:41PM +0000, Marc Zyngier wrote:
> In order to make the saving path slightly more readable and
> prepare for some more optimizations, let's more the GICH_ELRSR

s/more/move/

> saving to its own function.
> 
> No functional change.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 36 +++++++++++++++++++++---------------
>  1 file changed, 21 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> index 1bda5ce..c281374 100644
> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -66,6 +66,25 @@ static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu,
>  #endif
>  }
>  
> +static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
> +{
> +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> +	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
> +	u32 elrsr0, elrsr1;
> +
> +	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> +	if (unlikely(nr_lr > 32))
> +		elrsr1 = readl_relaxed(base + GICH_ELRSR1);
> +	else
> +		elrsr1 = 0;
> +
> +#ifdef CONFIG_CPU_BIG_ENDIAN
> +	cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
> +#else
> +	cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
> +#endif
> +}
> +
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  {
> @@ -73,7 +92,6 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
>  	struct vgic_dist *vgic = &kvm->arch.vgic;
>  	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> -	u32 elrsr0, elrsr1;
>  	int i, nr_lr;
>  
>  	if (!base)
> @@ -83,22 +101,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
>  
>  	if (vcpu->arch.vgic_cpu.live_lrs) {
> -		elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> -		cpu_if->vgic_apr    = readl_relaxed(base + GICH_APR);
> -
> -		if (unlikely(nr_lr > 32)) {
> -			elrsr1 = readl_relaxed(base + GICH_ELRSR1);
> -		} else {
> -			elrsr1 = 0;
> -		}
> -
> -#ifdef CONFIG_CPU_BIG_ENDIAN
> -		cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
> -#else
> -		cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
> -#endif
> +		cpu_if->vgic_apr = readl_relaxed(base + GICH_APR);
>  
>  		save_maint_int_state(vcpu, base);
> +		save_elrsr(vcpu, base);
>   
>  		for (i = 0; i < nr_lr; i++)
>  			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
> -- 
> 2.1.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty
  2016-02-17 16:40 ` [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:42PM +0000, Marc Zyngier wrote:
> On exit, any empty LR will be signaled in GICH_ELRSR*. Which
> means that we do not have to save it, and we can just clear
> its state in the in-memory copy.
> 
> Take this opportunity to move the LR saving code into its
> own function.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> index c281374..3dbbc6b 100644
> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -85,6 +85,25 @@ static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
>  #endif
>  }
>  
> +static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
> +{
> +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> +	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
> +	int i;
> +
> +	for (i = 0; i < nr_lr; i++) {
> +		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
> +			continue;
> +
> +		if (cpu_if->vgic_elrsr & (1UL << i)) {
> +			cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
> +			continue;
> +		}
> +
> +		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> +	}
> +}
> +
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  {
> @@ -92,12 +111,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
>  	struct vgic_dist *vgic = &kvm->arch.vgic;
>  	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> -	int i, nr_lr;
>  
>  	if (!base)
>  		return;
>  
> -	nr_lr = vcpu->arch.vgic_cpu.nr_lr;
>  	cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR);
>  
>  	if (vcpu->arch.vgic_cpu.live_lrs) {
> @@ -105,10 +122,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  
>  		save_maint_int_state(vcpu, base);
>  		save_elrsr(vcpu, base);
> - 
> -		for (i = 0; i < nr_lr; i++)
> -			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
> -				cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> +		save_lrs(vcpu, base);
>  
>  		writel_relaxed(0, base + GICH_HCR);
>  
> -- 
> 2.1.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-02-17 16:40 ` [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  2016-03-03  8:14     ` Marc Zyngier
  2016-03-03 15:58     ` Marc Zyngier
  0 siblings, 2 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:43PM +0000, Marc Zyngier wrote:
> So far, we're always writing all possible LRs, setting the empty
> ones with a zero value. This is obvious doing a low of work for

s/low/lot/

> nothing, and we're better off clearing those we've actually
> dirtied on the exit path (it is very rare to inject more than one
> interrupt at a time anyway).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> index 3dbbc6b..e53f131 100644
> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
>  		}
>  
>  		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
>  	}
>  }
>  
> @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>  		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
>  		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
>  		for (i = 0; i < nr_lr; i++) {
> -			u32 val = 0;
> -
> -			if (live_lrs & (1UL << i))
> -				val = cpu_if->vgic_lr[i];
> +			if (!(live_lrs & (1UL << i)))
> +				continue;

how can we be sure that the LRs are clear when we launch our first VM on
a given physical CPU?  Don't we need to flush the LRs during VGIC init
time?

>  
> -			writel_relaxed(val, base + GICH_LR0 + (i * 4));
> +			writel_relaxed(cpu_if->vgic_lr[i],
> +				       base + GICH_LR0 + (i * 4));
>  		}
>  	}
>  
> -- 
> 2.1.4
> 

otherwie LGTM.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit
  2016-02-17 16:40 ` [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit Marc Zyngier
@ 2016-03-02 23:08   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-02 23:08 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:44PM +0000, Marc Zyngier wrote:
> The GICD_SGIR register lives a long way from the beginning of
> the handler array, which is searched linearly. As this is hit
> pretty often, let's move it up. This saves us some precious
> cycles when the guest is generating IPIs.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  virt/kvm/arm/vgic-v2-emul.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
> index 1390797..1b0bee0 100644
> --- a/virt/kvm/arm/vgic-v2-emul.c
> +++ b/virt/kvm/arm/vgic-v2-emul.c
> @@ -321,6 +321,11 @@ static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
>  
>  static const struct vgic_io_range vgic_dist_ranges[] = {
>  	{
> +		.base		= GIC_DIST_SOFTINT,
> +		.len		= 4,
> +		.handle_mmio	= handle_mmio_sgi_reg,
> +	},
> +	{
>  		.base		= GIC_DIST_CTRL,
>  		.len		= 12,
>  		.bits_per_irq	= 0,
> @@ -387,11 +392,6 @@ static const struct vgic_io_range vgic_dist_ranges[] = {
>  		.handle_mmio	= handle_mmio_cfg_reg,
>  	},
>  	{
> -		.base		= GIC_DIST_SOFTINT,
> -		.len		= 4,
> -		.handle_mmio	= handle_mmio_sgi_reg,
> -	},
> -	{
>  		.base		= GIC_DIST_SGI_PENDING_CLEAR,
>  		.len		= VGIC_NR_SGIS,
>  		.handle_mmio	= handle_mmio_sgi_clear,
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-03-02 23:08   ` Christoffer Dall
@ 2016-03-03  8:14     ` Marc Zyngier
  2016-03-03 15:58     ` Marc Zyngier
  1 sibling, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-03-03  8:14 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Thu, 3 Mar 2016 00:08:19 +0100
Christoffer Dall <christoffer.dall@linaro.org> wrote:

> On Wed, Feb 17, 2016 at 04:40:43PM +0000, Marc Zyngier wrote:
> > So far, we're always writing all possible LRs, setting the empty
> > ones with a zero value. This is obvious doing a low of work for
> 
> s/low/lot/
> 
> > nothing, and we're better off clearing those we've actually
> > dirtied on the exit path (it is very rare to inject more than one
> > interrupt at a time anyway).
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > index 3dbbc6b..e53f131 100644
> > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
> >  		}
> >  
> >  		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> > +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
> >  	}
> >  }
> >  
> > @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
> >  		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
> >  		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
> >  		for (i = 0; i < nr_lr; i++) {
> > -			u32 val = 0;
> > -
> > -			if (live_lrs & (1UL << i))
> > -				val = cpu_if->vgic_lr[i];
> > +			if (!(live_lrs & (1UL << i)))
> > +				continue;
> 
> how can we be sure that the LRs are clear when we launch our first VM on
> a given physical CPU?  Don't we need to flush the LRs during VGIC init
> time?

Very good point. The registers reset to zero, but there is indeed no
guarantee that we don't get something clobbered by EL3, or some earlier
crash.

I'll add a patch to that effect.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
  2016-03-02 23:08   ` Christoffer Dall
@ 2016-03-03  8:28     ` Marc Zyngier
  0 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-03-03  8:28 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, linux-arm-kernel, kvmarm

On Thu, 3 Mar 2016 00:08:06 +0100
Christoffer Dall <christoffer.dall@linaro.org> wrote:

> On Wed, Feb 17, 2016 at 04:40:40PM +0000, Marc Zyngier wrote:
> > Next on our list of useless accesses is the maintenance interrupt
> > status registers (GICH_MISR, GICH_EISR{0,1}).
> > 
> > It is pointless to save them if we haven't asked for a maintenance
> > interrupt the first place, which can only happen for two reasons:
> > - Underflow: GICH_HCR_UIE will be set,
> > - EOI: GICH_LR_EOI will be set.
> > 
> > These conditions can be checked on the in-memory copies of the regs.
> > Should any of these two condition be valid, we must read GICH_MISR.
> > We can then check for GICH_MISR_EOI, and only when set read
> > GICH_EISR*.
> > 
> > This means that in most case, we don't have to save them at all.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm64/kvm/hyp/vgic-v2-sr.c | 54 +++++++++++++++++++++++++++++++++++------
> >  1 file changed, 47 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > index 5ab8d63..1bda5ce 100644
> > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> > @@ -23,6 +23,49 @@
> >  
> >  #include "hyp.h"
> >  
> > +static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu,
> > +					    void __iomem *base)
> > +{
> > +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> > +	int nr_lr = vcpu->arch.vgic_cpu.nr_lr;
> > +	u32 eisr0, eisr1;
> > +	int i;
> > +	bool expect_mi;
> > +
> > +	expect_mi = !!(cpu_if->vgic_hcr & GICH_HCR_UIE);
> > +
> > +	for (i = 0; i < nr_lr; i++) {
> > +		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
> > +				continue;
> > +
> > +		expect_mi |= (!(cpu_if->vgic_lr[i] & GICH_LR_HW) &&
> > +			      (cpu_if->vgic_lr[i] & GICH_LR_EOI));
> > +	}
> 
> Just eye balling the code it really feels crazy that this is faster than
> reading two registers, but I believe that may just be the case given the
> speed of the GIC.

I've measured things like 300 cycles per access on some implementation.
Pretty scary. On the other hand, this loop is usually short (we
usually have very few LRs, and even fewer are actually live), so it
ends up being a couple of memory accesses at most (completely dwarfed
by the cost of the MMIO access).

> 
> As an alternative to this loop, you could keep a counter for the number
> of requested EOI MIs and whenever we program an LR with the EOI bit set,
> then we increment the counter, and whenever we clear such an LR, we
> decrement the counter, and then you can just check here if it's
> non-zero.  What do you think?  Is it worth it?  Does it make the code
> even worse?

This indeed could be a clearer/cleaner optimization indeed.

> 
> I can also write that on top of this patch if you'd like.

Yes please! :-)

> In any case, for this functionality:
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

Thanks!

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-03-02 23:08   ` Christoffer Dall
  2016-03-03  8:14     ` Marc Zyngier
@ 2016-03-03 15:58     ` Marc Zyngier
  2016-03-04 11:36       ` Christoffer Dall
  1 sibling, 1 reply; 37+ messages in thread
From: Marc Zyngier @ 2016-03-03 15:58 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On 02/03/16 23:08, Christoffer Dall wrote:
> On Wed, Feb 17, 2016 at 04:40:43PM +0000, Marc Zyngier wrote:
>> So far, we're always writing all possible LRs, setting the empty
>> ones with a zero value. This is obvious doing a low of work for
> 
> s/low/lot/
> 
>> nothing, and we're better off clearing those we've actually
>> dirtied on the exit path (it is very rare to inject more than one
>> interrupt at a time anyway).
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
>> index 3dbbc6b..e53f131 100644
>> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
>> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
>> @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
>>  		}
>>  
>>  		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
>> +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
>>  	}
>>  }
>>  
>> @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>>  		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
>>  		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
>>  		for (i = 0; i < nr_lr; i++) {
>> -			u32 val = 0;
>> -
>> -			if (live_lrs & (1UL << i))
>> -				val = cpu_if->vgic_lr[i];
>> +			if (!(live_lrs & (1UL << i)))
>> +				continue;
> 
> how can we be sure that the LRs are clear when we launch our first VM on
> a given physical CPU?  Don't we need to flush the LRs during VGIC init
> time?
> 
>>  
>> -			writel_relaxed(val, base + GICH_LR0 + (i * 4));
>> +			writel_relaxed(cpu_if->vgic_lr[i],
>> +				       base + GICH_LR0 + (i * 4));
>>  		}
>>  	}
>>  
>> -- 
>> 2.1.4
>>
> 
> otherwie LGTM.

So how about this, just before this patch (I'll obviously do something similar for GICv3):

>From d9a80c4c406450190a68abee302c7d9a0034c62a Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier@arm.com>
Date: Thu, 3 Mar 2016 15:43:58 +0000
Subject: [PATCH] KVM: arm/arm64: vgic-v2: Reset LRs at boot time

In order to let make the GICv2 code more lazy in the way it
accesses the LRs, it is necessary to start with a clean slate.

Let's reset the LRs on each CPU when the vgic is probed.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic-v2.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index ff02f08..67ec334 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -176,6 +176,15 @@ static const struct vgic_ops vgic_v2_ops = {
 
 static struct vgic_params vgic_v2_params;
 
+static void vgic_cpu_init_lrs(void *params)
+{
+	struct vgic_params *vgic = params;
+	int i;
+
+	for (i = 0; i < vgic->nr_lr; i++)
+		writel_relaxed(0, vgic->vctrl_base + GICH_LR0 + (i * 4));
+}
+
 /**
  * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
  * @node:	pointer to the DT node
@@ -257,6 +266,9 @@ int vgic_v2_probe(struct device_node *vgic_node,
 
 	vgic->type = VGIC_V2;
 	vgic->max_gic_vcpus = VGIC_V2_MAX_CPUS;
+
+	on_each_cpu(vgic_cpu_init_lrs, vgic, 1);
+
 	*ops = &vgic_v2_ops;
 	*params = vgic;
 	goto out;


Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  2016-02-17 16:40 ` [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required Marc Zyngier
@ 2016-03-03 19:21   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-03 19:21 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

On Wed, Feb 17, 2016 at 04:40:46PM +0000, Marc Zyngier wrote:
> Next on our list of useless accesses is the maintenance interrupt
> status registers (ICH_MISR_EL2, ICH_EISR_EL2).
> 
> It is pointless to save them if we haven't asked for a maintenance
> interrupt the first place, which can only happen for two reasons:
> - Underflow: ICH_HCR_UIE will be set,
> - EOI: ICH_LR_EOI will be set.
> 
> These conditions can be checked on the in-memory copies of the regs.
> Should any of these two condition be valid, we must read GICH_MISR.
> We can then check for ICH_MISR_EOI, and only when set read
> ICH_EISR_EL2.
> 
> This means that in most case, we don't have to save them at all.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index b62c923..3463c0a 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -133,6 +133,36 @@ static void __hyp_text __gic_v3_set_lr(u64 val, int lr)
>  	}
>  }
>  
> +static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu, int nr_lr)
> +{
> +	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
> +	int i;
> +	bool expect_mi;
> +
> +	expect_mi = !!(cpu_if->vgic_hcr & ICH_HCR_UIE);
> +
> +	for (i = 0; i < nr_lr; i++) {
> +		if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
> +				continue;
> +
> +		expect_mi |= (!(cpu_if->vgic_lr[i] & ICH_LR_HW) &&
> +			      (cpu_if->vgic_lr[i] & ICH_LR_EOI));
> +	}
> +
> +	if (expect_mi) {
> +		cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
> +
> +		if (cpu_if->vgic_misr & ICH_MISR_EOI) {
> +			cpu_if->vgic_eisr = read_gicreg(ICH_EISR_EL2);
> +		} else {
> +			cpu_if->vgic_eisr = 0;
> +		}

nit: you can drop the curly braces here

> +	} else {
> +		cpu_if->vgic_misr = 0;
> +		cpu_if->vgic_eisr = 0;
> +	}
> +}
> +
>  void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
> @@ -150,8 +180,6 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  		int i;
>  		u32 max_lr_idx, nr_pri_bits;
>  
> -		cpu_if->vgic_misr  = read_gicreg(ICH_MISR_EL2);
> -		cpu_if->vgic_eisr  = read_gicreg(ICH_EISR_EL2);
>  		cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2);
>  
>  		write_gicreg(0, ICH_HCR_EL2);
> @@ -159,6 +187,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  		max_lr_idx = vtr_to_max_lr_idx(val);
>  		nr_pri_bits = vtr_to_nr_pri_bits(val);
>  
> +		save_maint_int_state(vcpu, max_lr_idx + 1);
> +
>  		for (i = 0; i <= max_lr_idx; i++) {
>  			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
>  				cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
> -- 
> 2.1.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty
  2016-02-17 16:40 ` [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty Marc Zyngier
@ 2016-03-03 19:21   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-03 19:21 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:47PM +0000, Marc Zyngier wrote:
> On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which
> means that we do not have to save it, and we can just clear
> its state in the in-memory copy.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 3463c0a..ff67296 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -190,8 +190,15 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  		save_maint_int_state(vcpu, max_lr_idx + 1);
>  
>  		for (i = 0; i <= max_lr_idx; i++) {
> -			if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i))
> -				cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
> +			if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i)))
> +				continue;
> +
> +			if (cpu_if->vgic_elrsr & (1 << i)) {
> +				cpu_if->vgic_lr[i] &= ~ICH_LR_STATE;
> +				continue;
> +			}
> +
> +			cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
>  		}
>  
>  		switch (nr_pri_bits) {
> -- 
> 2.1.4
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  2016-02-17 16:40 ` [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit Marc Zyngier
@ 2016-03-03 19:21   ` Christoffer Dall
  0 siblings, 0 replies; 37+ messages in thread
From: Christoffer Dall @ 2016-03-03 19:21 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

On Wed, Feb 17, 2016 at 04:40:48PM +0000, Marc Zyngier wrote:
> So far, we're always writing all possible LRs, setting the empty
> ones with a zero value. This is obvious doing a low of work for
> nothing, and we're better off clearing those we've actually
> dirtied on the exit path (it is very rare to inject more than one
> interrupt at a time anyway).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index ff67296..5f12c57 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -199,6 +199,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  			}
>  
>  			cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
> +			__gic_v3_set_lr(0, i);
>  		}
>  
>  		switch (nr_pri_bits) {
> @@ -296,12 +297,10 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
>  		}
>  		 	                           
>  		for (i = 0; i <= max_lr_idx; i++) {
> -			val = 0;
> -
> -			if (live_lrs & (1 << i))
> -				val = cpu_if->vgic_lr[i];
> +			if (!(live_lrs & (1 << i)))
> +				continue;
>  
> -			__gic_v3_set_lr(val, i);
> +			__gic_v3_set_lr(cpu_if->vgic_lr[i], i);
>  		}
>  	}
>  
> -- 
> 2.1.4
> 

same comment here with initializing the LRs, but otherwise:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation
  2016-02-17 16:40 ` [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation Marc Zyngier
@ 2016-03-03 19:21   ` Christoffer Dall
  2016-03-04  8:54     ` Marc Zyngier
  0 siblings, 1 reply; 37+ messages in thread
From: Christoffer Dall @ 2016-03-03 19:21 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On Wed, Feb 17, 2016 at 04:40:49PM +0000, Marc Zyngier wrote:
> The GICv3 specification mandates that ICH_AP0Rn_EL2 are set to
> zero when running guests that use the memory mapped registers.
> 
> This is fine, as we initialize all ICH_AP0Rn_EL2 registers to
> zero, and restore them on entry. But it also means that we
> do not need to save these registers on exit. Profit!
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 5f12c57..c2f173d 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -202,14 +202,17 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>  			__gic_v3_set_lr(0, i);
>  		}
>  
> -		switch (nr_pri_bits) {
> -		case 7:
> -			cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
> -			cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
> -		case 6:
> -			cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
> -		default:
> -			cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
> +		/* ICH_AP0Rn is only valid for SRE==1 */
> +		if (cpu_if->vgic_sre & ICC_SRE_EL1_SRE) {
> +			switch (nr_pri_bits) {
> +			case 7:
> +				cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
> +				cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
> +			case 6:
> +				cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
> +			default:
> +				cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
> +			}
>  		}
>  
>  		switch (nr_pri_bits) {
> -- 
> 2.1.4
> 

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation
  2016-03-03 19:21   ` Christoffer Dall
@ 2016-03-04  8:54     ` Marc Zyngier
  0 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-03-04  8:54 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On 03/03/16 19:21, Christoffer Dall wrote:
> On Wed, Feb 17, 2016 at 04:40:49PM +0000, Marc Zyngier wrote:
>> The GICv3 specification mandates that ICH_AP0Rn_EL2 are set to
>> zero when running guests that use the memory mapped registers.
>>
>> This is fine, as we initialize all ICH_AP0Rn_EL2 registers to
>> zero, and restore them on entry. But it also means that we
>> do not need to save these registers on exit. Profit!
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 19 +++++++++++--------
>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
>> index 5f12c57..c2f173d 100644
>> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
>> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
>> @@ -202,14 +202,17 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
>>  			__gic_v3_set_lr(0, i);
>>  		}
>>  
>> -		switch (nr_pri_bits) {
>> -		case 7:
>> -			cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
>> -			cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
>> -		case 6:
>> -			cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
>> -		default:
>> -			cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
>> +		/* ICH_AP0Rn is only valid for SRE==1 */
>> +		if (cpu_if->vgic_sre & ICC_SRE_EL1_SRE) {
>> +			switch (nr_pri_bits) {
>> +			case 7:
>> +				cpu_if->vgic_ap0r[3] = read_gicreg(ICH_AP0R3_EL2);
>> +				cpu_if->vgic_ap0r[2] = read_gicreg(ICH_AP0R2_EL2);
>> +			case 6:
>> +				cpu_if->vgic_ap0r[1] = read_gicreg(ICH_AP0R1_EL2);
>> +			default:
>> +				cpu_if->vgic_ap0r[0] = read_gicreg(ICH_AP0R0_EL2);
>> +			}
>>  		}
>>  
>>  		switch (nr_pri_bits) {
>> -- 
>> 2.1.4
>>
> 
> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
> 

Looks like this patch, despite working fine on A57+GIC500, introduces a
regression on an internal build of the AEMv8 model. I'm chasing the
modelling guys in order to find out about it. In the meantime, I'll keep
it on the back-burner, just in case.

Thanks.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-03-03 15:58     ` Marc Zyngier
@ 2016-03-04 11:36       ` Christoffer Dall
  2016-03-04 11:45         ` Marc Zyngier
  0 siblings, 1 reply; 37+ messages in thread
From: Christoffer Dall @ 2016-03-04 11:36 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

On Thu, Mar 03, 2016 at 03:58:08PM +0000, Marc Zyngier wrote:
> On 02/03/16 23:08, Christoffer Dall wrote:
> > On Wed, Feb 17, 2016 at 04:40:43PM +0000, Marc Zyngier wrote:
> >> So far, we're always writing all possible LRs, setting the empty
> >> ones with a zero value. This is obvious doing a low of work for
> > 
> > s/low/lot/
> > 
> >> nothing, and we're better off clearing those we've actually
> >> dirtied on the exit path (it is very rare to inject more than one
> >> interrupt at a time anyway).
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
> >>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> >> index 3dbbc6b..e53f131 100644
> >> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
> >> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> >> @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
> >>  		}
> >>  
> >>  		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> >> +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
> >>  	}
> >>  }
> >>  
> >> @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
> >>  		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
> >>  		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
> >>  		for (i = 0; i < nr_lr; i++) {
> >> -			u32 val = 0;
> >> -
> >> -			if (live_lrs & (1UL << i))
> >> -				val = cpu_if->vgic_lr[i];
> >> +			if (!(live_lrs & (1UL << i)))
> >> +				continue;
> > 
> > how can we be sure that the LRs are clear when we launch our first VM on
> > a given physical CPU?  Don't we need to flush the LRs during VGIC init
> > time?
> > 
> >>  
> >> -			writel_relaxed(val, base + GICH_LR0 + (i * 4));
> >> +			writel_relaxed(cpu_if->vgic_lr[i],
> >> +				       base + GICH_LR0 + (i * 4));
> >>  		}
> >>  	}
> >>  
> >> -- 
> >> 2.1.4
> >>
> > 
> > otherwie LGTM.
> 
> So how about this, just before this patch (I'll obviously do something similar for GICv3):
> 
> From d9a80c4c406450190a68abee302c7d9a0034c62a Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <marc.zyngier@arm.com>
> Date: Thu, 3 Mar 2016 15:43:58 +0000
> Subject: [PATCH] KVM: arm/arm64: vgic-v2: Reset LRs at boot time
> 
> In order to let make the GICv2 code more lazy in the way it
> accesses the LRs, it is necessary to start with a clean slate.

The first sentence need some love I think :)

> 
> Let's reset the LRs on each CPU when the vgic is probed.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic-v2.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index ff02f08..67ec334 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -176,6 +176,15 @@ static const struct vgic_ops vgic_v2_ops = {
>  
>  static struct vgic_params vgic_v2_params;
>  
> +static void vgic_cpu_init_lrs(void *params)
> +{
> +	struct vgic_params *vgic = params;
> +	int i;
> +
> +	for (i = 0; i < vgic->nr_lr; i++)
> +		writel_relaxed(0, vgic->vctrl_base + GICH_LR0 + (i * 4));
> +}
> +
>  /**
>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
>   * @node:	pointer to the DT node
> @@ -257,6 +266,9 @@ int vgic_v2_probe(struct device_node *vgic_node,
>  
>  	vgic->type = VGIC_V2;
>  	vgic->max_gic_vcpus = VGIC_V2_MAX_CPUS;
> +
> +	on_each_cpu(vgic_cpu_init_lrs, vgic, 1);
> +
>  	*ops = &vgic_v2_ops;
>  	*params = vgic;
>  	goto out;
> 
> 
> Thanks,
> 
> 	M.

This looks good to me.  Only concern is that we're now accessing the
control interface from EL1 for the first time (I think), but that should
work just fine though.

-Christoffer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
  2016-03-04 11:36       ` Christoffer Dall
@ 2016-03-04 11:45         ` Marc Zyngier
  0 siblings, 0 replies; 37+ messages in thread
From: Marc Zyngier @ 2016-03-04 11:45 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: Andrew Jones, linux-arm-kernel, kvm, kvmarm

On 04/03/16 11:36, Christoffer Dall wrote:
> On Thu, Mar 03, 2016 at 03:58:08PM +0000, Marc Zyngier wrote:
>> On 02/03/16 23:08, Christoffer Dall wrote:
>>> On Wed, Feb 17, 2016 at 04:40:43PM +0000, Marc Zyngier wrote:
>>>> So far, we're always writing all possible LRs, setting the empty
>>>> ones with a zero value. This is obvious doing a low of work for
>>>
>>> s/low/lot/
>>>
>>>> nothing, and we're better off clearing those we've actually
>>>> dirtied on the exit path (it is very rare to inject more than one
>>>> interrupt at a time anyway).
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +++++-----
>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
>>>> index 3dbbc6b..e53f131 100644
>>>> --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c
>>>> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
>>>> @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
>>>>  		}
>>>>  
>>>>  		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
>>>> +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
>>>>  	}
>>>>  }
>>>>  
>>>> @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>>>>  		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
>>>>  		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
>>>>  		for (i = 0; i < nr_lr; i++) {
>>>> -			u32 val = 0;
>>>> -
>>>> -			if (live_lrs & (1UL << i))
>>>> -				val = cpu_if->vgic_lr[i];
>>>> +			if (!(live_lrs & (1UL << i)))
>>>> +				continue;
>>>
>>> how can we be sure that the LRs are clear when we launch our first VM on
>>> a given physical CPU?  Don't we need to flush the LRs during VGIC init
>>> time?
>>>
>>>>  
>>>> -			writel_relaxed(val, base + GICH_LR0 + (i * 4));
>>>> +			writel_relaxed(cpu_if->vgic_lr[i],
>>>> +				       base + GICH_LR0 + (i * 4));
>>>>  		}
>>>>  	}
>>>>  
>>>> -- 
>>>> 2.1.4
>>>>
>>>
>>> otherwie LGTM.
>>
>> So how about this, just before this patch (I'll obviously do something similar for GICv3):
>>
>> From d9a80c4c406450190a68abee302c7d9a0034c62a Mon Sep 17 00:00:00 2001
>> From: Marc Zyngier <marc.zyngier@arm.com>
>> Date: Thu, 3 Mar 2016 15:43:58 +0000
>> Subject: [PATCH] KVM: arm/arm64: vgic-v2: Reset LRs at boot time
>>
>> In order to let make the GICv2 code more lazy in the way it
>> accesses the LRs, it is necessary to start with a clean slate.
> 
> The first sentence need some love I think :)

Erm... yes. -ENOPARSE! ;-)

> 
>>
>> Let's reset the LRs on each CPU when the vgic is probed.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic-v2.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
>> index ff02f08..67ec334 100644
>> --- a/virt/kvm/arm/vgic-v2.c
>> +++ b/virt/kvm/arm/vgic-v2.c
>> @@ -176,6 +176,15 @@ static const struct vgic_ops vgic_v2_ops = {
>>  
>>  static struct vgic_params vgic_v2_params;
>>  
>> +static void vgic_cpu_init_lrs(void *params)
>> +{
>> +	struct vgic_params *vgic = params;
>> +	int i;
>> +
>> +	for (i = 0; i < vgic->nr_lr; i++)
>> +		writel_relaxed(0, vgic->vctrl_base + GICH_LR0 + (i * 4));
>> +}
>> +
>>  /**
>>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
>>   * @node:	pointer to the DT node
>> @@ -257,6 +266,9 @@ int vgic_v2_probe(struct device_node *vgic_node,
>>  
>>  	vgic->type = VGIC_V2;
>>  	vgic->max_gic_vcpus = VGIC_V2_MAX_CPUS;
>> +
>> +	on_each_cpu(vgic_cpu_init_lrs, vgic, 1);
>> +
>>  	*ops = &vgic_v2_ops;
>>  	*params = vgic;
>>  	goto out;
>>
>>
>> Thanks,
>>
>> 	M.
> 
> This looks good to me.  Only concern is that we're now accessing the
> control interface from EL1 for the first time (I think), but that should
> work just fine though.

We already access it for reading GICH_VTR, but that's actually fine, as
the access is actually controlled by the page tables. Not what the
architects had in mind, but hey... ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-03-04 11:45 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-17 16:40 [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 01/17] arm64: KVM: Switch the sys_reg search to be a binary search Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 02/17] ARM: KVM: Properly sort the invariant table Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 03/17] ARM: KVM: Enforce sorting of all CP tables Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 04/17] ARM: KVM: Rename struct coproc_reg::is_64 to is_64bit Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 05/17] ARM: KVM: Switch the CP reg search to be a binary search Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 06/17] KVM: arm/arm64: timer: Add active state caching Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-03-03  8:28     ` Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-03-03  8:14     ` Marc Zyngier
2016-03-03 15:58     ` Marc Zyngier
2016-03-04 11:36       ` Christoffer Dall
2016-03-04 11:45         ` Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit Marc Zyngier
2016-03-02 23:08   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 13/17] arm64: KVM: vgic-v3: Avoid accessing ICH registers Marc Zyngier
2016-02-17 16:40 ` [PATCH v2 14/17] arm64: KVM: vgic-v3: Save maintenance interrupt state only if required Marc Zyngier
2016-03-03 19:21   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 15/17] arm64: KVM: vgic-v3: Do not save an LR known to be empty Marc Zyngier
2016-03-03 19:21   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 16/17] arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit Marc Zyngier
2016-03-03 19:21   ` Christoffer Dall
2016-02-17 16:40 ` [PATCH v2 17/17] arm64: KVM: vgic-v3: Do not save ICH_AP0Rn_EL2 for GICv2 emulation Marc Zyngier
2016-03-03 19:21   ` Christoffer Dall
2016-03-04  8:54     ` Marc Zyngier
2016-02-29  0:57 ` [PATCH v2 00/17] KVM/ARM: Guest Entry/Exit optimizations Mihai Claudiu Caraman
2016-02-29  8:26   ` Marc Zyngier
2016-02-29 10:43     ` Mihai Claudiu Caraman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).