kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
@ 2016-12-06  6:41 Shannon Zhao
  2016-12-06  6:41 ` [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time Shannon Zhao
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Shannon Zhao @ 2016-12-06  6:41 UTC (permalink / raw)
  To: kvmarm, marc.zyngier, christoffer.dall

From: Shannon Zhao <shannon.zhao@linaro.org>

Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
when probing GIC. These two patches add the missing part.

BTW, here is a strange problem on Huawei D03 board when using
upstream kernel that android guest with a goldfish_fb will hang with
rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
but the problem still exists, while if we revert the commit
b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
well.

We add a trace in kvm_vgic_flush_hwstate() to print the value of 
compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
but the first one in vgic_lr is 10a0000000002001. I don't understand why
there is a valued one in vgic_lr since the memory of vgic_lr is zero
allocated. I think It should be zero when the vcpu first run and first
call kvm_vgic_flush_hwstate().

qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0

I also add a trace at the end of vgic_flush_lr_state() which shows the
kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
are zero.

qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0

But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
first one of vgic_lr is 10a0000000002001.

qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0

The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.

Shannon Zhao (2):
  arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time
  KVM: arm/arm64: vgic-v2: Add the missing resetting LRs at boot time

 virt/kvm/arm/vgic/vgic-v2.c | 11 +++++++++++
 virt/kvm/arm/vgic/vgic-v3.c |  7 +++++++
 2 files changed, 18 insertions(+)

-- 
2.0.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time
  2016-12-06  6:41 [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Shannon Zhao
@ 2016-12-06  6:41 ` Shannon Zhao
  2016-12-06 11:38   ` Marc Zyngier
  2016-12-06  6:41 ` [PATCH 2/2] KVM: arm/arm64: vgic-v2: " Shannon Zhao
  2016-12-06 11:47 ` [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Marc Zyngier
  2 siblings, 1 reply; 11+ messages in thread
From: Shannon Zhao @ 2016-12-06  6:41 UTC (permalink / raw)
  To: kvmarm, marc.zyngier, christoffer.dall

From: Shannon Zhao <shannon.zhao@linaro.org>

This is the corresponding part of commit 0d98d00(arm64: KVM:
vgic-v3: Reset LRs at boot time) which is missed for new-vgic.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
---
 virt/kvm/arm/vgic/vgic-v3.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 5c9f974..7262f3b 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -307,6 +307,11 @@ int vgic_v3_map_resources(struct kvm *kvm)
 	return ret;
 }
 
+static void vgic_cpu_init_lrs(void *params)
+{
+	kvm_call_hyp(__vgic_v3_init_lrs);
+}
+
 /**
  * vgic_v3_probe - probe for a GICv3 compatible interrupt controller in DT
  * @node:	pointer to the DT node
@@ -361,5 +366,7 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
 	kvm_vgic_global_state.type = VGIC_V3;
 	kvm_vgic_global_state.max_gic_vcpus = VGIC_V3_MAX_CPUS;
 
+	on_each_cpu(vgic_cpu_init_lrs, NULL, 1);
+
 	return 0;
 }
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] KVM: arm/arm64: vgic-v2: Add the missing resetting LRs at boot time
  2016-12-06  6:41 [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Shannon Zhao
  2016-12-06  6:41 ` [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time Shannon Zhao
@ 2016-12-06  6:41 ` Shannon Zhao
  2016-12-06 11:39   ` Marc Zyngier
  2016-12-06 11:47 ` [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Marc Zyngier
  2 siblings, 1 reply; 11+ messages in thread
From: Shannon Zhao @ 2016-12-06  6:41 UTC (permalink / raw)
  To: kvmarm, marc.zyngier, christoffer.dall

From: Shannon Zhao <shannon.zhao@linaro.org>

This is the corresponding part of commit d6400d7(KVM: arm/arm64:
vgic-v2: Reset LRs at boot time) which is missed for new-vgic.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
---
 virt/kvm/arm/vgic/vgic-v2.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 9bab867..c636a19 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -300,6 +300,15 @@ int vgic_v2_map_resources(struct kvm *kvm)
 
 DEFINE_STATIC_KEY_FALSE(vgic_v2_cpuif_trap);
 
+static void vgic_cpu_init_lrs(void *params)
+{
+	int i;
+
+	for (i = 0; i < kvm_vgic_global_state.nr_lr; i++)
+		writel_relaxed(0, kvm_vgic_global_state.vctrl_base +
+				  GICH_LR0 + (i * 4));
+}
+
 /**
  * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
  * @node:	pointer to the DT node
@@ -368,6 +377,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 	kvm_vgic_global_state.type = VGIC_V2;
 	kvm_vgic_global_state.max_gic_vcpus = VGIC_V2_MAX_CPUS;
 
+	on_each_cpu(vgic_cpu_init_lrs, NULL, 1);
+
 	kvm_info("vgic-v2@%llx\n", info->vctrl.start);
 
 	return 0;
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time
  2016-12-06  6:41 ` [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time Shannon Zhao
@ 2016-12-06 11:38   ` Marc Zyngier
  0 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2016-12-06 11:38 UTC (permalink / raw)
  To: Shannon Zhao, kvmarm, christoffer.dall

Hi Shannon,

On 06/12/16 06:41, Shannon Zhao wrote:
> From: Shannon Zhao <shannon.zhao@linaro.org>
> 
> This is the corresponding part of commit 0d98d00(arm64: KVM:
> vgic-v3: Reset LRs at boot time) which is missed for new-vgic.
> 
> Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
> ---
>  virt/kvm/arm/vgic/vgic-v3.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
> index 5c9f974..7262f3b 100644
> --- a/virt/kvm/arm/vgic/vgic-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-v3.c
> @@ -307,6 +307,11 @@ int vgic_v3_map_resources(struct kvm *kvm)
>  	return ret;
>  }
>  
> +static void vgic_cpu_init_lrs(void *params)
> +{
> +	kvm_call_hyp(__vgic_v3_init_lrs);
> +}
> +
>  /**
>   * vgic_v3_probe - probe for a GICv3 compatible interrupt controller in DT
>   * @node:	pointer to the DT node
> @@ -361,5 +366,7 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
>  	kvm_vgic_global_state.type = VGIC_V3;
>  	kvm_vgic_global_state.max_gic_vcpus = VGIC_V3_MAX_CPUS;
>  
> +	on_each_cpu(vgic_cpu_init_lrs, NULL, 1);

I don't think that's enough. If you have CPU hotplug, your CPU will come
up with reset values long after boot. I think you need something that is
triggered from cpu_init_hyp_mode/cpu_hyp_reinit.

> +
>  	return 0;
>  }
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: vgic-v2: Add the missing resetting LRs at boot time
  2016-12-06  6:41 ` [PATCH 2/2] KVM: arm/arm64: vgic-v2: " Shannon Zhao
@ 2016-12-06 11:39   ` Marc Zyngier
  2016-12-15  9:09     ` Shannon Zhao
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2016-12-06 11:39 UTC (permalink / raw)
  To: Shannon Zhao, kvmarm, christoffer.dall

On 06/12/16 06:41, Shannon Zhao wrote:
> From: Shannon Zhao <shannon.zhao@linaro.org>
> 
> This is the corresponding part of commit d6400d7(KVM: arm/arm64:
> vgic-v2: Reset LRs at boot time) which is missed for new-vgic.
> 
> Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
> ---
>  virt/kvm/arm/vgic/vgic-v2.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index 9bab867..c636a19 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -300,6 +300,15 @@ int vgic_v2_map_resources(struct kvm *kvm)
>  
>  DEFINE_STATIC_KEY_FALSE(vgic_v2_cpuif_trap);
>  
> +static void vgic_cpu_init_lrs(void *params)
> +{
> +	int i;
> +
> +	for (i = 0; i < kvm_vgic_global_state.nr_lr; i++)
> +		writel_relaxed(0, kvm_vgic_global_state.vctrl_base +
> +				  GICH_LR0 + (i * 4));
> +}
> +
>  /**
>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
>   * @node:	pointer to the DT node
> @@ -368,6 +377,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  	kvm_vgic_global_state.type = VGIC_V2;
>  	kvm_vgic_global_state.max_gic_vcpus = VGIC_V2_MAX_CPUS;
>  
> +	on_each_cpu(vgic_cpu_init_lrs, NULL, 1);
> +
>  	kvm_info("vgic-v2@%llx\n", info->vctrl.start);
>  
>  	return 0;
> 

Same remark as the GICv3 version.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
  2016-12-06  6:41 [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Shannon Zhao
  2016-12-06  6:41 ` [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time Shannon Zhao
  2016-12-06  6:41 ` [PATCH 2/2] KVM: arm/arm64: vgic-v2: " Shannon Zhao
@ 2016-12-06 11:47 ` Marc Zyngier
  2016-12-07  7:45   ` Shannon Zhao
  2 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2016-12-06 11:47 UTC (permalink / raw)
  To: Shannon Zhao, kvmarm, christoffer.dall

On 06/12/16 06:41, Shannon Zhao wrote:
> From: Shannon Zhao <shannon.zhao@linaro.org>
> 
> Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
> removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
> vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
> when probing GIC. These two patches add the missing part.
> 
> BTW, here is a strange problem on Huawei D03 board when using
> upstream kernel that android guest with a goldfish_fb will hang with
> rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
> but the problem still exists, while if we revert the commit
> b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
> well.
> 
> We add a trace in kvm_vgic_flush_hwstate() to print the value of 
> compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
> vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
> but the first one in vgic_lr is 10a0000000002001. I don't understand why
> there is a valued one in vgic_lr since the memory of vgic_lr is zero
> allocated. I think It should be zero when the vcpu first run and first
> call kvm_vgic_flush_hwstate().
> 
> qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0
> 
> I also add a trace at the end of vgic_flush_lr_state() which shows the
> kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
> are zero.
> 
> qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0
> 
> But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
> first one of vgic_lr is 10a0000000002001.
> 
> qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0
> 
> The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.

Decoding this LR value is interesting:

10a0000000002001
| |         | LPI 8193
| |
| Priority 0xa0
|
Group1

Someone is injecting an LPI behind your back. If nobody populates this,
then you may want to investigate what is happening on the host side. Is
there anyone using this interrupt?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
  2016-12-06 11:47 ` [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Marc Zyngier
@ 2016-12-07  7:45   ` Shannon Zhao
  2016-12-07  8:10     ` Marc Zyngier
  0 siblings, 1 reply; 11+ messages in thread
From: Shannon Zhao @ 2016-12-07  7:45 UTC (permalink / raw)
  To: Marc Zyngier, kvmarm, christoffer.dall



On 2016/12/6 19:47, Marc Zyngier wrote:
> On 06/12/16 06:41, Shannon Zhao wrote:
>> From: Shannon Zhao <shannon.zhao@linaro.org>
>>
>> Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
>> removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
>> vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
>> when probing GIC. These two patches add the missing part.
>>
>> BTW, here is a strange problem on Huawei D03 board when using
>> upstream kernel that android guest with a goldfish_fb will hang with
>> rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
>> but the problem still exists, while if we revert the commit
>> b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
>> well.
>>
>> We add a trace in kvm_vgic_flush_hwstate() to print the value of 
>> compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
>> vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
>> but the first one in vgic_lr is 10a0000000002001. I don't understand why
>> there is a valued one in vgic_lr since the memory of vgic_lr is zero
>> allocated. I think It should be zero when the vcpu first run and first
>> call kvm_vgic_flush_hwstate().
>>
>> qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0
>>
>> I also add a trace at the end of vgic_flush_lr_state() which shows the
>> kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
>> are zero.
>>
>> qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0
>>
>> But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
>> first one of vgic_lr is 10a0000000002001.
>>
>> qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0
>>
>> The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.
> 
> Decoding this LR value is interesting:
> 
> 10a0000000002001
> | |         | LPI 8193
> | |
> | Priority 0xa0
> |
> Group1
> 
> Someone is injecting an LPI behind your back. If nobody populates this,
> then you may want to investigate what is happening on the host side. Is
> there anyone using this interrupt?
> 

For this guest, I think nobody populates this LR, but on the host, there
is a LPI interrupt 8193. It's a interrupt of eth2

MBIGEN-V2 8193 Edge      eth2-tx0

It's a little confused to me that the LR registers should only be used
for VM, right? Why does the interrupt on host would affect the LRs?

Thanks,
-- 
Shannon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
  2016-12-07  7:45   ` Shannon Zhao
@ 2016-12-07  8:10     ` Marc Zyngier
  2016-12-07 11:00       ` Shannon Zhao
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2016-12-07  8:10 UTC (permalink / raw)
  To: Shannon Zhao, kvmarm, christoffer.dall

On 07/12/16 07:45, Shannon Zhao wrote:
> 
> 
> On 2016/12/6 19:47, Marc Zyngier wrote:
>> On 06/12/16 06:41, Shannon Zhao wrote:
>>> From: Shannon Zhao <shannon.zhao@linaro.org>
>>>
>>> Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
>>> removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
>>> vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
>>> when probing GIC. These two patches add the missing part.
>>>
>>> BTW, here is a strange problem on Huawei D03 board when using
>>> upstream kernel that android guest with a goldfish_fb will hang with
>>> rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
>>> but the problem still exists, while if we revert the commit
>>> b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
>>> well.
>>>
>>> We add a trace in kvm_vgic_flush_hwstate() to print the value of 
>>> compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
>>> vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
>>> but the first one in vgic_lr is 10a0000000002001. I don't understand why
>>> there is a valued one in vgic_lr since the memory of vgic_lr is zero
>>> allocated. I think It should be zero when the vcpu first run and first
>>> call kvm_vgic_flush_hwstate().
>>>
>>> qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0
>>>
>>> I also add a trace at the end of vgic_flush_lr_state() which shows the
>>> kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
>>> are zero.
>>>
>>> qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0
>>>
>>> But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
>>> first one of vgic_lr is 10a0000000002001.
>>>
>>> qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0
>>>
>>> The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.
>>
>> Decoding this LR value is interesting:
>>
>> 10a0000000002001
>> | |         | LPI 8193
>> | |
>> | Priority 0xa0
>> |
>> Group1
>>
>> Someone is injecting an LPI behind your back. If nobody populates this,
>> then you may want to investigate what is happening on the host side. Is
>> there anyone using this interrupt?
>>
> 
> For this guest, I think nobody populates this LR, but on the host, there
> is a LPI interrupt 8193. It's a interrupt of eth2
> 
> MBIGEN-V2 8193 Edge      eth2-tx0
> 
> It's a little confused to me that the LR registers should only be used
> for VM, right? Why does the interrupt on host would affect the LRs?

It should never have an impact, but I'm worried that this could be a HW
bug where the physical side of the ITS leaks into the virtual one. You
have a GICv4, right?

It'd be interesting to find out what happens if you leave this interrupt
disabled (don't enable eth2) and see if that interrupt magically appears
or not.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
  2016-12-07  8:10     ` Marc Zyngier
@ 2016-12-07 11:00       ` Shannon Zhao
  2016-12-08 12:32         ` Christoffer Dall
  0 siblings, 1 reply; 11+ messages in thread
From: Shannon Zhao @ 2016-12-07 11:00 UTC (permalink / raw)
  To: Marc Zyngier, kvmarm, christoffer.dall



On 2016/12/7 16:10, Marc Zyngier wrote:
> On 07/12/16 07:45, Shannon Zhao wrote:
>>
>>
>> On 2016/12/6 19:47, Marc Zyngier wrote:
>>> On 06/12/16 06:41, Shannon Zhao wrote:
>>>> From: Shannon Zhao <shannon.zhao@linaro.org>
>>>>
>>>> Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
>>>> removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
>>>> vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
>>>> when probing GIC. These two patches add the missing part.
>>>>
>>>> BTW, here is a strange problem on Huawei D03 board when using
>>>> upstream kernel that android guest with a goldfish_fb will hang with
>>>> rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
>>>> but the problem still exists, while if we revert the commit
>>>> b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
>>>> well.
>>>>
>>>> We add a trace in kvm_vgic_flush_hwstate() to print the value of 
>>>> compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
>>>> vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
>>>> but the first one in vgic_lr is 10a0000000002001. I don't understand why
>>>> there is a valued one in vgic_lr since the memory of vgic_lr is zero
>>>> allocated. I think It should be zero when the vcpu first run and first
>>>> call kvm_vgic_flush_hwstate().
>>>>
>>>> qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0
>>>>
>>>> I also add a trace at the end of vgic_flush_lr_state() which shows the
>>>> kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
>>>> are zero.
>>>>
>>>> qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0
>>>>
>>>> But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
>>>> first one of vgic_lr is 10a0000000002001.
>>>>
>>>> qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0
>>>>
>>>> The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.
>>>
>>> Decoding this LR value is interesting:
>>>
>>> 10a0000000002001
>>> | |         | LPI 8193
>>> | |
>>> | Priority 0xa0
>>> |
>>> Group1
>>>
>>> Someone is injecting an LPI behind your back. If nobody populates this,
>>> then you may want to investigate what is happening on the host side. Is
>>> there anyone using this interrupt?
>>>
>>
>> For this guest, I think nobody populates this LR, but on the host, there
>> is a LPI interrupt 8193. It's a interrupt of eth2
>>
>> MBIGEN-V2 8193 Edge      eth2-tx0
>>
>> It's a little confused to me that the LR registers should only be used
>> for VM, right? Why does the interrupt on host would affect the LRs?
> 
> It should never have an impact, but I'm worried that this could be a HW
> bug where the physical side of the ITS leaks into the virtual one. You
> have a GICv4, right?
Yes, the hardware supports GICv4 but I think current kernel doesn't
enable it.

> 
> It'd be interesting to find out what happens if you leave this interrupt
> disabled (don't enable eth2) and see if that interrupt magically appears
> or not.
> 
Ah, I found the guest uses ITS and there is a irq number 8193. If I use
a qemu without ITS feature then there is no such irq in trace output.

But there is still unexpected LR in vgic_lr[] array of irq 27. Nobody
calls vgic_update_irq_pending for irq 27 before below trace outputs.

 qemu-system-aar-6681  [021] ....  1081.718849: kvm_vgic_flush_hwstate:
VCPU: 0, lits-count: 0, LR: 0, 0, 0
 qemu-system-aar-6681  [021] ....  1081.718849: vgic_flush_lr_state:
used lr count is :0, irq1:0, irq2:0, irq3:0, irq4:0
 qemu-system-aar-6681  [021] d...  1081.718850: kvm_entry: PC:
0xffffff8008432940
 qemu-system-aar-6681  [021] ....  1081.718852: kvm_exit: TRAP: HSR_EC:
0x0024 (DABT_LOW), PC: 0xffffff8008432954
 qemu-system-aar-6681  [021] ....  1081.718852:
kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 0, 0, 0, 0
 qemu-system-aar-6681  [021] ....  1081.718855: kvm_vgic_flush_hwstate:
VCPU: 0, lits-count: 0, LR: 50a002000000001b, 0, 0, 0
 qemu-system-aar-6681  [021] ....  1081.718855: vgic_flush_lr_state:
used lr count is :0, irq1:0, irq2:0, irq3:0, irq4:0
 qemu-system-aar-6681  [021] d...  1081.718856: kvm_entry: PC:
0xffffff8008432958
 qemu-system-aar-6681  [021] ....  1081.718858: kvm_exit: TRAP: HSR_EC:
0x0024 (DABT_LOW), PC: 0xffffff800843291c

Thanks,
-- 
Shannon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic
  2016-12-07 11:00       ` Shannon Zhao
@ 2016-12-08 12:32         ` Christoffer Dall
  0 siblings, 0 replies; 11+ messages in thread
From: Christoffer Dall @ 2016-12-08 12:32 UTC (permalink / raw)
  To: Shannon Zhao; +Cc: Marc Zyngier, kvmarm

On Wed, Dec 07, 2016 at 07:00:35PM +0800, Shannon Zhao wrote:
> 
> 
> On 2016/12/7 16:10, Marc Zyngier wrote:
> > On 07/12/16 07:45, Shannon Zhao wrote:
> >>
> >>
> >> On 2016/12/6 19:47, Marc Zyngier wrote:
> >>> On 06/12/16 06:41, Shannon Zhao wrote:
> >>>> From: Shannon Zhao <shannon.zhao@linaro.org>
> >>>>
> >>>> Commit 50926d8(KVM: arm/arm64: The GIC is dead, long live the GIC)
> >>>> removes the old vgic and commit 9097773(KVM: arm/arm64: vgic-new: 
> >>>> vgic_init: implement kvm_vgic_hyp_init) doesn't reset LRs for new-vgic
> >>>> when probing GIC. These two patches add the missing part.
> >>>>
> >>>> BTW, here is a strange problem on Huawei D03 board when using
> >>>> upstream kernel that android guest with a goldfish_fb will hang with
> >>>> rcu_stall and interrupt timeout for goldfish_fb. We apply these patches
> >>>> but the problem still exists, while if we revert the commit
> >>>> b40c489(arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit) the guest runs
> >>>> well.
> >>>>
> >>>> We add a trace in kvm_vgic_flush_hwstate() to print the value of 
> >>>> compute_ap_list_depth(vcpu) and the value of vgic_lr before calling
> >>>> vgic_flush_lr_state(). The first output shows that the ap_list_depth is zero
> >>>> but the first one in vgic_lr is 10a0000000002001. I don't understand why
> >>>> there is a valued one in vgic_lr since the memory of vgic_lr is zero
> >>>> allocated. I think It should be zero when the vcpu first run and first
> >>>> call kvm_vgic_flush_hwstate().
> >>>>
> >>>> qemu-system-aar-6673  [016] ....   501.969251: kvm_vgic_flush_hwstate: VCPU: 0, lits-count: 0, LR: 10a0000000002001, 0, 0, 0
> >>>>
> >>>> I also add a trace at the end of vgic_flush_lr_state() which shows the
> >>>> kvm_vgic_global_state.nr_lr is 4, used_lrs is 0 and all LRs in vgic_lr
> >>>> are zero.
> >>>>
> >>>> qemu-system-aar-6673  [016] ....   501.969254: vgic_flush_lr_state_nuke: kvm_vgic_global_state.nr_lr is :4, irq1:0, irq2:0, irq3:0, irq4:0
> >>>>
> >>>> But the trace at the beginning of kvm_vgic_sync_hwstate() shows the
> >>>> first one of vgic_lr is 10a0000000002001.
> >>>>
> >>>> qemu-system-aar-6673  [016] ....   501.969261: kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 10a0000000002001, 0, 0, 0
> >>>>
> >>>> The above three trace outputs are printed by the first KVM_ENTRY/EXIT of VCPU 0.
> >>>
> >>> Decoding this LR value is interesting:
> >>>
> >>> 10a0000000002001
> >>> | |         | LPI 8193
> >>> | |
> >>> | Priority 0xa0
> >>> |
> >>> Group1
> >>>
> >>> Someone is injecting an LPI behind your back. If nobody populates this,
> >>> then you may want to investigate what is happening on the host side. Is
> >>> there anyone using this interrupt?
> >>>
> >>
> >> For this guest, I think nobody populates this LR, but on the host, there
> >> is a LPI interrupt 8193. It's a interrupt of eth2
> >>
> >> MBIGEN-V2 8193 Edge      eth2-tx0
> >>
> >> It's a little confused to me that the LR registers should only be used
> >> for VM, right? Why does the interrupt on host would affect the LRs?
> > 
> > It should never have an impact, but I'm worried that this could be a HW
> > bug where the physical side of the ITS leaks into the virtual one. You
> > have a GICv4, right?
> Yes, the hardware supports GICv4 but I think current kernel doesn't
> enable it.
> 
> > 
> > It'd be interesting to find out what happens if you leave this interrupt
> > disabled (don't enable eth2) and see if that interrupt magically appears
> > or not.
> > 
> Ah, I found the guest uses ITS and there is a irq number 8193. If I use
> a qemu without ITS feature then there is no such irq in trace output.
> 
> But there is still unexpected LR in vgic_lr[] array of irq 27. Nobody
> calls vgic_update_irq_pending for irq 27 before below trace outputs.
> 
>  qemu-system-aar-6681  [021] ....  1081.718849: kvm_vgic_flush_hwstate:
> VCPU: 0, lits-count: 0, LR: 0, 0, 0
>  qemu-system-aar-6681  [021] ....  1081.718849: vgic_flush_lr_state:
> used lr count is :0, irq1:0, irq2:0, irq3:0, irq4:0
>  qemu-system-aar-6681  [021] d...  1081.718850: kvm_entry: PC:
> 0xffffff8008432940
>  qemu-system-aar-6681  [021] ....  1081.718852: kvm_exit: TRAP: HSR_EC:
> 0x0024 (DABT_LOW), PC: 0xffffff8008432954
>  qemu-system-aar-6681  [021] ....  1081.718852:
> kvm_vgic_sync_hwstate_vgic_lr: VCPU: 0, used_lrs: 0, LR: 0, 0, 0, 0
>  qemu-system-aar-6681  [021] ....  1081.718855: kvm_vgic_flush_hwstate:
> VCPU: 0, lits-count: 0, LR: 50a002000000001b, 0, 0, 0
>  qemu-system-aar-6681  [021] ....  1081.718855: vgic_flush_lr_state:
> used lr count is :0, irq1:0, irq2:0, irq3:0, irq4:0
>  qemu-system-aar-6681  [021] d...  1081.718856: kvm_entry: PC:
> 0xffffff8008432958
>  qemu-system-aar-6681  [021] ....  1081.718858: kvm_exit: TRAP: HSR_EC:
> 0x0024 (DABT_LOW), PC: 0xffffff800843291c
> 

You could write a debug function that compares the GIC view of the LR
and the actual hardware value whenever vcpu_load and vcpu_put are
called, and see if this is a bug in the vgic code or this is related to
migrating vcpu threads around on cores that come and go.

Another thing to try is to pin each vcpu thread to physical cores and
see if you still see this problem.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: arm/arm64: vgic-v2: Add the missing resetting LRs at boot time
  2016-12-06 11:39   ` Marc Zyngier
@ 2016-12-15  9:09     ` Shannon Zhao
  0 siblings, 0 replies; 11+ messages in thread
From: Shannon Zhao @ 2016-12-15  9:09 UTC (permalink / raw)
  To: Marc Zyngier, kvmarm, christoffer.dall

Hi Marc,

On 2016/12/6 19:39, Marc Zyngier wrote:
> On 06/12/16 06:41, Shannon Zhao wrote:
>> From: Shannon Zhao <shannon.zhao@linaro.org>
>>
>> This is the corresponding part of commit d6400d7(KVM: arm/arm64:
>> vgic-v2: Reset LRs at boot time) which is missed for new-vgic.
>>
>> Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
>> ---
>>  virt/kvm/arm/vgic/vgic-v2.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
>> index 9bab867..c636a19 100644
>> --- a/virt/kvm/arm/vgic/vgic-v2.c
>> +++ b/virt/kvm/arm/vgic/vgic-v2.c
>> @@ -300,6 +300,15 @@ int vgic_v2_map_resources(struct kvm *kvm)
>>  
>>  DEFINE_STATIC_KEY_FALSE(vgic_v2_cpuif_trap);
>>  
>> +static void vgic_cpu_init_lrs(void *params)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < kvm_vgic_global_state.nr_lr; i++)
>> +		writel_relaxed(0, kvm_vgic_global_state.vctrl_base +
>> +				  GICH_LR0 + (i * 4));
>> +}
Since this function will use kvm_vgic_global_state which is initialized
by kvm_vgic_hyp_init, if we call it in cpu_hyp_reinit/cpu_init_hyp_mode,
the kvm_vgic_global_state is not initialized for now. Is that fine to
move kvm_vgic_hyp_init to the first place in init_subsystems?

Thanks,
-- 
Shannon

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-15  9:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-06  6:41 [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Shannon Zhao
2016-12-06  6:41 ` [PATCH 1/2] arm64: KVM: vgic-v3: Add the missing resetting LRs at boot time Shannon Zhao
2016-12-06 11:38   ` Marc Zyngier
2016-12-06  6:41 ` [PATCH 2/2] KVM: arm/arm64: vgic-v2: " Shannon Zhao
2016-12-06 11:39   ` Marc Zyngier
2016-12-15  9:09     ` Shannon Zhao
2016-12-06 11:47 ` [PATCH 0/2] Add the missing resetting LRs at boot time for new-vgic Marc Zyngier
2016-12-07  7:45   ` Shannon Zhao
2016-12-07  8:10     ` Marc Zyngier
2016-12-07 11:00       ` Shannon Zhao
2016-12-08 12:32         ` Christoffer Dall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).