All of lore.kernel.org
 help / color / mirror / Atom feed
From: wanghaibin <wanghaibin.wang@huawei.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: marc.zyngier@arm.com, cdall@linaro.org,
	kvmarm@lists.cs.columbia.edu, wu.wubin@huawei.com,
	andre.przywara@arm.com
Subject: Re: [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset
Date: Tue, 12 Sep 2017 19:15:44 +0800	[thread overview]
Message-ID: <59B7C1E0.1080109@huawei.com> (raw)
In-Reply-To: <fbf06eb5-f894-1719-616c-79ccf3322c1f@redhat.com>

On 2017/9/11 2:46, Auger Eric wrote:

> Hi Wanghaibin,
> 
> On 07/09/2017 13:28, Auger Eric wrote:
>> Hi Wanghaibin,
>>
>> On 07/09/2017 03:32, wanghaibin wrote:
>>> On 2017/9/7 0:20, Auger Eric wrote:
>>>
>>>> Hi,
>>>>
>>>> On 06/09/2017 15:05, wanghaibin wrote:
>>>>> This patch fix the migrate save tables failure.
>>>>>
>>>>> When the virtual machine is in booting and the devices haven't initialized,
>>>>> the all virtual dte/ite may be invalid. If migrate at this moment, the save
>>>>> tables interface traversal device list, and check the dte is valid or not.
>>>>> if not, it will return the -EINVAL.
>>>>
>>>> The issue on save is less clear to me. We are not checking the "dte" are
>>>> valid as it is said above. We are scrolling the ITS lists - which may be
>>>> empty - and dump them in guest memory.
>>>>
>>>> On save() there are quite few checks that can cause a failure.
>>>> vgic_its_check_id() can be among them. This typically requires the
>>>> GITS_BASER to have been properly set. Failing on save looks OK to me in
>>>> such situation.
>>>>
>>>> Sorry but I don't get the purpose of this patch. Does it fix a save failure?
>>>
>>>
>>> Yes, for save, vgic_its_check_id() func will check the L1 DTE valid or not through
>>> the code like :
>>>
>>> 	/* Each 1st level entry is represented by a 64-bit value. */
>>> 	if (kvm_read_guest(its->dev->kvm,
>>> 			   BASER_ADDRESS(baser) + index * sizeof(indirect_ptr),
>>> 			   &indirect_ptr, sizeof(indirect_ptr)))
>>> 		return false;
>>>
>>> 	indirect_ptr = le64_to_cpu(indirect_ptr);
>>>
>>> 	/* check the valid bit of the first level entry */
>>> 	if (!(indirect_ptr & BIT_ULL(63)))
>>> 		return false;
>>>
>>> If invalid , the save will return -EINVAL caused by the vgic_its_check_id() with return the false value.
>>>
>>> And form the cover letter, the problem happened when no one pci dev has been probed( guest driver haven't any
>>> mapd, mapti), So the L1 DTEs are all invalid currently. Just like you said, at this moment migrate, we are scrolling
>>> the ITS lists, next time check_id failed and save interface failed.
>>>
>>> I think the final reason is the device list free problem, at the reset/reboot, ITS dev/clo/itt lists are not be free
>>> and set NULL. So that, the save interface failed.
>>> This patch try to free the resource when vm reboot/reset.
>> OK understood. Indeed none of the device/collection lists should be non
>> empty at that stage, ie. when GITS_BASERn have not be written yet and
>> are marked invalid.
>>
>> For solving the specific save() issue here, I think the best is to check
>> the validity bit of the GITS_BASER (col, device) and if invalid do nothing.
> 
> Actually the above proposal does not work as GITS_BASERn is not properly
> reset. Maybe the best way is to introduce an ITS KVM device reset IOTCL
> in the control group. Upon this command we could properly reset the
> requested registers and the lists.


Yes, It should free these lists when vits reset.

This patch according the has_run_once and vcpu_init to mark the vcpu reset happened,
and scrolling all kvm devices to find the vits device to free the lists.
I think it's a little odd too.

If we can add the reset IOCTL, I think it must be the best way.

Thanks.

> 
> Thanks
> 
> Eric
>>
>> Then we need to have a more global discussion about whether, when and
>> where the device and collection lists need to be freed.
>>
>> If you want I can respin with above suggestion and add the valid pointer
>> to the entry_fn_t to handle the restore path. Up to you.


All along, I want to contribute code to the community, so far, It has not been achieved.
So I would like to collect the solutions for this problem and try to fix it first, can I?

Thanks.

>>
>> Thanks
>>
>> Eric
>>
>>
>>> BTW: these lists will re-bulid when the reboot vm run the probe pci device step.
>>
>>>
>>> Thanks
>>>
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>>
>>>>> This patch try to free the its list resource when vm reboot or reset to avoid this.
>>>>>
>>>>> Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com>
>>>>> ---
>>>>>  virt/kvm/arm/arm.c           |  5 ++++-
>>>>>  virt/kvm/arm/vgic/vgic-its.c | 10 ++++++++++
>>>>>  virt/kvm/arm/vgic/vgic.h     |  1 +
>>>>>  3 files changed, 15 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>>>>> index a39a1e1..db7632d 100644
>>>>> --- a/virt/kvm/arm/arm.c
>>>>> +++ b/virt/kvm/arm/arm.c
>>>>> @@ -46,6 +46,7 @@
>>>>>  #include <asm/kvm_coproc.h>
>>>>>  #include <asm/kvm_psci.h>
>>>>>  #include <asm/sections.h>
>>>>> +#include "vgic.h"
>>>>>  
>>>>>  #ifdef REQUIRES_VIRT
>>>>>  __asm__(".arch_extension	virt");
>>>>> @@ -901,8 +902,10 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
>>>>>  	 * Ensure a rebooted VM will fault in RAM pages and detect if the
>>>>>  	 * guest MMU is turned off and flush the caches as needed.
>>>>>  	 */
>>>>> -	if (vcpu->arch.has_run_once)
>>>>> +	if (vcpu->arch.has_run_once) {
>>>>>  		stage2_unmap_vm(vcpu->kvm);
>>>>> +		vgic_its_free_resource(vcpu->kvm);
>>>>> +	}
>>>>>  
>>>>>  	vcpu_reset_hcr(vcpu);
>>>>>  
>>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>>> index 25d614f..5c20352 100644
>>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>>> @@ -2467,6 +2467,16 @@ static int vgic_its_get_attr(struct kvm_device *dev,
>>>>>  	.has_attr = vgic_its_has_attr,
>>>>>  };
>>>>>  
>>>>> +void vgic_its_free_resource(struct kvm *kvm)
>>>>> +{
>>>>> +	struct kvm_device *dev, *tmp;
>>>>> +
>>>>> +	list_for_each_entry_safe(dev, tmp, &kvm->devices, vm_node) {
>>>>> +		if(dev->ops == &kvm_arm_vgic_its_ops)
>>>>> +			vgic_its_free_list(kvm, dev->private);
>>>>> +	}
>>>>> +}
>>>>> +
>>>>>  int kvm_vgic_register_its_device(void)
>>>>>  {
>>>>>  	return kvm_register_device_ops(&kvm_arm_vgic_its_ops,
>>>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>>>>> index c2be5b7..fbcbdfd 100644
>>>>> --- a/virt/kvm/arm/vgic/vgic.h
>>>>> +++ b/virt/kvm/arm/vgic/vgic.h
>>>>> @@ -222,5 +222,6 @@ int vgic_v3_line_level_info_uaccess(struct kvm_vcpu *vcpu, bool is_write,
>>>>>  
>>>>>  bool lock_all_vcpus(struct kvm *kvm);
>>>>>  void unlock_all_vcpus(struct kvm *kvm);
>>>>> +void vgic_its_free_resource(struct kvm *kvm);
>>>>>  
>>>>>  #endif
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>>
> 
> .
> 

  reply	other threads:[~2017-09-12 11:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin
2017-09-12  8:50   ` wanghaibin
2017-09-12 10:08     ` Auger Eric
2017-09-13 19:13       ` Christoffer Dall
2017-09-13 19:14   ` Christoffer Dall
2017-09-16  1:59     ` wanghaibin
2017-09-16 22:17       ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin
2017-09-06 16:20   ` Auger Eric
2017-09-07  1:32     ` wanghaibin
2017-09-07 11:28       ` Auger Eric
2017-09-10 18:46         ` Auger Eric
2017-09-12 11:15           ` wanghaibin [this message]
2017-09-13  8:49             ` Auger Eric
2017-09-13 19:34   ` Christoffer Dall
2017-09-13 21:13     ` Auger Eric
2017-09-14  5:34       ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin
2017-09-06 15:18   ` Auger Eric
2017-09-13 20:02     ` Christoffer Dall
2017-09-13 21:25       ` Auger Eric
2017-09-14  5:35         ` Christoffer Dall
2017-09-13 20:04   ` Christoffer Dall
2017-09-14  8:30   ` Auger Eric
2017-09-16  2:02     ` wanghaibin
2017-09-20  1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-20  7:16   ` Auger Eric
2017-09-21 12:17     ` wanghaibin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59B7C1E0.1080109@huawei.com \
    --to=wanghaibin.wang@huawei.com \
    --cc=andre.przywara@arm.com \
    --cc=cdall@linaro.org \
    --cc=eric.auger@redhat.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=wu.wubin@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.