From: wanghaibin <wanghaibin.wang@huawei.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: cdall@linaro.org, "Huangweidong (C)" <weidong.huang@huawei.com>,
marc.zyngier@arm.com, andre.przywara@arm.com,
kvmarm@lists.cs.columbia.edu, wu.wubin@huawei.com
Subject: Re: [RFC PATCH 0/3] fix migrate failed when vm is in booting
Date: Thu, 21 Sep 2017 20:17:57 +0800 [thread overview]
Message-ID: <59C3ADF5.5010202@huawei.com> (raw)
In-Reply-To: <17cf4b4e-f3b1-ffa1-062c-460454aef049@redhat.com>
On 2017/9/20 15:16, Auger Eric wrote:
> Hi Wanghaibin,
>
> On 20/09/2017 03:57, wanghaibin wrote:
>> On 2017/9/6 21:05, wanghaibin wrote:
>>
>>> We have a test scenario: vmlife and migrate fixed test.
>>>
>>> Here is a problem; VM migration failed caused the qemu core which gdb trace:
>>
>>
>> I tested migration with these patches in these days, and here is another scene can expose another problem.
>>
>> scene: migrate at the moment which vm boot in UEFI (EDK2, before the guest kernel boot) while vm reboot.
>> core file(at destination) gdb trace:
>> (gdb) bt
>> #0 0x0000ffff978dae84 in raise () from /usr/lib64/libc.so.6
>> #1 0x0000ffff978dcb80 in abort () from /usr/lib64/libc.so.6
>> #2 0x0000000000465468 in kvm_device_access ()
>> #3 0x00000000004a2d68 in kvm_arm_its_post_load ()
>> #4 0x0000000000629038 in gicv3_its_post_load ()
>> #5 0x00000000006c042c in vmstate_load_state ()
>> #6 0x000000000047acf8 in qemu_loadvm_section_start_full ()
>> #7 0x000000000047b234 in qemu_loadvm_state_main ()
>> #8 0x000000000047cdb8 in qemu_loadvm_state ()
>> #9 0x00000000006bf3cc in process_incoming_migration_co ()
>> #10 0x00000000007b40f0 in coroutine_trampoline ()
>> #11 0x0000ffff978ebfa0 in ?? () from /usr/lib64/libc.so.6
>>
>> The reason of the bug is vITS device doesn't reset it's registers value correctly, in qemu vits device reset logic, the iidr is set to 0,
>> and the kvm_arm_its_post_load check the iidr is 0 and return (without put any registers).
>
> That's correct this is what currently prevents from writing the
> registers on reset. My initial goal was that this code would be executed
> only on restore() and not on reset(), hence this check.
>
>>
>> Here maybe a problem when migrate at that moment (vm boot in UEFI), the restore interface can check the corresponding BASERn registers
>> valid successfully, and flush info from guest memory to kvm will failed in future (for example, the BASER[CT].valid=1, but the CTE maybe invalid).
>>
>> So, we should fix the vITS device reset interface, and there may be a rule that needs clarification:
>> In arm-vgic-its.txt, it has been described about ITS restore sequence:
>> d) restore the ITS in the following order:
>> 1. Restore GITS_CBASER
>> 2. Restore all other GITS_ registers, except GITS_CTLR!
>> 3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES)
>> 4. Restore GITS_CTLR
>>
>> But this sequence is inappropriate for reset interface based on our currently implementation , for example, reset interface write creadr/baser with
>> the initial value must rely on its disable firstly.
>
> After the discussion with Christoffer, I think the preferred solution is
> to introduce a new KVM ITS device reset IOCTL and not write individual
> registers as this sequence would do. Anyway as you point ed out, this
> sequence does not work as is. Typically writing CREADR/CWRITER with 0
> fails at the moment because CBASER == 0.
>
> So may I suggest I follow up with an implementation of the reset IOCTL +
> some cleanup patchs? Please an you respin the restore fix according to
> what we discussed?
Sure, Thanks.
>
> Thanks
>
> Eric
>>
>> So, minimal change (fix in qemu) maybe like:
>>
>> diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c
>> index 39903d5..9602dd3 100644
>> --- a/hw/intc/arm_gicv3_its_kvm.c
>> +++ b/hw/intc/arm_gicv3_its_kvm.c
>> @@ -160,7 +160,8 @@ static void kvm_arm_its_post_load(GICv3ITSState *s)
>> int i;
>>
>> if (!s->iidr) {
>> - return;
>> + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> + GITS_CTLR, &s->ctlr, true, &error_abort);
>> }
>>
>> kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> @@ -190,8 +191,10 @@ static void kvm_arm_its_post_load(GICv3ITSState *s)
>> KVM_DEV_ARM_ITS_RESTORE_TABLES, NULL, true,
>> &error_abort);
>>
>> - kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> - GITS_CTLR, &s->ctlr, true, &error_abort);
>> + if (s->iidr) {
>> + kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> + GITS_CTLR, &s->ctlr, true, &error_abort);
>> + }
>> }
>>
>> Any suggestion ?
>>
>> Thanks.
>>
>>>
>>> #0 0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6
>>> #1 0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6
>>> #2 0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>)
>>> at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064
>>> #3 0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728
>>> #4 0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744
>>> #5 0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434
>>> #6 0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442
>>> #7 0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4,
>>> current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f,
>>> start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798
>>> #8 0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995
>>> #9 0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0
>>> #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6
>>>
>>> qemu failed log :
>>> 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument
>>>
>>> I try to debug it, and I found the migrate at the moment that the vm is still booting.
>>>
>>> So just change the guest like :
>>> diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c
>>> index e9a53ba..d73fc03 100644
>>> --- a/drivers/pci/host/pci-host-common.c
>>> +++ b/drivers/pci/host/pci-host-common.c
>>> @@ -23,6 +23,7 @@
>>> #include <linux/of_pci.h>
>>> #include <linux/pci-ecam.h>
>>> #include <linux/platform_device.h>
>>> +#include <linux/delay.h>
>>>
>>> static int gen_pci_parse_request_of_pci_ranges(struct device *dev,
>>> struct list_head *resources, struct resource **bus_range)
>>> @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev,
>>> struct pci_bus *bus, *child;
>>> struct pci_config_window *cfg;
>>> struct list_head resources;
>>> + int i;
>>>
>>> type = of_get_property(np, "device_type", NULL);
>>> if (!type || strcmp(type, "pci")) {
>>> @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev,
>>> pcie_bus_configure_settings(child);
>>> }
>>>
>>> + for (i=0;i<20000; i++)
>>> + udelay(1000);
>>> pci_bus_add_devices(bus);
>>> return 0;
>>> }
>>>
>>> And migrate at this delay time, It must be failed.
>>>
>>> This patchset try to fix this problem.
>>>
>>> BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil.
>>>
>>> Thanks
>>>
>>>
>>>
>>> wanghaibin (3):
>>> kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function
>>> kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset
>>> kvm: arm/arm64: vgic-its: fix return value for restore
>>>
>>> virt/kvm/arm/arm.c | 5 ++++-
>>> virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++-------
>>> virt/kvm/arm/vgic/vgic.h | 1 +
>>> 3 files changed, 24 insertions(+), 8 deletions(-)
>>>
>>
>>
>>
>> _______________________________________________
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>>
>
> .
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
prev parent reply other threads:[~2017-09-21 12:20 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin
2017-09-12 8:50 ` wanghaibin
2017-09-12 10:08 ` Auger Eric
2017-09-13 19:13 ` Christoffer Dall
2017-09-13 19:14 ` Christoffer Dall
2017-09-16 1:59 ` wanghaibin
2017-09-16 22:17 ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin
2017-09-06 16:20 ` Auger Eric
2017-09-07 1:32 ` wanghaibin
2017-09-07 11:28 ` Auger Eric
2017-09-10 18:46 ` Auger Eric
2017-09-12 11:15 ` wanghaibin
2017-09-13 8:49 ` Auger Eric
2017-09-13 19:34 ` Christoffer Dall
2017-09-13 21:13 ` Auger Eric
2017-09-14 5:34 ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin
2017-09-06 15:18 ` Auger Eric
2017-09-13 20:02 ` Christoffer Dall
2017-09-13 21:25 ` Auger Eric
2017-09-14 5:35 ` Christoffer Dall
2017-09-13 20:04 ` Christoffer Dall
2017-09-14 8:30 ` Auger Eric
2017-09-16 2:02 ` wanghaibin
2017-09-20 1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-20 7:16 ` Auger Eric
2017-09-21 12:17 ` wanghaibin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59C3ADF5.5010202@huawei.com \
--to=wanghaibin.wang@huawei.com \
--cc=andre.przywara@arm.com \
--cc=cdall@linaro.org \
--cc=eric.auger@redhat.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=marc.zyngier@arm.com \
--cc=weidong.huang@huawei.com \
--cc=wu.wubin@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox