All of lore.kernel.org
 help / color / mirror / Atom feed
From: wanghaibin <wanghaibin.wang@huawei.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: cdall@linaro.org, "Huangweidong (C)" <weidong.huang@huawei.com>,
	marc.zyngier@arm.com, andre.przywara@arm.com,
	kvmarm@lists.cs.columbia.edu, wu.wubin@huawei.com
Subject: Re: [RFC PATCH 0/3] fix migrate failed when vm is in booting
Date: Thu, 21 Sep 2017 20:17:57 +0800	[thread overview]
Message-ID: <59C3ADF5.5010202@huawei.com> (raw)
In-Reply-To: <17cf4b4e-f3b1-ffa1-062c-460454aef049@redhat.com>

On 2017/9/20 15:16, Auger Eric wrote:

> Hi Wanghaibin,
> 
> On 20/09/2017 03:57, wanghaibin wrote:
>> On 2017/9/6 21:05, wanghaibin wrote:
>>
>>> We have a test scenario: vmlife and migrate fixed test.
>>>
>>> Here is a problem; VM migration failed caused the qemu core which gdb trace:
>>
>>
>> I tested migration with these patches in these days, and here is another scene can expose another problem.
>>
>> scene: migrate at the moment which vm boot in UEFI (EDK2, before the guest kernel boot) while vm reboot.
>> core file(at destination) gdb trace:
>> (gdb) bt
>> #0  0x0000ffff978dae84 in raise () from /usr/lib64/libc.so.6
>> #1  0x0000ffff978dcb80 in abort () from /usr/lib64/libc.so.6
>> #2  0x0000000000465468 in kvm_device_access ()
>> #3  0x00000000004a2d68 in kvm_arm_its_post_load ()
>> #4  0x0000000000629038 in gicv3_its_post_load ()
>> #5  0x00000000006c042c in vmstate_load_state ()
>> #6  0x000000000047acf8 in qemu_loadvm_section_start_full ()
>> #7  0x000000000047b234 in qemu_loadvm_state_main ()
>> #8  0x000000000047cdb8 in qemu_loadvm_state ()
>> #9  0x00000000006bf3cc in process_incoming_migration_co ()
>> #10 0x00000000007b40f0 in coroutine_trampoline ()
>> #11 0x0000ffff978ebfa0 in ?? () from /usr/lib64/libc.so.6
>>
>> The reason of the bug is vITS device doesn't reset it's registers value correctly, in qemu vits device reset logic, the iidr is set to 0,
>> and the kvm_arm_its_post_load check the iidr is 0 and return (without put any registers).
> 
> That's correct this is what currently prevents from writing the
> registers on reset. My initial goal was that this code would be executed
> only on restore() and not on reset(), hence this check.
> 
>>
>> Here maybe a problem when migrate at that moment (vm boot in UEFI), the restore interface can check the corresponding BASERn registers
>> valid successfully, and flush info from guest memory to kvm will failed in future (for example, the BASER[CT].valid=1, but the CTE maybe invalid).
>>
>> So, we should fix the vITS device reset interface, and there may be a rule that needs clarification:
>> In arm-vgic-its.txt, it has been described about ITS restore sequence:
>> d) restore the ITS in the following order:
>>    1. Restore GITS_CBASER
>>    2. Restore all other GITS_ registers, except GITS_CTLR!
>>    3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES)
>>    4. Restore GITS_CTLR
>>
>> But this sequence is inappropriate for reset interface based on our currently implementation , for example, reset interface write creadr/baser with
>> the initial value must rely on its disable firstly.
> 
> After the discussion with Christoffer, I think the preferred solution is
> to introduce a new KVM ITS device reset IOCTL and not write individual
> registers as this sequence would do. Anyway as you point ed out, this
> sequence does not work as is. Typically writing CREADR/CWRITER with 0
> fails at the moment because CBASER == 0.
> 
> So may I suggest I follow up with an implementation of the reset IOCTL +
> some cleanup patchs? Please an you respin the restore fix according to
> what we discussed?



Sure, Thanks.

> 
> Thanks
> 
> Eric
>>
>> So, minimal change (fix in qemu) maybe like:
>>
>> diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c
>> index 39903d5..9602dd3 100644
>> --- a/hw/intc/arm_gicv3_its_kvm.c
>> +++ b/hw/intc/arm_gicv3_its_kvm.c
>> @@ -160,7 +160,8 @@ static void kvm_arm_its_post_load(GICv3ITSState *s)
>>      int i;
>>
>>      if (!s->iidr) {
>> -        return;
>> +        kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> +                          GITS_CTLR, &s->ctlr, true, &error_abort);
>>      }
>>
>>      kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> @@ -190,8 +191,10 @@ static void kvm_arm_its_post_load(GICv3ITSState *s)
>>                        KVM_DEV_ARM_ITS_RESTORE_TABLES, NULL, true,
>>                        &error_abort);
>>
>> -    kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> -                      GITS_CTLR, &s->ctlr, true, &error_abort);
>> +    if (s->iidr) {
>> +        kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
>> +                          GITS_CTLR, &s->ctlr, true, &error_abort);
>> +    }
>>  }
>>
>> Any suggestion ?
>>
>> Thanks.
>>
>>>
>>> #0  0x0000ffffb023fe84 in raise () from /usr/lib64/libc.so.6
>>> #1  0x0000ffffb0241b80 in abort () from /usr/lib64/libc.so.6
>>> #2  0x000000000046b408 in kvm_device_access (fd=<optimized out>, group=4, attr=1, val=<optimized out>, write=<optimized out>)
>>>     at /usr/src/debug/qemu-2.6.0/kvm-all.c:2064
>>> #3  0x000000000059ade8 in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at vl.c:1728
>>> #4  0x000000000045736c in do_vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:744
>>> #5  0x00000000004573bc in vm_stop (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1434
>>> #6  0x0000000000457428 in vm_stop_force_state (state=state@entry=RUN_STATE_FINISH_MIGRATE) at /usr/src/debug/qemu-2.6.0/cpus.c:1442
>>> #7  0x00000000006d68d0 in migration_completion (s=s@entry=0xb92d60 <current_migration.37525>, current_active_state=4, 
>>>     current_active_state@entry=65535, old_vm_running=0xffffb03c2000, old_vm_running@entry=0xfffe934fc53f, 
>>>     start_time=0xfffe934fcce0, start_time@entry=0xfffe934fc540) at migration/migration.c:1798
>>> #8  0x00000000006d7480 in migration_thread (opaque=0xb92d60 <current_migration.37525>) at migration/migration.c:1995
>>> #9  0x0000ffffb0398dc4 in start_thread () from /usr/lib64/libpthread.so.0
>>> #10 0x0000ffffb02ef020 in thread_start () from /usr/lib64/libc.so.6
>>>
>>> qemu failed log :
>>> 2017-08-28T12:34:29.654396Z qemu-kvm: KVM_SET_DEVICE_ATTR failed: Invalid argument
>>>
>>> I try to debug it, and I found the migrate at the moment that the vm is still booting.
>>>
>>> So just change the guest like :
>>> diff --git a/drivers/pci/host/pci-host-common.c b/drivers/pci/host/pci-host-common.c
>>> index e9a53ba..d73fc03 100644
>>> --- a/drivers/pci/host/pci-host-common.c
>>> +++ b/drivers/pci/host/pci-host-common.c
>>> @@ -23,6 +23,7 @@
>>>  #include <linux/of_pci.h>
>>>  #include <linux/pci-ecam.h>
>>>  #include <linux/platform_device.h>
>>> +#include <linux/delay.h>
>>>  
>>>  static int gen_pci_parse_request_of_pci_ranges(struct device *dev,
>>>                        struct list_head *resources, struct resource **bus_range)
>>> @@ -119,6 +120,7 @@ int pci_host_common_probe(struct platform_device *pdev,
>>>         struct pci_bus *bus, *child;
>>>         struct pci_config_window *cfg;
>>>         struct list_head resources;
>>> +       int i;
>>>  
>>>         type = of_get_property(np, "device_type", NULL);
>>>         if (!type || strcmp(type, "pci")) {
>>> @@ -164,6 +166,8 @@ int pci_host_common_probe(struct platform_device *pdev,
>>>                         pcie_bus_configure_settings(child);
>>>         }
>>>  
>>> +       for (i=0;i<20000; i++)
>>> +               udelay(1000);
>>>         pci_bus_add_devices(bus);
>>>         return 0;
>>>  }
>>>
>>> And migrate at this delay time, It must be failed.
>>>
>>> This patchset try to fix this problem.
>>>
>>> BTW: This patchset just a demo, haven't do more test, and the implement maybe a little evil.
>>>
>>> Thanks
>>>
>>>
>>>
>>> wanghaibin (3):
>>>   kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function
>>>   kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset
>>>   kvm: arm/arm64: vgic-its: fix return value for restore
>>>
>>>  virt/kvm/arm/arm.c           |  5 ++++-
>>>  virt/kvm/arm/vgic/vgic-its.c | 26 +++++++++++++++++++-------
>>>  virt/kvm/arm/vgic/vgic.h     |  1 +
>>>  3 files changed, 24 insertions(+), 8 deletions(-)
>>>
>>
>>
>>
>> _______________________________________________
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>>
> 
> .
> 



_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

      reply	other threads:[~2017-09-21 12:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-06 13:05 [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-06 13:05 ` [RFC PATCH 1/3] kvm: arm/arm64: vgic-vits: separate vgic_its_free_list() function wanghaibin
2017-09-12  8:50   ` wanghaibin
2017-09-12 10:08     ` Auger Eric
2017-09-13 19:13       ` Christoffer Dall
2017-09-13 19:14   ` Christoffer Dall
2017-09-16  1:59     ` wanghaibin
2017-09-16 22:17       ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 2/3] kvm: arm/arm64: vgic-vits: free its resource when vm reboot/reset wanghaibin
2017-09-06 16:20   ` Auger Eric
2017-09-07  1:32     ` wanghaibin
2017-09-07 11:28       ` Auger Eric
2017-09-10 18:46         ` Auger Eric
2017-09-12 11:15           ` wanghaibin
2017-09-13  8:49             ` Auger Eric
2017-09-13 19:34   ` Christoffer Dall
2017-09-13 21:13     ` Auger Eric
2017-09-14  5:34       ` Christoffer Dall
2017-09-06 13:05 ` [RFC PATCH 3/3] kvm: arm/arm64: vgic-its: fix return value for restore wanghaibin
2017-09-06 15:18   ` Auger Eric
2017-09-13 20:02     ` Christoffer Dall
2017-09-13 21:25       ` Auger Eric
2017-09-14  5:35         ` Christoffer Dall
2017-09-13 20:04   ` Christoffer Dall
2017-09-14  8:30   ` Auger Eric
2017-09-16  2:02     ` wanghaibin
2017-09-20  1:57 ` [RFC PATCH 0/3] fix migrate failed when vm is in booting wanghaibin
2017-09-20  7:16   ` Auger Eric
2017-09-21 12:17     ` wanghaibin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59C3ADF5.5010202@huawei.com \
    --to=wanghaibin.wang@huawei.com \
    --cc=andre.przywara@arm.com \
    --cc=cdall@linaro.org \
    --cc=eric.auger@redhat.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=weidong.huang@huawei.com \
    --cc=wu.wubin@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.