From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Wangxin (Alexander)" <wangxinxin.wang@huawei.com>, mst@redhat.com
Cc: "Wuchenye \(karot,
Cloud Infrastructure Service Product Dept\)"
<wuchenye@huawei.com>,
"Zhoujian \(jay\)" <jianjay.zhou@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"quintela@redhat.com" <quintela@redhat.com>
Subject: Re: [RFC]migration: stop/start device at the end of live migration concurrently
Date: Mon, 1 Mar 2021 16:02:23 +0000 [thread overview]
Message-ID: <YD0QD+6IZ2LkNnRN@work-vm> (raw)
In-Reply-To: <c716d92c659149f6bdb00c9aa642abf9@huawei.com>
* Wangxin (Alexander) (wangxinxin.wang@huawei.com) wrote:
> Hi all,
(copying in Michael for vhost user maintainer).
> We found that the downtime of migration will reach a few seconds when live
> migrating a huge VM with 224vCPU/180GiB/16 vhost-user nics (x32 queues)/
> 24 vhost-user-blk disks(x4 queues), most of the time is spent in the
> position of stopping the device at src and starting device at dst.
I suspect that's more vhost-user devices than anyone else has run on a
single VM!
> Our idea is to stop the device through multiple threads during the end of
> migration. To be more specific, we create thread pool at the beginning of live
> migraion, when migration thread call virtio_vmstate_change callback to stop or
> start device in vm_state_notify, it will submits request to thread pool to
> handle the callback concurrently.
>
> We live migrate the vm and count the cost time at different stages of
> stopping/starting devices.
>
> - - - Cost: Original With state change concurrently
> get vring base 36ms 18ms
> disk disable guest notify 48ms 32ms
> disable host notify 300ms 120ms
> Src get vring base 1376ms 294ms
> net disable host notify 1011ms 116ms
> disable guest notify 59ms 40ms
> - - -
> enable guest notify 310ms 97ms
> net set memtable 48ms 20ms
> enable host notify 2022ms 114ms
> Dst enable host notify 312ms 78ms
> disk enable guest notify 32ms 23ms
> set memTable 16ms 10ms
> Total Downtime 5600ms 962ms
>
> However, there are some side effects:
> 1. When disable host notify or guest notify concurrently, the vm will be crashed
> due to disabling same notify at the different threads, we now add two different lock
> to solve this problem, it is hacking to do so and may be resulting in other problems.
>
> 2. As the QEMU BQL will be held by migration thread before stopping device in
> migration_completion, there will be deadlock in the following scene:
> migration_thread [thread 1]
> set_up_multithread
> ...
> migration_completion()# get QEMU BQL
> qemu_mutex_lock_iothread()
> vm_stop_force_state()
> ...
> submit stopping device request
> to thread pool
> virtio_vmstate_change
> virtio_set_status
> ...
> memory_region_transaction_begin
> ...
> prepare_mmio_access
> qemu_mutex_iothread_locked()# N
> qemu_mutex_lock_iothread()# deadlock
>
> Now we add another lock to replace the BQL in this scene to solve the problem,
> but we think this is not reliable enough and has potential risk that other
> processes will also use the QEMU BQL during the process of stopping device. My
> question is: how to deal with the conflict with QEMU BQL properly.
>
> Any advice will be appreciated, thanks.
To me it feels like the other way here would be to explicitly split
each of these stages into two; one where it sends the request to the
vhost device and the other it waits for the response from the vhost-user
device; (i.e. in the vhost_user case after the vhost_user_write but
before the vhost_user_read) - so instead of parallelising everything in
threads, you'd parallelise all of the corresponding operations;
so all of the get_vring_base's happen at the same time.
Michael: Would this make sense as a thing to change VhostOps
get_vring_base and many of the others into two part operations?
(or maybe coroutines with a yield in???)
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
prev parent reply other threads:[~2021-03-01 16:04 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-01 15:09 [RFC]migration: stop/start device at the end of live migration concurrently Wangxin (Alexander)
2021-03-01 16:02 ` Dr. David Alan Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YD0QD+6IZ2LkNnRN@work-vm \
--to=dgilbert@redhat.com \
--cc=jianjay.zhou@huawei.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wangxinxin.wang@huawei.com \
--cc=wuchenye@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).