From: Stefan Hajnoczi <stefanha@redhat.com>
To: 858585 jemmy <jemmy858585@gmail.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, Fam Zheng <famz@redhat.com>,
qemu-block@nongnu.org, quintela@redhat.com,
qemu-devel@nongnu.org, dgilbert@redhat.com,
Lidong Chen <lidongchen@tencent.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v3] migration/block:limit the time used for block migration
Date: Tue, 11 Apr 2017 16:32:34 +0100 [thread overview]
Message-ID: <20170411153234.GA5940@stefanha-x1.localdomain> (raw)
In-Reply-To: <CAOGPPbfZjX0zsZZqhc0v6u2KNo8oDU9RvqTL-rW9W0Oid4ovMQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5596 bytes --]
On Tue, Apr 11, 2017 at 09:06:37PM +0800, 858585 jemmy wrote:
> On Tue, Apr 11, 2017 at 8:19 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> > On Mon, Apr 10, 2017 at 9:52 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >> On Sat, Apr 08, 2017 at 09:17:58PM +0800, 858585 jemmy wrote:
> >>> On Fri, Apr 7, 2017 at 7:34 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >>> > On Fri, Apr 07, 2017 at 09:30:33AM +0800, 858585 jemmy wrote:
> >>> >> On Thu, Apr 6, 2017 at 10:02 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >>> >> > On Wed, Apr 05, 2017 at 05:27:58PM +0800, jemmy858585@gmail.com wrote:
> >>> >> >
> >>> >> > A proper solution is to refactor the synchronous code to make it
> >>> >> > asynchronous. This might require invoking the system call from a
> >>> >> > thread pool worker.
> >>> >> >
> >>> >>
> >>> >> yes, i agree with you, but this is a big change.
> >>> >> I will try to find how to optimize this code, maybe need a long time.
> >>> >>
> >>> >> this patch is not a perfect solution, but can alleviate the problem.
> >>> >
> >>> > Let's try to understand the problem fully first.
> >>> >
> >>>
> >>> when migrate the vm with high speed, i find vnc response slowly sometime.
> >>> not only vnc response slowly, virsh console aslo response slowly sometime.
> >>> and the guest os block io performance is also reduce.
> >>>
> >>> the bug can be reproduce by this command:
> >>> virsh migrate-setspeed 165cf436-312f-47e7-90f2-f8aa63f34893 900
> >>> virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893
> >>> --copy-storage-inc qemu+ssh://10.59.163.38/system
> >>>
> >>> and --copy-storage-all have no problem.
> >>> virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893
> >>> --copy-storage-all qemu+ssh://10.59.163.38/system
> >>>
> >>> compare the difference between --copy-storage-inc and
> >>> --copy-storage-all. i find out the reason is
> >>> mig_save_device_bulk invoke bdrv_is_allocated, but bdrv_is_allocated
> >>> is synchronous and maybe wait
> >>> for a long time.
> >>>
> >>> i write this code to measure the time used by brdrv_is_allocated()
> >>>
> >>> 279 static int max_time = 0;
> >>> 280 int tmp;
> >>>
> >>> 288 clock_gettime(CLOCK_MONOTONIC_RAW, &ts1);
> >>> 289 ret = bdrv_is_allocated(blk_bs(bb), cur_sector,
> >>> 290 MAX_IS_ALLOCATED_SEARCH, &nr_sectors);
> >>> 291 clock_gettime(CLOCK_MONOTONIC_RAW, &ts2);
> >>> 292
> >>> 293
> >>> 294 tmp = (ts2.tv_sec - ts1.tv_sec)*1000000000L
> >>> 295 + (ts2.tv_nsec - ts1.tv_nsec);
> >>> 296 if (tmp > max_time) {
> >>> 297 max_time=tmp;
> >>> 298 fprintf(stderr, "max_time is %d\n", max_time);
> >>> 299 }
> >>>
> >>> the test result is below:
> >>>
> >>> max_time is 37014
> >>> max_time is 1075534
> >>> max_time is 17180913
> >>> max_time is 28586762
> >>> max_time is 49563584
> >>> max_time is 103085447
> >>> max_time is 110836833
> >>> max_time is 120331438
> >>>
> >>> bdrv_is_allocated is called after qemu_mutex_lock_iothread.
> >>> and the main thread is also call qemu_mutex_lock_iothread.
> >>> so cause the main thread maybe wait for a long time.
> >>>
> >>> if (bmds->shared_base) {
> >>> qemu_mutex_lock_iothread();
> >>> aio_context_acquire(blk_get_aio_context(bb));
> >>> /* Skip unallocated sectors; intentionally treats failure as
> >>> * an allocated sector */
> >>> while (cur_sector < total_sectors &&
> >>> !bdrv_is_allocated(blk_bs(bb), cur_sector,
> >>> MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) {
> >>> cur_sector += nr_sectors;
> >>> }
> >>> aio_context_release(blk_get_aio_context(bb));
> >>> qemu_mutex_unlock_iothread();
> >>> }
> >>>
> >>> #0 0x00007f107322f264 in __lll_lock_wait () from /lib64/libpthread.so.0
> >>> #1 0x00007f107322a508 in _L_lock_854 () from /lib64/libpthread.so.0
> >>> #2 0x00007f107322a3d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> >>> #3 0x0000000000949ecb in qemu_mutex_lock (mutex=0xfc51a0) at
> >>> util/qemu-thread-posix.c:60
> >>> #4 0x0000000000459e58 in qemu_mutex_lock_iothread () at /root/qemu/cpus.c:1516
> >>> #5 0x0000000000945322 in os_host_main_loop_wait (timeout=28911939) at
> >>> util/main-loop.c:258
> >>> #6 0x00000000009453f2 in main_loop_wait (nonblocking=0) at util/main-loop.c:517
> >>> #7 0x00000000005c76b4 in main_loop () at vl.c:1898
> >>> #8 0x00000000005ceb77 in main (argc=49, argv=0x7fff921182b8,
> >>> envp=0x7fff92118448) at vl.c:4709
> >>
> >> The following patch moves bdrv_is_allocated() into bb's AioContext. It
> >> will execute without blocking other I/O activity.
> >>
> >> Compile-tested only.
> > i will try this patch.
>
> Hi Stefan:
> It work for virtio. i will test ide later.
> Do you have any suggestion about the test case?
1. When testing virtio-blk it's interesting to try both -object
iothread,id=iothread0 -device virtio-blk-pci,iothread=iothread0,...
and without iothread. The code paths are different so there may be
bugs that only occur with iothread or without iothread.
2. The guest should be submitting I/O requests to increase the chance of
race conditions. You could run "dd if=/dev/vda of=/dev/null
iflag=direct bs=4k &" 8 times inside the guest to generate I/O.
Thanks,
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2017-04-11 15:32 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-05 9:27 [Qemu-devel] [PATCH v3] migration/block:limit the time used for block migration jemmy858585
2017-04-05 9:34 ` Daniel P. Berrange
2017-04-05 10:44 ` 858585 jemmy
2017-04-06 3:18 ` 858585 jemmy
2017-04-06 14:02 ` Stefan Hajnoczi
2017-04-07 1:30 ` 858585 jemmy
2017-04-07 8:26 ` 858585 jemmy
2017-04-07 11:33 ` Stefan Hajnoczi
2017-04-08 10:09 ` Paolo Bonzini
2017-04-10 10:01 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-04-09 13:06 ` [Qemu-devel] " 858585 jemmy
2017-04-07 11:34 ` Stefan Hajnoczi
2017-04-08 13:17 ` 858585 jemmy
2017-04-10 13:52 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-04-11 12:19 ` 858585 jemmy
2017-04-11 13:06 ` 858585 jemmy
2017-04-11 15:32 ` Stefan Hajnoczi [this message]
2017-05-03 3:44 ` 858585 jemmy
2017-05-03 13:31 ` 858585 jemmy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170411153234.GA5940@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=dgilbert@redhat.com \
--cc=famz@redhat.com \
--cc=jemmy858585@gmail.com \
--cc=lidongchen@tencent.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).