From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Yury Kotov <yury-kotov@yandex-team.ru>
Cc: "Juan Quintela" <quintela@redhat.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Evgeny Yakovlev" <wrfsh@yandex-team.ru>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
Date: Wed, 3 Apr 2019 20:05:03 +0100 [thread overview]
Message-ID: <20190403190503.GI2790@work-vm> (raw)
In-Reply-To: <20190403083841.830-1-yury-kotov@yandex-team.ru>
* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> It fixes heap-use-after-free which was found by clang's ASAN.
>
> Control flow of this use-after-free:
> main_thread:
> * Got SIGTERM and completes main loop
> * Calls migration_shutdown
> - migrate_fd_cancel (so, migration_thread begins to complete)
> - object_unref(OBJECT(current_migration));
>
> migration_thread:
> * migration_iteration_finish -> schedule cleanup bh
> * object_unref(OBJECT(s)); (Now, current_migration is freed)
> * exits
>
> main_thread:
> * Calls vm_shutdown -> drain bdrvs -> main loop
> -> cleanup_bh -> use after free
>
> If you want to reproduce, these couple of sleeps will help:
> vl.c:4613:
> migration_shutdown();
> + sleep(2);
> migration.c:3269:
> + sleep(1);
> trace_migration_thread_after_loop();
> migration_iteration_finish(s);
Fun; I hadn't realised you could get more main loop after exiting
the main loop.
Is this fallout from my 892ae715b6b or is this just another case it
didn't catch?
I guess it moved the shutdown early, so it probably is fallout, and
so probably does need to go to 4.0 ?
> Original output:
> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
> =================================================================
> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
> #5 0x5555573bddf6 in virtio_blk_data_plane_stop
> hw/block/dataplane/virtio-blk.c:282:5
> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
> #10 0x555557c36713 in vm_state_notify vl.c:1605:9
> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
> #13 0x555557c4283e in main vl.c:4617:5
> #14 0x7fffdfdb482f in __libc_start_main
> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
>
> 0x61900001d210 is located 144 bytes inside of 952-byte region
> [0x61900001d180,0x61900001d538)
> freed by thread T6 (live_migration) here:
> #0 0x555556f76782 in __interceptor_free
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
> #2 0x555558d57651 in object_unref qom/object.c:1068:9
> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
>
> previously allocated by thread T0 (qemu-vm-0) here:
> #0 0x555556f76b03 in __interceptor_malloc
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
> #2 0x555558d58031 in object_new qom/object.c:640:12
> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
> #4 0x555557c41398 in main vl.c:4320:5
> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>
> Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
> #0 0x555556f5f0dd in pthread_create
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
> #15 0x7ffff6ede196 in g_main_context_dispatch
> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
>
> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
> in migrate_fd_cleanup
> Shadow bytes around the buggy address:
> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> Shadow byte legend (one shadow byte represents 8 application bytes):
> Addressable: 00
> Partially addressable: 01 02 03 04 05 06 07
> Heap left redzone: fa
> Freed heap region: fd
> Stack left redzone: f1
> Stack mid redzone: f2
> Stack right redzone: f3
> Stack after return: f5
> Stack use after scope: f8
> Global redzone: f9
> Global init order: f6
> Poisoned by user: f7
> Container overflow: fc
> Array cookie: ac
> Intra object redzone: bb
> ASan internal: fe
> Left alloca redzone: ca
> Right alloca redzone: cb
> Shadow gap: cc
> ==31958==ABORTING
>
> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> ---
> migration/migration.c | 30 ++++++++++++++++++++++++------
> 1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 609e0df5d0..54d82bb88b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
> int *current_active_state,
> int new_state);
> static void migrate_fd_cancel(MigrationState *s);
> +static void migrate_join_thread(MigrationState *s);
> +static void migrate_fd_cleanup(void *opaque);
>
> void migration_object_init(void)
> {
> @@ -176,6 +178,17 @@ void migration_shutdown(void)
> * stop the migration using this structure
> */
> migrate_fd_cancel(current_migration);
> + /*
> + * migration_thread schedules cleanup_bh, but migration_shutdown is called
> + * from the end of main, so cleanu_up may not be called because main-loop is
> + * complete. So, wait for the complition of migration_thread, cancel bh and
> + * call cleanup ourselves.
> + */
(Some typos)
> + migrate_join_thread(current_migration);
I'll admit to being a bit nervous that something in here could be
hanging and thus we could hang here rather than die as SIGTERM is
expecting us to.
> + if (current_migration->cleanup_bh) {
> + qemu_bh_cancel(current_migration->cleanup_bh);
Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
> + migrate_fd_cleanup(current_migration);
> + }
> object_unref(OBJECT(current_migration));
> }
>
> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
> }
> }
>
> +static void migrate_join_thread(MigrationState *s)
> +{
> + qemu_mutex_unlock_iothread();
> + if (s->migration_thread_running) {
> + qemu_thread_join(&s->thread);
> + s->migration_thread_running = false;
> + }
> + qemu_mutex_lock_iothread();
> +}
> +
> static void migrate_fd_cleanup(void *opaque)
> {
> MigrationState *s = opaque;
> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
> QEMUFile *tmp;
>
> trace_migrate_fd_cleanup();
> - qemu_mutex_unlock_iothread();
> - if (s->migration_thread_running) {
> - qemu_thread_join(&s->thread);
> - s->migration_thread_running = false;
> - }
> - qemu_mutex_lock_iothread();
> + migrate_join_thread(s);
>
> multifd_save_cleanup();
> qemu_mutex_lock(&s->qemu_file_lock);
Dave
> --
> 2.21.0
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-04-03 19:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-03 8:38 [Qemu-devel] [PATCH] migration: fix migration shutdown Yury Kotov
2019-04-03 19:05 ` Dr. David Alan Gilbert [this message]
2019-04-04 8:12 ` Yury Kotov
2019-04-05 9:07 ` Dr. David Alan Gilbert
2019-04-08 11:36 ` Yury Kotov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190403190503.GI2790@work-vm \
--to=dgilbert@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wrfsh@yandex-team.ru \
--cc=yury-kotov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.