From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: pkrempa@redhat.com, qemu-devel@nongnu.org, peterx@redhat.com,
quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH] Migration+TLS: Fix crash due to double cleanup
Date: Tue, 1 May 2018 11:11:27 +0100 [thread overview]
Message-ID: <20180501101127.GK5708@redhat.com> (raw)
In-Reply-To: <20180501100035.GJ5708@redhat.com>
On Tue, May 01, 2018 at 11:00:35AM +0100, Daniel P. Berrangé wrote:
> On Mon, Apr 30, 2018 at 07:59:43PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > During a TLS connect we see:
> > migration_channel_connect calls
> > migration_tls_channel_connect
> > (calls after TLS setup)
> > migration_channel_connect
> >
> > My previous error handling fix made migration_channel_connect
> > call migrate_fd_connect in all cases; unfortunately the above
> > means it gets called twice and crashes doing double cleanup.
> >
> > Fixes: 688a3dcba98
>
> This fixes the crash, but we're still seeing error messages duplicated
>
> (qemu) migrate_set_parameter tls-creds tls0
> (qemu) migrate tcp:localhost:9000
> qemu-system-x86_64: Certificate does not match the hostname localhost
> qemu-system-x86_64: Certificate does not match the hostname localhost
>
> git bisect points to 688a3dcba98 as the cause of these double
> errors still.
FYI the stack traces look like this..
The first error message is printed in this context:
#0 0x0000555555bb6c20 in error_report (fmt=0x555555d0d77e "%s") at util/qemu-error.c:273
#1 0x0000555555bb5fa5 in error_report_err (err=0x55555749cb00) at util/error.c:228
#2 0x0000555555a6d1f8 in migrate_fd_cleanup (opaque=opaque@entry=0x5555569925a0) at migration/migration.c:1106
#3 0x0000555555a6e2aa in migrate_fd_connect (s=s@entry=0x5555569925a0, error_in=0x55555768de80) at migration/migration.c:2387
#4 0x0000555555a6f8e7 in migration_channel_connect (s=s@entry=0x5555569925a0, ioc=ioc@entry=0x555557e1de10, hostname=hostname@entry=0x0, error=<optimized out>) at migration/channel.c:83
#5 0x0000555555a6f2d6 in migration_tls_outgoing_handshake (task=<optimized out>, opaque=0x5555569925a0) at migration/tls.c:124
#6 0x0000555555b6d2b2 in qio_task_complete (task=task@entry=0x5555576ae620) at io/task.c:142
#7 0x0000555555b68c74 in qio_channel_tls_handshake_task (ioc=ioc@entry=0x555557e1de10, task=task@entry=0x5555576ae620) at io/channel-tls.c:171
#8 0x0000555555b6975a in qio_channel_tls_handshake (ioc=ioc@entry=0x555557e1de10, func=func@entry=0x555555a6f250 <migration_tls_outgoing_handshake>, opaque=opaque@entry=0x5555569925a0, destroy=destroy@entry=0x0) at io/channel-tls.c:215
#9 0x0000555555a6f6ac in migration_tls_channel_connect (s=s@entry=0x5555569925a0, ioc=ioc@entry=0x555556a01000, hostname=hostname@entry=0x555556bc0c30 "localhost", errp=errp@entry=0x7fffffffd778) at migration/tls.c:159
#10 0x0000555555a6f967 in migration_channel_connect (s=0x5555569925a0, ioc=ioc@entry=0x555556a01000, hostname=0x555556bc0c30 "localhost", error=<optimized out>) at migration/channel.c:73
#11 0x0000555555a6e5f4 in socket_outgoing_migration (task=<optimized out>, opaque=0x555556cc7de0) at migration/socket.c:85
#12 0x0000555555b6d2b2 in qio_task_complete (task=task@entry=0x5555576ce3b0) at io/task.c:142
#13 0x0000555555b6d3a2 in gio_task_thread_result (opaque=0x5555578eff50) at io/task.c:88
#14 0x00007ffff5da1577 in g_idle_dispatch () at /lib64/libglib-2.0.so.0
#15 0x00007ffff5da4b77 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#16 0x0000555555baf677 in glib_pollfds_poll () at util/main-loop.c:214
#17 0x0000555555baf677 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
#18 0x0000555555baf677 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:515
#19 0x00005555557b9157 in main_loop () at vl.c:1935
#20 0x00005555557b9157 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4767
The second error message is printed in this context:
#0 0x0000555555bb6c20 in error_report (fmt=fmt@entry=0x555555d0d77e "%s") at util/qemu-error.c:273
#1 0x0000555555904035 in hmp_migrate_status_cb (opaque=0x555556bc0ba0) at hmp.c:1909
#2 0x0000555555baefac in timerlist_run_timers (timer_list=0x555556a0d760) at util/qemu-timer.c:536
#3 0x0000555555baf1b7 in qemu_clock_run_timers (type=QEMU_CLOCK_REALTIME) at util/qemu-timer.c:547
#4 0x0000555555baf1b7 in qemu_clock_run_all_timers () at util/qemu-timer.c:662
#5 0x0000555555baf69a in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:521
#6 0x00005555557b9157 in main_loop () at vl.c:1935
#7 0x00005555557b9157 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4767
The second stack trace is the error reporting context that I added originally
in
commit d59ce6f34434bf47a9b26138c908650bf9a24be1
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Apr 27 11:05:00 2016 +0100
migration: add reporting of errors for outgoing migration
So the first stack trace is the new duplicate.
Which error reporting context is "better" though, I don't know ?
My patch was based on the view that, although alot of code uses error_report,
long term all migration would eventually need to be able to filter an
'Error *errp' back up the stack, so that we can pass it back to QMP / HMP via
'info migrate' / query-migrate. So I decided to leave the error_report_err
call to the hmp.c code, as long term that's the only place that would need
to print to the console.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2018-05-01 10:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-30 18:59 [Qemu-devel] [PATCH] Migration+TLS: Fix crash due to double cleanup Dr. David Alan Gilbert (git)
2018-05-01 10:00 ` Daniel P. Berrangé
2018-05-01 10:11 ` Daniel P. Berrangé [this message]
2018-05-01 10:57 ` Dr. David Alan Gilbert
2018-05-01 11:21 ` Daniel P. Berrangé
2018-05-01 12:51 ` Daniel P. Berrangé
2018-05-02 12:03 ` Peter Krempa
2018-05-02 12:07 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180501101127.GK5708@redhat.com \
--to=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=peterx@redhat.com \
--cc=pkrempa@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).