* [Qemu-devel] colo-compare: segfault and assert on colo_compare_finalize @ 2017-08-07 16:39 Eduardo Otubo 2017-08-08 11:30 ` Hailiang Zhang 0 siblings, 1 reply; 3+ messages in thread From: Eduardo Otubo @ 2017-08-07 16:39 UTC (permalink / raw) To: qemu-devel Cc: zhang.zhanghailiang, wency, zhangchen.fnst, wang.guang55, wang.yong155 (please ignore my last email, looks like mutt wants play games lately) Hi all, I have found a problem on colo-compare that leads to segmentation fault when calling qemu like this: $ qemu-system-x86_64 -S -machine pc -object colo-compare,id=test-object First I got an assert failed: (qemu-system-x86_64:7887): GLib-CRITICAL **: g_main_loop_quit: assertion 'loop != NULL' failed From this looks like s->compare_loop is NULL on the function colo_compare_finalize(), then I just added a check there and the assert went away. But then there's the segfault: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x00007ffff333f79e in pthread_join () from /lib64/libpthread.so.0 (gdb) bt #0 0x00007ffff333f79e in pthread_join () at /lib64/libpthread.so.0 #1 0x0000555555c379d2 in qemu_thread_join (thread=0x7ffff7ff5160) at util/qemu-thread-posix.c:547 #2 0x0000555555adfc1a in colo_compare_finalize (obj=0x7ffff7fd3010) at net/colo-compare.c:867 #3 0x0000555555b2cd87 in object_deinit (obj=0x7ffff7fd3010, type=0x5555567432e0) at qom/object.c:453 #4 0x0000555555b2cdf9 in object_finalize (data=0x7ffff7fd3010) at qom/object.c:467 #5 0x0000555555b2dd80 in object_unref (obj=0x7ffff7fd3010) at qom/object.c:902 #6 0x0000555555b319a5 in user_creatable_add_type (type=0x5555567499a0 "colo-compare", id=0x555556749960 "test-object", qdict=0x555556835750, v=0x55555681a3f0, errp=0x7fffffffde58) at qom/object_interfaces.c:105 #7 0x0000555555b31b02 in user_creatable_add_opts (opts=0x555556749910, errp=0x7fffffffde58) at qom/object_interfaces.c:135 #8 0x0000555555b31bfd in user_creatable_add_opts_foreach (opaque=0x5555558e9c39 <object_create_delayed>, opts=0x555556749910, errp=0x0) at qom/object_interfaces.c:159 #9 0x0000555555c4aecf in qemu_opts_foreach (list=0x555556157ac0 <qemu_object_opts>, func=0x555555b31b6f <user_creatable_add_opts_foreach>, opaque=0x5555558e9c39 <object_create_delayed>, errp=0x0) at util/qemu-option.c:1104 #10 0x00005555558edb75 in main (argc=6, argv=0x7fffffffe2d8, envp=0x7fffffffe310) at vl.c:4520 At this point '&s->thread' is '0'. Is this segfault and the above mentioned assert trigged because I'm creating a colo-compare object without any other parameter? In a positive case, a simple workaround and error check should do it. Otherwise I'll debug a little more. Best regards, -- Eduardo Otubo Senior Software Engineer // Red Hat Hyper-V Virtualization, Berlin, DE IRC: otubo@{RedHat, OFTC, Freenode} ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] colo-compare: segfault and assert on colo_compare_finalize 2017-08-07 16:39 [Qemu-devel] colo-compare: segfault and assert on colo_compare_finalize Eduardo Otubo @ 2017-08-08 11:30 ` Hailiang Zhang 2017-08-08 14:29 ` Eduardo Otubo 0 siblings, 1 reply; 3+ messages in thread From: Hailiang Zhang @ 2017-08-08 11:30 UTC (permalink / raw) To: otubo, qemu-devel; +Cc: wency, zhangchen.fnst, wang.guang55, wang.yong155 Hi, Did you test this branch https://github.com/coloft/qemu/tree/colo-for-qemu-2.10-2017-4-22 ? This seems to be an already known problem, I'm not quite sure, it may be fixed by this patch b19456dd0ea4eb418ad093f092adbb882be13054 char: Fix removing wrong GSource that be found by fd_in_tag We use fd_in_tag to find a GSource, fd_in_tag is return value of g_source_attach(GSource *source, GMainContext *context), the return value is unique only in the same context, so we may get the same values with different 'context' parameters. It is no problem to find the right fd_in_tag by using g_main_context_find_source_by_id(GMainContext *context, guint source_id) while there is only one default main context. But colo-compare tries to create/use its own context, and if we pass wrong 'context' parameter with right fd_in_tag, we will find a wrong GSource to handle. We tried to fix the related codes in commit b43decb015a6efeb9e3cdbdb80f6547ad7248a4c, but it didn't fix the bug completely, because we still have some codes didn't pass *right* context parameter for remove_fd_in_watch(). Let's fix it by record the GSource directly instead of fd_in_tag. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1492564532-91680-1-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Actually, we have already re-writed this part, and please follow the later series. Thanks, Hailiang On 2017/8/8 0:39, Eduardo Otubo wrote: > (please ignore my last email, looks like mutt wants play games lately) > > Hi all, > > I have found a problem on colo-compare that leads to segmentation fault > when calling qemu like this: > > $ qemu-system-x86_64 -S -machine pc -object colo-compare,id=test-object > > First I got an assert failed: > > (qemu-system-x86_64:7887): GLib-CRITICAL **: g_main_loop_quit: > assertion 'loop != NULL' failed > > From this looks like s->compare_loop is NULL on the function > colo_compare_finalize(), then I just added a check there and the assert > went away. But then there's the segfault: > > Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. > 0x00007ffff333f79e in pthread_join () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007ffff333f79e in pthread_join () at /lib64/libpthread.so.0 > #1 0x0000555555c379d2 in qemu_thread_join (thread=0x7ffff7ff5160) at > util/qemu-thread-posix.c:547 > #2 0x0000555555adfc1a in colo_compare_finalize (obj=0x7ffff7fd3010) > at net/colo-compare.c:867 > #3 0x0000555555b2cd87 in object_deinit (obj=0x7ffff7fd3010, > type=0x5555567432e0) at qom/object.c:453 > #4 0x0000555555b2cdf9 in object_finalize (data=0x7ffff7fd3010) at > qom/object.c:467 > #5 0x0000555555b2dd80 in object_unref (obj=0x7ffff7fd3010) at > qom/object.c:902 > #6 0x0000555555b319a5 in user_creatable_add_type (type=0x5555567499a0 > "colo-compare", id=0x555556749960 "test-object", qdict=0x555556835750, > v=0x55555681a3f0, errp=0x7fffffffde58) at qom/object_interfaces.c:105 > #7 0x0000555555b31b02 in user_creatable_add_opts > (opts=0x555556749910, errp=0x7fffffffde58) at qom/object_interfaces.c:135 > #8 0x0000555555b31bfd in user_creatable_add_opts_foreach > (opaque=0x5555558e9c39 <object_create_delayed>, opts=0x555556749910, > errp=0x0) at qom/object_interfaces.c:159 > #9 0x0000555555c4aecf in qemu_opts_foreach (list=0x555556157ac0 > <qemu_object_opts>, func=0x555555b31b6f > <user_creatable_add_opts_foreach>, opaque=0x5555558e9c39 > <object_create_delayed>, errp=0x0) at util/qemu-option.c:1104 > #10 0x00005555558edb75 in main (argc=6, argv=0x7fffffffe2d8, > envp=0x7fffffffe310) at vl.c:4520 > > At this point '&s->thread' is '0'. Is this segfault and the above > mentioned assert trigged because I'm creating a colo-compare object > without any other parameter? In a positive case, a simple workaround and > error check should do it. Otherwise I'll debug a little more. > > Best regards, ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] colo-compare: segfault and assert on colo_compare_finalize 2017-08-08 11:30 ` Hailiang Zhang @ 2017-08-08 14:29 ` Eduardo Otubo 0 siblings, 0 replies; 3+ messages in thread From: Eduardo Otubo @ 2017-08-08 14:29 UTC (permalink / raw) To: Hailiang Zhang, qemu-devel Cc: wency, zhangchen.fnst, wang.guang55, wang.yong155 On 08/08/2017 01:30 PM, Hailiang Zhang wrote: > Hi, > > Did you test this branch > https://github.com/coloft/qemu/tree/colo-for-qemu-2.10-2017-4-22 ? > > This seems to be an already known problem, I'm not quite sure, it may be > fixed by this patch It's not :( Using your branch I don't see the assert() error anymore, but the segfault remains and apparently in the same place: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x00007ffff30ec79e in pthread_join () from /lib64/libpthread.so.0 (gdb) bt #0 0x00007ffff30ec79e in pthread_join () at /lib64/libpthread.so.0 #1 0x0000555555c0b807 in qemu_thread_join (thread=0x7ffff7ff5130) at util/qemu-thread-posix.c:504 #2 0x0000555555ac96c2 in colo_compare_finalize (obj=0x7ffff7fd3010) at net/colo-compare.c:873 #3 0x0000555555b12f90 in object_deinit (obj=0x7ffff7fd3010, type=0x5555566e4c70) at qom/object.c:454 #4 0x0000555555b13002 in object_finalize (data=0x7ffff7fd3010) at qom/object.c:468 #5 0x0000555555b13f92 in object_unref (obj=0x7ffff7fd3010) at qom/object.c:903 #6 0x0000555555b17a3e in user_creatable_add_type (type=0x5555566eb270 "colo-compare", id=0x5555566eb230 "test-object", qdict=0x555556781bc0, v=0x5555567798f0, errp=0x7fffffffde58) at qom/object_interfaces.c:104 #7 0x0000555555b17b9b in user_creatable_add_opts (opts=0x5555566eb1e0, errp=0x7fffffffde58) at qom/object_interfaces.c:134 #8 0x0000555555b17c96 in user_creatable_add_opts_foreach (opaque=0x5555558e3562 <object_create_delayed>, opts=0x5555566eb1e0, errp=0x0) at qom/object_interfaces.c:158 #9 0x0000555555c1e634 in qemu_opts_foreach (list=0x5555561077a0 <qemu_object_opts>, func=0x555555b17c08 <user_creatable_add_opts_foreach>, opaque=0x5555558e3562 <object_create_delayed>, errp=0x0) at util/qemu-option.c:1114 #10 0x00005555558e7441 in main (argc=6, argv=0x7fffffffe2d8, envp=0x7fffffffe310) at vl.c:4455 So, from what you're saying, this is *not* caused by the lack of parameter on the command line. Right? > > b19456dd0ea4eb418ad093f092adbb882be13054 > char: Fix removing wrong GSource that be found by fd_in_tag > We use fd_in_tag to find a GSource, fd_in_tag is return value of > g_source_attach(GSource *source, GMainContext *context), the return > value is unique only in the same context, so we may get the same values > with different 'context' parameters. It is no problem to find the right > fd_in_tag by using g_main_context_find_source_by_id(GMainContext > *context, guint source_id) while there is only one default main context. > But colo-compare tries to create/use its own context, and if we pass > wrong 'context' parameter with right fd_in_tag, we will find a wrong > GSource to handle. We tried to fix the related codes in commit > b43decb015a6efeb9e3cdbdb80f6547ad7248a4c, but it didn't fix the bug > completely, because we still have some codes didn't pass *right* context > parameter for remove_fd_in_watch(). Let's fix it by record the GSource > directly instead of fd_in_tag. Signed-off-by: zhanghailiang > <zhang.zhanghailiang@huawei.com> Reviewed-by: Marc-André Lureau > <marcandre.lureau@redhat.com> Message-Id: > <1492564532-91680-1-git-send-email-zhang.zhanghailiang@huawei.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Actually, we have > already re-writed this part, and please follow the later series. Thanks, > Hailiang > > On 2017/8/8 0:39, Eduardo Otubo wrote: >> (please ignore my last email, looks like mutt wants play games lately) >> >> Hi all, >> >> I have found a problem on colo-compare that leads to segmentation fault >> when calling qemu like this: >> >> $ qemu-system-x86_64 -S -machine pc -object colo-compare,id=test-object >> >> First I got an assert failed: >> >> (qemu-system-x86_64:7887): GLib-CRITICAL **: g_main_loop_quit: >> assertion 'loop != NULL' failed >> >> From this looks like s->compare_loop is NULL on the function >> colo_compare_finalize(), then I just added a check there and the assert >> went away. But then there's the segfault: >> >> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. >> 0x00007ffff333f79e in pthread_join () from /lib64/libpthread.so.0 >> (gdb) bt >> #0 0x00007ffff333f79e in pthread_join () at /lib64/libpthread.so.0 >> #1 0x0000555555c379d2 in qemu_thread_join (thread=0x7ffff7ff5160) at >> util/qemu-thread-posix.c:547 >> #2 0x0000555555adfc1a in colo_compare_finalize (obj=0x7ffff7fd3010) >> at net/colo-compare.c:867 >> #3 0x0000555555b2cd87 in object_deinit (obj=0x7ffff7fd3010, >> type=0x5555567432e0) at qom/object.c:453 >> #4 0x0000555555b2cdf9 in object_finalize (data=0x7ffff7fd3010) at >> qom/object.c:467 >> #5 0x0000555555b2dd80 in object_unref (obj=0x7ffff7fd3010) at >> qom/object.c:902 >> #6 0x0000555555b319a5 in user_creatable_add_type (type=0x5555567499a0 >> "colo-compare", id=0x555556749960 "test-object", qdict=0x555556835750, >> v=0x55555681a3f0, errp=0x7fffffffde58) at qom/object_interfaces.c:105 >> #7 0x0000555555b31b02 in user_creatable_add_opts >> (opts=0x555556749910, errp=0x7fffffffde58) at qom/object_interfaces.c:135 >> #8 0x0000555555b31bfd in user_creatable_add_opts_foreach >> (opaque=0x5555558e9c39 <object_create_delayed>, opts=0x555556749910, >> errp=0x0) at qom/object_interfaces.c:159 >> #9 0x0000555555c4aecf in qemu_opts_foreach (list=0x555556157ac0 >> <qemu_object_opts>, func=0x555555b31b6f >> <user_creatable_add_opts_foreach>, opaque=0x5555558e9c39 >> <object_create_delayed>, errp=0x0) at util/qemu-option.c:1104 >> #10 0x00005555558edb75 in main (argc=6, argv=0x7fffffffe2d8, >> envp=0x7fffffffe310) at vl.c:4520 >> >> At this point '&s->thread' is '0'. Is this segfault and the above >> mentioned assert trigged because I'm creating a colo-compare object >> without any other parameter? In a positive case, a simple workaround and >> error check should do it. Otherwise I'll debug a little more. >> >> Best regards, > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-08-08 14:29 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-07 16:39 [Qemu-devel] colo-compare: segfault and assert on colo_compare_finalize Eduardo Otubo 2017-08-08 11:30 ` Hailiang Zhang 2017-08-08 14:29 ` Eduardo Otubo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).