qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] colo-compare: fix three bugs
@ 2017-04-20  7:46 zhanghailiang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread zhanghailiang
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: zhanghailiang @ 2017-04-20  7:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang, zhanghailiang

Hi, 

This series fixes three bugs found in our test, please review.

Thanks.

zhanghailiang (3):
  colo-compare: serialize compare thread's initialization with main
    thread
  colo-compare: Check main_loop value before call g_main_loop_quit
  colo-compare: fix a memory leak

 net/colo-compare.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-20  7:46 [Qemu-devel] [PATCH 0/3] colo-compare: fix three bugs zhanghailiang
@ 2017-04-20  7:46 ` zhanghailiang
  2017-04-24  4:10   ` Jason Wang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit zhanghailiang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 3/3] colo-compare: fix a memory leak zhanghailiang
  2 siblings, 1 reply; 14+ messages in thread
From: zhanghailiang @ 2017-04-20  7:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang, zhanghailiang

We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
to detach watched fd from default main context, so it has chance to
handle the same watched fd with main thread concurrently, which will
trigger an error report:
"qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed."

Fix it by serializing compare thread's initialization with main thread.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 54e6d40..a6bf419 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,7 @@ typedef struct CompareState {
     GHashTable *connection_track_table;
     /* compare thread, a thread for each NIC */
     QemuThread thread;
+    QemuSemaphore thread_ready;
 
     GMainContext *worker_context;
     GMainLoop *compare_loop;
@@ -557,6 +558,8 @@ static void *colo_compare_thread(void *opaque)
                           (GSourceFunc)check_old_packet_regular, s, NULL);
     g_source_attach(timeout_source, s->worker_context);
 
+    qemu_sem_post(&s->thread_ready);
+
     g_main_loop_run(s->compare_loop);
 
     g_source_unref(timeout_source);
@@ -707,12 +710,15 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
                                                       connection_key_equal,
                                                       g_free,
                                                       connection_destroy);
+    qemu_sem_init(&s->thread_ready, 0);
 
     sprintf(thread_name, "colo-compare %d", compare_id);
     qemu_thread_create(&s->thread, thread_name,
                        colo_compare_thread, s,
                        QEMU_THREAD_JOINABLE);
     compare_id++;
+    qemu_sem_wait(&s->thread_ready);
+    qemu_sem_destroy(&s->thread_ready);
 
     return;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit
  2017-04-20  7:46 [Qemu-devel] [PATCH 0/3] colo-compare: fix three bugs zhanghailiang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread zhanghailiang
@ 2017-04-20  7:46 ` zhanghailiang
  2017-04-24  4:13   ` Jason Wang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 3/3] colo-compare: fix a memory leak zhanghailiang
  2 siblings, 1 reply; 14+ messages in thread
From: zhanghailiang @ 2017-04-20  7:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang, zhanghailiang

If some errors happen before main_loop is initialized in colo
compare thread, qemu will go into finalizing process where
we call g_main_loop_quit(s->main_loop), if main_loop is NULL, there
will be an error report:
"(process:14861): GLib-CRITICAL **: g_main_loop_quit: assertion 'loop != NULL' failed".

We need to check if main_loop is NULL or not before call g_main_loop_quit().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a6bf419..d6a5e4c 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -770,7 +770,9 @@ static void colo_compare_finalize(Object *obj)
                              s->worker_context, true);
     qemu_chr_fe_deinit(&s->chr_out);
 
-    g_main_loop_quit(s->compare_loop);
+    if (s->compare_loop) {
+        g_main_loop_quit(s->compare_loop);
+    }
     qemu_thread_join(&s->thread);
 
     /* Release all unhandled packets after compare thead exited */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 3/3] colo-compare: fix a memory leak
  2017-04-20  7:46 [Qemu-devel] [PATCH 0/3] colo-compare: fix three bugs zhanghailiang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread zhanghailiang
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit zhanghailiang
@ 2017-04-20  7:46 ` zhanghailiang
  2 siblings, 0 replies; 14+ messages in thread
From: zhanghailiang @ 2017-04-20  7:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang, zhanghailiang

g_timeout_source_new() will initialize GSource's reference count to 1,
and g_source_attach() will increase the count by 1, so it will not be
enough to call just g_source_unref() which only reduce the value by 1.
It will lead to memory leak.

We need to call g_source_destroy() before g_source_unref().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index d6a5e4c..97bf0e5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -562,7 +562,9 @@ static void *colo_compare_thread(void *opaque)
 
     g_main_loop_run(s->compare_loop);
 
+    g_source_destroy(timeout_source);
     g_source_unref(timeout_source);
+
     g_main_loop_unref(s->compare_loop);
     g_main_context_unref(s->worker_context);
     return NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread zhanghailiang
@ 2017-04-24  4:10   ` Jason Wang
  2017-04-24  6:03     ` Hailiang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Wang @ 2017-04-24  4:10 UTC (permalink / raw)
  To: zhanghailiang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang



On 2017年04月20日 15:46, zhanghailiang wrote:
> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
> to detach watched fd from default main context, so it has chance to
> handle the same watched fd with main thread concurrently, which will
> trigger an error report:
> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed."

Anyway to prevent fd from being handled by main thread before creating 
colo thread? Using semaphore seems not elegant.

Thanks

>
> Fix it by serializing compare thread's initialization with main thread.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/colo-compare.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 54e6d40..a6bf419 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -83,6 +83,7 @@ typedef struct CompareState {
>       GHashTable *connection_track_table;
>       /* compare thread, a thread for each NIC */
>       QemuThread thread;
> +    QemuSemaphore thread_ready;
>   
>       GMainContext *worker_context;
>       GMainLoop *compare_loop;
> @@ -557,6 +558,8 @@ static void *colo_compare_thread(void *opaque)
>                             (GSourceFunc)check_old_packet_regular, s, NULL);
>       g_source_attach(timeout_source, s->worker_context);
>   
> +    qemu_sem_post(&s->thread_ready);
> +
>       g_main_loop_run(s->compare_loop);
>   
>       g_source_unref(timeout_source);
> @@ -707,12 +710,15 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>                                                         connection_key_equal,
>                                                         g_free,
>                                                         connection_destroy);
> +    qemu_sem_init(&s->thread_ready, 0);
>   
>       sprintf(thread_name, "colo-compare %d", compare_id);
>       qemu_thread_create(&s->thread, thread_name,
>                          colo_compare_thread, s,
>                          QEMU_THREAD_JOINABLE);
>       compare_id++;
> +    qemu_sem_wait(&s->thread_ready);
> +    qemu_sem_destroy(&s->thread_ready);
>   
>       return;
>   }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit
  2017-04-20  7:46 ` [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit zhanghailiang
@ 2017-04-24  4:13   ` Jason Wang
  2017-04-24  6:06     ` Hailiang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Wang @ 2017-04-24  4:13 UTC (permalink / raw)
  To: zhanghailiang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang



On 2017年04月20日 15:46, zhanghailiang wrote:
> If some errors happen before main_loop is initialized in colo
> compare thread, qemu will go into finalizing process where
> we call g_main_loop_quit(s->main_loop), if main_loop is NULL, there
> will be an error report:
> "(process:14861): GLib-CRITICAL **: g_main_loop_quit: assertion 'loop != NULL' failed".
>
> We need to check if main_loop is NULL or not before call g_main_loop_quit().

Do we need check and fail early in colo_compare_thread() too?

Thanks

>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/colo-compare.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index a6bf419..d6a5e4c 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -770,7 +770,9 @@ static void colo_compare_finalize(Object *obj)
>                                s->worker_context, true);
>       qemu_chr_fe_deinit(&s->chr_out);
>   
> -    g_main_loop_quit(s->compare_loop);
> +    if (s->compare_loop) {
> +        g_main_loop_quit(s->compare_loop);
> +    }
>       qemu_thread_join(&s->thread);
>   
>       /* Release all unhandled packets after compare thead exited */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-24  4:10   ` Jason Wang
@ 2017-04-24  6:03     ` Hailiang Zhang
  2017-04-25  8:41       ` Jason Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Hailiang Zhang @ 2017-04-24  6:03 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, weidong.huang

On 2017/4/24 12:10, Jason Wang wrote:
>
> On 2017年04月20日 15:46, zhanghailiang wrote:
>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>> to detach watched fd from default main context, so it has chance to
>> handle the same watched fd with main thread concurrently, which will
>> trigger an error report:
>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed."
> Anyway to prevent fd from being handled by main thread before creating
> colo thread? Using semaphore seems not elegant.

So how about calling qemu_mutex_lock_iothread() before qemu_chr_fe_set_handlers() ?

> Thanks
>
>> Fix it by serializing compare thread's initialization with main thread.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>    net/colo-compare.c | 6 ++++++
>>    1 file changed, 6 insertions(+)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 54e6d40..a6bf419 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -83,6 +83,7 @@ typedef struct CompareState {
>>        GHashTable *connection_track_table;
>>        /* compare thread, a thread for each NIC */
>>        QemuThread thread;
>> +    QemuSemaphore thread_ready;
>>    
>>        GMainContext *worker_context;
>>        GMainLoop *compare_loop;
>> @@ -557,6 +558,8 @@ static void *colo_compare_thread(void *opaque)
>>                              (GSourceFunc)check_old_packet_regular, s, NULL);
>>        g_source_attach(timeout_source, s->worker_context);
>>    
>> +    qemu_sem_post(&s->thread_ready);
>> +
>>        g_main_loop_run(s->compare_loop);
>>    
>>        g_source_unref(timeout_source);
>> @@ -707,12 +710,15 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>>                                                          connection_key_equal,
>>                                                          g_free,
>>                                                          connection_destroy);
>> +    qemu_sem_init(&s->thread_ready, 0);
>>    
>>        sprintf(thread_name, "colo-compare %d", compare_id);
>>        qemu_thread_create(&s->thread, thread_name,
>>                           colo_compare_thread, s,
>>                           QEMU_THREAD_JOINABLE);
>>        compare_id++;
>> +    qemu_sem_wait(&s->thread_ready);
>> +    qemu_sem_destroy(&s->thread_ready);
>>    
>>        return;
>>    }
>
> .
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit
  2017-04-24  4:13   ` Jason Wang
@ 2017-04-24  6:06     ` Hailiang Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Hailiang Zhang @ 2017-04-24  6:06 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, weidong.huang

On 2017/4/24 12:13, Jason Wang wrote:
>
> On 2017年04月20日 15:46, zhanghailiang wrote:
>> If some errors happen before main_loop is initialized in colo
>> compare thread, qemu will go into finalizing process where
>> we call g_main_loop_quit(s->main_loop), if main_loop is NULL, there
>> will be an error report:
>> "(process:14861): GLib-CRITICAL **: g_main_loop_quit: assertion 'loop != NULL' failed".
>>
>> We need to check if main_loop is NULL or not before call g_main_loop_quit().
> Do we need check and fail early in colo_compare_thread() too?

Yes, we need to check there too, will add the check in next version, thanks.

> Thanks
>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>    net/colo-compare.c | 4 +++-
>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index a6bf419..d6a5e4c 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -770,7 +770,9 @@ static void colo_compare_finalize(Object *obj)
>>                                 s->worker_context, true);
>>        qemu_chr_fe_deinit(&s->chr_out);
>>    
>> -    g_main_loop_quit(s->compare_loop);
>> +    if (s->compare_loop) {
>> +        g_main_loop_quit(s->compare_loop);
>> +    }
>>        qemu_thread_join(&s->thread);
>>    
>>        /* Release all unhandled packets after compare thead exited */
>
> .
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-24  6:03     ` Hailiang Zhang
@ 2017-04-25  8:41       ` Jason Wang
  2017-04-25  9:59         ` Hailiang Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Wang @ 2017-04-25  8:41 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, weidong.huang



On 2017年04月24日 14:03, Hailiang Zhang wrote:
> On 2017/4/24 12:10, Jason Wang wrote:
>>
>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>>> to detach watched fd from default main context, so it has chance to
>>> handle the same watched fd with main thread concurrently, which will
>>> trigger an error report:
>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src == 
>>> ((void *)0)' failed."
>> Anyway to prevent fd from being handled by main thread before creating
>> colo thread? Using semaphore seems not elegant.
>
> So how about calling qemu_mutex_lock_iothread() before 
> qemu_chr_fe_set_handlers() ?

Looks better, but I needs more information e.g how main thread can touch it?

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-25  8:41       ` Jason Wang
@ 2017-04-25  9:59         ` Hailiang Zhang
  2017-04-25 11:33           ` Jason Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Hailiang Zhang @ 2017-04-25  9:59 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, weidong.huang

On 2017/4/25 16:41, Jason Wang wrote:
>
> On 2017年04月24日 14:03, Hailiang Zhang wrote:
>> On 2017/4/24 12:10, Jason Wang wrote:
>>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>>>> to detach watched fd from default main context, so it has chance to
>>>> handle the same watched fd with main thread concurrently, which will
>>>> trigger an error report:
>>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src ==
>>>> ((void *)0)' failed."
>>> Anyway to prevent fd from being handled by main thread before creating
>>> colo thread? Using semaphore seems not elegant.
>> So how about calling qemu_mutex_lock_iothread() before
>> qemu_chr_fe_set_handlers() ?
> Looks better, but I needs more information e.g how main thread can touch it?

Hmm, this happened quite occasionally, and we didn't catch the first place (backtrace)
of removing fd from been watched, but  from the codes logic, we found there should
be such possible cases:
tcp_chr_write (Or tcp_chr_read/tcp_chr_sync_read/chr_disconnect)
  ->tcp_chr_disconnect (Or char_socket_finalize)
     ->tcp_chr_free_connection
       -> remove_fd_in_watch(chr);

Anyway, it needs the protection from been freed twice.

Thanks,
Hailiang
> Thanks
>
> .
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-25  9:59         ` Hailiang Zhang
@ 2017-04-25 11:33           ` Jason Wang
  2017-04-26  7:51             ` Hailiang Zhang
  2017-05-04  2:51             ` Hailiang Zhang
  0 siblings, 2 replies; 14+ messages in thread
From: Jason Wang @ 2017-04-25 11:33 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, weidong.huang



On 2017年04月25日 17:59, Hailiang Zhang wrote:
> On 2017/4/25 16:41, Jason Wang wrote:
>>
>> On 2017年04月24日 14:03, Hailiang Zhang wrote:
>>> On 2017/4/24 12:10, Jason Wang wrote:
>>>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>>>>> to detach watched fd from default main context, so it has chance to
>>>>> handle the same watched fd with main thread concurrently, which will
>>>>> trigger an error report:
>>>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src ==
>>>>> ((void *)0)' failed."
>>>> Anyway to prevent fd from being handled by main thread before creating
>>>> colo thread? Using semaphore seems not elegant.
>>> So how about calling qemu_mutex_lock_iothread() before
>>> qemu_chr_fe_set_handlers() ?
>> Looks better, but I needs more information e.g how main thread can 
>> touch it?
>
> Hmm, this happened quite occasionally, and we didn't catch the first 
> place (backtrace)
> of removing fd from been watched, but  from the codes logic, we found 
> there should
> be such possible cases:
> tcp_chr_write (Or tcp_chr_read/tcp_chr_sync_read/chr_disconnect)
>  ->tcp_chr_disconnect (Or char_socket_finalize)
>     ->tcp_chr_free_connection
>       -> remove_fd_in_watch(chr);
>
> Anyway, it needs the protection from been freed twice.
>
> Thanks,
> Hailiang

Still a little bit confused. The question is how could main thread still 
call tcp_chr_write or other in the above case?

Thanks

>> Thanks
>>
>> .
>>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-25 11:33           ` Jason Wang
@ 2017-04-26  7:51             ` Hailiang Zhang
  2017-05-04  2:51             ` Hailiang Zhang
  1 sibling, 0 replies; 14+ messages in thread
From: Hailiang Zhang @ 2017-04-26  7:51 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang

On 2017/4/25 19:33, Jason Wang wrote:
>
> On 2017年04月25日 17:59, Hailiang Zhang wrote:
>> On 2017/4/25 16:41, Jason Wang wrote:
>>> On 2017年04月24日 14:03, Hailiang Zhang wrote:
>>>> On 2017/4/24 12:10, Jason Wang wrote:
>>>>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>>>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>>>>>> to detach watched fd from default main context, so it has chance to
>>>>>> handle the same watched fd with main thread concurrently, which will
>>>>>> trigger an error report:
>>>>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src ==
>>>>>> ((void *)0)' failed."
>>>>> Anyway to prevent fd from being handled by main thread before creating
>>>>> colo thread? Using semaphore seems not elegant.
>>>> So how about calling qemu_mutex_lock_iothread() before
>>>> qemu_chr_fe_set_handlers() ?
>>> Looks better, but I needs more information e.g how main thread can
>>> touch it?
>> Hmm, this happened quite occasionally, and we didn't catch the first
>> place (backtrace)
>> of removing fd from been watched, but  from the codes logic, we found
>> there should
>> be such possible cases:
>> tcp_chr_write (Or tcp_chr_read/tcp_chr_sync_read/chr_disconnect)
>>   ->tcp_chr_disconnect (Or char_socket_finalize)
>>      ->tcp_chr_free_connection
>>        -> remove_fd_in_watch(chr);
>>
>> Anyway, it needs the protection from been freed twice.
>>
>> Thanks,
>> Hailiang
> Still a little bit confused. The question is how could main thread still
> call tcp_chr_write or other in the above case?

The 'char_socekt_finalize' ? Hmm, I'd better to reproduce it again to get the first
time of removing the fd been watched...

> Thanks
>
>>> Thanks
>>>
>>> .
>>>
>>
>
> .
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-04-25 11:33           ` Jason Wang
  2017-04-26  7:51             ` Hailiang Zhang
@ 2017-05-04  2:51             ` Hailiang Zhang
  2017-05-05  3:03               ` Jason Wang
  1 sibling, 1 reply; 14+ messages in thread
From: Hailiang Zhang @ 2017-05-04  2:51 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang

Hi Jason,

On 2017/4/25 19:33, Jason Wang wrote:
>
> On 2017年04月25日 17:59, Hailiang Zhang wrote:
>> On 2017/4/25 16:41, Jason Wang wrote:
>>> On 2017年04月24日 14:03, Hailiang Zhang wrote:
>>>> On 2017/4/24 12:10, Jason Wang wrote:
>>>>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>>>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
>>>>>> to detach watched fd from default main context, so it has chance to
>>>>>> handle the same watched fd with main thread concurrently, which will
>>>>>> trigger an error report:
>>>>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src ==
>>>>>> ((void *)0)' failed."
>>>>> Anyway to prevent fd from being handled by main thread before creating
>>>>> colo thread? Using semaphore seems not elegant.
>>>> So how about calling qemu_mutex_lock_iothread() before
>>>> qemu_chr_fe_set_handlers() ?
>>> Looks better, but I needs more information e.g how main thread can
>>> touch it?
>> Hmm, this happened quite occasionally, and we didn't catch the first
>> place (backtrace)
>> of removing fd from been watched, but  from the codes logic, we found
>> there should
>> be such possible cases:
>> tcp_chr_write (Or tcp_chr_read/tcp_chr_sync_read/chr_disconnect)
>>   ->tcp_chr_disconnect (Or char_socket_finalize)
>>      ->tcp_chr_free_connection
>>        -> remove_fd_in_watch(chr);
>>
>> Anyway, it needs the protection from been freed twice.
>>
>> Thanks,
>> Hailiang
> Still a little bit confused. The question is how could main thread still
> call tcp_chr_write or other in the above case?

Finally, we reproduced this bug (We use qemu 2.6), and got the follow backtrace of this problem:

(gdb) thread apply all bt

Thread 7 (Thread 0x7f407a1ff700 (LWP 23144)):
#0  0x00007f41037e0db5 in _int_malloc () from /usr/lib64/libc.so.6
#1  0x00007f41037e3b96 in calloc () from /usr/lib64/libc.so.6
#2  0x00007f41041ad4d7 in g_malloc0 () from /usr/lib64/libglib-2.0.so.0
#3  0x00007f41041a5437 in g_source_new () from /usr/lib64/libglib-2.0.so.0
#4  0x00007f410a2cec9c in qio_channel_create_fd_watch (ioc=ioc@entry=0x7f410d6238c0, fd=20, condition=condition@entry=
     (G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL)) at io/channel-watch.c:259
#5  0x00007f410a2ced01 in qio_channel_create_socket_watch (ioc=ioc@entry=0x7f410d6238c0, socket=<optimized out>,
     condition=condition@entry=(G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL)) at io/channel-watch.c:311
#6  0x00007f410a2cbea7 in qio_channel_socket_create_watch (ioc=0x7f410d6238c0, condition=(G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL))
     at io/channel-socket.c:732
#7  0x00007f410a2c94d2 in qio_channel_create_watch (ioc=0x7f410d6238c0, condition=condition@entry=
     (G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL)) at io/channel.c:132
#8  0x00007f410a003cd6 in io_watch_poll_prepare (source=0x7f4070000d00, timeout_=<optimized out>) at qemu-char.c:883
#9  0x00007f41041a72ed in g_main_context_prepare () from /usr/lib64/libglib-2.0.so.0
#10 0x00007f41041a7b7b in g_main_context_iterate.isra.24 () from /usr/lib64/libglib-2.0.so.0
#11 0x00007f41041a7fba in g_main_loop_run () from /usr/lib64/libglib-2.0.so.0
#12 0x00007f410a1e528f in colo_compare_thread (opaque=0x7f410d7d6800) at net/colo-compare.c:651
#13 0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#14 0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 6 (Thread 0x7f40799fe700 (LWP 19368)):
#0  0x00007f4103b2f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007f410a3138d1 in qemu_cond_wait (cond=cond@entry=0x7f410cce44c0, mutex=mutex@entry=0x7f410cce44f0)
     at util/qemu-thread-posix.c:132
---Type <return> to continue, or q <return> to quit---
#2  0x00007f410a22b1a3 in vnc_worker_thread_loop (queue=queue@entry=0x7f410cce44c0) at ui/vnc-jobs.c:228
#3  0x00007f410a22b810 in vnc_worker_thread (arg=0x7f410cce44c0) at ui/vnc-jobs.c:335
#4  0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 5 (Thread 0x7f407abff700 (LWP 19366)):
#0  0x00007f4103b2f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007f410a3138d1 in qemu_cond_wait (cond=cond@entry=0x7f410a9fc368 <mlock_struct+40>,
     mutex=mutex@entry=0x7f410a9fc340 <mlock_struct>) at util/qemu-thread-posix.c:132
#2  0x00007f4109e99060 in mlock_wait () at /work/zhanghailiang/qemu-kvm/exec.c:392
#3  mlock_thread (opaque=<optimized out>) at /work/zhanghailiang/qemu-kvm/exec.c:407
#4  0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 4 (Thread 0x7f40fcd83700 (LWP 19364)):
#0  0x00007f4103b2f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007f410a3138d1 in qemu_cond_wait (cond=<optimized out>, mutex=mutex@entry=0x7f410aa66ca0 <qemu_global_mutex>)
     at util/qemu-thread-posix.c:132
#2  0x00007f4109ed5b3b in qemu_kvm_wait_io_event (cpu=0x7f410c2bda30) at /work/zhanghailiang/qemu-kvm/cpus.c:1087
#3  qemu_kvm_cpu_thread_fn (arg=0x7f410c2bda30) at /work/zhanghailiang/qemu-kvm/cpus.c:1126
#4  0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 3 (Thread 0x7f40fd584700 (LWP 19363)):
#0  0x00007f4103b2f6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007f410a3138d1 in qemu_cond_wait (cond=<optimized out>, mutex=mutex@entry=0x7f410aa66ca0 <qemu_global_mutex>)
---Type <return> to continue, or q <return> to quit---
     at util/qemu-thread-posix.c:132
#2  0x00007f4109ed5b3b in qemu_kvm_wait_io_event (cpu=0x7f410c24e690) at /work/zhanghailiang/qemu-kvm/cpus.c:1087
#3  qemu_kvm_cpu_thread_fn (arg=0x7f410c24e690) at /work/zhanghailiang/qemu-kvm/cpus.c:1126
#4  0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 2 (Thread 0x7f40fde76700 (LWP 19311)):
#0  0x00007f4103853e99 in syscall () from /usr/lib64/libc.so.6
#1  0x00007f410a313d52 in futex_wait (val=4294967295, ev=0x7f410afe4fc8 <rcu_call_ready_event>) at util/qemu-thread-posix.c:301
#2  qemu_event_wait (ev=ev@entry=0x7f410afe4fc8 <rcu_call_ready_event>) at util/qemu-thread-posix.c:408
#3  0x00007f410a329846 in call_rcu_thread (opaque=<optimized out>) at util/rcu.c:250
#4  0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007f410385971d in clone () from /usr/lib64/libc.so.6

Thread 1 (Thread 0x7f4109b73bc0 (LWP 19310)):
#0  0x00007f41037985d7 in raise () from /usr/lib64/libc.so.6
#1  0x00007f4103799cc8 in abort () from /usr/lib64/libc.so.6
#2  0x00007f4103791546 in __assert_fail_base () from /usr/lib64/libc.so.6
#3  0x00007f41037915f2 in __assert_fail () from /usr/lib64/libc.so.6
#4  0x00007f410a003778 in io_watch_poll_finalize (source=<optimized out>) at qemu-char.c:919
#5  0x00007f41041a4de2 in g_source_unref_internal () from /usr/lib64/libglib-2.0.so.0
#6  0x00007f41041a4fee in g_source_iter_next () from /usr/lib64/libglib-2.0.so.0
#7  0x00007f41041a728b in g_main_context_prepare () from /usr/lib64/libglib-2.0.so.0
#8  0x00007f410a23fdca in glib_pollfds_fill (cur_timeout=<synthetic pointer>) at main-loop.c:196
#9  os_host_main_loop_wait (timeout=2491118968) at main-loop.c:235
#10 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:517
---Type <return> to continue, or q <return> to quit---
#11 0x00007f4109e890fd in main_loop () at vl.c:2202
#12 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:5124

In this case,  main thread and colo compare thread process the same GSource concurrently.
(I still didn't figure out why g_main_context_prepare call finalize callback here).

Thanks,


> Thanks
>
>>> Thanks
>>>
>>> .
>>>
>>
>
> .
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread
  2017-05-04  2:51             ` Hailiang Zhang
@ 2017-05-05  3:03               ` Jason Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Wang @ 2017-05-05  3:03 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: qemu-devel, weidong.huang



On 2017年05月04日 10:51, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/4/25 19:33, Jason Wang wrote:
>>
>> On 2017年04月25日 17:59, Hailiang Zhang wrote:
>>> On 2017/4/25 16:41, Jason Wang wrote:
>>>> On 2017年04月24日 14:03, Hailiang Zhang wrote:
>>>>> On 2017/4/24 12:10, Jason Wang wrote:
>>>>>> On 2017年04月20日 15:46, zhanghailiang wrote:
>>>>>>> We call qemu_chr_fe_set_handlers() in colo-compare thread, it is 
>>>>>>> used
>>>>>>> to detach watched fd from default main context, so it has chance to
>>>>>>> handle the same watched fd with main thread concurrently, which 
>>>>>>> will
>>>>>>> trigger an error report:
>>>>>>> "qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src ==
>>>>>>> ((void *)0)' failed."
>>>>>> Anyway to prevent fd from being handled by main thread before 
>>>>>> creating
>>>>>> colo thread? Using semaphore seems not elegant.
>>>>> So how about calling qemu_mutex_lock_iothread() before
>>>>> qemu_chr_fe_set_handlers() ?
>>>> Looks better, but I needs more information e.g how main thread can
>>>> touch it?
>>> Hmm, this happened quite occasionally, and we didn't catch the first
>>> place (backtrace)
>>> of removing fd from been watched, but  from the codes logic, we found
>>> there should
>>> be such possible cases:
>>> tcp_chr_write (Or tcp_chr_read/tcp_chr_sync_read/chr_disconnect)
>>>   ->tcp_chr_disconnect (Or char_socket_finalize)
>>>      ->tcp_chr_free_connection
>>>        -> remove_fd_in_watch(chr);
>>>
>>> Anyway, it needs the protection from been freed twice.
>>>
>>> Thanks,
>>> Hailiang
>> Still a little bit confused. The question is how could main thread still
>> call tcp_chr_write or other in the above case?
>
> Finally, we reproduced this bug (We use qemu 2.6), and got the follow 
> backtrace of this problem:
>
> (gdb) thread apply all bt
>
> Thread 7 (Thread 0x7f407a1ff700 (LWP 23144)):
> #0  0x00007f41037e0db5 in _int_malloc () from /usr/lib64/libc.so.6
> #1  0x00007f41037e3b96 in calloc () from /usr/lib64/libc.so.6
> #2  0x00007f41041ad4d7 in g_malloc0 () from /usr/lib64/libglib-2.0.so.0
> #3  0x00007f41041a5437 in g_source_new () from 
> /usr/lib64/libglib-2.0.so.0
> #4  0x00007f410a2cec9c in qio_channel_create_fd_watch 
> (ioc=ioc@entry=0x7f410d6238c0, fd=20, condition=condition@entry=
>     (G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL)) at 
> io/channel-watch.c:259
> #5  0x00007f410a2ced01 in qio_channel_create_socket_watch 
> (ioc=ioc@entry=0x7f410d6238c0, socket=<optimized out>,
>     condition=condition@entry=(G_IO_IN | G_IO_ERR | G_IO_HUP | 
> G_IO_NVAL)) at io/channel-watch.c:311
> #6  0x00007f410a2cbea7 in qio_channel_socket_create_watch 
> (ioc=0x7f410d6238c0, condition=(G_IO_IN | G_IO_ERR | G_IO_HUP | 
> G_IO_NVAL))
>     at io/channel-socket.c:732
> #7  0x00007f410a2c94d2 in qio_channel_create_watch 
> (ioc=0x7f410d6238c0, condition=condition@entry=
>     (G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL)) at io/channel.c:132
> #8  0x00007f410a003cd6 in io_watch_poll_prepare 
> (source=0x7f4070000d00, timeout_=<optimized out>) at qemu-char.c:883
> #9  0x00007f41041a72ed in g_main_context_prepare () from 
> /usr/lib64/libglib-2.0.so.0
> #10 0x00007f41041a7b7b in g_main_context_iterate.isra.24 () from 
> /usr/lib64/libglib-2.0.so.0
> #11 0x00007f41041a7fba in g_main_loop_run () from 
> /usr/lib64/libglib-2.0.so.0
> #12 0x00007f410a1e528f in colo_compare_thread (opaque=0x7f410d7d6800) 
> at net/colo-compare.c:651
> #13 0x00007f4103b2bdc5 in start_thread () from /usr/lib64/libpthread.so.0
> #14 0x00007f410385971d in clone () from /usr/lib64/libc.so.6 

It looks like we use main context which is wrong, maybe you can track 
io_add_watch_poll() and its caller get the reason.

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-05  3:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-20  7:46 [Qemu-devel] [PATCH 0/3] colo-compare: fix three bugs zhanghailiang
2017-04-20  7:46 ` [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread zhanghailiang
2017-04-24  4:10   ` Jason Wang
2017-04-24  6:03     ` Hailiang Zhang
2017-04-25  8:41       ` Jason Wang
2017-04-25  9:59         ` Hailiang Zhang
2017-04-25 11:33           ` Jason Wang
2017-04-26  7:51             ` Hailiang Zhang
2017-05-04  2:51             ` Hailiang Zhang
2017-05-05  3:03               ` Jason Wang
2017-04-20  7:46 ` [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit zhanghailiang
2017-04-24  4:13   ` Jason Wang
2017-04-24  6:06     ` Hailiang Zhang
2017-04-20  7:46 ` [Qemu-devel] [PATCH 3/3] colo-compare: fix a memory leak zhanghailiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).