From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: Jason Wang <jasowang@redhat.com>, zhangchen.fnst@cn.fujitsu.com
Cc: xuquan8@huawei.com, qemu-devel@nongnu.org, lizhijian@cn.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
Date: Tue, 14 Feb 2017 10:32:21 +0800 [thread overview]
Message-ID: <58A26C35.9070309@huawei.com> (raw)
In-Reply-To: <2e301620-0362-f43e-9194-e438f2091546@redhat.com>
On 2017/2/7 17:21, Jason Wang wrote:
>
>
> On 2017年02月07日 16:19, Hailiang Zhang wrote:
>> On 2017/2/7 15:57, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月07日 15:54, Hailiang Zhang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2017/2/6 20:53, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>>>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>>>>> is used to protect the 'conn_list' queue and its child queues
>>>>>>>>>> which
>>>>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>>>>> and confusing
>>>>>>>>>>
>>>>>>>>>> To make it clearer, we rename 'timer_check_lock' to
>>>>>>>>>> 'conn_list_lock'
>>>>>>>>>> which is used to protect 'conn_list' queue, use another
>>>>>>>>>> 'conn_lock'
>>>>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>>>>
>>>>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>>>>
>>>>>>>>> Instead of sticking to such kind of mutex, I think it's time to
>>>>>>>>> make
>>>>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>>>>
>>>>>>>>
>>>>>>>> Er, it seems that, we still need these mutex locks even we make
>>>>>>>> colo
>>>>>>>> timer and colo thread run in the same thread, because we may access
>>>>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>>>>> concurrently.
>>>>>>>
>>>>>>> Just make sure I understand the issue, why need it access the list?
>>>>>>>
>>>>>>>>
>>>>>>>> Besides, it seems to be a little complex to make colo timer run in
>>>>>>>> colo
>>>>>>>> compare thread, and it is not this series' goal.
>>>>>>>
>>>>>>> Seems not by just looking at how it was implemented in main loop,
>>>>>>> but
>>>>>>> maybe I was wrong.
>>>>>>>
>>>>>>>> This series is preparing
>>>>>>>> work for integrating COLO compare with COLO frame and it is
>>>>>>>> prerequisite.
>>>>>>>>
>>>>>>>> So, we may consider implementing it later in another series.
>>>>>>>> Zhang Chen, what's your opinion ?
>>>>>>>
>>>>>>> The problem is this patch make things even worse, it introduces one
>>>>>>> more
>>>>>>> mutex.
>>>>>>>
>>>>>>
>>>>>> Hmm, for most cases, we need to get these two locks at the same time.
>>>>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>>>>> (We need to get this lock before operate all these lists, as you can
>>>>>> see
>>>>>> in patch 2, while do checkpoint, we may operate these lists in
>>>>>> COLO checkpoint thread concurrently.)
>>>>>>
>>>>>> But for the original codes, we didn't got timer_check_lock in
>>>>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>>>>> and didn't got this lock in colo_compare_connection while operate
>>>>>> secondary_list either.
>>>>>>
>>>>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>>>>> add the lock where it is need ?
>>>>>
>>>>> I'd like to know if timer were run in colo thread (this looks as
>>>>> simple
>>>>> as a g_timeout_source_new() followed by a g_source_attach()), why
>>>>> do we
>>>>> still need a mutex. And if we need it now but plan to remove it in the
>>>>> future, I'd like to use big lock to simplify the codes.
>>>>>
>>>>
>>>> After investigated your above suggestion, I think it works by using
>>>> g_timeout_source_new() and g_source_attach(), but I'm not sure
>>>> if it is a good idea to use the big lock to protect the possible
>>>> concurrent cases which seem to only happen between COLO migration
>>>> thread and COLO compare thread, any further suggestions ?
>>>
>>> I think I need first understand why migration thread need to access the
>>> list?
>>>
>>
>> Er, sorry to confuse you here, to be exactly, it is COLO checkpoint
>> thread,
>> we reuse the migration thread to realize checkpoint process for COLO,
>> Because qemu enters into COLO state after a complete migration, so we
>> reuse it.
>>
>> While do checkpoint, we need to release all the buffered packets that
>> have not
>> yet been compared, so we need to access the list in COLO checkpoint
>> thread.
>
Hi Jason,
> I think the better way is notify the comparing thread and let it do the
> releasing. You probably need similar mechanism to notify from comparing
> thread to checkpoint thread.
>
It seems that there is no available APIs in glib to notify a thread which
are run coroutine to do something (idle source ?). What about using anonymous pipe
as the GPollFD to communicate between colo comparing thread and colo thread ?
Any ideas ?
Hailiang
> Thanks
>
>>
>> Thanks.
>
>
> .
>
next prev parent reply other threads:[~2017-02-14 2:32 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-24 14:05 [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame zhanghailiang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
2017-02-03 3:47 ` Jason Wang
2017-02-06 8:13 ` Hailiang Zhang
2017-02-06 8:30 ` Zhang Chen
2017-02-06 9:35 ` Jason Wang
2017-02-06 11:11 ` Hailiang Zhang
2017-02-06 12:53 ` Jason Wang
2017-02-07 7:54 ` Hailiang Zhang
2017-02-07 7:57 ` Jason Wang
2017-02-07 8:19 ` Hailiang Zhang
2017-02-07 9:21 ` Jason Wang
2017-02-07 9:30 ` Hailiang Zhang
2017-02-14 2:32 ` Hailiang Zhang [this message]
2017-02-14 4:08 ` Jason Wang
2017-02-14 7:33 ` Hailiang Zhang
2017-02-15 3:16 ` Jason Wang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 2/3] colo-compare: add API to flush all queued packets while do checkpoint zhanghailiang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing zhanghailiang
2017-02-03 4:50 ` Jason Wang
2017-02-06 8:44 ` Hailiang Zhang
2017-02-06 9:35 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58A26C35.9070309@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=qemu-devel@nongnu.org \
--cc=xuquan8@huawei.com \
--cc=zhangchen.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).