From: dongmao zhang <dmzhang@suse.com>
To: lvm-devel@redhat.com
Subject: [PATCH] clvmd: closedown the cluster after finishing of lvm_thread
Date: Fri, 29 Nov 2013 14:06:39 +0800 [thread overview]
Message-ID: <52982EEF.3020104@suse.com> (raw)
In-Reply-To: <52974BE7.1080206@redhat.com>
? 2013?11?28? 21:57, Zdenek Kabelac ??:
> Dne 27.11.2013 09:56, dongmao zhang napsal(a):
>> when lvm_thread is processing remote request, the clvmd
>> received a SIG_TERM, it will free cluster resource before
>> the realwork of lvm_thread is done. If freeing the cluster
>> resource happens before send_message, it would cause the
>> remote command hangs forever.
>>
>> this patch move closedown after the closing the working thread.
>> ---
>> daemons/clvmd/clvmd.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
>> index d57c0fd..b2f7dd5 100644
>> --- a/daemons/clvmd/clvmd.c
>> +++ b/daemons/clvmd/clvmd.c
>> @@ -621,6 +621,8 @@ int main(int argc, char *argv[])
>> if ((errno = pthread_join(lvm_thread, NULL)))
>> log_sys_error("pthread_join", "");
>>
>> + clops->cluster_closedown();
>> +
>> close_local_sock(local_sock);
>> destroy_lvm();
>>
>> @@ -979,7 +981,6 @@ static void main_loop(int local_sock, int
>> cmd_timeout)
>> }
>>
>> closedown:
>> - clops->cluster_closedown();
>> if (quit)
>> DEBUGLOG("SIGTERM received\n");
>> }
>
>
> It's not clear to me how this code move helps to anything.
>
> You just moved call of clops->cluster_closedown(); after joining thread?
>
> In which code path this patch is changing something ?
>
> Zdenek
>
>
hi Zdenek,
thank you for you reply. The main idea is that the lvm_thread_fn is
using cluster resources(such as using cpg_handler in send_message), we
could not free cluster resource until lvm_thread_fn finishs.
The 'lvm_thread_fn' thread is doing 'process_work_item' in which it will
send reply message(cluster_send_message) back
to remote nodes. The cluster_send_message is using the cluster resource.
So it means we can not free the cluster resource before lvm_thread_fn
really is finished. The cluster_closedown in the main thread could
possibly happen before lvm_thread_fn thread calls send_message.
If so, it could cause a sending message failure, moreover, the remote
node can not get the response, it has to wait a timeout to finish.
I met a bug like this: two nodes with VG resource.
1. NodeA runs 'rcopenais stop'
2. NodeB runs 'vgscan'
in some time, vgscan could hang for a while waiting all cluster nodes'
response.
Because unfortunately clvmd on NodeA can not send back message because
cluster_closedown happens before send_message.
Dongmao Zhang
prev parent reply other threads:[~2013-11-29 6:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-27 8:56 [PATCH] clvmd: closedown the cluster after finishing of lvm_thread dongmao zhang
2013-11-28 13:57 ` Zdenek Kabelac
2013-11-29 6:06 ` dongmao zhang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52982EEF.3020104@suse.com \
--to=dmzhang@suse.com \
--cc=lvm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.