* [PATCH] clvmd: closedown the cluster after finishing of lvm_thread
@ 2013-11-27 8:56 dongmao zhang
2013-11-28 13:57 ` Zdenek Kabelac
0 siblings, 1 reply; 3+ messages in thread
From: dongmao zhang @ 2013-11-27 8:56 UTC (permalink / raw)
To: lvm-devel
when lvm_thread is processing remote request, the clvmd
received a SIG_TERM, it will free cluster resource before
the realwork of lvm_thread is done. If freeing the cluster
resource happens before send_message, it would cause the
remote command hangs forever.
this patch move closedown after the closing the working thread.
---
daemons/clvmd/clvmd.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
index d57c0fd..b2f7dd5 100644
--- a/daemons/clvmd/clvmd.c
+++ b/daemons/clvmd/clvmd.c
@@ -621,6 +621,8 @@ int main(int argc, char *argv[])
if ((errno = pthread_join(lvm_thread, NULL)))
log_sys_error("pthread_join", "");
+ clops->cluster_closedown();
+
close_local_sock(local_sock);
destroy_lvm();
@@ -979,7 +981,6 @@ static void main_loop(int local_sock, int cmd_timeout)
}
closedown:
- clops->cluster_closedown();
if (quit)
DEBUGLOG("SIGTERM received\n");
}
--
1.7.3.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH] clvmd: closedown the cluster after finishing of lvm_thread
2013-11-27 8:56 [PATCH] clvmd: closedown the cluster after finishing of lvm_thread dongmao zhang
@ 2013-11-28 13:57 ` Zdenek Kabelac
2013-11-29 6:06 ` dongmao zhang
0 siblings, 1 reply; 3+ messages in thread
From: Zdenek Kabelac @ 2013-11-28 13:57 UTC (permalink / raw)
To: lvm-devel
Dne 27.11.2013 09:56, dongmao zhang napsal(a):
> when lvm_thread is processing remote request, the clvmd
> received a SIG_TERM, it will free cluster resource before
> the realwork of lvm_thread is done. If freeing the cluster
> resource happens before send_message, it would cause the
> remote command hangs forever.
>
> this patch move closedown after the closing the working thread.
> ---
> daemons/clvmd/clvmd.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
> index d57c0fd..b2f7dd5 100644
> --- a/daemons/clvmd/clvmd.c
> +++ b/daemons/clvmd/clvmd.c
> @@ -621,6 +621,8 @@ int main(int argc, char *argv[])
> if ((errno = pthread_join(lvm_thread, NULL)))
> log_sys_error("pthread_join", "");
>
> + clops->cluster_closedown();
> +
> close_local_sock(local_sock);
> destroy_lvm();
>
> @@ -979,7 +981,6 @@ static void main_loop(int local_sock, int cmd_timeout)
> }
>
> closedown:
> - clops->cluster_closedown();
> if (quit)
> DEBUGLOG("SIGTERM received\n");
> }
It's not clear to me how this code move helps to anything.
You just moved call of clops->cluster_closedown(); after joining thread?
In which code path this patch is changing something ?
Zdenek
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] clvmd: closedown the cluster after finishing of lvm_thread
2013-11-28 13:57 ` Zdenek Kabelac
@ 2013-11-29 6:06 ` dongmao zhang
0 siblings, 0 replies; 3+ messages in thread
From: dongmao zhang @ 2013-11-29 6:06 UTC (permalink / raw)
To: lvm-devel
? 2013?11?28? 21:57, Zdenek Kabelac ??:
> Dne 27.11.2013 09:56, dongmao zhang napsal(a):
>> when lvm_thread is processing remote request, the clvmd
>> received a SIG_TERM, it will free cluster resource before
>> the realwork of lvm_thread is done. If freeing the cluster
>> resource happens before send_message, it would cause the
>> remote command hangs forever.
>>
>> this patch move closedown after the closing the working thread.
>> ---
>> daemons/clvmd/clvmd.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
>> index d57c0fd..b2f7dd5 100644
>> --- a/daemons/clvmd/clvmd.c
>> +++ b/daemons/clvmd/clvmd.c
>> @@ -621,6 +621,8 @@ int main(int argc, char *argv[])
>> if ((errno = pthread_join(lvm_thread, NULL)))
>> log_sys_error("pthread_join", "");
>>
>> + clops->cluster_closedown();
>> +
>> close_local_sock(local_sock);
>> destroy_lvm();
>>
>> @@ -979,7 +981,6 @@ static void main_loop(int local_sock, int
>> cmd_timeout)
>> }
>>
>> closedown:
>> - clops->cluster_closedown();
>> if (quit)
>> DEBUGLOG("SIGTERM received\n");
>> }
>
>
> It's not clear to me how this code move helps to anything.
>
> You just moved call of clops->cluster_closedown(); after joining thread?
>
> In which code path this patch is changing something ?
>
> Zdenek
>
>
hi Zdenek,
thank you for you reply. The main idea is that the lvm_thread_fn is
using cluster resources(such as using cpg_handler in send_message), we
could not free cluster resource until lvm_thread_fn finishs.
The 'lvm_thread_fn' thread is doing 'process_work_item' in which it will
send reply message(cluster_send_message) back
to remote nodes. The cluster_send_message is using the cluster resource.
So it means we can not free the cluster resource before lvm_thread_fn
really is finished. The cluster_closedown in the main thread could
possibly happen before lvm_thread_fn thread calls send_message.
If so, it could cause a sending message failure, moreover, the remote
node can not get the response, it has to wait a timeout to finish.
I met a bug like this: two nodes with VG resource.
1. NodeA runs 'rcopenais stop'
2. NodeB runs 'vgscan'
in some time, vgscan could hang for a while waiting all cluster nodes'
response.
Because unfortunately clvmd on NodeA can not send back message because
cluster_closedown happens before send_message.
Dongmao Zhang
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-11-29 6:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-27 8:56 [PATCH] clvmd: closedown the cluster after finishing of lvm_thread dongmao zhang
2013-11-28 13:57 ` Zdenek Kabelac
2013-11-29 6:06 ` dongmao zhang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.