cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd
@ 2017-08-09  5:51 tsutomu.owa
  2017-08-09 16:18 ` David Teigland
  0 siblings, 1 reply; 4+ messages in thread
From: tsutomu.owa @ 2017-08-09  5:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

When dlm_recoverd_stop() is called between kthread_should_stop() and
set_task_state(), dlm_recoverd will not wake up.

Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp>
Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp>
---
 fs/dlm/recoverd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 6859b4b..d3956cc 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
 static int dlm_recoverd(void *arg)
 {
 	struct dlm_ls *ls;
+	unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1;
 
 	ls = dlm_find_lockspace_local(arg);
 	if (!ls) {
@@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg)
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
 		    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
-			schedule();
+			schedule_timeout(timeout);
 		set_current_state(TASK_RUNNING);
 
 		if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) {
-- 
2.7.4






^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd
  2017-08-09  5:51 [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd tsutomu.owa
@ 2017-08-09 16:18 ` David Teigland
  2017-08-17 23:40   ` tsutomu.owa
  0 siblings, 1 reply; 4+ messages in thread
From: David Teigland @ 2017-08-09 16:18 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu.owa at toshiba.co.jp wrote:
> When dlm_recoverd_stop() is called between kthread_should_stop() and
> set_task_state(), dlm_recoverd will not wake up.

This works, but have you looked elsewhere in the kernel for kthread
examples we could copy that do a race-free check without a timeout?
Thanks

> 
> Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp>
> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp>
> ---
>  fs/dlm/recoverd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index 6859b4b..d3956cc 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
>  static int dlm_recoverd(void *arg)
>  {
>  	struct dlm_ls *ls;
> +	unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1;
>  
>  	ls = dlm_find_lockspace_local(arg);
>  	if (!ls) {
> @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg)
>  		set_current_state(TASK_INTERRUPTIBLE);
>  		if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
>  		    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
> -			schedule();
> +			schedule_timeout(timeout);
>  		set_current_state(TASK_RUNNING);
>  
>  		if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) {
> -- 
> 2.7.4
> 
> 
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd
  2017-08-09 16:18 ` David Teigland
@ 2017-08-17 23:40   ` tsutomu.owa
  2017-08-22 18:21     ` David Teigland
  0 siblings, 1 reply; 4+ messages in thread
From: tsutomu.owa @ 2017-08-17 23:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

> This works, but have you looked elsewhere in the kernel for kthread
> examples we could copy that do a race-free check without a timeout?

If you refer to other implementations in kernel, the following
modifications may be better.
The important thing is to call kthread_should_stop() after
set_current_state(TASK_INTERRUPTIBLE).  How is this fix?

--- fs/dlm/recoverd.c.orig      2017-08-10 16:42:31.131296351 +0900
+++ fs/dlm/recoverd.c   2017-08-10 16:43:22.483295232 +0900
@@ -287,8 +287,10 @@
	set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags);
	wake_up(&ls->ls_recover_lock_wait);

-	while (!kthread_should_stop()) {
+	while (1) {
		set_current_state(TASK_INTERRUPTIBLE);
+		if (kthread_should_stop())
+			break;
		if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
		    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
			schedule();

thanks,
-- owa

-----Original Message-----
From: David Teigland [mailto:teigland at redhat.com] 
Sent: Thursday, August 10, 2017 1:19 AM
To: owa tsutomu(?? ? ??? ?????????????)
Cc: cluster-devel at redhat.com; miyauchi tadashi(?? ?? ???? ?????????)
Subject: Re: [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd

On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu.owa at toshiba.co.jp wrote:
> When dlm_recoverd_stop() is called between kthread_should_stop() and
> set_task_state(), dlm_recoverd will not wake up.

This works, but have you looked elsewhere in the kernel for kthread
examples we could copy that do a race-free check without a timeout?
Thanks

> 
> Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp>
> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp>
> ---
>  fs/dlm/recoverd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index 6859b4b..d3956cc 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
>  static int dlm_recoverd(void *arg)
>  {
>  	struct dlm_ls *ls;
> +	unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1;
>  
>  	ls = dlm_find_lockspace_local(arg);
>  	if (!ls) {
> @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg)
>  		set_current_state(TASK_INTERRUPTIBLE);
>  		if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
>  		    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
> -			schedule();
> +			schedule_timeout(timeout);
>  		set_current_state(TASK_RUNNING);
>  
>  		if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) {
> -- 
> 2.7.4
> 
> 
> 





^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd
  2017-08-17 23:40   ` tsutomu.owa
@ 2017-08-22 18:21     ` David Teigland
  0 siblings, 0 replies; 4+ messages in thread
From: David Teigland @ 2017-08-22 18:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Aug 17, 2017 at 11:40:13PM +0000, tsutomu.owa at toshiba.co.jp wrote:
> If you refer to other implementations in kernel, the following
> modifications may be better.
> The important thing is to call kthread_should_stop() after
> set_current_state(TASK_INTERRUPTIBLE).  How is this fix?

Thanks, I prefer this.  Perhaps put that explanation in a comment, e.g.

> --- fs/dlm/recoverd.c.orig      2017-08-10 16:42:31.131296351 +0900
> +++ fs/dlm/recoverd.c   2017-08-10 16:43:22.483295232 +0900
> @@ -287,8 +287,10 @@
> 	set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags);
> 	wake_up(&ls->ls_recover_lock_wait);
> 
> -	while (!kthread_should_stop()) {
> +	while (1) {
                /* call kthread_should_stop() after set_current_state() /
> 		set_current_state(TASK_INTERRUPTIBLE);
> +		if (kthread_should_stop())
> +			break;
> 		if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
> 		    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
> 			schedule();
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-22 18:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-09  5:51 [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd tsutomu.owa
2017-08-09 16:18 ` David Teigland
2017-08-17 23:40   ` tsutomu.owa
2017-08-22 18:21     ` David Teigland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).