* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd @ 2017-08-09 5:51 tsutomu.owa 2017-08-09 16:18 ` David Teigland 0 siblings, 1 reply; 4+ messages in thread From: tsutomu.owa @ 2017-08-09 5:51 UTC (permalink / raw) To: cluster-devel.redhat.com When dlm_recoverd_stop() is called between kthread_should_stop() and set_task_state(), dlm_recoverd will not wake up. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> --- fs/dlm/recoverd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c index 6859b4b..d3956cc 100644 --- a/fs/dlm/recoverd.c +++ b/fs/dlm/recoverd.c @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls) static int dlm_recoverd(void *arg) { struct dlm_ls *ls; + unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1; ls = dlm_find_lockspace_local(arg); if (!ls) { @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg) set_current_state(TASK_INTERRUPTIBLE); if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) - schedule(); + schedule_timeout(timeout); set_current_state(TASK_RUNNING); if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) { -- 2.7.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd 2017-08-09 5:51 [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd tsutomu.owa @ 2017-08-09 16:18 ` David Teigland 2017-08-17 23:40 ` tsutomu.owa 0 siblings, 1 reply; 4+ messages in thread From: David Teigland @ 2017-08-09 16:18 UTC (permalink / raw) To: cluster-devel.redhat.com On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu.owa at toshiba.co.jp wrote: > When dlm_recoverd_stop() is called between kthread_should_stop() and > set_task_state(), dlm_recoverd will not wake up. This works, but have you looked elsewhere in the kernel for kthread examples we could copy that do a race-free check without a timeout? Thanks > > Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> > Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> > --- > fs/dlm/recoverd.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c > index 6859b4b..d3956cc 100644 > --- a/fs/dlm/recoverd.c > +++ b/fs/dlm/recoverd.c > @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls) > static int dlm_recoverd(void *arg) > { > struct dlm_ls *ls; > + unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1; > > ls = dlm_find_lockspace_local(arg); > if (!ls) { > @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg) > set_current_state(TASK_INTERRUPTIBLE); > if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && > !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) > - schedule(); > + schedule_timeout(timeout); > set_current_state(TASK_RUNNING); > > if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) { > -- > 2.7.4 > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd 2017-08-09 16:18 ` David Teigland @ 2017-08-17 23:40 ` tsutomu.owa 2017-08-22 18:21 ` David Teigland 0 siblings, 1 reply; 4+ messages in thread From: tsutomu.owa @ 2017-08-17 23:40 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, > This works, but have you looked elsewhere in the kernel for kthread > examples we could copy that do a race-free check without a timeout? If you refer to other implementations in kernel, the following modifications may be better. The important thing is to call kthread_should_stop() after set_current_state(TASK_INTERRUPTIBLE). How is this fix? --- fs/dlm/recoverd.c.orig 2017-08-10 16:42:31.131296351 +0900 +++ fs/dlm/recoverd.c 2017-08-10 16:43:22.483295232 +0900 @@ -287,8 +287,10 @@ set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags); wake_up(&ls->ls_recover_lock_wait); - while (!kthread_should_stop()) { + while (1) { set_current_state(TASK_INTERRUPTIBLE); + if (kthread_should_stop()) + break; if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) schedule(); thanks, -- owa -----Original Message----- From: David Teigland [mailto:teigland at redhat.com] Sent: Thursday, August 10, 2017 1:19 AM To: owa tsutomu(?? ? ??? ?????????????) Cc: cluster-devel at redhat.com; miyauchi tadashi(?? ?? ???? ?????????) Subject: Re: [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu.owa at toshiba.co.jp wrote: > When dlm_recoverd_stop() is called between kthread_should_stop() and > set_task_state(), dlm_recoverd will not wake up. This works, but have you looked elsewhere in the kernel for kthread examples we could copy that do a race-free check without a timeout? Thanks > > Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> > Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> > --- > fs/dlm/recoverd.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c > index 6859b4b..d3956cc 100644 > --- a/fs/dlm/recoverd.c > +++ b/fs/dlm/recoverd.c > @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls) > static int dlm_recoverd(void *arg) > { > struct dlm_ls *ls; > + unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1; > > ls = dlm_find_lockspace_local(arg); > if (!ls) { > @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg) > set_current_state(TASK_INTERRUPTIBLE); > if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && > !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) > - schedule(); > + schedule_timeout(timeout); > set_current_state(TASK_RUNNING); > > if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) { > -- > 2.7.4 > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd 2017-08-17 23:40 ` tsutomu.owa @ 2017-08-22 18:21 ` David Teigland 0 siblings, 0 replies; 4+ messages in thread From: David Teigland @ 2017-08-22 18:21 UTC (permalink / raw) To: cluster-devel.redhat.com On Thu, Aug 17, 2017 at 11:40:13PM +0000, tsutomu.owa at toshiba.co.jp wrote: > If you refer to other implementations in kernel, the following > modifications may be better. > The important thing is to call kthread_should_stop() after > set_current_state(TASK_INTERRUPTIBLE). How is this fix? Thanks, I prefer this. Perhaps put that explanation in a comment, e.g. > --- fs/dlm/recoverd.c.orig 2017-08-10 16:42:31.131296351 +0900 > +++ fs/dlm/recoverd.c 2017-08-10 16:43:22.483295232 +0900 > @@ -287,8 +287,10 @@ > set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags); > wake_up(&ls->ls_recover_lock_wait); > > - while (!kthread_should_stop()) { > + while (1) { /* call kthread_should_stop() after set_current_state() / > set_current_state(TASK_INTERRUPTIBLE); > + if (kthread_should_stop()) > + break; > if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && > !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) > schedule(); > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-08-22 18:21 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-09 5:51 [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd tsutomu.owa 2017-08-09 16:18 ` David Teigland 2017-08-17 23:40 ` tsutomu.owa 2017-08-22 18:21 ` David Teigland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).