All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [RFC] Doubt about dlm_worker
@ 2015-09-06 13:11 Joseph Qi
  2015-09-10 11:49 ` Joseph Qi
  0 siblings, 1 reply; 4+ messages in thread
From: Joseph Qi @ 2015-09-06 13:11 UTC (permalink / raw)
  To: ocfs2-devel

Comments for dlm_dispatch_work is described below:
/* Worker function used during recovery. */

But actually dlm_worker is used by 4 types of dlm message workers:
	dlm_assert_master_worker
	dlm_deref_lockres_worker
	dlm_request_all_locks_worker
	dlm_mig_lockres_worker

And the first 2 are not dlm recovery related. Moreover, it will send
DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
And it may do a lot of assert master during recovery. In our scenario,
it is tens of thousands.
This will delay the recovery because dlm_worker is a single thread
workqueue and cluster is hanging during dlm recovery.
So I doubt if we can move the assert master to a new workqueue or just
use a system workqueue.
Any suggestions?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] [RFC] Doubt about dlm_worker
  2015-09-06 13:11 [Ocfs2-devel] [RFC] Doubt about dlm_worker Joseph Qi
@ 2015-09-10 11:49 ` Joseph Qi
  2015-09-10 19:18   ` Sunil Mushran
  2015-09-11  2:19   ` Junxiao Bi
  0 siblings, 2 replies; 4+ messages in thread
From: Joseph Qi @ 2015-09-10 11:49 UTC (permalink / raw)
  To: ocfs2-devel

Hi Junxiao & Sunil,
Your comments would be appreciated.

Thanks,
Joseph

On 2015/9/6 21:11, Joseph Qi wrote:
> Comments for dlm_dispatch_work is described below:
> /* Worker function used during recovery. */
> 
> But actually dlm_worker is used by 4 types of dlm message workers:
> 	dlm_assert_master_worker
> 	dlm_deref_lockres_worker
> 	dlm_request_all_locks_worker
> 	dlm_mig_lockres_worker
> 
> And the first 2 are not dlm recovery related. Moreover, it will send
> DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
> And it may do a lot of assert master during recovery. In our scenario,
> it is tens of thousands.
> This will delay the recovery because dlm_worker is a single thread
> workqueue and cluster is hanging during dlm recovery.
> So I doubt if we can move the assert master to a new workqueue or just
> use a system workqueue.
> Any suggestions?
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] [RFC] Doubt about dlm_worker
  2015-09-10 11:49 ` Joseph Qi
@ 2015-09-10 19:18   ` Sunil Mushran
  2015-09-11  2:19   ` Junxiao Bi
  1 sibling, 0 replies; 4+ messages in thread
From: Sunil Mushran @ 2015-09-10 19:18 UTC (permalink / raw)
  To: ocfs2-devel

Sure. It will need to be tested appropriately.

On Thu, Sep 10, 2015 at 4:49 AM, Joseph Qi <joseph.qi@huawei.com> wrote:

> Hi Junxiao & Sunil,
> Your comments would be appreciated.
>
> Thanks,
> Joseph
>
> On 2015/9/6 21:11, Joseph Qi wrote:
> > Comments for dlm_dispatch_work is described below:
> > /* Worker function used during recovery. */
> >
> > But actually dlm_worker is used by 4 types of dlm message workers:
> >       dlm_assert_master_worker
> >       dlm_deref_lockres_worker
> >       dlm_request_all_locks_worker
> >       dlm_mig_lockres_worker
> >
> > And the first 2 are not dlm recovery related. Moreover, it will send
> > DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
> > And it may do a lot of assert master during recovery. In our scenario,
> > it is tens of thousands.
> > This will delay the recovery because dlm_worker is a single thread
> > workqueue and cluster is hanging during dlm recovery.
> > So I doubt if we can move the assert master to a new workqueue or just
> > use a system workqueue.
> > Any suggestions?
> >
> >
> > _______________________________________________
> > Ocfs2-devel mailing list
> > Ocfs2-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20150910/b4bff62e/attachment.html 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] [RFC] Doubt about dlm_worker
  2015-09-10 11:49 ` Joseph Qi
  2015-09-10 19:18   ` Sunil Mushran
@ 2015-09-11  2:19   ` Junxiao Bi
  1 sibling, 0 replies; 4+ messages in thread
From: Junxiao Bi @ 2015-09-11  2:19 UTC (permalink / raw)
  To: ocfs2-devel

On 09/10/2015 07:49 PM, Joseph Qi wrote:
> Hi Junxiao & Sunil,
> Your comments would be appreciated.
> 
> Thanks,
> Joseph
> 
> On 2015/9/6 21:11, Joseph Qi wrote:
>> Comments for dlm_dispatch_work is described below:
>> /* Worker function used during recovery. */
>>
>> But actually dlm_worker is used by 4 types of dlm message workers:
>> 	dlm_assert_master_worker
>> 	dlm_deref_lockres_worker
>> 	dlm_request_all_locks_worker
>> 	dlm_mig_lockres_worker
>>
>> And the first 2 are not dlm recovery related. Moreover, it will send
>> DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
>> And it may do a lot of assert master during recovery. In our scenario,
>> it is tens of thousands.
>> This will delay the recovery because dlm_worker is a single thread
>> workqueue and cluster is hanging during dlm recovery.
>> So I doubt if we can move the assert master to a new workqueue or just
>> use a system workqueue.
>> Any suggestions?
Take a look at the code and didn't see an obvious need that these four
worker should be run in order and they use locks to protect. So i think
it's OK to split it out. But better do a good test to avoid this unhide
some bug.

Thanks,
Junxiao.
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-09-11  2:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-06 13:11 [Ocfs2-devel] [RFC] Doubt about dlm_worker Joseph Qi
2015-09-10 11:49 ` Joseph Qi
2015-09-10 19:18   ` Sunil Mushran
2015-09-11  2:19   ` Junxiao Bi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.