From: xuejiufei <xuejiufei@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] ocfs2: A race between refmap setting and clearing
Date: Wed, 13 Jan 2016 14:21:26 +0800 [thread overview]
Message-ID: <5695ECE6.7050709@huawei.com> (raw)
In-Reply-To: <5695BA7D.90009@oracle.com>
Hi Junxiao,
I have not describe the issue clearly.
Node 1 Node 2(master)
dlmlock
dlm_do_master_request
dlm_master_request_handler
-> dlm_lockres_set_refmap_bit
dlmlock succeed
dlmunlock succeed
dlm_purge_lockres
dlm_deref_handler
-> find lock resource is in
DLM_LOCK_RES_SETREF_INPROG state,
so dispatch a deref work
dlm_purge_lockres succeed.
call dlmlock again
dlm_do_master_request
dlm_master_request_handler
-> dlm_lockres_set_refmap_bit
deref work trigger, call
dlm_lockres_clear_refmap_bit
to clear Node 1 from refmap
dlm_purge_lockres succeed
dlm_send_remote_lock_request
return DLM_IVLOCKID because
the lockres is not exist
BUG if the lockres is $RECOVERY
On 2016/1/13 10:46, Junxiao Bi wrote:
> On 01/12/2016 03:16 PM, xuejiufei wrote:
>> Hi, Junxiao
>>
>> On 2016/1/12 12:03, Junxiao Bi wrote:
>>> Hi Jiufei,
>>>
>>> On 01/11/2016 10:46 AM, xuejiufei wrote:
>>>> Hi all,
>>>> We have found a race between refmap setting and clearing which
>>>> will cause the lock resource on master is freed before other nodes
>>>> purge it.
>>>>
>>>> Node 1 Node 2(master)
>>>> dlm_do_master_request
>>>> dlm_master_request_handler
>>>> -> dlm_lockres_set_refmap_bit
>>>> call dlm_purge_lockres after unlock
>>>> dlm_deref_handler
>>>> -> find lock resource is in
>>>> DLM_LOCK_RES_SETREF_INPROG state,
>>>> so dispatch a deref work
>>>> dlm_purge_lockres succeed.
>>>>
>>>> dlm_do_master_request
>>>> dlm_master_request_handler
>>>> -> dlm_lockres_set_refmap_bit
>>>>
>>>> deref work trigger, call
>>>> dlm_lockres_clear_refmap_bit
>>>> to clear Node 1 from refmap
>>>>
>>>> Now Node 2 can purge the lock resource but the owner of lock resource
>>>> on Node 1 is still Node 2 which may trigger BUG if the lock resource
>>>> is $RECOVERY or other problems.
>>>>
>>>> We have discussed 2 solutions:
>>>> 1)The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG
>>>> is set. Node 1 will not retry and master send another message to Node 1
>>>> after clearing the refmap. Node 1 can purge the lock resource after the
>>>> refmap on master is cleared.
>>>> 2) The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG
>>>> is set, and Node 1 will retry to deref the lockres.
>>>>
>>>> Does anybody has better ideas?
>>>>
>>> dlm_purge_lockres() will wait to drop ref until
>>> DLM_LOCK_RES_SETREF_INPROG cleared. So if set this flag when find the
>>> master during doing master request. And then this flag was cleared when
>>> receiving assert master message, can this fix the issue?
>>>
>> I don't think this can fix. Before doing master request, the lock resource is
>> already purged. The master should clear the refmap before client purge it.
> inflight_locks is increased in dlm_get_lock_resource() which will stop
> lockres purged? Set DLM_LOCK_RES_SETREF_INPROG when found lockres owner
> during master request, then this will stop lockres purged after unlock?
>
> Thanks,
> Junxiao.
>
>>
>> Thanks,
>> Jiufei
>>
>>> Thanks,
>>> Junxiao.
>>>> Thanks,
>>>> --Jiufei
>>>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>
next prev parent reply other threads:[~2016-01-13 6:21 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-11 2:46 [Ocfs2-devel] ocfs2: A race between refmap setting and clearing xuejiufei
2016-01-12 4:03 ` Junxiao Bi
2016-01-12 7:16 ` xuejiufei
2016-01-13 2:46 ` Junxiao Bi
2016-01-13 6:21 ` xuejiufei [this message]
2016-01-13 7:00 ` Junxiao Bi
2016-01-13 8:24 ` Joseph Qi
2016-01-18 4:28 ` Junxiao Bi
2016-01-18 7:07 ` xuejiufei
2016-01-19 3:03 ` Junxiao Bi
2016-01-19 8:19 ` xuejiufei
2016-01-19 9:02 ` Junxiao Bi
2016-01-21 7:34 ` Junxiao Bi
2016-01-26 1:43 ` xuejiufei
2016-01-26 2:45 ` Junxiao Bi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5695ECE6.7050709@huawei.com \
--to=xuejiufei@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.