From: piaojun <piaojun@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler
Date: Mon, 11 Jul 2016 10:17:00 +0800 [thread overview]
Message-ID: <5783019C.7030203@huawei.com> (raw)
In-Reply-To: <5782FC7A.90208@huawei.com>
On 2016-7-11 9:55, Joseph Qi wrote:
> Hi Jun,
>
> On 2016/7/10 18:01, piaojun wrote:
>> We found a BUG situation in which DLM_LOCK_RES_DROPPING_REF is cleared
>> unexpected that described below. To solve the bug, we disable the BUG_ON
>> and purge lockres in dlm_do_local_recovery_cleanup.
>>
>> Node 1 Node 2(master)
>> dlm_purge_lockres
>> dlm_deref_lockres_handler
>>
>> DLM_LOCK_RES_SETREF_INPROG is set
>> response DLM_DEREF_RESPONSE_INPROG
>>
>> receive DLM_DEREF_RESPONSE_INPROG
>> stop puring in dlm_purge_lockres
>> and wait for DLM_DEREF_RESPONSE_DONE
>>
>> dispatch dlm_deref_lockres_worker
>> response DLM_DEREF_RESPONSE_DONE
>>
>> receive DLM_DEREF_RESPONSE_DONE and
>> prepare to purge lockres
>>
>> Node 2 goes down
>>
>> find Node2 down and do local
>> clean up for Node2:
>> dlm_do_local_recovery_cleanup
>> -> clear DLM_LOCK_RES_DROPPING_REF
>>
>> when purging lockres, BUG_ON happens
>> because DLM_LOCK_RES_DROPPING_REF is clear:
>> dlm_deref_lockres_done_handler
>> ->BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
>>
>> Fixes: 60d663cb5273 ("ocfs2/dlm: add DEREF_DONE message")
>> Signed-off-by: Jun Piao <piaojun@huawei.com>
>> ---
>> fs/ocfs2/dlm/dlmmaster.c | 13 ++++++++++++-
>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>> index 9aed6e2..f72e7ae 100644
>> --- a/fs/ocfs2/dlm/dlmmaster.c
>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>> @@ -2416,7 +2416,16 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
>> }
>>
>> spin_lock(&res->spinlock);
>> - BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
>> + if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>> + spin_unlock(&res->spinlock);
>> + spin_unlock(&dlm->spinlock);
>> + mlog(ML_NOTICE, "%s:%.*s: node %u sends deref done "
>> + "but it is already derefed!\n", dlm->name,
>> + res->lockname.len, res->lockname.name, node);
>> + dlm_lockres_put(res);
> So we treat this case as normal?
> If so, we'd better return 0 other than -EINVAL.
>
> Thanks,
> Joseph
>
Good suggestion, I will fix this problem in the following [PATCH v2].
Thanks,
Jun Piao
>> + goto done;
>> + }
>> +
>> if (!list_empty(&res->purge)) {
>> mlog(0, "%s: Removing res %.*s from purgelist\n",
>> dlm->name, res->lockname.len, res->lockname.name);
>> @@ -2455,6 +2464,8 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
>>
>> spin_unlock(&dlm->spinlock);
>>
>> + ret = 0;
>> +
>> done:
>> dlm_put(dlm);
>> return ret;
>>
>
>
>
> .
>
prev parent reply other threads:[~2016-07-11 2:17 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-10 10:01 [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler piaojun
2016-07-11 1:55 ` Joseph Qi
2016-07-11 2:17 ` piaojun [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5783019C.7030203@huawei.com \
--to=piaojun@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.