ocfs2-devel.oss.oracle.com archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler
@ 2016-07-10 10:01 piaojun
  2016-07-11  1:55 ` Joseph Qi
  0 siblings, 1 reply; 3+ messages in thread
From: piaojun @ 2016-07-10 10:01 UTC (permalink / raw)
  To: ocfs2-devel

We found a BUG situation in which DLM_LOCK_RES_DROPPING_REF is cleared
unexpected that described below. To solve the bug, we disable the BUG_ON
and purge lockres in dlm_do_local_recovery_cleanup.

Node 1                               Node 2(master)
dlm_purge_lockres
                                     dlm_deref_lockres_handler

                                     DLM_LOCK_RES_SETREF_INPROG is set
                                     response DLM_DEREF_RESPONSE_INPROG

receive DLM_DEREF_RESPONSE_INPROG
stop puring in dlm_purge_lockres
and wait for DLM_DEREF_RESPONSE_DONE

                                     dispatch dlm_deref_lockres_worker
                                     response DLM_DEREF_RESPONSE_DONE

receive DLM_DEREF_RESPONSE_DONE and
prepare to purge lockres

                                     Node 2 goes down

find Node2 down and do local
clean up for Node2:
dlm_do_local_recovery_cleanup
  -> clear DLM_LOCK_RES_DROPPING_REF

when purging lockres, BUG_ON happens
because DLM_LOCK_RES_DROPPING_REF is clear:
dlm_deref_lockres_done_handler
  ->BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));

Fixes: 60d663cb5273 ("ocfs2/dlm: add DEREF_DONE message")
Signed-off-by: Jun Piao <piaojun@huawei.com>
---
 fs/ocfs2/dlm/dlmmaster.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 9aed6e2..f72e7ae 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -2416,7 +2416,16 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
 	}
 
 	spin_lock(&res->spinlock);
-	BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
+	if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
+		spin_unlock(&res->spinlock);
+		spin_unlock(&dlm->spinlock);
+		mlog(ML_NOTICE, "%s:%.*s: node %u sends deref done "
+			"but it is already derefed!\n", dlm->name,
+			res->lockname.len, res->lockname.name, node);
+		dlm_lockres_put(res);
+		goto done;
+	}
+
 	if (!list_empty(&res->purge)) {
 		mlog(0, "%s: Removing res %.*s from purgelist\n",
 			dlm->name, res->lockname.len, res->lockname.name);
@@ -2455,6 +2464,8 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
 
 	spin_unlock(&dlm->spinlock);
 
+	ret = 0;
+
 done:
 	dlm_put(dlm);
 	return ret;
-- 
1.8.4.3

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-11  2:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-10 10:01 [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler piaojun
2016-07-11  1:55 ` Joseph Qi
2016-07-11  2:17   ` piaojun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).