From: wangjian <wangjian161@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2/dlm: return DLM_CANCELGRANT if the lock is on granted list and the operation is canceled
Date: Mon, 3 Dec 2018 20:20:58 +0800 [thread overview]
Message-ID: <98f0e80c-9c13-dbb6-047c-b40e100082b1@huawei.com> (raw)
In the dlm_move_lockres_to_recovery_list function, if the lock
is in the granted queue and cancel_pending is set, it will
encounter a BUG. I think this is a meaningless BUG,
so be prepared to remove it. A scenario that causes
this BUG will be given below.
At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.
Node 1 Node 2 Node 3
want to get EX lock.
want to get EX lock.
Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.
receive BAST from
Node 1. downconvert
thread begin to
cancel PR to EX conversion.
In dlmunlock_common function,
downconvert thread has set
lock->cancel_pending,
but did not enter
dlm_send_remote_unlock_request
function.
Node2 dies because
the host is powered down.
In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.
receive AST from Node 1.
change lock level to EX,
move lock to granted list.
Node1 dies because
the host is powered down.
In dlm_move_lockres_to_recovery_list
function. the lock is in the
granted queue and cancel_pending
is set. BUG_ON.
But after clearing this BUG, process will encounter
the second BUG in the ocfs2_unlock_ast function.
Here is a scenario that will cause the second BUG
in ocfs2_unlock_ast as follows:
At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.
Node 1 Node 2 Node 3
want to get EX lock.
want to get EX lock.
Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.
receive BAST from
Node 1. downconvert
thread begin to
cancel PR to EX conversion.
In dlmunlock_common function,
downconvert thread has released
lock->spinlock and res->spinlock,
but did not enter
dlm_send_remote_unlock_request
function.
Node2 dies because
the host is powered down.
In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.
receive AST from Node 1.
change lock level to EX,
move lock to granted list,
set lockres->l_unlock_action
as OCFS2_UNLOCK_INVALID
in ocfs2_locking_ast function.
Node2 dies because
the host is powered down.
Node 3 realize that Node 1
is dead, remove Node 1 from
domain_map. downconvert thread
get DLM_NORMAL from
dlm_send_remote_unlock_request
function and set *call_ast as 1.
Then downconvert thread meet
BUG in ocfs2_unlock_ast function.
To avoid meet the second BUG, function dlmunlock_common shuold
return DLM_CANCELGRANT if the lock is on granted list and
the operation is canceled.
Signed-off-by: Jian Wang <wangjian161@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
---
fs/ocfs2/dlm/dlmrecovery.c | 1 -
fs/ocfs2/dlm/dlmunlock.c | 5 +++++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 802636d..7489652 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
* if this had completed successfully
* before sending this lock state to the
* new master */
- BUG_ON(i != DLM_CONVERTING_LIST);
mlog(0, "node died with cancel pending "
"on %.*s. move back to granted list.\n",
res->lockname.len, res->lockname.name);
diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..505bb6c 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
flags, owner);
spin_lock(&res->spinlock);
spin_lock(&lock->spinlock);
+
+ if ((flags & LKM_CANCEL) &&
+ dlm_lock_on_list(&res->granted, lock))
+ status = DLM_CANCELGRANT;
+
/* if the master told us the lock was already granted,
* let the ast handle all of these actions */
if (status == DLM_CANCELGRANT) {
--
1.8.3.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20181203/845049c8/attachment.html
next reply other threads:[~2018-12-03 12:20 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-03 12:20 wangjian [this message]
2018-12-05 1:49 ` [Ocfs2-devel] [PATCH] ocfs2/dlm: return DLM_CANCELGRANT if the lock is on granted list and the operation is canceled Changwei Ge
2018-12-06 12:05 ` wangjian
2018-12-07 3:12 ` Changwei Ge
2018-12-08 10:05 ` wangjian
2019-02-14 9:04 ` piaojun
2019-02-14 9:59 ` Changwei Ge
2019-02-14 10:25 ` piaojun
2019-02-14 10:28 ` Changwei Ge
2019-02-14 10:13 ` Changwei Ge
2019-02-15 6:50 ` piaojun
2019-02-15 7:06 ` Changwei Ge
2019-02-15 7:35 ` piaojun
2019-02-15 7:56 ` Changwei Ge
2019-02-15 9:19 ` piaojun
2019-02-15 9:27 ` Changwei Ge
2019-02-15 9:48 ` piaojun
2019-02-18 9:25 ` Changwei Ge
2019-02-19 0:47 ` piaojun
2019-02-19 2:38 ` Changwei Ge
2019-02-19 8:26 ` piaojun
2019-02-21 6:46 ` Changwei Ge
2019-02-22 3:15 ` piaojun
2019-02-22 3:32 ` Changwei Ge
2019-02-22 3:34 ` piaojun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98f0e80c-9c13-dbb6-047c-b40e100082b1@huawei.com \
--to=wangjian161@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).