All of lore.kernel.org
 help / color / mirror / Atom feed
From: xuejiufei <xuejiufei@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] ocfs2: A race between refmap setting and clearing
Date: Mon, 11 Jan 2016 10:46:29 +0800	[thread overview]
Message-ID: <56931785.2090603@huawei.com> (raw)

Hi all,
We have found a race between refmap setting and clearing which
will cause the lock resource on master is freed before other nodes
purge it.

Node 1                               Node 2(master)
dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit
call dlm_purge_lockres after unlock
                                dlm_deref_handler
                                -> find lock resource is in
                                   DLM_LOCK_RES_SETREF_INPROG state,
                                   so dispatch a deref work
dlm_purge_lockres succeed.

dlm_do_master_request
                                dlm_master_request_handler
                                -> dlm_lockres_set_refmap_bit

                                deref work trigger, call
                                dlm_lockres_clear_refmap_bit
                                to clear Node 1 from refmap

Now Node 2 can purge the lock resource but the owner of lock resource
on Node 1 is still Node 2 which may trigger BUG if the lock resource
is $RECOVERY or other problems.

We have discussed 2 solutions:
1)The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG
is set. Node 1 will not retry and master send another message to Node 1
after clearing the refmap. Node 1 can purge the lock resource after the
refmap on master is cleared.
2) The master return error to Node 1 if the DLM_LOCK_RES_SETREF_INPROG
is set, and Node 1 will retry to deref the lockres.

Does anybody has better ideas?

Thanks,
--Jiufei

             reply	other threads:[~2016-01-11  2:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11  2:46 xuejiufei [this message]
2016-01-12  4:03 ` [Ocfs2-devel] ocfs2: A race between refmap setting and clearing Junxiao Bi
2016-01-12  7:16   ` xuejiufei
2016-01-13  2:46     ` Junxiao Bi
2016-01-13  6:21       ` xuejiufei
2016-01-13  7:00         ` Junxiao Bi
2016-01-13  8:24           ` Joseph Qi
2016-01-18  4:28             ` Junxiao Bi
2016-01-18  7:07               ` xuejiufei
2016-01-19  3:03                 ` Junxiao Bi
2016-01-19  8:19                   ` xuejiufei
2016-01-19  9:02                     ` Junxiao Bi
2016-01-21  7:34 ` Junxiao Bi
2016-01-26  1:43   ` xuejiufei
2016-01-26  2:45     ` Junxiao Bi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56931785.2090603@huawei.com \
    --to=xuejiufei@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.