All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srinivas Eeda <srinivas.eeda@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix recovery hung
Date: Wed, 19 Feb 2014 21:23:11 -0800	[thread overview]
Message-ID: <5305913F.5000707@oracle.com> (raw)
In-Reply-To: <1392798631-16510-1-git-send-email-junxiao.bi@oracle.com>

nice catch!
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>

On 02/19/2014 12:30 AM, Junxiao Bi wrote:
> There is a race window in dlm_do_recovery() between dlm_remaster_locks()
> and dlm_reset_recovery() when the recovery master nearly finish the recovery
> process for a dead node. After the master sends FINALIZE_RECO message in
> dlm_remaster_locks(), another node may become the recovery master for another
> dead node, and then send the BEGIN_RECO message to all the nodes included the
> old master, in the handler of this message dlm_begin_reco_handler() of old master,
> dlm->reco.dead_node and dlm->reco.new_master will be set to the second dead
> node and the new master, then in dlm_reset_recovery(), these two variables
> will be reset to default value. This will cause new recovery master can not finish
> the recovery process and hung, at last the whole cluster will hung for recovery.
>
> old recovery master:                                 new recovery master:
> dlm_remaster_locks()
>                                                    become recovery master for
>                                                    another dead node.
>                                                    dlm_send_begin_reco_message()
> dlm_begin_reco_handler()
> {
>   if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
>    return -EAGAIN;
>   }
>   dlm_set_reco_master(dlm, br->node_idx);
>   dlm_set_reco_dead_node(dlm, br->dead_node);
> }
> dlm_reset_recovery()
> {
>   dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
>   dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
> }
>                                                    will hung in dlm_remaster_locks() for
>                                                    request dlm locks info
>
> Before send FINALIZE_RECO message, recovery master should set DLM_RECO_STATE_FINALIZE
> for itself and clear it after the recovery done, this can break the race windows as
> the BEGIN_RECO messages will not be handled before DLM_RECO_STATE_FINALIZE flag is
> cleared.
>
> A similar race may happen between new recovery master and normal node which is in
> dlm_finalize_reco_handler(), also fix it.
>
> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
> ---
>   fs/ocfs2/dlm/dlmrecovery.c |   11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> index 7035af0..fe58e8b 100644
> --- a/fs/ocfs2/dlm/dlmrecovery.c
> +++ b/fs/ocfs2/dlm/dlmrecovery.c
> @@ -537,7 +537,10 @@ master_here:
>   		/* success!  see if any other nodes need recovery */
>   		mlog(0, "DONE mastering recovery of %s:%u here(this=%u)!\n",
>   		     dlm->name, dlm->reco.dead_node, dlm->node_num);
> -		dlm_reset_recovery(dlm);
> +		spin_lock(&dlm->spinlock);
> +		__dlm_reset_recovery(dlm);
> +		dlm->reco.state &= ~DLM_RECO_STATE_FINALIZE;
> +		spin_unlock(&dlm->spinlock);
>   	}
>   	dlm_end_recovery(dlm);
>   
> @@ -695,6 +698,10 @@ static int dlm_remaster_locks(struct dlm_ctxt *dlm, u8 dead_node)
>   		if (all_nodes_done) {
>   			int ret;
>   
> +			spin_lock(&dlm->spinlock);
> +			dlm->reco.state |= DLM_RECO_STATE_FINALIZE;
> +			spin_unlock(&dlm->spinlock);
> +
>   			/* all nodes are now in DLM_RECO_NODE_DATA_DONE state
>   	 		 * just send a finalize message to everyone and
>   	 		 * clean up */
> @@ -2882,8 +2889,8 @@ int dlm_finalize_reco_handler(struct o2net_msg *msg, u32 len, void *data,
>   				BUG();
>   			}
>   			dlm->reco.state &= ~DLM_RECO_STATE_FINALIZE;
> +			__dlm_reset_recovery(dlm);
>   			spin_unlock(&dlm->spinlock);
> -			dlm_reset_recovery(dlm);
>   			dlm_kick_recovery_thread(dlm);
>   			break;
>   		default:

  parent reply	other threads:[~2014-02-20  5:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-19  8:30 [Ocfs2-devel] [PATCH] ocfs2: dlm: fix recovery hung Junxiao Bi
2014-02-19 10:10 ` Wengang
2014-02-20  5:23 ` Srinivas Eeda [this message]
  -- strict thread matches above, loose matches on Subject: below --
2014-02-19 13:44 Junxiao Bi
2014-02-20  2:32 ` Wengang
2014-02-20  5:51   ` Junxiao Bi
2014-02-20  5:51     ` Wengang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5305913F.5000707@oracle.com \
    --to=srinivas.eeda@oracle.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.