[Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres

All of lore.kernel.org
 help / color / mirror / Atom feed

From: piaojun <piaojun@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres
Date: Wed, 1 Nov 2017 13:52:54 +0800	[thread overview]
Message-ID: <59F96136.3070307@huawei.com> (raw)
In-Reply-To: <63ADC13FD55D6546B7DECE290D39E373CED73303@H3CMLB14-EX.srv.huawei-3com.com>

Hi Changwei,

On 2017/11/1 10:47, Changwei Ge wrote:
> Hi Jun,
> 
> Thanks for reporting.
> I am very interesting in this issue. But, first of all, I want to make 
> this issue clear, so that I might be able to provide some comments.
> 
> 
> On 2017/11/1 9:16, piaojun wrote:
>> wait for dlm recovery done when migrating all lockres in case of new
>> lockres to be left after leaving dlm domain.
> 
> What do you mean by 'a new lock resource to be left after leaving 
> domain'? It means we leak a dlm lock resource if below situation happens.
> 
a new lockres is the one collected by NodeA during recoverying for
NodeB. It leaks a lockres indeed.
>>
>>        NodeA                       NodeB                NodeC
>>
>> umount and migrate
>> all lockres
>>
>>                                   node down
>>
>> do recovery for NodeB
>> and collect a new lockres
>> form other live nodes
> 
> You mean a lock resource whose owner was NodeB is just migrated from 
> other cluster member nodes?
> 
that is it.
>>
>> leave domain but the
>> new lockres remains
>>
>>                                                    mount and join domain
>>
>>                                                    request for the owner
>>                                                    of the new lockres, but
>>                                                    all the other nodes said
>>                                                    'NO', so NodeC decide to
>>                                                    the owner, and send do
>>                                                    assert msg to other nodes.
>>
>>                                                    other nodes receive the msg
>>                                                    and found two masters exist.
>>                                                    at last cause BUG in
>>                                                    dlm_assert_master_handler()
>>                                                    -->BUG();
> 
> If this issue truly exists, can we take some efforts in 
> dlm_exit_domain_handler? Or perhaps we should kick dlm's work queue 
> before migrating all lock resources.
> 
If NodeA has entered dlm_leave_domain(), we can hardly go back
migrating res. Perhaps more work will be needed in that way.
>>
>> Fixes: bc9838c4d44a ("dlm: allow dlm do recovery during shutdown")
>>
>> Signed-off-by: Jun Piao <piaojun@huawei.com>
>> Reviewed-by: Alex Chen <alex.chen@huawei.com>
>> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
>> ---
>>   fs/ocfs2/dlm/dlmcommon.h   |  1 +
>>   fs/ocfs2/dlm/dlmdomain.c   | 14 ++++++++++++++
>>   fs/ocfs2/dlm/dlmrecovery.c | 12 +++++++++---
>>   3 files changed, 24 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h
>> index e9f3705..999ab7d 100644
>> --- a/fs/ocfs2/dlm/dlmcommon.h
>> +++ b/fs/ocfs2/dlm/dlmcommon.h
>> @@ -140,6 +140,7 @@ struct dlm_ctxt
>>   	u8 node_num;
>>   	u32 key;
>>   	u8  joining_node;
>> +	u8 migrate_done; /* set to 1 means node has migrated all lockres */
>>   	wait_queue_head_t dlm_join_events;
>>   	unsigned long live_nodes_map[BITS_TO_LONGS(O2NM_MAX_NODES)];
>>   	unsigned long domain_map[BITS_TO_LONGS(O2NM_MAX_NODES)];
>> diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c
>> index e1fea14..98a8f56 100644
>> --- a/fs/ocfs2/dlm/dlmdomain.c
>> +++ b/fs/ocfs2/dlm/dlmdomain.c
>> @@ -461,6 +461,18 @@ static int dlm_migrate_all_locks(struct dlm_ctxt *dlm)
>>   		cond_resched_lock(&dlm->spinlock);
>>   		num += n;
>>   	}
>> +
>> +	if (!num) {
>> +		if (dlm->reco.state & DLM_RECO_STATE_ACTIVE) {
>> +			mlog(0, "%s: perhaps there are more lock resources need to "
>> +					"be migrated after dlm recovery\n", dlm->name);
> 
> If dlm is mark with DLM_RECO_STATE_ACTIVE, then a lock resource must 
> already be marked with DLM_LOCK_RES_RECOVERING which can't be migrated. 
> So code will goto redo_bucket in function dlm_migrate_all_locks.
> So I don't understand why a judgement is added here?
> 
> 
> 
because we have done migrating before recoverying. the judgement here
is to avoid the following potential recoverying.
>> +			ret = -EAGAIN;
>> +		} else {
>> +			mlog(0, "%s: we won't do dlm recovery after migrating all lockres",
>> +					dlm->name);
>> +			dlm->migrate_done = 1;
>> +		}
>> +	}
>>   	spin_unlock(&dlm->spinlock);
>>   	wake_up(&dlm->dlm_thread_wq);
>>
>> @@ -2052,6 +2064,8 @@ static struct dlm_ctxt *dlm_alloc_ctxt(const char *domain,
>>   	dlm->joining_node = DLM_LOCK_RES_OWNER_UNKNOWN;
>>   	init_waitqueue_head(&dlm->dlm_join_events);
>>
>> +	dlm->migrate_done = 0;
>> +
>>   	dlm->reco.new_master = O2NM_INVALID_NODE_NUM;
>>   	dlm->reco.dead_node = O2NM_INVALID_NODE_NUM;
>>
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index 74407c6..3106332 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -423,12 +423,11 @@ void dlm_wait_for_recovery(struct dlm_ctxt *dlm)
>>
>>   static void dlm_begin_recovery(struct dlm_ctxt *dlm)
>>   {
>> -	spin_lock(&dlm->spinlock);
>> +	assert_spin_locked(&dlm->spinlock);
>>   	BUG_ON(dlm->reco.state & DLM_RECO_STATE_ACTIVE);
>>   	printk(KERN_NOTICE "o2dlm: Begin recovery on domain %s for node %u\n",
>>   	       dlm->name, dlm->reco.dead_node);
>>   	dlm->reco.state |= DLM_RECO_STATE_ACTIVE;
>> -	spin_unlock(&dlm->spinlock);
>>   }
>>
>>   static void dlm_end_recovery(struct dlm_ctxt *dlm)
>> @@ -456,6 +455,12 @@ static int dlm_do_recovery(struct dlm_ctxt *dlm)
>>
>>   	spin_lock(&dlm->spinlock);
>>
>> +	if (dlm->migrate_done) {
>> +		mlog(0, "%s: no need do recovery after migrating all lockres\n",
>> +				dlm->name);
> 
> Don't we need unlock above spin_lock before return?
> 
> And if we just return here, how dlm lock resource can clear its 
> REDISCOVERING flag. I suppose this may cause cluster hang.
> 
> And I cc this to ocfs2 maintainers.
> 
> Thanks,
> Changwei
> 
oh, good catch, I missed spin_unlock(&dlm->spinlock);
>> +		return 0;
>> +	}
>> +
>>   	/* check to see if the new master has died */
>>   	if (dlm->reco.new_master != O2NM_INVALID_NODE_NUM &&
>>   	    test_bit(dlm->reco.new_master, dlm->recovery_map)) {
>> @@ -490,12 +495,13 @@ static int dlm_do_recovery(struct dlm_ctxt *dlm)
>>   	mlog(0, "%s(%d):recovery thread found node %u in the recovery map!\n",
>>   	     dlm->name, task_pid_nr(dlm->dlm_reco_thread_task),
>>   	     dlm->reco.dead_node);
>> -	spin_unlock(&dlm->spinlock);
>>
>>   	/* take write barrier */
>>   	/* (stops the list reshuffling thread, proxy ast handling) */
>>   	dlm_begin_recovery(dlm);
>>
>> +	spin_unlock(&dlm->spinlock);
>> +
>>   	if (dlm->reco.new_master == dlm->node_num)
>>   		goto master_here;
>>
> 
> .
>

next prev parent reply	other threads:[~2017-11-01  5:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-01  1:14 [Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres piaojun
2017-11-01  2:47 ` Changwei Ge
2017-11-01  5:52   ` piaojun [this message]
2017-11-01  7:13     ` Changwei Ge
2017-11-01  7:56       ` piaojun
2017-11-01  8:11         ` Changwei Ge
2017-11-01  8:45           ` piaojun
2017-11-01  9:00             ` Changwei Ge
2017-11-02  1:42               ` piaojun
2017-11-02  1:56                 ` Changwei Ge
2017-11-03  1:01                   ` piaojun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59F96136.3070307@huawei.com \
    --to=piaojun@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.