All of lore.kernel.org
 help / color / mirror / Atom feed
From: piaojun <piaojun@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2/dlm: don't handle migrate lockres if already in shutdown
Date: Thu, 1 Mar 2018 20:37:50 +0800	[thread overview]
Message-ID: <5A97F41E.5090800@huawei.com> (raw)
In-Reply-To: <63ADC13FD55D6546B7DECE290D39E373F292B279@H3CMLB12-EX.srv.huawei-3com.com>

Hi Changwei,

Thanks for your quick reply, please see my comments below.

On 2018/3/1 17:39, Changwei Ge wrote:
> Hi Jun,
> 
> On 2018/3/1 17:27, piaojun wrote:
>> We should not handle migrate lockres if we are already in
>> 'DLM_CTXT_IN_SHUTDOWN', as that will cause lockres remains after
>> leaving dlm domain. At last other nodes will get stuck into infinite
>> loop when requsting lock from us.
>>
>>      N1                             N2 (owner)
>>                                     touch file
>>
>> access the file,
>> and get pr lock
>>
>> umount
>>
> 
> Before migrating all lock resources, N1 should have already sent 
> DLM_BEGIN_EXIT_DOMAIN_MSG in dlm_begin_exit_domain().
> N2 will set ->exit_domain_map later.
> So N2 can't take N1 as migration target.
Before receiveing N1's DLM_BEGIN_EXIT_DOMAIN_MSG, N2 has picked up N1 as
the migrate target. So N2 will continue sending lockres to N1 even though
N1 has left domain. Sorry for making you misunderstanding, I will give a
more detailed description.

    N1                             N2 (owner)
                                   touch file

access the file,
and get pr lock

                                   begin leave domain and
                                   pick up N1 as new owner

begin leave domain and
migrate all lockres done

                                   begin migrate lockres to N1

end leave domain, but
the lockres left
unexpectedly, because
migrate task has passed

thanks,
Jun
> 
> How can the scenario your changelog describing happen?
> Or if miss something?
> 
> Thanks,
> Changwei
> 
>> migrate all lockres
>>
>>                                     umount and migrate lockres to N1
>>
>> leave dlm domain, but
>> the lockres left
>> unexpectedly, because
>> migrate task has passed
>>
>> Signed-off-by: Jun Piao <piaojun@huawei.com>
>> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
>> ---
>>   fs/ocfs2/dlm/dlmdomain.c   | 14 ++++++++++++++
>>   fs/ocfs2/dlm/dlmdomain.h   |  1 +
>>   fs/ocfs2/dlm/dlmrecovery.c |  9 +++++++++
>>   3 files changed, 24 insertions(+)
>>
>> diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c
>> index e1fea14..3b7ec51 100644
>> --- a/fs/ocfs2/dlm/dlmdomain.c
>> +++ b/fs/ocfs2/dlm/dlmdomain.c
>> @@ -675,6 +675,20 @@ static void dlm_leave_domain(struct dlm_ctxt *dlm)
>>   	spin_unlock(&dlm->spinlock);
>>   }
>>
>> +int dlm_joined(struct dlm_ctxt *dlm)
>> +{
>> +	int ret = 0;
>> +
>> +	spin_lock(&dlm_domain_lock);
>> +
>> +	if (dlm->dlm_state == DLM_CTXT_JOINED)
>> +		ret = 1;
>> +
>> +	spin_unlock(&dlm_domain_lock);
>> +
>> +	return ret;
>> +}
>> +
>>   int dlm_shutting_down(struct dlm_ctxt *dlm)
>>   {
>>   	int ret = 0;
>> diff --git a/fs/ocfs2/dlm/dlmdomain.h b/fs/ocfs2/dlm/dlmdomain.h
>> index fd6122a..2f7f60b 100644
>> --- a/fs/ocfs2/dlm/dlmdomain.h
>> +++ b/fs/ocfs2/dlm/dlmdomain.h
>> @@ -28,6 +28,7 @@
>>   extern spinlock_t dlm_domain_lock;
>>   extern struct list_head dlm_domains;
>>
>> +int dlm_joined(struct dlm_ctxt *dlm);
>>   int dlm_shutting_down(struct dlm_ctxt *dlm);
>>   void dlm_fire_domain_eviction_callbacks(struct dlm_ctxt *dlm,
>>   					int node_num);
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index ec8f758..9b3bc66 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -1378,6 +1378,15 @@ int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
>>   	if (!dlm_grab(dlm))
>>   		return -EINVAL;
>>
>> +	if (!dlm_joined(dlm)) {
>> +		mlog(ML_ERROR, "Domain %s not joined! "
>> +				"lockres %.*s, master %u\n",
>> +				dlm->name, mres->lockname_len,
>> +				mres->lockname, mres->master);
>> +		dlm_put(dlm);
>> +		return -EINVAL;
>> +	}
>> +
>>   	BUG_ON(!(mres->flags & (DLM_MRES_RECOVERY|DLM_MRES_MIGRATION)));
>>
>>   	real_master = mres->master;
>>
> .
> 

  reply	other threads:[~2018-03-01 12:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-01  9:26 [Ocfs2-devel] [PATCH] ocfs2/dlm: don't handle migrate lockres if already in shutdown piaojun
2018-03-01  9:39 ` Changwei Ge
2018-03-01 12:37   ` piaojun [this message]
2018-03-01 23:29     ` Andrew Morton
2018-03-02  9:40       ` piaojun
2018-03-02  1:49     ` Changwei Ge
2018-03-02  2:08       ` piaojun
2018-03-02  5:53         ` Changwei Ge
2018-03-02  9:37           ` piaojun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A97F41E.5090800@huawei.com \
    --to=piaojun@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.