All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
@ 2017-12-21  0:55 alex chen
  2017-12-21  1:30 ` Joseph Qi
  0 siblings, 1 reply; 6+ messages in thread
From: alex chen @ 2017-12-21  0:55 UTC (permalink / raw)
  To: ocfs2-devel

In dlm_reset_mleres_owner(), we will lock
dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
which breaks the spinlock lock ordering:
 dlm_domain_lock
 struct dlm_ctxt->spinlock
 struct dlm_lock_resource->spinlock
 struct dlm_ctxt->master_lock

Fix it by unlocking dlm_ctxt->master_lock before locking
dlm_lock_resource->spinlock and restarting to clean master list.

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
---
 fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 3e04279..d83ccdc 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
 {
 	struct dlm_lock_resource *res;

+	assert_spin_locked(&dlm->spinlock);
+	assert_spin_locked(&dlm->master_lock);
+
 	/* Find the lockres associated to the mle and set its owner to UNK */
-	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
+	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
 				   mle->mnamehash);
 	if (res) {
 		spin_unlock(&dlm->master_lock);

-		/* move lockres onto recovery list */
 		spin_lock(&res->spinlock);
-		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
-		dlm_move_lockres_to_recovery_list(dlm, res);
+		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
+			/* move lockres onto recovery list */
+			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
+			dlm_move_lockres_to_recovery_list(dlm, res);
+		}
+
 		spin_unlock(&res->spinlock);
 		dlm_lockres_put(res);

-- 
1.9.5.msysgit.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
  2017-12-21  0:55 [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner() alex chen
@ 2017-12-21  1:30 ` Joseph Qi
  2017-12-21  6:11   ` alex chen
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Qi @ 2017-12-21  1:30 UTC (permalink / raw)
  To: ocfs2-devel

Hi Alex,

On 17/12/21 08:55, alex chen wrote:
> In dlm_reset_mleres_owner(), we will lock
> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
> which breaks the spinlock lock ordering:
>  dlm_domain_lock
>  struct dlm_ctxt->spinlock
>  struct dlm_lock_resource->spinlock
>  struct dlm_ctxt->master_lock
> 
> Fix it by unlocking dlm_ctxt->master_lock before locking
> dlm_lock_resource->spinlock and restarting to clean master list.
> 
> Signed-off-by: Alex Chen <alex.chen@huawei.com>
> Reviewed-by: Jun Piao <piaojun@huawei.com>
> ---
>  fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> index 3e04279..d83ccdc 100644
> --- a/fs/ocfs2/dlm/dlmmaster.c
> +++ b/fs/ocfs2/dlm/dlmmaster.c
> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>  {
>  	struct dlm_lock_resource *res;
> 
> +	assert_spin_locked(&dlm->spinlock);
> +	assert_spin_locked(&dlm->master_lock);
> +
>  	/* Find the lockres associated to the mle and set its owner to UNK */
> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>  				   mle->mnamehash);
>  	if (res) {
>  		spin_unlock(&dlm->master_lock);
> 
> -		/* move lockres onto recovery list */
>  		spin_lock(&res->spinlock);
> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
> -		dlm_move_lockres_to_recovery_list(dlm, res);
> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
> +			/* move lockres onto recovery list */
> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
> +			dlm_move_lockres_to_recovery_list(dlm, res);
> +		}
> +
I don't think this change is lock re-ordering *only*. It definitely
changes the logic of resetting mle resource owner.
Why do you detach mle from heartbeat if lock resource is in the process
of dropping its mastery reference? And why we have to restart in this
case?

Thanks,
Joseph

>  		spin_unlock(&res->spinlock);
>  		dlm_lockres_put(res);
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
  2017-12-21  1:30 ` Joseph Qi
@ 2017-12-21  6:11   ` alex chen
  2017-12-22  2:34     ` Changwei Ge
  0 siblings, 1 reply; 6+ messages in thread
From: alex chen @ 2017-12-21  6:11 UTC (permalink / raw)
  To: ocfs2-devel

Hi Joseph,

On 2017/12/21 9:30, Joseph Qi wrote:
> Hi Alex,
> 
> On 17/12/21 08:55, alex chen wrote:
>> In dlm_reset_mleres_owner(), we will lock
>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>> which breaks the spinlock lock ordering:
>>  dlm_domain_lock
>>  struct dlm_ctxt->spinlock
>>  struct dlm_lock_resource->spinlock
>>  struct dlm_ctxt->master_lock
>>
>> Fix it by unlocking dlm_ctxt->master_lock before locking
>> dlm_lock_resource->spinlock and restarting to clean master list.
>>
>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>> ---
>>  fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>> index 3e04279..d83ccdc 100644
>> --- a/fs/ocfs2/dlm/dlmmaster.c
>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>  {
>>  	struct dlm_lock_resource *res;
>>
>> +	assert_spin_locked(&dlm->spinlock);
>> +	assert_spin_locked(&dlm->master_lock);
>> +
>>  	/* Find the lockres associated to the mle and set its owner to UNK */
>> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>  				   mle->mnamehash);
>>  	if (res) {
>>  		spin_unlock(&dlm->master_lock);
>>
>> -		/* move lockres onto recovery list */
>>  		spin_lock(&res->spinlock);
>> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>> -		dlm_move_lockres_to_recovery_list(dlm, res);
>> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>> +			/* move lockres onto recovery list */
>> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>> +			dlm_move_lockres_to_recovery_list(dlm, res);
>> +		}
>> +
> I don't think this change is lock re-ordering *only*. It definitely
> changes the logic of resetting mle resource owner.
> Why do you detach mle from heartbeat if lock resource is in the process
> of dropping its mastery reference? And why we have to restart in this
> case?
I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
it is the same as the original logic. We should drop the master lock if we want to judge
if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
we should restart to clean master list.
Here the mle is not useful and will be released, so we detach it from heartbeat.
In fact, the mle has been detached in dlm_clean_migration_mle().

Thanks,
Alex
> 
> Thanks,
> Joseph
> 
>>  		spin_unlock(&res->spinlock);
>>  		dlm_lockres_put(res);
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
  2017-12-21  6:11   ` alex chen
@ 2017-12-22  2:34     ` Changwei Ge
  2017-12-22  3:17       ` Joseph Qi
  0 siblings, 1 reply; 6+ messages in thread
From: Changwei Ge @ 2017-12-22  2:34 UTC (permalink / raw)
  To: ocfs2-devel

On 2017/12/21 14:36, alex chen wrote:
> Hi Joseph,
> 
> On 2017/12/21 9:30, Joseph Qi wrote:
>> Hi Alex,
>>
>> On 17/12/21 08:55, alex chen wrote:
>>> In dlm_reset_mleres_owner(), we will lock
>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>> which breaks the spinlock lock ordering:
>>>   dlm_domain_lock
>>>   struct dlm_ctxt->spinlock
>>>   struct dlm_lock_resource->spinlock
>>>   struct dlm_ctxt->master_lock
>>>
>>> Fix it by unlocking dlm_ctxt->master_lock before locking
>>> dlm_lock_resource->spinlock and restarting to clean master list.
>>>
>>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>>> ---
>>>   fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>>> index 3e04279..d83ccdc 100644
>>> --- a/fs/ocfs2/dlm/dlmmaster.c
>>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>>   {
>>>   	struct dlm_lock_resource *res;
>>>
>>> +	assert_spin_locked(&dlm->spinlock);
>>> +	assert_spin_locked(&dlm->master_lock);
>>> +
>>>   	/* Find the lockres associated to the mle and set its owner to UNK */
>>> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>>> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>>   				   mle->mnamehash);
>>>   	if (res) {
>>>   		spin_unlock(&dlm->master_lock);
>>>
>>> -		/* move lockres onto recovery list */
>>>   		spin_lock(&res->spinlock);
>>> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> -		dlm_move_lockres_to_recovery_list(dlm, res);
>>> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>>> +			/* move lockres onto recovery list */
>>> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> +			dlm_move_lockres_to_recovery_list(dlm, res);
>>> +		}
>>> +
>> I don't think this change is lock re-ordering *only*. It definitely
>> changes the logic of resetting mle resource owner.
>> Why do you detach mle from heartbeat if lock resource is in the process
>> of dropping its mastery reference? And why we have to restart in this
>> case?
> I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
> it is the same as the original logic. We should drop the master lock if we want to judge
> if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
> we should restart to clean master list.
> Here the mle is not useful and will be released, so we detach it from heartbeat.
> In fact, the mle has been detached in dlm_clean_migration_mle().
Hi Alex,

Perhaps, you can just judge if lock resource is marked  DLM_LOCK_RES_DROPPING_REF and if so directly
return NULL with ::master_lock locked :)

Thanks,
Changwei

> 
> Thanks,
> Alex
>>
>> Thanks,
>> Joseph
>>
>>>   		spin_unlock(&res->spinlock);
>>>   		dlm_lockres_put(res);
>>>
>>
>> .
>>
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
  2017-12-22  2:34     ` Changwei Ge
@ 2017-12-22  3:17       ` Joseph Qi
  2017-12-22  6:19         ` alex chen
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Qi @ 2017-12-22  3:17 UTC (permalink / raw)
  To: ocfs2-devel



On 17/12/22 10:34, Changwei Ge wrote:
> On 2017/12/21 14:36, alex chen wrote:
>> Hi Joseph,
>>
>> On 2017/12/21 9:30, Joseph Qi wrote:
>>> Hi Alex,
>>>
>>> On 17/12/21 08:55, alex chen wrote:
>>>> In dlm_reset_mleres_owner(), we will lock
>>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>>> which breaks the spinlock lock ordering:
>>>>   dlm_domain_lock
>>>>   struct dlm_ctxt->spinlock
>>>>   struct dlm_lock_resource->spinlock
>>>>   struct dlm_ctxt->master_lock
>>>>
>>>> Fix it by unlocking dlm_ctxt->master_lock before locking
>>>> dlm_lock_resource->spinlock and restarting to clean master list.
>>>>
>>>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>>>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>>>> ---
>>>>   fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>>>> index 3e04279..d83ccdc 100644
>>>> --- a/fs/ocfs2/dlm/dlmmaster.c
>>>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>>>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>>>   {
>>>>   	struct dlm_lock_resource *res;
>>>>
>>>> +	assert_spin_locked(&dlm->spinlock);
>>>> +	assert_spin_locked(&dlm->master_lock);
>>>> +
>>>>   	/* Find the lockres associated to the mle and set its owner to UNK */
>>>> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>>>> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>>>   				   mle->mnamehash);
>>>>   	if (res) {
>>>>   		spin_unlock(&dlm->master_lock);
>>>>
>>>> -		/* move lockres onto recovery list */
>>>>   		spin_lock(&res->spinlock);
>>>> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>>> -		dlm_move_lockres_to_recovery_list(dlm, res);
>>>> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>>>> +			/* move lockres onto recovery list */
>>>> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>>> +			dlm_move_lockres_to_recovery_list(dlm, res);
>>>> +		}
>>>> +
>>> I don't think this change is lock re-ordering *only*. It definitely
>>> changes the logic of resetting mle resource owner.
>>> Why do you detach mle from heartbeat if lock resource is in the process
>>> of dropping its mastery reference? And why we have to restart in this
>>> case?
>> I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
>> it is the same as the original logic. We should drop the master lock if we want to judge
>> if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
>> we should restart to clean master list.
>> Here the mle is not useful and will be released, so we detach it from heartbeat.
>> In fact, the mle has been detached in dlm_clean_migration_mle().
> Hi Alex,
> 
> Perhaps, you can just judge if lock resource is marked  DLM_LOCK_RES_DROPPING_REF and if so directly
> return NULL with ::master_lock locked :)
>Umm... We can't do this with unlocking master lock first and then
re-taking it. It breaks the logic.
What my concern is the behavior change. e.g. Currently for lock resource
in the process of dropping mastery reference, we just ignore it and
continue with the next. But after this patch, we have to restart at the
beginning.

> Thanks,
> Changwei
> 
>>
>> Thanks,
>> Alex
>>>
>>> Thanks,
>>> Joseph
>>>
>>>>   		spin_unlock(&res->spinlock);
>>>>   		dlm_lockres_put(res);
>>>>
>>>
>>> .
>>>
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
  2017-12-22  3:17       ` Joseph Qi
@ 2017-12-22  6:19         ` alex chen
  0 siblings, 0 replies; 6+ messages in thread
From: alex chen @ 2017-12-22  6:19 UTC (permalink / raw)
  To: ocfs2-devel

Hi Joseph and Changwei,

On 2017/12/22 11:17, Joseph Qi wrote:
> 
> 
> On 17/12/22 10:34, Changwei Ge wrote:
>> On 2017/12/21 14:36, alex chen wrote:
>>> Hi Joseph,
>>>
>>> On 2017/12/21 9:30, Joseph Qi wrote:
>>>> Hi Alex,
>>>>
>>>> On 17/12/21 08:55, alex chen wrote:
>>>>> In dlm_reset_mleres_owner(), we will lock
>>>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>>>> which breaks the spinlock lock ordering:
>>>>>   dlm_domain_lock
>>>>>   struct dlm_ctxt->spinlock
>>>>>   struct dlm_lock_resource->spinlock
>>>>>   struct dlm_ctxt->master_lock
>>>>>
>>>>> Fix it by unlocking dlm_ctxt->master_lock before locking
>>>>> dlm_lock_resource->spinlock and restarting to clean master list.
>>>>>
>>>>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>>>>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>>>>> ---
>>>>>   fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>>>>> index 3e04279..d83ccdc 100644
>>>>> --- a/fs/ocfs2/dlm/dlmmaster.c
>>>>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>>>>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>>>>   {
>>>>>   	struct dlm_lock_resource *res;
>>>>>
>>>>> +	assert_spin_locked(&dlm->spinlock);
>>>>> +	assert_spin_locked(&dlm->master_lock);
>>>>> +
>>>>>   	/* Find the lockres associated to the mle and set its owner to UNK */
>>>>> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>>>>> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>>>>   				   mle->mnamehash);
>>>>>   	if (res) {
>>>>>   		spin_unlock(&dlm->master_lock);
>>>>>
>>>>> -		/* move lockres onto recovery list */
>>>>>   		spin_lock(&res->spinlock);
>>>>> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>>>> -		dlm_move_lockres_to_recovery_list(dlm, res);
>>>>> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>>>>> +			/* move lockres onto recovery list */
>>>>> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>>>> +			dlm_move_lockres_to_recovery_list(dlm, res);
>>>>> +		}
>>>>> +
>>>> I don't think this change is lock re-ordering *only*. It definitely
>>>> changes the logic of resetting mle resource owner.
>>>> Why do you detach mle from heartbeat if lock resource is in the process
>>>> of dropping its mastery reference? And why we have to restart in this
>>>> case?
>>> I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
>>> it is the same as the original logic. We should drop the master lock if we want to judge
>>> if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
>>> we should restart to clean master list.
>>> Here the mle is not useful and will be released, so we detach it from heartbeat.
>>> In fact, the mle has been detached in dlm_clean_migration_mle().
>> Hi Alex,
>>
>> Perhaps, you can just judge if lock resource is marked  DLM_LOCK_RES_DROPPING_REF and if so directly
>> return NULL with ::master_lock locked :)
>> Umm... We can't do this with unlocking master lock first and then
> re-taking it. It breaks the logic.
> What my concern is the behavior change. e.g. Currently for lock resource
> in the process of dropping mastery reference, we just ignore it and
> continue with the next. But after this patch, we have to restart at the
> beginning.
Before this patch, we can continue when we find the lock resource is marked  DLM_LOCK_RES_DROPPING_REF,
that is because we use a wrong spin lock ordering.
Once we drop the master lock we should restart to iterate over the master list, otherwise, we may miss some mles
which are inserted during we are dropping the master lock.

In this patch, we may increase the times to restart, but we can not avoid it for solving the deadlock.

Thanks,
Alex

> 
>> Thanks,
>> Changwei
>>
>>>
>>> Thanks,
>>> Alex
>>>>
>>>> Thanks,
>>>> Joseph
>>>>
>>>>>   		spin_unlock(&res->spinlock);
>>>>>   		dlm_lockres_put(res);
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-12-22  6:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-21  0:55 [Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner() alex chen
2017-12-21  1:30 ` Joseph Qi
2017-12-21  6:11   ` alex chen
2017-12-22  2:34     ` Changwei Ge
2017-12-22  3:17       ` Joseph Qi
2017-12-22  6:19         ` alex chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.