* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
@ 2011-06-08 10:04 Wengang Wang
2011-06-09 5:17 ` Srinivas Eeda
2011-06-09 17:53 ` Sunil Mushran
0 siblings, 2 replies; 12+ messages in thread
From: Wengang Wang @ 2011-06-08 10:04 UTC (permalink / raw)
To: ocfs2-devel
When we are to purge a lockres, we check if it's unused.
the check includes
1. no locks attached to the lockres.
2. not dirty.
3. not in recovery.
4. no interested nodes in refmap.
If the the above four are satisfied, we are going to purge it(remove it
from hash table).
While, when a "create lock" is in progress especially when lockres is owned
remotely(spinlocks are dropped when networking), the lockres can satisfy above
four condition. If it's put to purge list, it can be purged out from hash table
when we are still accessing on it(sending request to owner for example). That's
not what we want.
I have met such a problem (orabug 12405575).
The lockres is in the purge list already(there is a delay for real purge work)
and the create lock request comes. When we are sending network message to the
owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
value and retries dlmlock_remote(). And before the owner crash, we have purged
the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
dlmlock_remote() infinitely.
fix:
we remove the lockres from purge list if it's there in dlm_get_lock_resource()
which is called for only createlock case. So that the lockres can't be purged
when we are in progress of createlock.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
---
fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
1 files changed, 29 insertions(+), 12 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 11eefb8..511d43c 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -709,28 +709,27 @@ lookup:
spin_lock(&dlm->spinlock);
tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
if (tmpres) {
- int dropping_ref = 0;
-
- spin_unlock(&dlm->spinlock);
-
spin_lock(&tmpres->spinlock);
/* We wait for the other thread that is mastering the resource */
if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
+ spin_unlock(&dlm->spinlock);
__dlm_wait_on_lockres(tmpres);
BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
+ spin_unlock(&tmpres->spinlock);
+ dlm_lockres_put(tmpres);
+ tmpres = NULL;
+ goto lookup;
}
if (tmpres->owner == dlm->node_num) {
BUG_ON(tmpres->state & DLM_LOCK_RES_DROPPING_REF);
dlm_lockres_grab_inflight_ref(dlm, tmpres);
- } else if (tmpres->state & DLM_LOCK_RES_DROPPING_REF)
- dropping_ref = 1;
- spin_unlock(&tmpres->spinlock);
-
- /* wait until done messaging the master, drop our ref to allow
- * the lockres to be purged, start over. */
- if (dropping_ref) {
- spin_lock(&tmpres->spinlock);
+ } else if (tmpres->state & DLM_LOCK_RES_DROPPING_REF) {
+ /*
+ * wait until done messaging the master, drop our ref to
+ * allow the lockres to be purged, start over.
+ */
+ spin_unlock(&dlm->spinlock);
__dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
spin_unlock(&tmpres->spinlock);
dlm_lockres_put(tmpres);
@@ -739,6 +738,24 @@ lookup:
}
mlog(0, "found in hash!\n");
+ /*
+ * we are going to do a create-lock next. so remove the lockres
+ * from purge list to avoid the case that we will access on the
+ * lockres which is already purged out from hash table in
+ * dlm_run_purge_list() path.
+ * otherwise, we could run into a problem:
+ * the owner dead(recovery didn't take care of this lockres
+ * since it's not in hashtable), and the code keeps sending
+ * request to the dead node and getting DLM_RECOVERING and
+ * then retrying infinitely.
+ */
+ if (!list_empty(&tmpres->purge)) {
+ list_del_init(&tmpres->purge);
+ dlm_lockres_put(tmpres);
+ }
+
+ spin_unlock(&tmpres->spinlock);
+ spin_unlock(&dlm->spinlock);
if (res)
dlm_lockres_put(res);
res = tmpres;
--
1.7.2.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-08 10:04 [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock Wengang Wang
@ 2011-06-09 5:17 ` Srinivas Eeda
2011-06-09 6:40 ` Wengang Wang
2011-06-09 18:34 ` Sunil Mushran
2011-06-09 17:53 ` Sunil Mushran
1 sibling, 2 replies; 12+ messages in thread
From: Srinivas Eeda @ 2011-06-09 5:17 UTC (permalink / raw)
To: ocfs2-devel
I think I have seen this problem in ocfs2-1.2 and it was addressed by
using a new state DLM_LOCK_RES_IN_USE. But we didn't merge into mainline
as sunil suggested we need to look for a different approach
http://oss.oracle.com/pipermail/ocfs2-devel/2010-June/006669.html
http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg05733.html
one comment inline
On 6/8/2011 3:04 AM, Wengang Wang wrote:
> When we are to purge a lockres, we check if it's unused.
> the check includes
> 1. no locks attached to the lockres.
> 2. not dirty.
> 3. not in recovery.
> 4. no interested nodes in refmap.
> If the the above four are satisfied, we are going to purge it(remove it
> from hash table).
>
> While, when a "create lock" is in progress especially when lockres is owned
> remotely(spinlocks are dropped when networking), the lockres can satisfy above
> four condition. If it's put to purge list, it can be purged out from hash table
> when we are still accessing on it(sending request to owner for example). That's
> not what we want.
>
> I have met such a problem (orabug 12405575).
> The lockres is in the purge list already(there is a delay for real purge work)
> and the create lock request comes. When we are sending network message to the
> owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
> value and retries dlmlock_remote(). And before the owner crash, we have purged
> the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
> dlmlock_remote() infinitely.
>
> fix:
> we remove the lockres from purge list if it's there in dlm_get_lock_resource()
> which is called for only createlock case. So that the lockres can't be purged
> when we are in progress of createlock.
>
> Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
> ---
> fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
> 1 files changed, 29 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> index 11eefb8..511d43c 100644
> --- a/fs/ocfs2/dlm/dlmmaster.c
> +++ b/fs/ocfs2/dlm/dlmmaster.c
> @@ -709,28 +709,27 @@ lookup:
> spin_lock(&dlm->spinlock);
> tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
> if (tmpres) {
> - int dropping_ref = 0;
> -
> - spin_unlock(&dlm->spinlock);
> -
> spin_lock(&tmpres->spinlock);
> /* We wait for the other thread that is mastering the resource */
> if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> + spin_unlock(&dlm->spinlock);
> __dlm_wait_on_lockres(tmpres);
> BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> + spin_unlock(&tmpres->spinlock);
> + dlm_lockres_put(tmpres);
> + tmpres = NULL;
> + goto lookup;
> }
>
> if (tmpres->owner == dlm->node_num) {
> BUG_ON(tmpres->state& DLM_LOCK_RES_DROPPING_REF);
> dlm_lockres_grab_inflight_ref(dlm, tmpres);
> - } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF)
> - dropping_ref = 1;
> - spin_unlock(&tmpres->spinlock);
> -
> - /* wait until done messaging the master, drop our ref to allow
> - * the lockres to be purged, start over. */
> - if (dropping_ref) {
> - spin_lock(&tmpres->spinlock);
> + } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF) {
> + /*
> + * wait until done messaging the master, drop our ref to
> + * allow the lockres to be purged, start over.
> + */
> + spin_unlock(&dlm->spinlock);
> __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
> spin_unlock(&tmpres->spinlock);
> dlm_lockres_put(tmpres);
> @@ -739,6 +738,24 @@ lookup:
> }
>
> mlog(0, "found in hash!\n");
> + /*
> + * we are going to do a create-lock next. so remove the lockres
> + * from purge list to avoid the case that we will access on the
> + * lockres which is already purged out from hash table in
> + * dlm_run_purge_list() path.
> + * otherwise, we could run into a problem:
> + * the owner dead(recovery didn't take care of this lockres
> + * since it's not in hashtable), and the code keeps sending
> + * request to the dead node and getting DLM_RECOVERING and
> + * then retrying infinitely.
> + */
> + if (!list_empty(&tmpres->purge)) {
> + list_del_init(&tmpres->purge);
> + dlm_lockres_put(tmpres);
> + }
> +
> + spin_unlock(&tmpres->spinlock);
> + spin_unlock(&dlm->spinlock);
lockres could still get added to purgelist at this point and we could
still have the same problem? I think, here we need some mechanism that
marks the lockres is in use that would protect it from adding to the
purgelist?
> if (res)
> dlm_lockres_put(res);
> res = tmpres;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 5:17 ` Srinivas Eeda
@ 2011-06-09 6:40 ` Wengang Wang
2011-06-09 7:43 ` Srinivas Eeda
2011-06-09 18:34 ` Sunil Mushran
1 sibling, 1 reply; 12+ messages in thread
From: Wengang Wang @ 2011-06-09 6:40 UTC (permalink / raw)
To: ocfs2-devel
Srini,
On 11-06-08 22:17, Srinivas Eeda wrote:
> I think I have seen this problem in ocfs2-1.2 and it was addressed
> by using a new state DLM_LOCK_RES_IN_USE. But we didn't merge into
> mainline as sunil suggested we need to look for a different approach
>
> http://oss.oracle.com/pipermail/ocfs2-devel/2010-June/006669.html
> http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg05733.html
Yes, this is the same problem.
>
> one comment inline
<..snip>
> >+ spin_unlock(&tmpres->spinlock);
> >+ spin_unlock(&dlm->spinlock);
> lockres could still get added to purgelist at this point and we
> could still have the same problem? I think, here we need some
> mechanism that marks the lockres is in use that would protect it
> from adding to the purgelist?
Seems there shouldn't be the possibility that the lockres was moved to purge
list again. We are in the case of a "create lock", there should not be any
outstanding convert/unlock request in progress concurrently. If there is one,
the there must be lock(s) attached to the lockres. But if there are locks
attached, the lockres can't be purged. Also for concurrent "create lock", there
is no problem either because there must be a lock attached to the lockres.
I think your patch works too. But changing
DLM_LOCK_RES_IN_USE
to
DLM_LOCK_RES_CREATING_LOCK
can help understand better.
I also thought about inflight_locks, but so far I failed to find correct
places to drop it.
thanks,
wengang.
> > if (res)
> > dlm_lockres_put(res);
> > res = tmpres;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 6:40 ` Wengang Wang
@ 2011-06-09 7:43 ` Srinivas Eeda
2011-06-09 8:10 ` Wengang Wang
0 siblings, 1 reply; 12+ messages in thread
From: Srinivas Eeda @ 2011-06-09 7:43 UTC (permalink / raw)
To: ocfs2-devel
>>> + spin_unlock(&tmpres->spinlock);
>>> + spin_unlock(&dlm->spinlock);
>> lockres could still get added to purgelist at this point and we
>> could still have the same problem? I think, here we need some
>> mechanism that marks the lockres is in use that would protect it
>> from adding to the purgelist?
> Seems there shouldn't be the possibility that the lockres was moved to purge
> list again.
I think you are right about that once removed from the purge list it
will not be put back again on the purgelist in this case :)
I still think that there is a hole here. Is it possible that the
spinlocks are released here and right then dlm_thread was walking
through dirty list and then it could put this on purge list for the
first time?
If it's already on purgelist, then we seem to be safe by removing it.
But what is preventing the lockres getting onto to purge list right at
this point. locks are added to lockres a little later but till then it
is not safe?(I think).
> We are in the case of a "create lock", there should not be any
> outstanding convert/unlock request in progress concurrently. If there is one,
> the there must be lock(s) attached to the lockres. But if there are locks
> attached, the lockres can't be purged. Also for concurrent "create lock", there
> is no problem either because there must be a lock attached to the lockres.
>
> I think your patch works too. But changing
> DLM_LOCK_RES_IN_USE
> to
> DLM_LOCK_RES_CREATING_LOCK
> can help understand better.
I am ok with the name :). The thing we should do here is once we found a
lockres and moving ahead with create lock request, we should be sure
it's protected before releasing the spinlocks.
> I also thought about inflight_locks, but so far I failed to find correct
> places to drop it.
>
> thanks,
> wengang.
>>> if (res)
>>> dlm_lockres_put(res);
>>> res = tmpres;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 7:43 ` Srinivas Eeda
@ 2011-06-09 8:10 ` Wengang Wang
0 siblings, 0 replies; 12+ messages in thread
From: Wengang Wang @ 2011-06-09 8:10 UTC (permalink / raw)
To: ocfs2-devel
Srini,
Yes, you are right! Agree!
Sunil,
My privious patch which checks if lockres is unhashed can't fix the
problem that "can't find lockres in ast handler". So I vote for Srini's
patch.
what's your comment?
thanks,
wengang.
On 11-06-09 00:43, Srinivas Eeda wrote:
>
> >>>+ spin_unlock(&tmpres->spinlock);
> >>>+ spin_unlock(&dlm->spinlock);
> >>lockres could still get added to purgelist at this point and we
> >>could still have the same problem? I think, here we need some
> >>mechanism that marks the lockres is in use that would protect it
> >>from adding to the purgelist?
> >Seems there shouldn't be the possibility that the lockres was moved to purge
> >list again.
> I think you are right about that once removed from the purge list it
> will not be put back again on the purgelist in this case :)
>
> I still think that there is a hole here. Is it possible that the
> spinlocks are released here and right then dlm_thread was walking
> through dirty list and then it could put this on purge list for the
> first time?
>
> If it's already on purgelist, then we seem to be safe by removing
> it. But what is preventing the lockres getting onto to purge list
> right at this point. locks are added to lockres a little later but
> till then it is not safe?(I think).
>
> > We are in the case of a "create lock", there should not be any
> >outstanding convert/unlock request in progress concurrently. If there is one,
> >the there must be lock(s) attached to the lockres. But if there are locks
> >attached, the lockres can't be purged. Also for concurrent "create lock", there
> >is no problem either because there must be a lock attached to the lockres.
> >
> >I think your patch works too. But changing
> >DLM_LOCK_RES_IN_USE
> >to
> >DLM_LOCK_RES_CREATING_LOCK
> >can help understand better.
> I am ok with the name :). The thing we should do here is once we
> found a lockres and moving ahead with create lock request, we should
> be sure it's protected before releasing the spinlocks.
> >I also thought about inflight_locks, but so far I failed to find correct
> >places to drop it.
> >
> >thanks,
> >wengang.
> >>> if (res)
> >>> dlm_lockres_put(res);
> >>> res = tmpres;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-08 10:04 [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock Wengang Wang
2011-06-09 5:17 ` Srinivas Eeda
@ 2011-06-09 17:53 ` Sunil Mushran
2011-06-10 1:01 ` Wengang Wang
1 sibling, 1 reply; 12+ messages in thread
From: Sunil Mushran @ 2011-06-09 17:53 UTC (permalink / raw)
To: ocfs2-devel
On 06/08/2011 03:04 AM, Wengang Wang wrote:
> When we are to purge a lockres, we check if it's unused.
> the check includes
> 1. no locks attached to the lockres.
> 2. not dirty.
> 3. not in recovery.
> 4. no interested nodes in refmap.
> If the the above four are satisfied, we are going to purge it(remove it
> from hash table).
>
> While, when a "create lock" is in progress especially when lockres is owned
> remotely(spinlocks are dropped when networking), the lockres can satisfy above
> four condition. If it's put to purge list, it can be purged out from hash table
> when we are still accessing on it(sending request to owner for example). That's
> not what we want.
>
> I have met such a problem (orabug 12405575).
> The lockres is in the purge list already(there is a delay for real purge work)
> and the create lock request comes. When we are sending network message to the
> owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
> value and retries dlmlock_remote(). And before the owner crash, we have purged
> the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
> dlmlock_remote() infinitely.
>
> fix:
> we remove the lockres from purge list if it's there in dlm_get_lock_resource()
> which is called for only createlock case. So that the lockres can't be purged
> when we are in progress of createlock.
>
> Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
> ---
> fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
> 1 files changed, 29 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> index 11eefb8..511d43c 100644
> --- a/fs/ocfs2/dlm/dlmmaster.c
> +++ b/fs/ocfs2/dlm/dlmmaster.c
> @@ -709,28 +709,27 @@ lookup:
> spin_lock(&dlm->spinlock);
> tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
> if (tmpres) {
> - int dropping_ref = 0;
> -
> - spin_unlock(&dlm->spinlock);
> -
> spin_lock(&tmpres->spinlock);
> /* We wait for the other thread that is mastering the resource */
> if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> + spin_unlock(&dlm->spinlock);
> __dlm_wait_on_lockres(tmpres);
> BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> + spin_unlock(&tmpres->spinlock);
> + dlm_lockres_put(tmpres);
> + tmpres = NULL;
> + goto lookup;
> }
>
> if (tmpres->owner == dlm->node_num) {
> BUG_ON(tmpres->state& DLM_LOCK_RES_DROPPING_REF);
> dlm_lockres_grab_inflight_ref(dlm, tmpres);
> - } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF)
> - dropping_ref = 1;
> - spin_unlock(&tmpres->spinlock);
> -
> - /* wait until done messaging the master, drop our ref to allow
> - * the lockres to be purged, start over. */
> - if (dropping_ref) {
> - spin_lock(&tmpres->spinlock);
> + } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF) {
> + /*
> + * wait until done messaging the master, drop our ref to
> + * allow the lockres to be purged, start over.
> + */
> + spin_unlock(&dlm->spinlock);
> __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
> spin_unlock(&tmpres->spinlock);
> dlm_lockres_put(tmpres);
> @@ -739,6 +738,24 @@ lookup:
> }
>
> mlog(0, "found in hash!\n");
> + /*
> + * we are going to do a create-lock next. so remove the lockres
> + * from purge list to avoid the case that we will access on the
> + * lockres which is already purged out from hash table in
> + * dlm_run_purge_list() path.
> + * otherwise, we could run into a problem:
> + * the owner dead(recovery didn't take care of this lockres
> + * since it's not in hashtable), and the code keeps sending
> + * request to the dead node and getting DLM_RECOVERING and
> + * then retrying infinitely.
> + */
> + if (!list_empty(&tmpres->purge)) {
> + list_del_init(&tmpres->purge);
> + dlm_lockres_put(tmpres);
> + }
> +
> + spin_unlock(&tmpres->spinlock);
> + spin_unlock(&dlm->spinlock);
> if (res)
> dlm_lockres_put(res);
> res = tmpres;
In short, you are holding onto the dlm->spinlock a bit longer and forcibly
removing the lockres from the purgelist.
I have two problems with this patch.
Firstly it ignores the fact that the resource can be added to the purgelist
right after we drop the dlm->spinlock. There is nothing to protect against
that. And I would think that is the more likely case. I had asked to you explore
inflight_locks for that reason. Did you explore that option? Currently it is used
for remote lock creates. That's why I suggested we use it for local creates too.
Secondly, we currently manipulate the purgelist in one function only.
__dlm_calc_lockres_usage(). We should stick to that.
BTW, how are you testing this?
I would think this issue will be more of an issue for userdlm (ovm). Not the fs.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 5:17 ` Srinivas Eeda
2011-06-09 6:40 ` Wengang Wang
@ 2011-06-09 18:34 ` Sunil Mushran
2011-06-10 0:32 ` Wengang Wang
1 sibling, 1 reply; 12+ messages in thread
From: Sunil Mushran @ 2011-06-09 18:34 UTC (permalink / raw)
To: ocfs2-devel
Different issue. That patch plugged a hole when the master was remote.
In this case the master is local.
In this case, the window is very tiny as we not only need the recovery to
kick-in in the that window, we also need the lockres to be moved to the
purgelist in that window.
But it just so happens that ovm via userdlm hits createlock a lot making
this issue more likely to trigger than with the fs.
On 06/08/2011 10:17 PM, Srinivas Eeda wrote:
> I think I have seen this problem in ocfs2-1.2 and it was addressed by
> using a new state DLM_LOCK_RES_IN_USE. But we didn't merge into mainline
> as sunil suggested we need to look for a different approach
>
> http://oss.oracle.com/pipermail/ocfs2-devel/2010-June/006669.html
> http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg05733.html
>
> one comment inline
>
> On 6/8/2011 3:04 AM, Wengang Wang wrote:
>> When we are to purge a lockres, we check if it's unused.
>> the check includes
>> 1. no locks attached to the lockres.
>> 2. not dirty.
>> 3. not in recovery.
>> 4. no interested nodes in refmap.
>> If the the above four are satisfied, we are going to purge it(remove it
>> from hash table).
>>
>> While, when a "create lock" is in progress especially when lockres is owned
>> remotely(spinlocks are dropped when networking), the lockres can satisfy above
>> four condition. If it's put to purge list, it can be purged out from hash table
>> when we are still accessing on it(sending request to owner for example). That's
>> not what we want.
>>
>> I have met such a problem (orabug 12405575).
>> The lockres is in the purge list already(there is a delay for real purge work)
>> and the create lock request comes. When we are sending network message to the
>> owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
>> value and retries dlmlock_remote(). And before the owner crash, we have purged
>> the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
>> dlmlock_remote() infinitely.
>>
>> fix:
>> we remove the lockres from purge list if it's there in dlm_get_lock_resource()
>> which is called for only createlock case. So that the lockres can't be purged
>> when we are in progress of createlock.
>>
>> Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
>> ---
>> fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
>> 1 files changed, 29 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>> index 11eefb8..511d43c 100644
>> --- a/fs/ocfs2/dlm/dlmmaster.c
>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>> @@ -709,28 +709,27 @@ lookup:
>> spin_lock(&dlm->spinlock);
>> tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
>> if (tmpres) {
>> - int dropping_ref = 0;
>> -
>> - spin_unlock(&dlm->spinlock);
>> -
>> spin_lock(&tmpres->spinlock);
>> /* We wait for the other thread that is mastering the resource */
>> if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
>> + spin_unlock(&dlm->spinlock);
>> __dlm_wait_on_lockres(tmpres);
>> BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
>> + spin_unlock(&tmpres->spinlock);
>> + dlm_lockres_put(tmpres);
>> + tmpres = NULL;
>> + goto lookup;
>> }
>>
>> if (tmpres->owner == dlm->node_num) {
>> BUG_ON(tmpres->state& DLM_LOCK_RES_DROPPING_REF);
>> dlm_lockres_grab_inflight_ref(dlm, tmpres);
>> - } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF)
>> - dropping_ref = 1;
>> - spin_unlock(&tmpres->spinlock);
>> -
>> - /* wait until done messaging the master, drop our ref to allow
>> - * the lockres to be purged, start over. */
>> - if (dropping_ref) {
>> - spin_lock(&tmpres->spinlock);
>> + } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF) {
>> + /*
>> + * wait until done messaging the master, drop our ref to
>> + * allow the lockres to be purged, start over.
>> + */
>> + spin_unlock(&dlm->spinlock);
>> __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
>> spin_unlock(&tmpres->spinlock);
>> dlm_lockres_put(tmpres);
>> @@ -739,6 +738,24 @@ lookup:
>> }
>>
>> mlog(0, "found in hash!\n");
>> + /*
>> + * we are going to do a create-lock next. so remove the lockres
>> + * from purge list to avoid the case that we will access on the
>> + * lockres which is already purged out from hash table in
>> + * dlm_run_purge_list() path.
>> + * otherwise, we could run into a problem:
>> + * the owner dead(recovery didn't take care of this lockres
>> + * since it's not in hashtable), and the code keeps sending
>> + * request to the dead node and getting DLM_RECOVERING and
>> + * then retrying infinitely.
>> + */
>> + if (!list_empty(&tmpres->purge)) {
>> + list_del_init(&tmpres->purge);
>> + dlm_lockres_put(tmpres);
>> + }
>> +
>> + spin_unlock(&tmpres->spinlock);
>> + spin_unlock(&dlm->spinlock);
> lockres could still get added to purgelist at this point and we could
> still have the same problem? I think, here we need some mechanism that
> marks the lockres is in use that would protect it from adding to the
> purgelist?
>> if (res)
>> dlm_lockres_put(res);
>> res = tmpres;
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 18:34 ` Sunil Mushran
@ 2011-06-10 0:32 ` Wengang Wang
2011-06-10 0:43 ` Sunil Mushran
0 siblings, 1 reply; 12+ messages in thread
From: Wengang Wang @ 2011-06-10 0:32 UTC (permalink / raw)
To: ocfs2-devel
On 11-06-09 11:34, Sunil Mushran wrote:
> Different issue. That patch plugged a hole when the master was remote.
> In this case the master is local.
I think in my case, the master is remote too. The phenomemon is local
node keeps sending messages infinitely to the master which is dead.
>
> In this case, the window is very tiny as we not only need the recovery to
> kick-in in the that window, we also need the lockres to be moved to the
> purgelist in that window.
>
> But it just so happens that ovm via userdlm hits createlock a lot making
> this issue more likely to trigger than with the fs.
Yes, OVM create the lock in every 5 seconds via dlmfs.
thanks,
wengang.
> On 06/08/2011 10:17 PM, Srinivas Eeda wrote:
> >I think I have seen this problem in ocfs2-1.2 and it was addressed by
> >using a new state DLM_LOCK_RES_IN_USE. But we didn't merge into mainline
> >as sunil suggested we need to look for a different approach
> >
> >http://oss.oracle.com/pipermail/ocfs2-devel/2010-June/006669.html
> >http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg05733.html
> >
> >one comment inline
> >
> >On 6/8/2011 3:04 AM, Wengang Wang wrote:
> >>When we are to purge a lockres, we check if it's unused.
> >>the check includes
> >>1. no locks attached to the lockres.
> >>2. not dirty.
> >>3. not in recovery.
> >>4. no interested nodes in refmap.
> >>If the the above four are satisfied, we are going to purge it(remove it
> >>from hash table).
> >>
> >>While, when a "create lock" is in progress especially when lockres is owned
> >>remotely(spinlocks are dropped when networking), the lockres can satisfy above
> >>four condition. If it's put to purge list, it can be purged out from hash table
> >>when we are still accessing on it(sending request to owner for example). That's
> >>not what we want.
> >>
> >>I have met such a problem (orabug 12405575).
> >>The lockres is in the purge list already(there is a delay for real purge work)
> >>and the create lock request comes. When we are sending network message to the
> >>owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
> >>value and retries dlmlock_remote(). And before the owner crash, we have purged
> >>the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
> >>dlmlock_remote() infinitely.
> >>
> >>fix:
> >>we remove the lockres from purge list if it's there in dlm_get_lock_resource()
> >>which is called for only createlock case. So that the lockres can't be purged
> >>when we are in progress of createlock.
> >>
> >>Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
> >>---
> >> fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
> >> 1 files changed, 29 insertions(+), 12 deletions(-)
> >>
> >>diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> >>index 11eefb8..511d43c 100644
> >>--- a/fs/ocfs2/dlm/dlmmaster.c
> >>+++ b/fs/ocfs2/dlm/dlmmaster.c
> >>@@ -709,28 +709,27 @@ lookup:
> >> spin_lock(&dlm->spinlock);
> >> tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
> >> if (tmpres) {
> >>- int dropping_ref = 0;
> >>-
> >>- spin_unlock(&dlm->spinlock);
> >>-
> >> spin_lock(&tmpres->spinlock);
> >> /* We wait for the other thread that is mastering the resource */
> >> if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> >>+ spin_unlock(&dlm->spinlock);
> >> __dlm_wait_on_lockres(tmpres);
> >> BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> >>+ spin_unlock(&tmpres->spinlock);
> >>+ dlm_lockres_put(tmpres);
> >>+ tmpres = NULL;
> >>+ goto lookup;
> >> }
> >>
> >> if (tmpres->owner == dlm->node_num) {
> >> BUG_ON(tmpres->state& DLM_LOCK_RES_DROPPING_REF);
> >> dlm_lockres_grab_inflight_ref(dlm, tmpres);
> >>- } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF)
> >>- dropping_ref = 1;
> >>- spin_unlock(&tmpres->spinlock);
> >>-
> >>- /* wait until done messaging the master, drop our ref to allow
> >>- * the lockres to be purged, start over. */
> >>- if (dropping_ref) {
> >>- spin_lock(&tmpres->spinlock);
> >>+ } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF) {
> >>+ /*
> >>+ * wait until done messaging the master, drop our ref to
> >>+ * allow the lockres to be purged, start over.
> >>+ */
> >>+ spin_unlock(&dlm->spinlock);
> >> __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
> >> spin_unlock(&tmpres->spinlock);
> >> dlm_lockres_put(tmpres);
> >>@@ -739,6 +738,24 @@ lookup:
> >> }
> >>
> >> mlog(0, "found in hash!\n");
> >>+ /*
> >>+ * we are going to do a create-lock next. so remove the lockres
> >>+ * from purge list to avoid the case that we will access on the
> >>+ * lockres which is already purged out from hash table in
> >>+ * dlm_run_purge_list() path.
> >>+ * otherwise, we could run into a problem:
> >>+ * the owner dead(recovery didn't take care of this lockres
> >>+ * since it's not in hashtable), and the code keeps sending
> >>+ * request to the dead node and getting DLM_RECOVERING and
> >>+ * then retrying infinitely.
> >>+ */
> >>+ if (!list_empty(&tmpres->purge)) {
> >>+ list_del_init(&tmpres->purge);
> >>+ dlm_lockres_put(tmpres);
> >>+ }
> >>+
> >>+ spin_unlock(&tmpres->spinlock);
> >>+ spin_unlock(&dlm->spinlock);
> >lockres could still get added to purgelist at this point and we could
> >still have the same problem? I think, here we need some mechanism that
> >marks the lockres is in use that would protect it from adding to the
> >purgelist?
> >> if (res)
> >> dlm_lockres_put(res);
> >> res = tmpres;
> >_______________________________________________
> >Ocfs2-devel mailing list
> >Ocfs2-devel at oss.oracle.com
> >http://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-10 0:32 ` Wengang Wang
@ 2011-06-10 0:43 ` Sunil Mushran
0 siblings, 0 replies; 12+ messages in thread
From: Sunil Mushran @ 2011-06-10 0:43 UTC (permalink / raw)
To: ocfs2-devel
On 06/09/2011 05:32 PM, Wengang Wang wrote:
> On 11-06-09 11:34, Sunil Mushran wrote:
>> Different issue. That patch plugged a hole when the master was remote.
>> In this case the master is local.
> I think in my case, the master is remote too. The phenomemon is local
> node keeps sending messages infinitely to the master which is dead.
>> In this case, the window is very tiny as we not only need the recovery to
>> kick-in in the that window, we also need the lockres to be moved to the
>> purgelist in that window.
>>
>> But it just so happens that ovm via userdlm hits createlock a lot making
>> this issue more likely to trigger than with the fs.
> Yes, OVM create the lock in every 5 seconds via dlmfs.
Do you have a reproducible testcase? If not, work on that. That would
be most helpful in fixing this issue.
Right now I am also working on a problem in this area. Not the same
though. I can work on this issue too. But I need a testcase.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-09 17:53 ` Sunil Mushran
@ 2011-06-10 1:01 ` Wengang Wang
2011-06-10 1:10 ` Sunil Mushran
0 siblings, 1 reply; 12+ messages in thread
From: Wengang Wang @ 2011-06-10 1:01 UTC (permalink / raw)
To: ocfs2-devel
On 11-06-09 10:53, Sunil Mushran wrote:
> On 06/08/2011 03:04 AM, Wengang Wang wrote:
> >When we are to purge a lockres, we check if it's unused.
> >the check includes
> >1. no locks attached to the lockres.
> >2. not dirty.
> >3. not in recovery.
> >4. no interested nodes in refmap.
> >If the the above four are satisfied, we are going to purge it(remove it
> >from hash table).
> >
> >While, when a "create lock" is in progress especially when lockres is owned
> >remotely(spinlocks are dropped when networking), the lockres can satisfy above
> >four condition. If it's put to purge list, it can be purged out from hash table
> >when we are still accessing on it(sending request to owner for example). That's
> >not what we want.
> >
> >I have met such a problem (orabug 12405575).
> >The lockres is in the purge list already(there is a delay for real purge work)
> >and the create lock request comes. When we are sending network message to the
> >owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
> >value and retries dlmlock_remote(). And before the owner crash, we have purged
> >the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
> >dlmlock_remote() infinitely.
> >
> >fix:
> >we remove the lockres from purge list if it's there in dlm_get_lock_resource()
> >which is called for only createlock case. So that the lockres can't be purged
> >when we are in progress of createlock.
> >
> >Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
> >---
> > fs/ocfs2/dlm/dlmmaster.c | 41 +++++++++++++++++++++++++++++------------
> > 1 files changed, 29 insertions(+), 12 deletions(-)
> >
> >diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> >index 11eefb8..511d43c 100644
> >--- a/fs/ocfs2/dlm/dlmmaster.c
> >+++ b/fs/ocfs2/dlm/dlmmaster.c
> >@@ -709,28 +709,27 @@ lookup:
> > spin_lock(&dlm->spinlock);
> > tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
> > if (tmpres) {
> >- int dropping_ref = 0;
> >-
> >- spin_unlock(&dlm->spinlock);
> >-
> > spin_lock(&tmpres->spinlock);
> > /* We wait for the other thread that is mastering the resource */
> > if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> >+ spin_unlock(&dlm->spinlock);
> > __dlm_wait_on_lockres(tmpres);
> > BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> >+ spin_unlock(&tmpres->spinlock);
> >+ dlm_lockres_put(tmpres);
> >+ tmpres = NULL;
> >+ goto lookup;
> > }
> >
> > if (tmpres->owner == dlm->node_num) {
> > BUG_ON(tmpres->state& DLM_LOCK_RES_DROPPING_REF);
> > dlm_lockres_grab_inflight_ref(dlm, tmpres);
> >- } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF)
> >- dropping_ref = 1;
> >- spin_unlock(&tmpres->spinlock);
> >-
> >- /* wait until done messaging the master, drop our ref to allow
> >- * the lockres to be purged, start over. */
> >- if (dropping_ref) {
> >- spin_lock(&tmpres->spinlock);
> >+ } else if (tmpres->state& DLM_LOCK_RES_DROPPING_REF) {
> >+ /*
> >+ * wait until done messaging the master, drop our ref to
> >+ * allow the lockres to be purged, start over.
> >+ */
> >+ spin_unlock(&dlm->spinlock);
> > __dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
> > spin_unlock(&tmpres->spinlock);
> > dlm_lockres_put(tmpres);
> >@@ -739,6 +738,24 @@ lookup:
> > }
> >
> > mlog(0, "found in hash!\n");
> >+ /*
> >+ * we are going to do a create-lock next. so remove the lockres
> >+ * from purge list to avoid the case that we will access on the
> >+ * lockres which is already purged out from hash table in
> >+ * dlm_run_purge_list() path.
> >+ * otherwise, we could run into a problem:
> >+ * the owner dead(recovery didn't take care of this lockres
> >+ * since it's not in hashtable), and the code keeps sending
> >+ * request to the dead node and getting DLM_RECOVERING and
> >+ * then retrying infinitely.
> >+ */
> >+ if (!list_empty(&tmpres->purge)) {
> >+ list_del_init(&tmpres->purge);
> >+ dlm_lockres_put(tmpres);
> >+ }
> >+
> >+ spin_unlock(&tmpres->spinlock);
> >+ spin_unlock(&dlm->spinlock);
> > if (res)
> > dlm_lockres_put(res);
> > res = tmpres;
>
> In short, you are holding onto the dlm->spinlock a bit longer and forcibly
> removing the lockres from the purgelist.
Yes. that's what the patch does.
>
> I have two problems with this patch.
>
> Firstly it ignores the fact that the resource can be added to the purgelist
> right after we drop the dlm->spinlock. There is nothing to protect against
> that. And I would think that is the more likely case. I had asked to you explore
> inflight_locks for that reason. Did you explore that option? Currently it is used
> for remote lock creates. That's why I suggested we use it for local creates too.
Yes, you are right. There is such a problem that the lockres can be
added to purge list after we drop dlm->spinlock. I am not sure if it's the more
likely case since there is a 8 seconds delay between the the time the lockres is
added to purge list and the time it is purged. I guess in normal case, the create-
lock should already finished in the 8 seconds.
I considered about the inflight_locks. But I didn't think it well so far.
Because simply moving the lockres out from purge list is not good
enough, I will take a good think about making use of inflight_locks.
> Secondly, we currently manipulate the purgelist in one function only.
> __dlm_calc_lockres_usage(). We should stick to that.
If we make good use of inflight_locks, I think we can.
>
> BTW, how are you testing this?
I tested with the steps which can cause the original problem to prove it
works.(but without UT).
Also I made a regression test with the existing ocfs2-test suites(the
multiple-xxx with reducing interations).
> I would think this issue will be more of an issue for userdlm (ovm). Not the fs.
Yes, agree.
thanks,
wengang.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-10 1:01 ` Wengang Wang
@ 2011-06-10 1:10 ` Sunil Mushran
2011-06-10 1:21 ` Wengang Wang
0 siblings, 1 reply; 12+ messages in thread
From: Sunil Mushran @ 2011-06-10 1:10 UTC (permalink / raw)
To: ocfs2-devel
On 06/09/2011 06:01 PM, Wengang Wang wrote:
>
> Yes, you are right. There is such a problem that the lockres can be
> added to purge list after we drop dlm->spinlock. I am not sure if it's the more
> likely case since there is a 8 seconds delay between the the time the lockres is
> added to purge list and the time it is purged. I guess in normal case, the create-
> lock should already finished in the 8 seconds.
>
> I considered about the inflight_locks. But I didn't think it well so far.
> Because simply moving the lockres out from purge list is not good
> enough, I will take a good think about making use of inflight_locks.
>
>> Secondly, we currently manipulate the purgelist in one function only.
>> __dlm_calc_lockres_usage(). We should stick to that.
> If we make good use of inflight_locks, I think we can.
>> BTW, how are you testing this?
> I tested with the steps which can cause the original problem to prove it
> works.(but without UT).
> Also I made a regression test with the existing ocfs2-test suites(the
> multiple-xxx with reducing interations).
What's UT?
I am looking for is a testcase that reliably reproduces the problem.
Not necessarily that can be checked in. But something that triggers
the issue.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock
2011-06-10 1:10 ` Sunil Mushran
@ 2011-06-10 1:21 ` Wengang Wang
0 siblings, 0 replies; 12+ messages in thread
From: Wengang Wang @ 2011-06-10 1:21 UTC (permalink / raw)
To: ocfs2-devel
On 11-06-09 18:10, Sunil Mushran wrote:
> On 06/09/2011 06:01 PM, Wengang Wang wrote:
> >
> >Yes, you are right. There is such a problem that the lockres can be
> >added to purge list after we drop dlm->spinlock. I am not sure if it's the more
> >likely case since there is a 8 seconds delay between the the time the lockres is
> >added to purge list and the time it is purged. I guess in normal case, the create-
> >lock should already finished in the 8 seconds.
> >
> >I considered about the inflight_locks. But I didn't think it well so far.
> >Because simply moving the lockres out from purge list is not good
> >enough, I will take a good think about making use of inflight_locks.
> >
> >>Secondly, we currently manipulate the purgelist in one function only.
> >>__dlm_calc_lockres_usage(). We should stick to that.
> >If we make good use of inflight_locks, I think we can.
> >>BTW, how are you testing this?
> >I tested with the steps which can cause the original problem to prove it
> >works.(but without UT).
> >Also I made a regression test with the existing ocfs2-test suites(the
> >multiple-xxx with reducing interations).
>
> What's UT?
unit test :)
>
> I am looking for is a testcase that reliably reproduces the problem.
> Not necessarily that can be checked in. But something that triggers
> the issue.
sent via an another email.
thanks,
wengang.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-06-10 1:21 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-08 10:04 [Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock Wengang Wang
2011-06-09 5:17 ` Srinivas Eeda
2011-06-09 6:40 ` Wengang Wang
2011-06-09 7:43 ` Srinivas Eeda
2011-06-09 8:10 ` Wengang Wang
2011-06-09 18:34 ` Sunil Mushran
2011-06-10 0:32 ` Wengang Wang
2011-06-10 0:43 ` Sunil Mushran
2011-06-09 17:53 ` Sunil Mushran
2011-06-10 1:01 ` Wengang Wang
2011-06-10 1:10 ` Sunil Mushran
2011-06-10 1:21 ` Wengang Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.