From: Sunil Mushran <sunil.mushran@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: unlock open_lock immediately
Date: Tue, 30 Aug 2011 18:55:38 -0700 [thread overview]
Message-ID: <4E5D949A.4040802@oracle.com> (raw)
In-Reply-To: <201108260250.p7Q2oQH9017484@acsmt357.oracle.com>
Comments inlined.
BTW, how common place is this race in your testing? If you can
answer that, I would like to also know how you arrived at it.
On 08/25/2011 07:50 PM, Wengang Wang wrote:
> There is a race between 2(+) nodes that calls iput_final() on same inode.
> time sequence is like the following. The result is neither of the 2(+) node
> does real inode deletion work and the unlinked inode is left in orphandir.
>
> --------------------------------------
>
> node A node B
>
> open_lock PR
>
> open_LOCK PR
>
> .......
>
> .......
>
> #in ocfs2_delete_inode()
> inode_lock EX
> #in ocfs2_query_inode_wipe
> try open_lock EX -->cant grant(B has PR)
> ignore the deletion
> inode_unlock EX
>
> #in ocfs2_delete_inode()
> inode_lock EX
> #in ocfs2_query_inode_wipe
> try open_lock EX -->can't grant(A has PR)
> ignore the deletion
> inode_unlock EX
>
> #in ocfs2_clear_inode()
> open_unlock EX
> drop open_lock
>
> #in ocfs2_clear_inode()
> open_unlock EX
>
> --------------------------------------
>
> The fix is to force dlm_unlock on open_lock within inode_lock. see
> comment embedded in patch.
>
> Signed-off-by: Wengang Wang<wen.gang.wang@oracle.com>
While I am still wrapping my head around this, I see no harm in releasing
the open_lock early. Afterall the inode is in MAYBE_ORPHANED state.
> ---
> fs/ocfs2/dlmglue.c | 8 ++++++--
> fs/ocfs2/inode.c | 11 +++++++++++
> 2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
> index 7642d7c..f331310 100644
> --- a/fs/ocfs2/dlmglue.c
> +++ b/fs/ocfs2/dlmglue.c
> @@ -1752,12 +1752,16 @@ void ocfs2_open_unlock(struct inode *inode)
> if (ocfs2_mount_local(osb))
> goto out;
>
> - if(lockres->l_ro_holders)
> + if (lockres->l_ro_holders) {
> ocfs2_cluster_unlock(OCFS2_SB(inode->i_sb), lockres,
> DLM_LOCK_PR);
> - if(lockres->l_ex_holders)
> + lockres->l_ro_holders = 0;
> + }
> + if (lockres->l_ex_holders) {
> ocfs2_cluster_unlock(OCFS2_SB(inode->i_sb), lockres,
> DLM_LOCK_EX);
> + lockres->l_ex_holders = 0;
> + }
This bit looks incorrect. We cannot force these counts to zero.
We have to let dec_holders() to do that in cluster_unlock().
> out:
> return;
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index b4c8bb6..390a6fc 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -1052,6 +1052,17 @@ static void ocfs2_delete_inode(struct inode *inode)
> OCFS2_I(inode)->ip_flags |= OCFS2_INODE_DELETED;
>
> bail_unlock_inode:
> + /*
> + * since we don't take care of deleting the on disk inode any longer
> + * from now on, we must release the open_lock(dlm unlock) immediately
> + * within inode_lock. Otherwise, trying open_lock for EX from other node
> + * can fail if it comes before we release PR on open_lock later, so that
> + * both/all nodes think other node(s) is/are opening the inode thus
> + * neither/none of them do real inode deletion.
> + */
> + ocfs2_open_unlock(inode);
> + ocfs2_simple_drop_lockres(OCFS2_SB(inode->i_sb),
> + &OCFS2_I(inode)->ip_open_lockres);
> ocfs2_inode_unlock(inode, 1);
> brelse(di_bh);
>
We have to make corresponding changes in ocfs2_drop_inode_locks()
and ocfs2_clear_inode().
next prev parent reply other threads:[~2011-08-31 1:55 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-26 2:50 [Ocfs2-devel] [PATCH] ocfs2: unlock open_lock immediately Wengang Wang
2011-08-31 1:55 ` Sunil Mushran [this message]
2011-08-31 2:51 ` Wengang Wang
2011-08-31 17:36 ` Sunil Mushran
2011-09-01 1:00 ` Wengang Wang
2011-09-07 18:04 ` Joel Becker
2011-09-08 2:14 ` Wengang Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E5D949A.4040802@oracle.com \
--to=sunil.mushran@oracle.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.