From: Xue jiufei <xuejiufei@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] May be deadlock for wrong locking order, patch request reviewed, thanks
Date: Fri, 12 Sep 2014 14:05:52 +0800 [thread overview]
Message-ID: <54128D40.9010806@huawei.com> (raw)
In-Reply-To: <71604351584F6A4EBAE558C676F37CA42EC6B8CB@H3CMLB14-EX.srv.huawei-3com.com>
Hi, Zhonghua
On 2014/9/11 19:28, Guozhonghua wrote:
> As we test the ocfs2 cluster, the cluster is sometime hangs up.
>
>
>
> I got some information about the dead lock, which cause the cluster hangs up, the sys dir / lock is held and the node did not release it which cause the cluster hangs up.
>
> root at cvknode-21:~# ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN | grep D
>
> PID STAT COMMAND WIDE-WCHAN-COLUMN
>
> 7489 D jbd2/sdh-621 jbd2_journal_commit_transaction
>
> 16218 D ls iterate_dir
>
> 16533 D mkdir dlm_wait_for_lock_mastery
>
> 31195 D+ ls iterate_dir
>
>
>
> So the code reviewed, and I found the order of the lock may wrong.
>
> In the function dlm_master_request_handler, the resource lock is held and so after the lock of &dlm->master_lock is locked.
>
> But in the function dlm_get_lock_resource, the &dlm->master_lock is locked first and so resource lock.
Resource lock is not required in dlm_get_lock_resouce() because it
is a new lock resource.
commit 8d400b81cc83 add this spinlock when cleanup code, I think we can
remove this spinlock.
Thanks
Xue Jiufei
>
> They are different order in different function.
>
> If there are two task, one holds the res->lock waiting for the dlm->master_lock, with the function dlm_master_request_handler.
>
> Another task holds the &dlm->master_lock waiting for the res->lock with dlm_get_lock_resource.
>
> So the deadlock may be up.
>
>
>
> I changed some code, and the patch request reviews.
>
>
>
>
>
>
>
> *** ocfs2-ko-3.16/dlm/dlmmaster.c 2014-09-11 12:45:45.821657634 +0800
>
> --- ocfs2-ko-3.16_compared/dlm/dlmmaster.c 2014-09-11 18:54:34.970243238 +0800
>
> *************** way_up_top:
>
> *** 1506,1512 ****
>
> --- 1506,1515 ----
>
> }
>
>
>
> // mlog(0, "lockres is in progress...\n");
>
> + spin_unlock(&res->spinlock);
>
> +
>
> spin_lock(&dlm->master_lock);
>
> + spin_lock(&res->spinlock);
>
> found = dlm_find_mle(dlm, &tmpmle, name, namelen);
>
> if (!found) {
>
> mlog(ML_ERROR, "no mle found for this lock!\n");
>
> *************** way_up_top:
>
> *** 1551,1558 ****
>
> set_bit(request->node_idx, tmpmle->maybe_map);
>
> spin_unlock(&tmpmle->spinlock);
>
>
>
> - spin_unlock(&dlm->master_lock);
>
> spin_unlock(&res->spinlock);
>
> + spin_unlock(&dlm->master_lock);
>
>
>
> /* keep the mle attached to heartbeat events */
>
> dlm_put_mle(tmpmle);
>
> -------------------------------------------------------------------------------------------------------------------------------------
> ????????????????????????????????????????
> ????????????????????????????????????????
> ????????????????????????????????????????
> ???
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
prev parent reply other threads:[~2014-09-12 6:05 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-11 11:28 [Ocfs2-devel] May be deadlock for wrong locking order, patch request reviewed, thanks Guozhonghua
2014-09-12 6:05 ` Xue jiufei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54128D40.9010806@huawei.com \
--to=xuejiufei@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.