From: Peng Tao <bergwolf@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org,
Andriy Skulysh <Andriy_Skulysh@xyratex.com>,
Peng Tao <bergwolf@gmail.com>,
Andreas Dilger <andreas.dilger@intel.com>
Subject: [PATCH 12/26] staging/lustre/ldlm: Fix flock detection for different mounts
Date: Fri, 15 Nov 2013 00:42:59 +0800 [thread overview]
Message-ID: <1384447393-13838-13-git-send-email-bergwolf@gmail.com> (raw)
In-Reply-To: <1384447393-13838-1-git-send-email-bergwolf@gmail.com>
From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Deadlock can happen when 2 processes take concurrent locks
on files situated in different mountpoints.
Modify flock detection algorithm to distinguish process by
pair PID+NID instead of PID+export.
It is done by searching for a blocking owner in all OBD's
exports with the same NID.
Lustre-change: http://review.whamcloud.com/3276
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1601
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
drivers/staging/lustre/lustre/ldlm/ldlm_flock.c | 45 +++++++++++++++++++++--
1 file changed, 41 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 37ebd2a..396e58b 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -161,6 +161,31 @@ ldlm_flock_destroy(struct ldlm_lock *lock, ldlm_mode_t mode, __u64 flags)
* one client holds a lock on something and want a lock on something
* else and at the same time another client has the opposite situation).
*/
+
+struct ldlm_flock_lookup_cb_data {
+ __u64 *bl_owner;
+ struct ldlm_lock *lock;
+ struct obd_export *exp;
+};
+
+static int ldlm_flock_lookup_cb(struct cfs_hash *hs, struct cfs_hash_bd *bd,
+ struct hlist_node *hnode, void *data)
+{
+ struct ldlm_flock_lookup_cb_data *cb_data = data;
+ struct obd_export *exp = cfs_hash_object(hs, hnode);
+ struct ldlm_lock *lock;
+
+ lock = cfs_hash_lookup(exp->exp_flock_hash, cb_data->bl_owner);
+ if (lock == NULL)
+ return 0;
+
+ /* Stop on first found lock. Same process can't sleep twice */
+ cb_data->lock = lock;
+ cb_data->exp = class_export_get(exp);
+
+ return 1;
+}
+
static int
ldlm_flock_deadlock(struct ldlm_lock *req, struct ldlm_lock *bl_lock)
{
@@ -175,16 +200,26 @@ ldlm_flock_deadlock(struct ldlm_lock *req, struct ldlm_lock *bl_lock)
class_export_get(bl_exp);
while (1) {
+ struct ldlm_flock_lookup_cb_data cb_data = {
+ .bl_owner = &bl_owner,
+ .lock = NULL,
+ .exp = NULL };
struct obd_export *bl_exp_new;
struct ldlm_lock *lock = NULL;
struct ldlm_flock *flock;
- if (bl_exp->exp_flock_hash != NULL)
- lock = cfs_hash_lookup(bl_exp->exp_flock_hash,
- &bl_owner);
+ if (bl_exp->exp_flock_hash != NULL) {
+ cfs_hash_for_each_key(bl_exp->exp_obd->obd_nid_hash,
+ &bl_exp->exp_connection->c_peer.nid,
+ ldlm_flock_lookup_cb, &cb_data);
+ lock = cb_data.lock;
+ }
if (lock == NULL)
break;
+ class_export_put(bl_exp);
+ bl_exp = cb_data.exp;
+
LASSERT(req != lock);
flock = &lock->l_policy_data.l_flock;
LASSERT(flock->owner == bl_owner);
@@ -198,7 +233,9 @@ ldlm_flock_deadlock(struct ldlm_lock *req, struct ldlm_lock *bl_lock)
if (bl_exp->exp_failed)
break;
- if (bl_owner == req_owner && bl_exp == req_exp) {
+ if (bl_owner == req_owner &&
+ (bl_exp->exp_connection->c_peer.nid ==
+ req_exp->exp_connection->c_peer.nid)) {
class_export_put(bl_exp);
return 1;
}
--
1.7.9.5
next prev parent reply other threads:[~2013-11-14 16:44 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-14 16:42 [PATCH 00/26] staging/lustre: patch bomb 3 Peng Tao
2013-11-14 16:42 ` [PATCH 01/26] staging/lustre/hsm: handle file ownership and timestamps Peng Tao
2013-11-14 16:42 ` [PATCH 02/26] staging/lustre/build: fix race issues thanks to oap_lock Peng Tao
2013-11-14 16:42 ` [PATCH 03/26] staging/lustre/clio: incorrect assertions in 'enable-invariants' Peng Tao
2013-11-14 16:42 ` [PATCH 04/26] staging/lustre/ldlm: Fix a race during FLock handling Peng Tao
2013-11-14 16:42 ` [PATCH 05/26] staging/lustre/dcache: Unsafe error handling arnd ll_splice_alias Peng Tao
2013-11-14 16:42 ` [PATCH 06/26] staging/lustre/build: fix 'NULL pointer dereference' errors Peng Tao
2013-11-14 16:42 ` [PATCH 07/26] staging/lustre/ldlm: refine LU-2665 patch for POSIX compliance Peng Tao
2013-11-14 16:42 ` [PATCH 08/26] staging/lustre/llite: speedup in unlink/rmdir Peng Tao
2013-11-14 16:42 ` [PATCH 09/26] staging/lustre/llite: error setting max_cache_mb at mount time Peng Tao
2013-11-14 16:42 ` [PATCH 10/26] staging/lustre/ldlm: MDT mount fails on MDS w/o MGS on it Peng Tao
2013-11-14 16:42 ` [PATCH 11/26] staging/lustre/ptlrpc: Return a meaningful status from ptlrpcd_init() Peng Tao
2013-11-14 16:42 ` Peng Tao [this message]
2013-11-14 16:43 ` [PATCH 13/26] staging/lustre/nrs: Fix a race condition in the ORR policy Peng Tao
2013-11-14 16:43 ` [PATCH 14/26] staging/lustre/ptlrpc: skip rpcs that fail ptl_send_rpc Peng Tao
2013-11-14 16:43 ` [PATCH 15/26] staging/lustre/llite: Truncate to restore file Peng Tao
2013-11-14 16:43 ` [PATCH 16/26] staging/lustre/lov: avoid subobj's coh_parent race Peng Tao
2013-11-14 16:43 ` [PATCH 17/26] staging/lustre/changelogs: Correct KUC code max changelog msg size Peng Tao
2013-11-14 16:43 ` [PATCH 18/26] staging/lustre/scrub: support dryrun mode OI scrub Peng Tao
2013-11-14 16:43 ` [PATCH 19/26] staging/lustre/mdt: return EXDEV for cross-MDT rename Peng Tao
2013-11-14 16:43 ` [PATCH 20/26] staging/lustre/hsm: reprocess LDLM resource in mdt_hsm_release() Peng Tao
2013-11-14 16:43 ` [PATCH 21/26] staging/lustre/clio: Do not shrink sublock at cancel Peng Tao
2013-11-14 16:43 ` [PATCH 22/26] staging/lustre/osc: osc_extent_wait() shouldn't be interruptible Peng Tao
2013-11-14 16:43 ` [PATCH 23/26] staging/lustre/seq: make seq_proc_write_common() safer Peng Tao
2013-11-14 16:43 ` [PATCH 24/26] staging/lustre/lprocfs: implement log2 using bitops Peng Tao
2013-11-14 16:43 ` [PATCH 25/26] staging/lustre/autoconf: remove quota_on/quota_off checks Peng Tao
2013-11-14 16:43 ` [PATCH 26/26] staging/lustre/autoconf: remove LC_BI_HW_SEGMENTS test Peng Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1384447393-13838-13-git-send-email-bergwolf@gmail.com \
--to=bergwolf@gmail.com \
--cc=Andriy_Skulysh@xyratex.com \
--cc=andreas.dilger@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox